Designing with Versal AI Engine: Kernel Programming and Optimization – 3

Course Code: AIE-KERNEL

This course covers the advanced features of the AMD Versal adaptive SoC AI Engine, including kernel function development, optimizing an AI Engine kernel program, using filter intrinsics and AI Engine APIs, and debugging an application in the Vitis unified software platform.

The emphasis of this course is on:

  • Reviewing the features of the Versal device AI Engine architecture
  • Optimizing AI Engine kernels using compiler directives, programming style, and efficient movement of data
  • Describing C++ kernel template functionality
  • Identifying the different types of kernel instance states
  • Programming FIR filters using AI Engine APIs
  • Debugging applications using the Vitis unified software platform

Click here for more information about the AMD Versal Adaptive SoC (formerly ACAP).

See Course Outline

3-Day Instructor-led CoursePrice USDTraining Credits
Hosted Online - $600/day$180018
In-Person Public Registration - $600/day$180018
Printed Course Book (A PDF book is included in the course fee)
Cannot be purchased without registration.
$1001
Private TrainingLearn MoreLearn More
CoachingLearn MoreLearn More

Scheduled Classes

No Scheduled Sessions - Contact Us to ask about setting one up!

View our Full Calendar for class date status.
(Confirmed, Closed, Full)

Training Duration:

3 Days

College course fit into 3 days

The instructor certainly knew the material and could explain the concepts as well as answer questions. Even the instructor said that this is a college course fit into 3 days.

Student from Designing with VDHL

Labs were great

The labs were great and really reinforced the topics.

– Student from Designing with Versal AI Engine 1: Architecture and Design Flow

I had a wonderful instructor

I had a wonderful instructor. His pacing throughout the course was good and made sure to allow for student questions and have conversations about related topics and experiences. I think the atmosphere was great for everyone to both learn and to share experiences, tips, and tricks about using the tool and the features discussed throughout the course.

Student from Vivado Boot Camp for the FPGA User Phase 3

Thanks for a great class!

I have attended a bunch of training courses over the years. This one was definitely one of the best I have attended. Erich did a great job, and the material is very well done. Thanks for a great class!

– Student from Vivado Boot Camp for the FPGA User Phase 1

One of the best experiences for AMD Xilinx training that I’ve had

Bill was a great instructor and answered all of our questions. He went above and beyond to make this course a great experience. If/When I use BLT for Xilinx training in the future I will be on the lookout to see if he’s leading the lecture. One of the best experiences for AMD Xilinx training that I’ve had.

– Student from Designing with VHDL

All in all a great experience

Tom was a great instructor, very knowledgeable and polite throughout the course. All in all a great experience.

– Student from Vivado Boot Camp for the FPGA User Phase 2

I gained a lot of information

The class was pretty great and I gained a lot of information from it that I will certainly be applying at my job going forward!!

– Student from Vivado Boot Camp for the FPGA User Phase 1

Knowledgeable instructor

Elie was a knowledgeable instructor, and did a really good job of making sure students were comfortable interrupting for questions. He answered questions well and communicated very clearly.

– Student from Designing with VHDL

Expert tidbits

I liked the expert tidbits my instructor threw in to keep in mind when working on projects in the future regarding best practices. I also appreciated the questions the more experienced students asked, and how he was knowledgeable in order to address them.

Student from Designing with VHDL

This one was definitely one of the best

I have attended a bunch of training courses over the years. This one was definitely one of the best I have attended. Erich did a great job, and the material is very well done. Thanks for a great class!

– Student from Vivado Boot Camp for the FPGA User Phase 1

My instructor was very professional

My instructor was very professional and answered all of my questions thoroughly. I enjoyed hearing about his professional experience with certain aspects of the course / labs as we went through the course.

– Student from Vivado Boot Camp for the FPGA User Phase 1

My instructor took time

My instructor took time during some of the breaks to look up and distribute information about questions that he didn’t happen to know direct answers to, and I always appreciate when instructors take the time to do that.

Student from Vivado Boot Camp for the FPGA User Phase 3

Elie was an exceptional instructor

Elie was an exceptional instructor, and I would welcome the opportunity to take another class from him and BLT in the future.

– Student from Designing with Verilog

I would endorse him to teach a friend

Cole was a fantastic instructor and was very proactive in answering any questions that came up. I would endorse him to teach if a friend had to learn from this course.

– Student from Designing with Verilog

Erich was engaging

Erich was engaging and had good pacing during the course. Although the course was all day for 3 days I didn’t feel exhausted at the end of sessions.

– Student from Vivado Boot Camp for the FPGA User Phase 1

Impressed with the effort

Glenn is a good instructor – I’m impressed with the effort he put into the presentation.
I hope I didn’t annoy him with too many questions.

– Student from Designing with Versal AI Engine 3: Kernel Programming and Optimization

Can quickly and concisely answer technical questions

I really like the expertise of the presenters and that they can quickly and concisely answer technical questions, Tom did great!

– Student from Vivado Boot Camp for the FPGA User Phase 3

A lot of insights beyond the course

Glenn was a great instructor and provided us with a lot of insights beyond the course material

– Student from Embedded Design with PetaLinux Tools

I have a great grasp of HLS and how to use Vitis effectively

I really enjoyed this class and feel like I have a great grasp of HLS and how to use Vitis effectively. Cole was a great instructor, and I
would easily take another class with him. Thank you very much for running this class!

– Student from High-Level Synthesis with the Vitis HLS Tool

They had answers for just about every question

Erich and Nathaniel were great, they had answers for just about every question/issue and linked relevant Xilinx/Vivado user manuals for further explanation/documentation.

– Student from Vivado Boot Camp for the FPGA User Phase 2

My instructor was very capable

My instructor was very capable of answering any of my questions even when they were an extension of the material being presented. If he wasn’t sure of an answer, he made sure to verify his thoughts before answering my question

– Student from Vivado Boot Camp for the FPGA User Phase 1

The instructor was excellent

The instructor for this class, Glenn, was excellent. He presented the material with great examples and encouraged students to ask questions at any point in the course. Whenever there was a question he could not answer, he mentioned that he would bring it to his colleagues for answers, and after we came back from lunch, he had the answer.

– Student from Embedded Design with PetaLinux Tools

Be the first to know. Sign up for our newsletter.

Who should attend:

Software and hardware developers, system architects, and anyone who needs to accelerate their software applications using AMD devices.

Skills Gained

After completing this comprehensive training, you will know how to:

  • Utilize various Versal AI Engine kernel optimization techniques, such as compiler directives, software pipelining, coding for performance, and core utilization
  • Apply C coding guidelines for performance improvement, including function inlining, pointer restricting, and code shuffling
  • Identify and implement the different types of kernel instance states using C++ kernel development
  • Implement AI Engine kernels using AI Engine APIs for symmetric and non-symmetric FIRs using aie::sliding_mul_sym_xy_ops
  • Debug an application using the simulation debugging methodology and event traces

Software Tools

  • Vitis unified software platform

Hardware

  • Architecture: Versal adaptive SoCs

Course Outline

Day 1Day 2Day 3
AI Engine and Memory Module Architecture
Introduces the architecture of the AI Engine and describes the memory module architecture for the AI Engine. {Lecture}

Versal AI Engine Data Movement and Interfaces
Describes the data movement and memory access by the AI Engines in the AI Engine arrays. Also reviews the AI Engine interfaces that are available, including the lock, core debug, cascaded stream, and AXI-Stream interfaces. {Lecture}

Overview of AI Engine Kernel Optimization
Explains the various AI Engine kernel optimization techniques, such as compiler directives, software pipelining, coding for performance, and core utilization. {Lecture}

AI Engine Kernel Optimization – Compiler Directives
Describes the usage of compiler directives for loop unrolling, loop flattening, and software pipelining to help improve the performance of AI Engine kernels. {Lecture}

AI Engine Kernel Optimization – Coding Style
Covers the coding guidelines for performance improvement, including function inlining, pointer restricting, and code shuffling. Also covers calculating AI Engine utilization for the kernels to help improve performance. The lab illustrates applying kernel optimization techniques such as the restrict keyword, custom pragmas, and code restructuring. {Lecture, Lab}
Advanced C++ Kernel Programming
Provides an overview of C++ kernel template functionality and the different types of states and kernel instance states using C++ classes. Also covered are kernel instance states with scalar parameters in a constructor as well as kernel instance states with array parameters in a constructor. {Lecture, Lab}

Vector Data Types (Review)

Provides an AI Engine functional overview and identifies the supported vector data types and high-width registers for allowing single instruction, multiple data (SIMD) instructions. {Lecture}

AI Engine Symmetric and Asymmetric Filter Implementation

Describes AI Engine APIs for symmetric and asymmetric FIR implementation, such as aie::sliding_mul_sym_xy_ops operators. Also, provides an overview of the DSP library, which can help with creating filters more easily and faster. {Lecture, Lab}

Debugging AI Engine Applications – Event Trace

Describes the application simulation debugging methodology as well as debugging with event traces, such as AI Engine events, DMA events, lock events, and stream events. Also demonstrates how to visualize these events in the Vitis unified software platform. {Lecture}

Debugging AI Engine Applications – Use Cases
Reviews various use cases of problems that arise, such as memory conflicts and deadlock analysis. Also covers performance analysis (profiling) in hardware. {Lecture, Lab}
AI Engine: DSP Library Overview
Provides an overview of the available DSP library, which enables faster development and comes with ready-to-use example designs that help with using the library and tools. {Lecture, Labs}

AI Engine Symmetric Filter Implementation Using Intrinsics

Describes advanced MAC intrinsic syntax, including the intrinsics for symmetric FIR implementation, such as mul4_sym and mac4_sym. Also provides guidelines for choosing the right fixed-point intrinsics for a FIR filter. {Lecture}

AI Engine Non-Symmetric Filter Implementation Using Intrinsics

Describes the intrinsics for non-symmetric FIR implementations, such as mul4_nc and mac4_nc. Also provides guidelines for choosing the right intrinsics for a FIR filter. {Lecture}

Floating-Point Operations Using Intrinsics

Reviews the floating-point operations fpmul, fpmac, and fpmsc as well as the fully configurable, floating-point intrinsics fpmac_conf. {Lecture}

Please note: The instructor may change the content order to provide a better learning experience.

Prerequisites:

  • Comfort with the C/C++ programming language
  • Software development flow
  • Vitis software for application acceleration development flow
  • Designing with Versal AI Engine: Quick Start
  • Designing with Versal AI Engine: Architecture and Design Flow - 1
  • Designing with Versal AI Engine: Graph Programming with AI Engine Kernels - 2

Related Courses:

Updated 8-18-2024
©2024 Advanced Micro Devices, Inc. Xilinx, Inc. is now part of AMD. Xilinx, the Xilinx logo, AMD, the AMD Arrow logo, Alveo, Artix, Kintex, Kria, Spartan, Versal, Vitis, Virtex, Vivado, Zynq, and other designated brands included herein are trademarks of Advanced Micro Devices, Inc.