Get in Touch

Course Outline

Introduction

Grasping the Fundamentals of Heterogeneous Computing Methodology

Rationale for Parallel Computing: Understanding the Need

Multi-Core Processors: Architecture and Design

Introduction to Threads, Thread Basics, and Core Concepts of Parallel Programming

Comprehending the Fundamentals of GPU Software Optimization Processes

OpenMP: A Standard for Directive-Based Parallel Programming

Hands-on Demonstration of Various Programs on Multicore Machines

Introduction to GPU Computing

Leveraging GPUs for Parallel Computing

GPU Programming Model

Hands-on Demonstration of Various Programs on GPU

SDK, Toolkit, and Environment Installation for GPU

Working with Various Libraries

Demonstration of GPU and Tools with Sample Programs and OpenACC

Understanding the CUDA Programming Model

Learning the CUDA Architecture

Exploring and Setting Up the CUDA Development Environments

Working with the CUDA Runtime API

Understanding the CUDA Memory Model

Exploring Additional CUDA API Features

Efficient Global Memory Access in CUDA: Global Memory Optimization

Optimizing Data Transfers in CUDA Using CUDA Streams

Utilizing Shared Memory in CUDA

Understanding and Using Atomic Operations and Instructions in CUDA

Case Study: Basic Digital Image Processing with CUDA

Working with Multi-GPU Programming

Advanced Hardware Profiling and Sampling on NVIDIA / CUDA

Using CUDA Dynamic Parallelism API for Dynamic Kernel Launch

Summary and Conclusion

Requirements

  • Proficiency in C Programming
  • Experience with Linux GCC
 21 Hours

Number of participants


Price per participant

Testimonials (1)

Upcoming Courses

Related Categories