6.
GPU and Hardware Acceleration
search
Quick search
code
Show Source
Course
GitHub
中文版
Table Of Contents
1. Introduction
2. Tensor Program Abstraction
2.1. Primitive Tensor Function
2.2. Tensor Program Abstraction
2.3. Summary
2.4. TensorIR: Tensor Program Abstraction Case Study
2.5. Exercises for TensorIR
3. End to End Model Execution
4. Automatic Program Optimization
5. Integration with Machine Learning Frameworks
6. GPU and Hardware Acceleration
6.1. Part 1
6.2. Part 2
7. Computational Graph Optimization
Table Of Contents
1. Introduction
2. Tensor Program Abstraction
2.1. Primitive Tensor Function
2.2. Tensor Program Abstraction
2.3. Summary
2.4. TensorIR: Tensor Program Abstraction Case Study
2.5. Exercises for TensorIR
3. End to End Model Execution
4. Automatic Program Optimization
5. Integration with Machine Learning Frameworks
6. GPU and Hardware Acceleration
6.1. Part 1
6.2. Part 2
7. Computational Graph Optimization
6.
GPU and Hardware Acceleration
¶
6.1. Part 1
6.1.1. Install packages
6.1.2. Preparations
6.1.3. GPU Architecture
6.1.4. Window Sum Example
6.1.5. Matrix Multiplication
6.1.6. Shared Memory Blocking
6.1.7. Leveraging Automatic Program Optimization
6.1.8. Summary
6.2. Part 2
6.2.1. Preparations
6.2.2. Hardware Specialization Trend
6.2.3. Tensorization
6.2.4. Discussions
6.2.5. Summary
Previous
5. Integration with Machine Learning Frameworks
Next
6.1. Part 1