samples

cuTile Code Samples

This repository contains various examples demonstrating the use of cuTile for implementing high-performance GPU kernels in Python. cuTile simplifies writing CUDA kernels by providing a Pythonic interface for GPU programming concepts like tiling, shared memory, and warp-level operations.

Each sample showcases a fundamental operation, implemented directly using cuTile kernel.

Samples Included

Batch Matrix Multiplication (BatchMatMul.py)

Purpose: Implements batched matrix multiplication (C = A * B) for 3D tensors.

Key Concepts: 3D grid launches, ct.mma for efficient matrix multiply-accumulate.

Dependencies: torch, math, numpy

Fast Fourier Transform (FFT) (FFT.py)

Purpose: Implements a Batched 1D FFT using a multi-dimensional factorization approach.

Key Concepts: Tensor factorization, complex arithmetic, pre-computed rotation (W) and twiddle (T) factors.

Dependencies: torch, math

Matrix Multiplication (MatMul.py)

Purpose: Implements standard (non-batched) matrix multiplication (C = A * B) for 2D matrices.

Key Concepts: Tiled processing, efficient inner loop computation.

Dependencies: torch, math

Matrix Transposition (Transpose.py)

Purpose: Demonstrates transposing a 2D matrix.

Key Concepts: Tiled processing, index swapping for transposition.

Dependencies: torch, math

Attention Fused Multi-Head Attention (AttentionFMHA.py)

Purpose: Demonstrates a fused multi-head attention operation, common in transformer models.

Key Concepts: Casual and Non-Casual Attention

Dependencies: torch, math, numpy

Name		Name	Last commit message	Last commit date
parent directory ..
quickstart		quickstart
templates		templates
utils		utils
AllGatherMatmul.py		AllGatherMatmul.py
AttentionFMHA.py		AttentionFMHA.py
BatchMatMul.py		BatchMatMul.py
FFT.py		FFT.py
LayerNorm.py		LayerNorm.py
MatMul.py		MatMul.py
MoE.py		MoE.py
README.md		README.md
Transpose.py		Transpose.py
VectorAddition.py		VectorAddition.py
__init__.py		__init__.py
test_samples.py		test_samples.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

FilesExpand file tree

samples

Directory actions

More options

Directory actions

More options

Latest commit

History

samples

Folders and files

parent directory

README.md