|
You are here |
www.thanassis.space | ||
| | | | |
siboehm.com
|
|
| | | | | In this post, I'll iteratively optimize an implementation of matrix multiplication written in CUDA.My goal is not to build a cuBLAS replacement, but to deepl... | |
| | | | |
www.da.vidbuchanan.co.uk
|
|
| | | | | [AI summary] The blog post discusses the implementation of Conway's Game of Life using Python with optimized techniques such as SWAR (Shift and Arithmetic Word) and parallel processing. The author leverages Python's native integer operations for SIMD-like acceleration, eliminating the need for explicit SIMD instructions. They also use SDL2 for rendering and achieve high frame rates through parallel execution across multiple processes. The post highlights performance gains, comparing the optimized approach with naive implementations and exploring potential further optimizations using lower-level languages or GPU acceleration. | |
| | | | |
cprimozic.net
|
|
| | | | | A detailed summary of the techniques I used to optimize my Advent of Code 2024 solution for Day 9 Part 2. Employs a variety of techniques including algorithmic shortcuts, bespoke data structures, and low-level optimizations + SIMD. | |
| | | | |
www.gfxstrand.net
|
|
| | | [AI summary] The post discusses the complexities and challenges of descriptor sets in graphics APIs like Vulkan and D3D12, focusing on hardware differences and the trade-offs between various descriptor binding methods. | ||