|
You are here |
www.modular.com | ||
| | | | |
ashvardanian.com
|
|
| | | | | This blogpost is a mirror of the original post on Modular.com. Modern CPUs have an incredible superpower: super-scalar operations, made available through single instruction, multiple data (SIMD) parallel processing. Instead of doing one operation at a time, a single core can do up to 4, 8, 16, or even 32 operations in parallel. In a way, a modern CPU is like a mini GPU, able to perform a lot of simultaneous calculations. Yet, because it's so tricky to write parallel operations, almost all that potential remains untapped, resulting in code that only does one operation at a time. | |
| | | | |
mcyoung.xyz
|
|
| | | | | [AI summary] The text provides an in-depth exploration of SIMD (Single Instruction, Multiple Data) programming, focusing on its application in optimizing algorithms like base64 decoding. It outlines the challenges of writing portable SIMD code across different architectures, the role of compilers and instruction sets, and the importance of avoiding branches in performance-critical code. The article transitions into a practical example of implementing a SIMD version of the base64 decoding algorithm, emphasizing the use of shuffles and data reordering to efficiently process data in parallel. It also touches on the trade-offs between using intrinsics, portable SIMD libraries, and compiler optimizations, while highlighting the complexities of cross-platform deve... | |
| | | | |
www.thanassis.space
|
|
| | | | | Optimizing code for the European Space Agency | |
| | | | |
winterbe.com
|
|
| | | Learn Java SE 8 by example: Lambda Expressions, Default Interface Methods, Method References, Streams, Date API, Annotations and more | ||