Understanding Memory Latency.
The Memory Wall
Processor speeds have increased exponentially over the last few decades, but memory speeds have not kept pace. This discrepancy is often referred to as the “Memory Wall”.
When a CPU needs data, it first checks its caches (L1, L2, L3). If the data isn’t there, it has to fetch it from main memory (RAM). This fetch is orders of magnitude slower than a register access.
L1 vs L2 vs RAM
- L1 Cache: ~0.5 ns
- L2 Cache: ~7 ns
- Main Memory: ~100 ns
As you can see, missing the cache is expensive.
Data Locality
To mitigate this, we need to write “cache-friendly” code. This means organizing data so that it is accessed sequentially (spatial locality) and reusing data once it’s loaded (temporal locality).
| |
By understanding the hardware, we can write software that runs significantly faster.