Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Apologies if I'm being too academic here but not all memory bottlenecks or communication bottlenecks are the "Von Neumann Bottleneck".

The term was originally focused on both data and program memory being on the other side of a shared bus from the CPU, which meant you could only do one at a time. If you were trying to figure out your next instruction, you wait. If you then need to data, you wait. It wasn't a problem back with slowly executing EDVAC code [0]. Based on this definition most architectures today do not have the Von Neumann Bottleneck as they are not Von Neumann architectures [1, 2, 3].

A slightly looser definition of the Von Neumann Bottleneck refers to the separation between CPU and memory with a single bus. This likely originated because fully Von Neumann architectures are so rare but that the general problem is similar enough to share the name. GPUs dont have this issue because they employ parallelism thru multiple memory ports talking to off chip RAM. TrueNorth also doesn't have this issue because it has 4096 parallel cores with their own localized memory and no off chip memory. There could certainly be other bottlenecks in the system, even with the memory system, but those wouldn't be the Von Neumann Bottleneck [0].

[0] https://en.wikipedia.org/wiki/Von_Neumann_architecture#Von_N...

[1] https://news.ycombinator.com/item?id=2645652

[2] http://ithare.com/modified-harvard-architecture-clarifying-c...

[3] http://cs.stackexchange.com/questions/24599/which-architectu...



Yeah, this is true, but the point I wanted to make is that physics doesn't care what it's called, just that you're moving data incredibly long distances with massive decoders. This is, in essence, what costs the huge amount of energy. By contrast, "pouring memory on die" solves this problem almost completely, but your compute (which was your major problem anyways)is still your biggest issue, and it's gotten worse!

By "pour memory on die" I mean that the memory is on die, clearly there are some special techniques being used to manage that memory, but physically, this is what's saving power.


Here's at least a start: http://www.slideshare.net/embeddedvision/tradeoffs-in-implem...

As you can see, ~10-1000X (the scale is logarithmic) more is spent on compute rather than data movement, and that's with DDR, not even HBM2, let alone on-chip!




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: