Comments by "Vitaly L" (@vitalyl1327) on "Low Level"
channel.
-
17
-
2
-
2
-
2
-
2
-
2
-
2
-
2
-
@SillySussySally this is exactly what I said - GPUs issue instructions from another threads (in NVidia parlance, warps), while OoO CPUs issue instructions from the same thread that it knows do not have a dependency on anything that's currently being stalled.
So, yes, OoO CPUs have higher latency. Simpler CPUs (such as ones you'll find in microcontrollers) have a much lower latency, and, more importantly, predictable latency. GPUs have lower latency (in terms of cycle count, not time - they run on lower clock frequency normally) just by a virtue of being much simpler cores and featuring shorter and simpler pipelines.
Keep in mind that the exact NVidia microarchitecture is not a public knowledge, so we can only assume here. There are other GPU designs that are far more open and well documented though, so we can extrapolate that knowledge. I personally worked on two mobile GPU cores, ARM Mali 6xx and Broadcom VC5, both are wildly different from each other. Latencies in both were (in clock cycles) still smaller than in high performance Intel cores and high end ARM cores (but higher than in the in-order ARM cores).
2
-
2
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1
-
1