Comments by "Valen Ron" (@valenrn8657) on "The Evolution Of CPU Processing Power Part 2: Rise Of The x86" video.

@nextlifeonearth Variable length instruction set removes ARM and RISC-V for being pure RISC.
1
@nextlifeonearth https://www.youtube.com/watch?v=Ii_pEXKKYUg That lecture omitted ARM's T32/Thumb.
1
@nextlifeonearth Against Chris Celio's argument Comparing C compiled code, not real hand-coded to the same skill on each platform. So your comparisons do not hold any water, because the optimizations for the target by the compiler will vary wildly from arch to arch.
1
@nextlifeonearth X86 ASM is well known when compared to RV64G Using the same linked source https://youtu.be/Ii_pEXKKYUg?t=821 Instruction count RV64G is 16% more instructions than X86-64 RV64G is 4% more instructions than ARM-7 Dynamic Bytes RV64G is 23% more instructions bytes than X86-64 RV64GC is 28% fewer instructions than ARM-7 RV64GC is 8% fewer instructions than X86-64 Uopus argument is nearly useless since this situation occurs within the CPU
1
@nextlifeonearth (D) concerning memory as the direct target of ALU instructions is the only aspect that's central to what distinguishes the x86 pseudo-RISC kernel from a true RISC kernel. If some person complaining about x86 doesn't mention this, his or her argument is half-baked. Here, again, Chris was not as forthcoming as he ought to have been. He actually comments on write ports to the register file as a pertinent modern design issue with complex trade-offs. x86 requires fewer write ports through its unique capacity to exploit the dcache as part of a (hugely) extended register file. The rmw instruction family in x86 is a bit like zero page in 6502/6809, as both of these allow memory to substitute for registers you don't have at far less cost than you would otherwise experience. The rmw instructions form a computed address on the fly—without committing this to a named register—and then operate on the memory location (both a read and a write), also without committing this to a named register file. This is why the register colouring algorithm for the original x86 ever survived to live another day, despite the gross inadequacy of the named register file. What does end up a bit stressed out in silicon is what the Pentium Pro used to call the MOB: memory order buffer. A lot more addresses need to be checked for ordering requirements (mostly use of overlapping memory addresses in close succession). I once read a discussion by a core member of the Athlon design team who said that this was almost a blessing in disguise. In a pure RISC design, you have to perform virtual address translation twice: once on read, again on write. In implicitly fused rmw on x86, you only need to perform virtual address translation once. And so the final score: a busier (and hotter) MOB, but a less busy (and less hot) TLB.
1
@nextlifeonearth This is bullshit. I'm a C++ (with inline x86 ASM) software engineer on Windows.
1
@nextlifeonearth Instruction set is only a part of the answer when microarchitecture's implementation can influence performance.
1
@nextlifeonearth How about this, offer a RISC-V against Xbox Series S level solution.
1
@nextlifeonearth According to GDC 2014, Jaguar beats CELL in IPC. AMD created CELL like solution from its GPU CU IP for Sony's PS5 DSP audio solution while AMD designed Zen 2 and RDNA 2. IBM couldn't match AMD's concurrent CPU, GPU and DSP. AMD also designed K12 Arm8 CPU clone from Zen R&D. PS5 has an extra CU (for DSP workloads) in addition to its 36 CU RDNA 2 GPU. Xbox Series S and X also has a similar CU DSP solution from AMD.
1