x86-64 Is Weird and I Love It

An Appreciation Post

Look, I know ARM is elegant. I know RISC-V is clean. But there’s something deeply satisfying about x86-64’s chaotic energy.

Variable-Length Instructions

In ARM, every instruction is 4 bytes. Predictable. Boring.

In x86-64, instructions range from 1 to 15 bytes. You get prefix bytes, REX bytes, opcode bytes, ModR/M bytes, SIB bytes, displacement bytes, and immediate bytes. It’s like a choose-your-own-adventure book where every path leads to a different encoding.

; These are all valid x86-64 instructions:
nop                     ; 1 byte:  90
mov rax, rbx            ; 3 bytes: 48 89 D8
lock cmpxchg16b [rsi]   ; 5 bytes: F0 48 0F C7 0E

The NOP Sled of Dreams

x86 has multiple ways to encode a NOP. The classic 0x90 is just xchg eax, eax. But modern processors also recognize multi-byte NOPs:

; All of these are NOPs
90                    ; 1-byte NOP
66 90                 ; 2-byte NOP
0F 1F 00              ; 3-byte NOP
0F 1F 40 00           ; 4-byte NOP

Intel literally defined a family of increasingly long instructions that do nothing, just so compilers can align code without wasting space. Beautiful.

Why This Matters for ML

This variability is exactly what makes binary analysis hard — and what makes it interesting for ML. A model that understands x86-64 encoding has to learn:

How prefixes modify instruction semantics
How the same opcode means different things in different contexts
The implicit relationships between register choices and encoding lengths

It’s not just pattern matching. It’s learning an entire encoding scheme that evolved over 40 years of backwards compatibility.

The Takeaway

RISC architectures are designed by engineers. x86 was designed by history. And history, like machine learning, is messy, unpredictable, and occasionally brilliant.