I've decoded the entry point PLA of the 80286 (not the actual microcode though). It also has separate entries for real and protected mode, but only for segment loads from a general purpose register, HLT, and for those opcodes that aren't allowed in real mode like ARPL.
Loading a segment register from memory on the 286 uses the same microcode in both modes, as does everything else that would certainly have to act differently, like jump/call far. That was a bit surprising, since it would have to decide at run time which mode it's in. Is this the same on the 386?
Tested on my 286 machine what happens when opcodes are decoded while in real mode but executed after PE is set: Segment load from memory works (using protected mode semantics), whereas the load from register only changes the visible selector and nothing else. The base in the descriptor cache keeps whatever was set there before -- I assume on the 386, SBRM would update the base the same way it does in real mode in that situation, because it's also used for V86 mode there. Illegal-in-real-mode instructions trap, but do so correctly using the protected mode IDT.
Also seems like executing three pre-decoded instructions without a jump after setting PE causes a triple fault for some reason.
The decode vs. execution behavior is more interesting. From both Intel docs and my own core, PE is effectively checked in both stages independently, but decode happens ahead of execution (prefetch queue). So if an instruction is decoded in real mode, it’ll still follow the real-mode path even if PE is set before it executes.
That’s exactly why Intel requires a jump right after setting PE — it flushes the prefetch queue and forces re-decode in protected mode. As the 80386 System Software Writer’s Guide (Ch. 6.1) puts it: "Instructions in the queue were fetched and decoded while the processor was in real mode; executing them after switching to protected mode can be erroneous."
It's been a while, but I recall Intel documenting that a jump was required almost immediately after setting PE. Probably because documenting "you must soon jump" was easy. Vs. handling the complexities of decoded-real/executed-PE - and documenting how that worked - would have been a giant PITA.
The two-instruction grace period was to let you load a couple segment or descriptor table registers or something, which were kinda needed for the jump. And that triple fault - if you failed to jump in time - sounds right in line with Intel's "when in doubt, fault or halt" philosophy for the 286.
Some later documentation contradicted this, saying that instead this first jump had to be to the protected mode segment.
From the patent (US4442484), it is apparent that the processor decodes opcodes into a microcode entry point before they are executed, and the PE bit is one of the inputs for the entry point PLA. So that would be the obvious reason for flushing the prefetch queue - but it turns out that at least on the 80286, most instructions go to the same entry point regardless of the mode they are decoded in. So they should work the same without flushing the queue.
And yet for some reason, what I've seen in my experiments is that the system would reset if there were three instructions following the "LMSW" without a jump. Even something harmless like "NOP" or "MOV AX,AX", that couldn't be different between real and protected mode. Maybe there is some clock phase where the PE bit changing during the decoding of an instruction leads to an invalid entry point, that either causes a triple fault or resets the processor?
seg000:FD56 Unreal_FFD56 proc near ; CODE XREF: CPU_MicrocodeUpdate+A↑j
seg000:FD56 ; VGA_BIOS_Shadow+20↑p ...
seg000:FD56 lgdt fword ptr cs:[bx]
seg000:FD5A mov eax, cr0
seg000:FD5D or al, 1
seg000:FD5F mov cr0, eax
seg000:FD62 jmp short $+2
seg000:FD64 ; ---------------------------------------------------------------------------
seg000:FD64
seg000:FD64 loc_FFD64: ; CODE XREF: Unreal_FFD56+C↑j
seg000:FD64 mov ax, 8
seg000:FD67 mov ds, ax
seg000:FD69 assume ds:nothing
seg000:FD69 mov es, ax
seg000:FD6B assume es:nothing
seg000:FD6B mov eax, cr0
seg000:FD6E and al, 0FEh
seg000:FD70 mov cr0, eax
seg000:FD73 jmp short $+2
seg000:FD75 ; ---------------------------------------------------------------------------
seg000:FD75
seg000:FD75 loc_FFD75: ; CODE XREF: Unreal_FFD56+1D↑j
seg000:FD75 xor ax, ax
seg000:FD77 mov ds, ax
seg000:FD79 assume ds:nothing
seg000:FD79 mov es, ax
seg000:FD7B assume es:nothing
seg000:FD7B retn
seg000:FD7B Unreal_FFD56 endp
two short jumps, no far jumps in sight. Apparently works just fine on Pentium 4, Core 2s and Atoms.How hard would it be for Mr Github to add rss/atom feeds, I wonder?