VM whose bytecode is a strict subset of x86-64, so you can just run the code directly without recompilation if you have access to JIT memory and are on an x86-64 machine, and otherwise you emulate the strict subset.

and you could still recompile if it's profitable. just make sure it's a very strict subset.

i think the minimal subset is CALL (E8 xx xx xx xx) and RET (C3) and having a mechanism for loading primitives (& other imports), like a dynamic linker. but adding some arithmetic instructions is probably a good idea.

One of the problems with this idea is that x86 has variable-length instructions. You might think that a program consisting of only CALL and RET could not do too much damage (other than stack overflow or infinite loops). But that's wrong, because CALL could jump to the middle of another CALL instruction, giving access to new instructions.

So either you verify the target of every CALL, or you require that every instruction is 8-byte aligned, by padding with NOP (90), and verify that every CALL target is 8-byte aligned. The two instructions could then be CALL (90 90 90 E8 xx xx xx xx) and RET (C3 90 90 90 90 90 90 90).

@typeswitch IIUC, x86 has multiple different NOP encodings of varying lengths? Using a single longer NOP might be preferable to take up fewer slots in the ROB. Idk if there are other tradeoffs involved.

@glaebhoerl Yeah intel recommends using different length NOPs. Idk if it makes a difference but it's worth trying if anyone uses this. Though 90 is iconic & easy to spot on a hexdump.

Sign in to participate in the conversation
Mastodon

a Schelling point for those who seek one