Learning Ethereum Virtual Machine Opcodes With EVM Puzzles

February 24, 2022 by patrickd

Franco Victorio (opens in a new tab), one of the devs from Hardhat (opens in a new tab), created a collection of "EVM Puzzles (opens in a new tab)" that we'll be solving here. Consider attempting them yourself first before getting spoiled!

Puzzle #1 (opens in a new tab)

So first of all, how do these puzzles work? They all consist of the puzzleCode, which is a smart contract's binary code, that was annotated with human-readable opcode names as comments. These contracts are deployed and then we can send data (calldata) and value (wei) to them as part of a transaction. The goal of each puzzle is to create a transaction (by setting data and value) that will succeed (and doesn't revert).

const puzzleCode = [
  '36',   // CALLDATASIZE
  '56',   // JUMP
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  '5b',   // JUMPDEST
].join('')
 
// Enter your solution here
const solution = {
  data: "0x",
  value: 0
}

The evm.codes (opens in a new tab) website has a handy table of all existing opcodes and what they do. Of these opcodes we need to prevent the instruction pointer to ever reach a REVERT or INVALID. Instead, we need it to either reach the end of the code or reach a STOP operation.

When our solution is sent as a transaction to the contract's address, the EVM will always begin executing it with the instruction pointer on the first opcode. In the first puzzle that would be the CALLDATASIZE opcode. This pushes the length of our solution's data field onto the stack.

The JUMP instruction tells the EVM to have the instruction pointer "jump" to another location in the bytecode. It gets this location by popping off the first value from the stack. It's important to note that only offsets where the JUMPDEST opcode is located are valid jumping targets.

So knowing that we can now see that we have to influence the value that CALLDATA pushes onto the stack to ensure that JUMP makes the program continue at JUMPDEST reaching the valid end of bytecode. In this case, we can determine the offset of JUMPDEST simply by counting the number instructions, it's at 8 (start counting at 0) and that's also the number of bytes we have to send as data.

To check this solution, edit it like this:

const solution = {
  data: "0x0000000000000000",
  value: 0
}

And then execute the script:

ubuntu@evmpuzzles:~/evm-puzzles$ npx hardhat run puzzles/puzzle_1.js
Puzzle solved!

Puzzle #2 (opens in a new tab)

const puzzleCode = [
  '36',   // CALLDATASIZE
  '38',   // CODESIZE
  '03',   // SUB
  '56',   // JUMP
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  '5b',   // JUMPDEST
].join('')

This puzzle introduces the CODESIZE opcode which simply puts the size of the current contract's bytecode on top of the stack. Once deployed this value is static, and in this case we can once again simply count the number of instructions which is 9, and once again the JUMPDEST location is at 8.

The SUB instruction subtracts the second value, from the stack, off the first one. In this case the stack would look like [CODESIZE, CALLDATASIZE] before the subtraction and [CODESIZE - CALLDATASIZE] after, because both values were popped off and the result was pushed on.

This means we have to solve the following: 9 - CALLDATASIZE = 8 - and to do that we simply send a single byte as data. This will leave the value 8 on top of the stack and just like last time the JUMP can do its job.

const solution = {
  data: "0x00",
  value: 0
}

Puzzle #3 (opens in a new tab)

const puzzleCode = [
  '6000', // PUSH1 00
  '35',   // CALLDATALOAD
  '56',   // JUMP
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  '5b',   // JUMPDEST
].join('')

The new PUSH1 instruction simply pushes a single byte on top of the stack, the byte that follows immediately after the instruction in the bytecode: 00. This zero value is then consumed by CALLDATALOAD as an offset of where to start reading the call data. It reads 32 bytes from our transaction data and pushes it onto the stack. If it finds less than 32 bytes to copy it will add zeros as padding to reach that length.

So whatever we send as calldata here will get pushed onto the stack and then consumed by JUMP. This means we simply have to send the correct offset for the jump destination as data, which is 10 in this case or 0x0a in hexadecimal. But since this would become 0x0a00000000000000000000000000000000000000000000000000000000000000 after padding, which is a really large number (opens in a new tab), we have to make sure to send the full 32 byte word with 0x0a being the least significant byte.

const solution = {
  data: "0x000000000000000000000000000000000000000000000000000000000000000a",
  value: 0
}

Puzzle #4 (opens in a new tab)

const puzzleCode = [
  '6000', // PUSH1 00
  '35',   // CALLDATALOAD
  '38',   // CODESIZE
  '18',   // XOR
  '56',   // JUMP
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  'fdfd', // REVERT REVERT
  '5b',   // JUMPDEST
].join('')

The XOR opcode is similar to SUB, it takes two inputs from the stack, applies the exclusive-or operation on them, finally pushing the result back.

What's effectively happening here is the following: jumpTo(calldata ^ codesize). And the location we want to jump to is 12 (0x0c) while the codesize is 13 (0x0d). This means we have to solve calldata ^ 0x0d = 0x0c which we can by doing 0x0d ^ 0x0c = calldata = 0x01.

Sending a 32 byte word with a value of 1, will be XORed with 13, which results in 12, pointing to the correct JUMPDEST:

const solution = {
  data: "0x0000000000000000000000000000000000000000000000000000000000000001",
  value: 0
}

Puzzle #5 (opens in a new tab)

const puzzleCode = [
  '34',     // CALLVALUE
  '610004', // PUSH2 0004
  '01',     // ADD
  '56',     // JUMP
  'fdfd',   // REVERT REVERT
  'fdfd',   // REVERT REVERT
  '5b',     // JUMPDEST
].join('')

This time the puzzle uses the value, which is the amount of wei, sent with the transaction, instead of the transaction data itself. CALLVALUE simply pushes this amount of wei onto the stack. After that a PUSH2 operation pushes the two bytes following it onto the stack: 0004 or simply a 4.

You have probably already guessed that ADD stands for addition. In pseudocode this puzzle can be rewritten as jumpTo(value + 4). The JUMPDEST is at offset 10, which means we need to send 6 wei to make it jump there.

const solution = {
  data: "0x",
  value: 6
}

Puzzle #6 (opens in a new tab)

const puzzleCode = [
  '34',     // CALLVALUE
  '80',     // DUP1
  '02',     // MUL
  '6009',   // PUSH1 09
  '14',     // EQ
  '600d',   // PUSH1 0d
  '57',     // JUMPI
  'fdfd',   // REVERT REVERT
  'fdfd',   // REVERT REVERT
  '5b',     // JUMPDEST
].join('')

A whole bunch of new opcodes are introduced in this puzzle. Starting with the easiest one, MUL which is the opcode for multiplication and works just like SUB, ADD and XOR did. EQ is also quite similar, but instead of calculating with the first two stack items, it compares them and pushes a 1 if they're equal and otherwise a 0 back onto the stack.

Then there's DUP1 which simply duplicates the first stack item. For example if your stack looks like [1, 2, 3] it will be [1, 1, 2, 3] after DUP1 was executed.

Finally there's JUMPI which is a conditional JUMP. Additionally to the destination offset, it consumes another stack item and checks whether it equals 0. If it's anything other than 0 it will jump to the location specified by the first stack item.

The puzzle can be rewritten like this:

if (value * value == 9) {
  jumpTo(0x0d)
}

This time the correct jump destination is already hardcoded. Instead, we have to specify a value of wei to send which ensures that JUMPI does jump at all. And as you can already tell from the pseudocode, that value must be 3.

const solution = {
  data: "0x",
  value: 3
}

To easier understand what's happening, here's how the stack changes after each instruction:

0x00   CALLVALUE       [3]
0x01   DUP1            [3, 3]
0x02   MUL             [9]
0x03   PUSH1 09        [9, 9]
0x05   EQ              [1]
0x06   PUSH1 0d        [0x0d, 1]
0x08   JUMPI           []            ---|
0x09   REVERT                           |
0x0a   REVERT                           |
0x0b   REVERT                           |
0x0c   REVERT                           |
0x0d   JUMPDEST        []            <--|
       STOP

Puzzle #7 (opens in a new tab)

const puzzleCode = [
  '36',     // CALLDATASIZE
  '6003',   // PUSH1 03
  '10',     // LT
  '6009',   // PUSH1 09
  '57',     // JUMPI
  'fdfd',   // REVERT REVERT
  '5b',     // JUMPDEST
  '34',     // CALLVALUE
  '36',     // CALLDATASIZE
  '02',     // MUL
  '6008',   // PUSH1 08
  '14',     // EQ
  '6014',   // PUSH1 14
  '57',     // JUMPI
  'fd',     // REVERT
  '5b',     // JUMPDEST
].join('')

This puzzle is quite a bit more complex but it only introduces one new opcode: LT or lower-than, is similar to EQ in that it compares the first two items on stack and pushes 1 or 0 as a result:

  • [1, 2] - 1 < 2? - Yes (1)
  • [2, 1] - 2 < 1? - No (0)

The first part of the puzzle (until the first REVERTs) can be rewritten like this:

if (3 < CALLDATASIZE) {
  jumpTo(0x09)
}

The second part like this:

if (CALLDATASIZE * CALLVALUE == 8) {
  jumpTo(0x14)
}

Since the length of our calldata must be greater than 3, and when multiplied with the value it must result in 8, it must be that we have to send 4 bytes and 2 wei.

const solution = {
  data: "0x00000000",
  value: 2
}

Let's once again see how this transaction is actually being executed within the EVM:

0x00   CALLDATASIZE   [4]
0x01   PUSH1 03       [3, 4]
0x03   LT             [1]
0x04   PUSH1 09       [0x09, 1]
0x06   JUMPI          []            ---|
0x07   REVERT                          |
0x08   REVERT                          |
0x09   JUMPDEST       []            <--|
0x0a   CALLVALUE      [2]
0x0b   CALLDATASIZE   [4, 2]
0x0c   MUL            [8]
0x0d   PUSH1 08       [8, 8]
0x0f   EQ             [1]
0x10   PUSH1 14       [0x14, 1]
0x12   JUMPI          []            ---|
0x13   REVERT                          |
0x14   JUMPDEST       []            <--|
       STOP

Puzzle #8 (opens in a new tab)

const puzzleCode = [
  '38',     // CODESIZE
  '34',     // CALLVALUE
  '90',     // SWAP1
  '11',     // GT
  '6008',   // PUSH1 08
  '57',     // JUMPI
  'fd',     // REVERT
  '5b',     // JUMPDEST
  '36',     // CALLDATASIZE
  '610003', // PUSH2 03
  '90',     // SWAP1
  '06',     // MOD
  '15',     // ISZERO
  '34',     // CALLVALUE
  '600a',   // PUSH1 0a
  '01',     // ADD
  '57',     // JUMPI
  'fdfd',   // REVERT REVERT
  'fdfd',   // REVERT REVERT
  '5b',     // JUMPDEST
].join('')

It doesn't take much to guess that GT indeed stands for greater-than and works just like LT. And yet another comparing opcode is introduced with ISZERO, but this one only consumes one item of the stack's top and pushes a 1 or a 0 depending whether that value was a zero or not.

Next is the MOD operator, which allows calculation of the modulo and works just like the other arithmetic instructions.

The use of SWAP1 in this puzzle appears to have been added for creating confusion by switching the first two value of the stack around (eg. [1, 2] becomes [2, 1]).

With that in mind, puzzle can again be broken down into two parts:

if (CODESIZE > CALLVALUE) {
  jumpTo(0x08)
}
if (CALLDATASIZE % 3 == 0) {
  jumpTo(0x0a + CALLVALUE)
}

The size of the bytecode is a static value of 26, and the first condition is that the sent wei value is smaller than that. It's also required that CALLDATASIZE % 3 = 0, which means we can send any length of data as long as it's a multiple of 3 - let's just send 3 bytes!

The final JUMPDEST is at offset 0x19 and depends on the wei value sent: 0x0a + CALLVALUE = 0x19 (or simply 10 + 15 = 25). Sending 15 wei is also smaller than 26 and therefore passing the first condition as well.

const solution = {
  data: "0x000000",
  value: 15
}

Once more, step by step:

0x00   CODESIZE      [26]
0x01   CALLVALUE     [15, 26]
0x02   SWAP1         [26, 15]
0x03   GT            [1]
0x04   PUSH1 08      [0x08, 1]
0x06   JUMPI         []            ---|
0x07   REVERT                         |
0x08   JUMPDEST      []            <--|
0x09   CALLDATASIZE  [3]
0x0a   PUSH2 0003    [3, 3]
0x0d   SWAP1         [3, 3]
0x0e   MOD           [0]
0x0f   ISZERO        [1]
0x10   CALLVALUE     [15, 1]
0x11   PUSH1 0a      [0x0a, 15, 1]
0x13   ADD           [0x18, 1]
0x14   JUMPI         []            ---|
0x15   REVERT                         |
0x16   REVERT                         |
0x17   REVERT                         |
0x18   REVERT                         |
0x19   JUMPDEST      []            <--|
       STOP

That's it already! Although these puzzles were rather simple they were still good fun and I'm looking forward to solving more of them in the future.