In Feb. this year, I and some other Balsn members participated in Paradigm CTF, an Ethereum-focused security competition held by Paradigm. First of all, thanks to the organizers for preparing such high-quality and impressive challenges! Luckily, our team,
whoami, got fourth place with 11 out of 17 challenges solved.
List of members and solved challenges
When browsing the official repository, I noticed a clever trick used in a coding challenge,
REVER, and wondered whether I could improve the solution furthermore. Here, I will share the optimal solution I can develop and some tips for EVM bytecode golfing.
Writing contracts that can only be used in one direction is so last year.
- Type: Smart contract
- Solves: 5/67
- Keywords: Yul, EVM opcode, EVM bytecode, code golfing
- Source files
The goal of this challenge is to write a contract (i.e., a sequence of EVM bytecodes) that can identify palindromes. Even more complicated, our written contract should work as well when executed in the reverse direction (somehow inspired by the movie
Both original and reversed versions of our contract are deployed after passing checks on the length and the included opcodes. When we request a flag, the backend judges the two contracts with fixed and randomly generated test cases. Our contract should answer all test cases correctly to get a flag.
By a careful inspection, we may realize that this challenge is a real coding challenge since the possible ways to call external contracts are forbidden. Besides, our contract’s length is limited to a maximum of 100 bytes.
How can we make our contract valid when executed in the reverse direction? A simple way is to palindrome it (i.e.,
s' = s[:-1] + s[::-1]). As a result, only half of our contract is executed in both directions. From now on, we only need to focus only on the first half of our contract, whose length is at most 50 bytes long.
This code implements a straight-forward byte-by-byte comparison algorithm. Notice that after deploying the two contracts,
1 wei is transferred to both of them to let the return value of
selfbalance() become 1. As for
returndatasize(), they are always 0 afterwards.
Although the above implementation is good enough to solve this challenge, one may find space to improve by inspecting the EVM bytecode directly (e.g., redundant opcodes, stack elements re-arrangement). Let me show you the optimal solution I can figure out, which is 24 bytes long:
pc instruction stack (top -> bottom)
This piece of bytecode consists of 4 parts:
iat the beginning of each loop.
s[size-1-i]. If they are the same, continue. If not, jump to offset 0 to revert the transaction.
- Jump to the beginning of the loop if
i+1<size; otherwise, stop the execution.
For reference, below is the pseudocode with the same behavior as the bytecode.
Same as the official solution, we should transfer
1 wei to both of the contracts after deploying them. Now, let me explain the tips I used to construct this solution.
Tip #1: Replace constants with environmental and block variables
This trick is applied to optimize the use of constants 0 and 1 in the official solution. As shown above,
callvalue() replace the constant 0, while
selfbalance() replace the constant 1. By this method, the use of 0 and 1 reaches the optimal of 1 byte for each use.
The reason for choosing
selfbalance() to replace the constants is that the former is always 0 (before any function calls), and the latter is usually controllable by us. Remember that the balance scale is calculated in
wei. Other variables such as
number() might also be exploited but are less useful in general.
Tip #2: Push once, duplicate multiple times
For other frequently-used constants that cannot be presented by an environmental or block variable, push them onto the stack (e.g.,
PUSH2 0x9453) then duplicate them (e.g.,
DUP1) whenever using them to save (at least) 1 byte for every additional use.
Tip #3: Take advantage of special opcodes
Use powerful opcodes such as
BYTE to simplify arithmetic operations. For example, to get the first byte of
byte(0, x) instead of
shr(248, x) to save 1 byte. If there is no need to change the input data, consider pushing it onto the stack by
calldataload instead of
calldatacopy to the memory and then
mload to the stack.
Tip #4: Avoid POP by reusing the old copy of variables
We choose to increase
i by one (part 2) before indexing the input data (part 3) to avoid wasting the old copy. In normal cases, the compiler tends to pop out the old copy of a variable after updating it:
0010 DUP1 [i i size]
However, making use of the old copy would let us save a
POP and a
0010 DUP1 [i i size]
Tip #5: Exploit execution error to revert the transaction
Instead of explicitly using the
REVERT opcode, an execution error in EVM has the same effect of reverting without any data returned. The most common way is by the byte
fe, also called the
INVALID instruction, which is only 1 byte long. If you want to revert based on a particular condition, use
JUMPI to jump to an offset that is not a
JUMPDEST to revert. See the 3rd part of the above bytecode as an example.
Actually, I didn’t come up with the above solution during the competition. I used a more complicated way to calculate the numerical difference between the first and second half of the input string, which even has a certain chance of failure.
By the way, there seems to be an easy unintended solution: return a hard-coded answer for the five fixed test cases (determined by the length of the input string), and guess for the randomly generated ones. The probability of success is to be
2^-10, which is feasible to brute-force until success within only hours.
License: CC BY-NC 4.0