Smart Contract Optimization. How the bitness of Solidity types affects the price of transactions

“Programmers spend a huge amount of time worrying about the speed of their programs, and attempts to achieve efficiency often have a sharply negative effect on the ability to debug and support them. It is necessary to forget about small optimizations, say, in 97% of cases. Premature optimization is the root of all evil! But we must not lose sight of the 3% where it really matters! ”
Donald Knut

When conducting audits of smart contracts, we sometimes ask ourselves whether their development is related to those 97% where there is no need to think about optimization or we are dealing with exactly 3% of cases where it is important. In our opinion, rather the second. Unlike other applications, smart contracts are not renewable, they cannot be optimized “on the go” (provided that this is not incorporated into their algorithm, but this is a separate topic). The second argument in favor of early contract optimization is that, unlike most systems, where non-optimality is manifested only in scale, is associated with the specificity of iron and environment, is measured by a huge number of metrics, a smart contract has essentially only a metric of productivity — gas consumption.

Therefore, the effectiveness of the contract is technically easier to evaluate, but developers often continue to rely on their intuition and do the same blind “premature optimization” that Professor Knut was talking about. We will check how an intuitive solution corresponds to reality using the example of choosing a variable width. In this example, as in most practical cases, we will not achieve savings, and on the contrary, our contract will be more expensive in terms of gas consumed.

What kind of gas?

Ethereum is like a global computer whose “processor” is an EVM virtual machine, a “program code” is a sequence of commands and data recorded in a smart contract, and calls are transactions from the outside world. Transactions are packed in related structures - blocks that occur every few seconds. And since the block size is by definition limited, and the processing protocol is deterministic (requires uniform processing of all transactions in the block by all network nodes), to meet potentially unlimited demand with limited node resource and protection from DoS ", the system must provide a fair algorithm for choosing whose request to service, and whose to ignore. As such a mechanism, many public blockchains have a simple principle - the sender can choose the amount of remuneration to the miner for the execution of his transit ktsii and miner chooses whose needs include a block, and whose not, choosing the most profitable for themselves.

For example, in Bitcoin, where a block is limited to one megabyte, the miner chooses to include the transaction in the block or not based on its length and the proposed commission (choosing those with the maximum satoshis per byte ratio).

This approach is not suitable for the more complex Ethereum protocol, because one byte can be either a lack of operation (for example, a STOP code) or an expensive and slow write operation in the storage (SSTORE). Therefore, for each op-code on the air has its own price, depending on its resource intensity.

Fee Schedule from protocol specification

Table of gas consumption by different types of operations. From the Ethereum Yellow Paper protocol specification.

Unlike Bitcoin, the Ethereum transaction sender does not set a commission in the cryptocurrency, but the maximum amount of gas that he is willing to spend - startGas and the price per unit of gas - gasPrice . When the virtual machine executes the code, the number of gas for each subsequent operation is subtracted from startGas, until either the output from the code is reached or the gas runs out. Apparently, this is why such a strange name is used for this unit of work - the transaction is filled with gas like a car, and whether it reaches the destination point or not depends on whether the volume filled into the tank is enough. Upon completion of the execution of the code, the amount of air received from the sender of the transaction is obtained by multiplying the gas actually consumed by the price specified by the sender ( wei per gas). In the global network, this happens at the moment of “mining” of the block in which the corresponding transaction is included, and in the Remix environment, the transaction is “mined” instantly, free of charge and without any conditions.

Our tool - Remix IDE

For the "profiling" of gas consumption, we will use the Ethereum online development environment of Remix IDE contracts. This IDE contains a code editor with syntax highlighting, an artifact viewer, a rendering of contract interfaces, a visual debugger of a virtual machine, JS compilers of all possible versions, and many other important tools. I highly recommend starting the study of the ether with him. An additional plus is that it does not require installation - just open it in the browser from the official site .

Variable type selection

The specification of the Solidity language offers the developer as much as thirty two digits of integer types uint - from 8 to 256 bits. Imagine that you are developing a smart contract that is designed to store a person’s age in years. What bit uint choose you?

It would be quite natural to choose the minimum sufficient type for a specific task - mathematically uint8 would fit here. It would be logical to assume that the smaller object we store in the blockchain and the less memory we use when executed, the less we have overhead, the less we pay. But in most cases, this assumption will be wrong.

For the experiment, let's take the simplest contract from what the official Solidity documentation offers and assemble it in two versions - using the variable type uint256 and 32 times smaller type - uint8.

simpleStorage_uint256.sol

 pragma solidity ^0.4.0; contract SimpleStorage { //uint is alias for uint256 uint storedData; function set(uint x) public { storedData = x; } function get() public view returns (uint) { return storedData; } }

pragma solidity ^0.4.0; contract SimpleStorage { //uint is alias for uint256 uint storedData; function set(uint x) public { storedData = x; } function get() public view returns (uint) { return storedData; } }

simpleStorage_uint8.sol

 pragma solidity ^0.4.0; contract SimpleStorage { uint8 storedData; function set(uint8 x) public { storedData = x; } function get() public view returns (uint) { return storedData; } }

Measuring "savings"

So, the contracts are created, loaded into Remix, closed, and the calls to the .set () methods are completed with transactions. What do we see? Long-type recording costs more than short - 20,464 vs. 20205 gas units! How? Why? Let's figure it out!

Ethereum gas consumption in Remix IDE

Storage uint8 vs uint256

Writing to persistent storage is one of the most expensive operations in the protocol for obvious reasons: first, state recording increases the amount of disk space required by a full host. The size of this storage is constantly increasing, and the more states are stored at the nodes, the slower synchronization occurs, the higher the requirements for the infrastructure (partition size, number of iops). At peak times, slow IO disk operations determine the performance of the entire network.

It would be logical to expect that the storage of uint8 should cost ten times cheaper than uint256. However, in the debugger, you can see that both values are located exactly the same in the storage slot as 256-bit value.

And in this particular case, the use of uint8 does not give any advantage to the cost of writing to the repository.

Handling uint8 vs uint256

Maybe we will get advantages when working with uint8, if not during storage, then at least when manipulating data in memory? Below are compared the instructions of the same function obtained for different types of variables.

You can see that operations with uint8 have even more instructions than uint256. This is due to the fact that the machine leads the 8-bit value to the native 256-bit word, and as a result, the code acquires additional instructions that the sender pays. Not only the recording, but also the execution of the code with the uint8 type in this case is more expensive.

Where can the use of short types be justified?

Our team has long been engaged in the audit of smart contracts, and there has not yet been a single practical case where the use of a small type in the code provided for auditing would lead to savings. Meanwhile, in some very specific cases, saving is theoretically possible. For example, if your contract stores a large number of small state variables or structures, then they have the ability to be packed in fewer storage slots.

The difference will be most apparent in the following example:

1. contract with 32 variables uint256

simpleStorage_32x_uint256.sol

 pragma solidity ^0.4.0; contract SimpleStorage { uint storedData1; uint storedData2; uint storedData3; uint storedData4; uint storedData5; uint storedData6; uint storedData7; uint storedData8; uint storedData9; uint storedData10; uint storedData11; uint storedData12; uint storedData13; uint storedData14; uint storedData15; uint storedData16; uint storedData17; uint storedData18; uint storedData19; uint storedData20; uint storedData21; uint storedData22; uint storedData23; uint storedData24; uint storedData25; uint storedData26; uint storedData27; uint storedData28; uint storedData29; uint storedData30; uint storedData31; uint storedData32; function set(uint x) public { storedData1 = x; storedData2 = x; storedData3 = x; storedData4 = x; storedData5 = x; storedData6 = x; storedData7 = x; storedData8 = x; storedData9 = x; storedData10 = x; storedData11 = x; storedData12 = x; storedData13 = x; storedData14 = x; storedData15 = x; storedData16 = x; storedData17 = x; storedData18 = x; storedData19 = x; storedData20 = x; storedData21 = x; storedData22 = x; storedData23 = x; storedData24 = x; storedData25 = x; storedData26 = x; storedData27 = x; storedData28 = x; storedData29 = x; storedData30 = x; storedData31 = x; storedData32 = x; } function get() public view returns (uint) { return storedData1; } }

2. contract with 32 variables uint8

simpleStorage_32x_uint8.sol

 pragma solidity ^0.4.0; contract SimpleStorage { uint8 storedData1; uint8 storedData2; uint8 storedData3; uint8 storedData4; uint8 storedData5; uint8 storedData6; uint8 storedData7; uint8 storedData8; uint8 storedData9; uint8 storedData10; uint8 storedData11; uint8 storedData12; uint8 storedData13; uint8 storedData14; uint8 storedData15; uint8 storedData16; uint8 storedData17; uint8 storedData18; uint8 storedData19; uint8 storedData20; uint8 storedData21; uint8 storedData22; uint8 storedData23; uint8 storedData24; uint8 storedData25; uint8 storedData26; uint8 storedData27; uint8 storedData28; uint8 storedData29; uint8 storedData30; uint8 storedData31; uint8 storedData32; function set(uint8 x) public { storedData1 = x; storedData2 = x; storedData3 = x; storedData4 = x; storedData5 = x; storedData6 = x; storedData7 = x; storedData8 = x; storedData9 = x; storedData10 = x; storedData11 = x; storedData12 = x; storedData13 = x; storedData14 = x; storedData15 = x; storedData16 = x; storedData17 = x; storedData18 = x; storedData19 = x; storedData20 = x; storedData21 = x; storedData22 = x; storedData23 = x; storedData24 = x; storedData25 = x; storedData26 = x; storedData27 = x; storedData28 = x; storedData29 = x; storedData30 = x; storedData31 = x; storedData32 = x; } function get() public view returns (uint) { return storedData1; } }

Deploy the first contract (32 uint256) will cost less - only 89941 gas, but .set () will be much more expensive because will occupy 256 slots in storage, which will cost 640639 gas for each challenge. The second contract (32 uint8) will be two and a half times more expensive with a delay (221663 gas), but each call to the .set () method will be many times cheaper, since changes only one cell of the story (185291 gas).

Does this optimization apply?

How significant the effect of type optimization is a moot point. As you can see, even for such a specially selected, synthetic case, we did not receive multiple differences. The choice to use uint8 or uint256 is rather an illustration of the fact that optimization should either be applied intelligently (with tools understanding, profiling), or not think about it at all. Here are some general guidelines:

if the contract contains many small numbers or compact structures in the repository, then you can think about optimization;
if you are using an “abbreviated” type - remember about over- / under-flow vulnerabilities ;
for memory variables and function arguments that are not written to the storage, it is always better to use the native type uint256 (or its uint alias). For example, there is no point in assigning a uint8 type to a list iterator - only you will lose;
The order of variables in the contract is of paramount importance for correct packing in the storage slots for the compiler.

Links

I’ll finish with a tip that doesn’t have any contraindications: experiment with development tools, know the language, library, and framework specifications. I will give the most useful, in my opinion, links to start exploring the Ethereum platform:

The Remix contract development environment is a very functional browser IDE;
Solidity language specification ; by reference you will be taken directly to the section on State Variables Layout;
A very interesting contract repository from the famous OpenZeppelin team. Examples of the implementation of tokens, crowdsail contracts, and most importantly - the SafeMath library, the one that helps to work with types safely;
Ethereum Yellow Paper , the formal specification of the Ethereum virtual machine;
Ethereum White Paper , the Ethereum platform specification, a more general and high-level document with a large number of references;
Ethereum in 25 minutes , a short but nevertheless capacious technical introduction to Ethereum from the creator of the platform Vitalik Buterin;
Etherscan blockchain explorer , a window to the real ether world, browser blocks, transactions, tokens, contracts on the main network. On Etherscan you will find explorer for test networks Rinkeby, Ropsten, Kovan (networks with free air, built on different consensus protocols).

Source: https://habr.com/ru/post/415791/

All Articles