Deep Mental Models for Solidity ABI Encoding: Part 2

Zaryab Afser

29 Jun 2025 — 10 min read

Note: Read the previous part before starting:
1. Part-0,
2 Part-1 ( must-read )

With Part 1 of this article series, you've already internalized the fundamentals of ABI encoding.

We’re now ready to move deeper into complex structures that Solidity developers use every day: structs, arrays, and deeply nested types.

This part of the article isn’t just about adding more types — it’s about unlocking recursive patterns in ABI encoding.

In this part, we will primarily focus on how solidity structs are encoded.

To demonstrate the step-by-step encoding, we will use the example of 3 different types of structs:

Static Struct: struct with only static type values
Dynamic Struct: struct with both static and dynamic type variables
Super Dynamic Struct: struct with one of the dynamic type variables as a dynamic array ( more complex, huh)

Primer on Struct Encoding

Before proceeding with a complex encoding mechanism in struct types, let’s first learn some imperatives:

A struct in solidity, a user-defined type that groups multiple variables under a single name. Each field in a struct can be of a different type, and the encoding of the struct depends entirely on the composition of these fields.

How do we know a struct type?

As we learned in part 1, when encoding, it's crucial to understand the type of arguments we are encoding. The rules rely on the type we encode.

But if a struct can have both static and dynamic types?

The ABI Specification encoding process divides types into static and dynamic structs based on:

Static struct: a struct where all fields are static types.
Dynamic struct: a struct that contains at least one dynamic type field.

For instance:

Struct ABC is a static Type, but
Struct XYZ is a dynamic type ( coz it has one dynamic type variable, i.e., bytes):

struct ABC{
  uint256 a;
  address b;
  bool    c;
}

struct XYZ{
  uint256 x;
  address y;
  bytes   c;
}

Note: Structs are always treated as tuples for ABI purposes.

The fundamental laws of encoding will remain the same as we discussed in Part 1, all we need to do is:

determine the type of struct
create the head-tail pattern
encode heads and tails
connect them all to get the final calldata

Let’s start with encoding now.

a. Encoding Structs with Only Static Types

Let’s begin with a struct that contains only static types — all fields have a fixed size at compile-time.


struct StaticStruct {
    uint256 id;
    bool isActive;
    address owner;
}

function encodeStaticStruct(StaticStruct memory s) public pure returns (bytes memory) {
    return msg.data;
}

// Arguments : [10, true, "0x5B38Da6a701c568545dCfcB03FcB875f56beddC4"]

We now apply the Fundamental Laws of ABI Encoding, exactly as we defined them in part 1.

Step 1: Determine the struct type and visualize its Head-Tail Layout

We have a single struct s with all types as static. This makes our struct a static type variable.

So the argument tuple looks like:

tuple(s.id, s.isActive, s.owner)

or

tuple(10, true, 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4)

Because all fields are static, there is no tail section.

The layout is simple:


encoded = [head(id)][head(isActive)][head(owner)]

No dynamic types → no offsets → no tail section.

Step 2: Start Encoding the Heads and Tails

Let’s now encode each head individually.

Recall that the encoding of static type heads is the encoded value itself.

For argument id = 10:
1. Encoded in big-endian (value right-aligned)
2. Left-padded to 32 bytes

000000000000000000000000000000000000000000000000000000000000000a

For argument isActive = true ( boolean value )
1. Encoded as a uint8 → becomes 0x01
2. Left-padded to 32 bytes

0000000000000000000000000000000000000000000000000000000000000001

For argument, owner = 0x5B38Da6a701c568545dCfcB03FcB875f56beddC4
1. address is 20 bytes
2. Right-aligned, padded to 32 bytes

0000000000000000000000005B38Da6a701c568545dCfcB03FcB875f56beddC

Step 3: Combine All Encoded Heads

Putting them together:

 Function-Signature: 118e7fac
 ------------
 [000]: 000000000000000000000000000000000000000000000000000000000000000a
 [020]: 0000000000000000000000000000000000000000000000000000000000000001
 [040]: 0000000000000000000000005b38da6a701c568545dcfcb03fcb875f56beddc4

That was easy because StaticStruct was a simple static type.

Let’s now move to the complicated world of dynamic structs.

b. Encoding Structs with Dynamic Types

For a dynamic struct, let’s consider the following example.

struct DynamicStruct {
    uint256 id;
    string name;
    bytes data;
}

function encodeDynamicStruct(DynamicStruct memory s) public pure returns (bytes memory) {
    return msg.data;
}

// Arguments: [10, "DecipherClub", "0xabcd1234"]

Step 1: Determine the struct type and visualize its Head-Tail layout

We are passing a single struct s, and this time, it contains both static and dynamic fields:

id → static (uint256)
name → dynamic (string)
data → dynamic (bytes)

This makes the entire struct a dynamic type ( since data is a dynamic type ).

This indicates the encodeDynamicStruct function has a dynamic type argument.

Recall, the rules of encoding that says for dynamic type variable, we have:

encoding of head, which is the offset that points to where the tail is,
encoding of tail, which is the encoding of the actual value of the arguments.

This means our initial head-tail layout looks something like this

encoding = [head(s)[tail(s)]

where s is the DynamicStruct type passed as argument

Encoding the head(s)

As we know, the encoding of the head of a dynamic type should simply point to where the tail starts.

Since this function takes one dynamic struct, the first 32 bytes of the encoded calldata is the offset pointing to where the struct's actual encoding starts — the beginning of the tuple(id, name, data) layout.

Therefore, we have:

offset = 0x20 = 32 bytes

encode(head(s)):
[000]: 0000000000000000000000000000000000000000000000000000000000000020

Encoding of tail(s)

Now this is where things get interesting.

The tail of a dynamic type is the encoding of the values themselves.

In this case, the tail of “s” is a tuple of multiple arguments (id, name, data ).

This means:

We treat s.id, s.name, and s.data as if they were regular function arguments
And we apply the same ABI rules:
- Encode all heads
- Encode all tails
- Stitch together into the full struct encoding

In other words, the tail of the struct is a mini ABI-encoded payload in itself, composed of a head–tail layout for each member.

We are now talking about nested encoding. Encoding within encoding.

Think of the arguments in terms of tuples now, so that means:

tuple(s.id, s.name, s.data)

or

tuple(10, "DecipherClub",  "0xabcd1234")

The head-tail layout would look something like this:


encoded = [head(id)][head(name)][head(data)][tail(name)][tail(data)]

// Note: no tail for id because id is static type

Start Encoding the Heads and Tails

We now encode the head-tails of the struct values individually.

Let’s compute each head:

head(id) = value 10 → left-padded uint256:


encode(head(id)): 
[020]: 000000000000000000000000000000000000000000000000000000000000000a**

head(name) = offset to tail(name)
- Remember that name is a dynamic type.
- And encoding of the head of a dynamic type shows the offset of where the values are present for that dynamic type.
- To calculate the offset for tail(name) , remember that:
  - The struct's head (id, name, data) occupies 3 × 32 = 96 bytes
  - And the argument just before name is a static type - so we don’t need to save any space for its tail ( because static types don’t have any tail )
  - Therefore, tail(name) starts after 96 bytes from the beginning of the struct encoding, i.e., 0x60 in hex

Encoded as:

encode(head(name)): 
[040]: 0000000000000000000000000000000000000000000000000000000000000060

head(data) = offset to tail(data)This means, head(data) will include 0xa0 as offset.
- data is also a dynamic type. ( follow same steps as before )
- head(data) should give us offset ( location ) of where tail(data) will start from.
- To calculate the offset for head(data) , remember that:
  - The struct's head (id, name, data) occupies 3 × 32 = 96 bytes.
  - The argument just before the data is:
    - id → no tail.
      - storing length of string
      - encoded value of string
    - This means total space after which the tail of data can be stored is:
      - 96 bytes (head) + 64 bytes = 160 bytes = 0xa0

name → string type which will take two 32-byte slot for:

remember that dynamic types stores their length + encoded data.

encode(head(data)): 
[060]: 00000000000000000000000000000000000000000000000000000000000000a0

We are now done with head encoding, so let’s move forward with the encoding of the tail.

Encode tail(id)
1. id is a static uint256 type
2. Therefore, there will be no tail for id,.
Encode tail(name):
1. name is a string, which is a dynamic type.
2. dynamic types store their length and encoded value.
3. The provided string as an argument is: "DecipherClub"
  - Length: 12 bytes or 0xc in hex
  - Encoded Value : 4465636970686572436C7562 ( verify here )
4. The encoding of tail(name) will therefore be:

encode(tail(name)):
 [080]: 000000000000000000000000000000000000000000000000000000000000000c
 [0a0]: 4465636970686572436c75620000000000000000000000000000000000000000

Encode tail(data)
1. data is a byte, which is a dynamic type.
2. Similar to the name, this will also store length + encoded value.
3. The provided bytes argument is : "0xabcd1234"
  1. Length: 4 bytes → encoded as:
  2. The value is already provided, i.e., - abcd1234
4. The encoding of the tail(data) will therefore be stored as:

encode(tail(name)):
 [0c0]: 0000000000000000000000000000000000000000000000000000000000000004
 [0e0]: abcd123400000000000000000000000000000000000000000000000000000000

Final Step: Combine Everything

markdown
CopyEdit
 Function Selector: 2b01f4b4
 ------------
[000]: 0000000000000000000000000000000000000000000000000000000000000020   ← top-level offset
[020]: 000000000000000000000000000000000000000000000000000000000000000a   ← id
[040]: 0000000000000000000000000000000000000000000000000000000000000060   ← offset to tail(name)
[060]: 00000000000000000000000000000000000000000000000000000000000000a0   ← offset to tail(data)
[080]: 000000000000000000000000000000000000000000000000000000000000000c   ← tail(name) length
[0a0]: 4465636970686572436c75620000000000000000000000000000000000000000   ← tail(name) data
[0c0]: 0000000000000000000000000000000000000000000000000000000000000004   ← tail(data) length
[0e0]: abcd123400000000000000000000000000000000000000000000000000000000   ← tail(data) data

c. Encoding Structs with Nested Dynamic Types

This is our most advanced struct so far — it includes dynamic types and a dynamic array inside it.

However, we will follow similar rules to encode this as well.


struct SuperDynamicStruct {
    uint256 id;
    string label;
    address[] signers;
    bytes metadata;
}

function encodeSuperDynamicStruct(SuperDynamicStruct memory s) public pure returns (bytes memory) {
    return msg.data;
}

// Arguments: [99, "DecipherClub", [0x5B38Da6a701c568545dCfcB03FcB875f56beddC4, 0xAb8483F64d9C6d1EcF9b849Ae677dD3315835Cb2], "0xbeefcafebabe"]

Step 1: Determine the struct type and visualize its Head-Tail layout

This struct contains:

id → static (uint256)
label → dynamic (string)
signers → dynamic (address[])
metadata → dynamic (bytes)

→ Because it contains dynamic fields, this struct is a dynamic type.

The initial head-tail layout looks like this:

encoding = [head(s)[tail(s)]

where s is the SuperDynamicStruct type passed as argument

Following the similar set of rules that we followed for DynamicStruct earlier, we have

The first 32 bytes represent the offset to where the struct encoding begins.

encode(head(s)):
[000]: 0000000000000000000000000000000000000000000000000000000000000020

Step 2: Think of the struct's tail as a tuple

Once we jump to that offset, we’re inside the struct body, which is conceptually:


encode(tail(s)) = encode( tuple(s.id, s.label, s.signers, s.metadata) )

or 

encode( tuple(99, "DecipherClub", [address, address], 0xbeefcafebabe) )

We now apply our head-tail logic again:

encoded = [head(id)][head(label)][head(signers)][head(metadata)][tail(label)][tail(signers)][tail(metadata)]

This means now we need to individually solve all of these to get the final encoded value.

But hey, we have done this before. No big deal, anon.

Step 3: Encode Heads

head(id) → 99 → static → left-padded uint256:

[020]: 0000000000000000000000000000000000000000000000000000000000000063

head(label) → should provide offset to tail(label)
- label is a dynamic type and its head encoding should show where the tail begins from.
- To calculate the offset, we have:
  - 4 different heads with 32 bytes each = 4 × 32 bytes = 128 bytes
- So tail(label) starts at byte 0x80 ( 128)

head(label) → should provide offset to tail(label)

Note: The argument just before label is a static type - so we don’t need to save any space for its tail ( because static types don’t have any tail )


encode(head(label)): 
[040]: 0000000000000000000000000000000000000000000000000000000000000080

head(signers) → offset to tail(signers)
- now we have a dynamic array.
- To calculate its offset, we have:
  - 4 different heads with 32 bytes each = 4 × 32 bytes = 128 bytes
  - The argument right before signers are id and label.
  - While id is static and doesn’t have any tail, the label argument does have a tail which takes two 32-byte slots ( for length and for encode value )
  - So the total space needed becomes = 128bytes + 2 * 32 ( for the tail of label ) = 192 bytes or 0xc0
- This means head(signers ) will include 0xc0 as offset


encode(head(signers)): 
[060]: 00000000000000000000000000000000000000000000000000000000000000c0

head(metadata) → offset to tail(metadata)
- next up is the head for metadata , which is again a dynamic type.
- To calculate offset, we have:
  - 4 different heads with 32 bytes each = 4 × 32 bytes = 128 bytes
  - The argument right before signers are id and label and signers array:
    - id takes no space because there are no tails.
    - label takes two 32 bytes slots for its length and value.
    - signers array will take three 32-byte slots for the following:
      - to store its length
      - to store the first address at signers[0]
      - to store the second address at the signers[1]
  - So total space needed becomes = 128 bytes + 64 bytes ( for tail(label) ) + 96 bytes ( for tail(signers) )
  - Total space in bytes needed = 288 bytes or 0x120 in hex
- This means, head(metadata) will include 0x120 as its offset.

[080]: 0000000000000000000000000000000000000000000000000000000000000120

So far, we have included all the heads encoding and our overall calldata looks something like this:

Function Selector: d291d14f
----------------------------
[000]: 0000000000000000000000000000000000000000000000000000000000000020   ← head(s)
[020]: 0000000000000000000000000000000000000000000000000000000000000063   ← id
[040]: 0000000000000000000000000000000000000000000000000000000000000080   ← offset to label
[060]: 00000000000000000000000000000000000000000000000000000000000000c0   ← offset to signers
[080]: 0000000000000000000000000000000000000000000000000000000000000120   ← offset to metadata

Step 4: Encode Tails

It's now time to encode all the tails.

Encode tail(id)
1. id is a uint256 type which is static
2. Therefore, there will be no tail for id.
Encode tail(label)

String = "DecipherClub"
UTF-8 = 4465636970686572436C7562 (12 bytes)
Padding → 32 bytes total

We have done this before.

encode(tail(label)):
[0a0]: 000000000000000000000000000000000000000000000000000000000000000c
[0c0]: 4465636970686572436c75620000000000000000000000000000000000000000

tail(signers)

Address array = length 2
Each address is 20 bytes, right-aligned into 32 bytes


[0e0]: 0000000000000000000000000000000000000000000000000000000000000002
[100]: 0000000000000000000000005b38da6a701c568545dcfcb03fcb875f56beddc4
[120]: 000000000000000000000000ab8483f64d9c6d1ecf9b849ae677dd3315835cb2

Encode tail(metadata)
1. Bytes = 0xbeefcafebabe = 6 bytes
2. Value is already provided as beefcafebabe


encode(tail(metadata)):
[140]: 0000000000000000000000000000000000000000000000000000000000000006
[160]: beefcafebabe0000000000000000000000000000000000000000000000000000

Final Step: Combine Everything

Function Selector: d291d14f
----------------------------
[000]: 0000000000000000000000000000000000000000000000000000000000000020   ← head(s)
[020]: 0000000000000000000000000000000000000000000000000000000000000063   ← id
[040]: 0000000000000000000000000000000000000000000000000000000000000080   ← offset to label
[060]: 00000000000000000000000000000000000000000000000000000000000000c0   ← offset to signers
[080]: 0000000000000000000000000000000000000000000000000000000000000120   ← offset to metadata
[0a0]: 000000000000000000000000000000000000000000000000000000000000000c   ← tail(label) length
[0c0]: 4465636970686572436c75620000000000000000000000000000000000000000   ← tail(label) value
[0e0]: 0000000000000000000000000000000000000000000000000000000000000002   ← tail(signers) length
[100]: 0000000000000000000000005b38da6a701c568545dcfcb03fcb875f56beddc4   ← signer[0]
[120]: 000000000000000000000000ab8483f64d9c6d1ecf9b849ae677dd3315835cb2   ← signer[1]
[140]: 0000000000000000000000000000000000000000000000000000000000000006   ← metadata length
[160]: beefcafebabe0000000000000000000000000000000000000000000000000000   ← metadata