Soldev

Anchor data serialization

Last updated:

Clients will send transactions which include instruction data. When these transactions are sent over the wire, their instructions are first compressed into bytes which is then compressed into a compact Base58 representation.

Base58 is a type of binary to text encoding.

While everything on Solana is stored in bytes, transforming them into Base58 for data transfer over-the-wire is done as an optimization.

On the other side, as a receiver, you also must be able to translate that Base58 code into something meaningful, like a Struct, when receiving transactions. Once it has been deserialized, validated, and executed it will once again be serialized into byte code to be written to the account's buffer.

Serialization

The client and the program must serialize in an identical way. In order to facilitate this, Anchor uses Borsh serialization as a standard, and is our default serializer when we're using Anchor with Rust.

Starting on the client side, we would use Kit to encode the data we want to send:

import {
  getPersonCodec,
  getStructCodec,
  addCodecSizePrefix,
  getUtf8Codec,
  getU32Codec
} from "@solana/kit"

// Use composable codecs to build complex data structures.
type Person = { name: string; age: number };

const getPersonCodec = (): Codec<Person> =>
    getStructCodec([
        ['name', addCodecSizePrefix(getUtf8Codec(), getU32Codec())],
        ['age', getU32Codec()],
    ]);

const personData = { name: 'John', age: 42 };

A codec in this context is a composable object that knows how to serialize (encode) and deserialize (decode) a specific data type to and from a binary format.

Kit provides a set of fundamental codecs we can combine together to create byte layouts for our own data types.

Having both the data and the codec means we can encode and decode to and from our type on the client:

const personCodec = getPersonCodec();
const encodedPerson: Uint8Array = personCodec.encode(personData);
const decodedPerson: Person = personCodec.decode(encodedPerson);

On-chain, when we call a program, the program actually receives the individually parsed instructions (the Uint8Array of encoded data), in this format:

fn process_instruction(
    program_id: &Pubkey,
    accounts: &[AccountInfo],
    instruction_data: &[u8],
) -> ProgramResult

Given that the Pubkey and AccountInfo have already been parsed into formats that the Rust program can understand, the only thing to parse is the u8 instruction data.

This is where the program would use Borsh serialization to transform that transaction data into the specific structs it would need to handle.

Deserializing

On the client side, we generate a client that will convert the native Rust types into their corresponding Typescript types.

RustTypescriptExample
boolbooleantrue
u8/u16/u32/i8/i16/i32number99
u64/u128/i64/i128anchor.BNnew anchor.BN(99)
f32/f64number1.0
Stringstring"hello"
[T; N]Array<T>[1, 2, 3]
Vec<T>Array<T>[1, 2, 3]
Option<T>T | null | undefinednull or 42 (some)

More complicated objects like structs become types themselves:

// Rust
struct MyStruct {
  val: u16
}

Would have the equivalent type in typescript:

// Typescript
type MyStruct = {
  val: number;
}