Anchor data serialization

Clients will send transactions which include instruction data. When these transactions are sent over the wire, their instructions are first compressed into bytes which is then compressed into a compact Base58 representation.

Base58 is a type of binary to text encoding.

While everything on Solana is stored in bytes, transforming them into Base58 for data transfer over-the-wire is done as an optimization.

On the other side, as a receiver, you also must be able to translate that Base58 code into something meaningful, like a Struct, when receiving transactions. Once it has been deserialized, validated, and executed it will once again be serialized into byte code to be written to the account's buffer.

Serialization

The client and the program must serialize in an identical way. In order to facilitate this, Anchor uses Borsh serialization as a standard, and is our default serializer when we're using Anchor with Rust.

Starting on the client side, we would use Kit to encode the data we want to send:

import {
  getPersonCodec,
  getStructCodec,
  addCodecSizePrefix,
  getUtf8Codec,
  getU32Codec
} from "@solana/kit"

// Use composable codecs to build complex data structures.
type Person = { name: string; age: number };

const getPersonCodec = (): Codec<Person> =>
    getStructCodec([
        ['name', addCodecSizePrefix(getUtf8Codec(), getU32Codec())],
        ['age', getU32Codec()],
    ]);

const personData = { name: 'John', age: 42 };

A codec in this context is a composable object that knows how to serialize (encode) and deserialize (decode) a specific data type to and from a binary format.

Kit provides a set of fundamental codecs we can combine together to create byte layouts for our own data types.

Having both the data and the codec means we can encode and decode to and from our type on the client:

const personCodec = getPersonCodec();
const encodedPerson: Uint8Array = personCodec.encode(personData);
const decodedPerson: Person = personCodec.decode(encodedPerson);

On-chain, when we call a program, the program actually receives the individually parsed instructions (the Uint8Array of encoded data), in this format:

fn process_instruction(
    program_id: &Pubkey,
    accounts: &[AccountInfo],
    instruction_data: &[u8],
) -> ProgramResult

Given that the Pubkey and AccountInfo have already been parsed into formats that the Rust program can understand, the only thing to parse is the u8 instruction data.

This is where the program would use Borsh serialization to transform that transaction data into the specific structs it would need to handle.

Deserializing

On the client side, we generate a client that will convert the native Rust types into their corresponding Typescript types.

Rust	Typescript	Example
`bool`	`boolean`	`true`
`u8/u16/u32/i8/i16/i32`	`number`	`99`
`u64/u128/i64/i128`	`anchor.BN`	`new anchor.BN(99)`
`f32/f64`	`number`	`1.0`
`String`	`string`	`"hello"`
`[T; N]`	`Array<T>`	`[1, 2, 3]`
`Vec<T>`	`Array<T>`	`[1, 2, 3]`
`Option<T>`	`T \| null \| undefined`	`null` or `42` (some)

Rust integers (u8 through i32) map to JavaScript number
Larger integers (u64 and above) use Anchor's BN type for precision
Rust's Option<T> maps to TypeScript's union type with null/undefined
Structs and enums become JavaScript objects

More complicated objects like structs become types themselves:

// Rust
struct MyStruct {
  val: u16
}

Would have the equivalent type in typescript:

// Typescript
type MyStruct = {
  val: number;
}