Solana accounts
Solana has 3 types of accounts:
- Accounts that store data (non-executable)
- Accounts that store programs (executable)
- Accounts that store native programs (executable)
Accounts in general are just buffers for arbitrary data which is stored inside their data field. So in all 3 cases, the data field is just a byte array, which is storing either program bytecode or arbitrary data.
struct Account {
lamports: u64,
owner: PublicKey,
executable: bool,
data: &str,
rent_epoch: u8
}
An account is a claim to a specific amount of data storage on the blockchain. The maximum data is 10MB. Each account pays for this privilege of storing data. As an incentive, paying for 2 years of rent waives any more payments.
The minimum overhead of an account is 128 bytes.
// https://github.com/anza-xyz/solana-sdk/blob/master/rent/src/lib.rs#L70C1-L70C47
pub const ACCOUNT_STORAGE_OVERHEAD: u64 = 128;
When we create an account we tell the blockchain how much space it will need for storing the specific data inside the buffer.
This space is fixed but can be adjusted using realloc, or more recently resize
Reading accounts
All addresses on Solana uniquely identify an account.
We can access the data from an account by asking an RPC node to fetch it for us.
We would provide the address of the account and the RPC node:
import { createSolanaRpc } from "@solana/kit"
const rpc = createSolanaRpc("http://localhost:8899");
const accountInfo = await rpc.getAccountInfo(mint.address).send();
console.log(accountInfo);
Then it would be kind enough to return us some JSON:
{
context: {
apiVersion: "3.0.7",
slot: 420371843n,
},
value: {
data: [ "AQAAAPR/cf7PNwHaz9JAucJSQH2hzzb7qCflf/FavQxZ6INHAAAAAAAAAAAJAQAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAAA==",
"base64"
],
executable: false,
lamports: 1461600n,
owner: "TokenzQdBNbLqP5VEhdkAS6EPFLC1PHnBqCXEpPxuEb",
rentEpoch: 0n,
space: 82n,
},
}
In this JSON is the data field of the account. This stores the arbitrary data in a base64 encoded form ready to be decoded by some kind of custom decoder.
We need to decode it according to what we know about its data structure. From the outside these data types are opaque and are usually published through an IDL.
Using these IDL we can generate (or hand write) code that decodes the data into something meaningful:
import {
getStructDecoder,
addDecoderSizePrefix,
getUtf8Decoder,
getU32Decoder
} from "@solana/kit"
const personDecoder: Decoder<PersonData> = getStructDecoder([
['name', addDecoderSizePrefix(getUtf8Decoder(), getU32Decoder())],
['age', getU32Decoder()],
]);
const decodedAccount = decodeAccount(account, personDecoder);
Writing accounts
Anyone can read account data, but only an accounts designated owner can modify anything. This owner is always a program, usually the system program. To control who can interact through this program, each account has an authority.
This means that if we ask a program to modify one of the accounts it owns, it will only agree to it if the authority for that account signed the transaction.
All human accounts (like wallets) are owned by the system program. The system program can create more accounts and send lamports.
Accounts can only be assigned a new owner once. This owner is always a program, with permission to control the flow of data and lamports.
When you sign a transaction using an accounts private key, that account is marked as a signer by the program runtime. Other programs can use this information to implement ownership and authority functionality.
You might be curious what we actually write to these accounts. We said they can store arbitrary data but most of the time developers will be using some kind of standard serialization method like Borsh.
#[derive(BorshDeserialize, BorshSerialize, Debug)]
pub struct AddressInfo {
pub name: String,
pub house_number: u8,
pub street: String,
pub city: String
}
Borsh is an encoding for binary objects. Its how we go from bytecode back into something Rust is going to understand.
Creating an account
To create an account you first generate a private/public keypair. Then you need to register that account to the blockchain by invoking the create account instruction on the system program.
To get the keypair on the client we can use Kit:
import { generateKeyPair } from "@solana/keys";
const { privateKey, publicKey } = await generateKeyPair();
This can be confusing because how do we know this is a brand new keypair and not one already in use? We never communicated to the blockchain so there is no way to know at this point.
It turns out that its a statistical impossibility to generate the same keypair as someone else. You can explore https://keys.lol and try it yourself.
It is common to use the keys generated by the Solana CLI, usually stored in your ~/.config/solana/id.json
import fs from "fs";
import { createKeyPairFromBytes } from "@solana/keys";
// Get bytes from local keypair file.
const keypairFile = fs.readFileSync("~/.config/solana/id.json");
const keypairBytes = new Uint8Array(JSON.parse(keypairFile.toString()));
// Create a CryptoKeyPair from the bytes.
const { privateKey, publicKey } = await createKeyPairFromBytes(keypairBytes);
Even though we generated a keypair, the account is not actually initialized on the blockchain.
When you want to create a new account on-chain we need to do two things:
- Create the account and allocate its space on-chain
- Initialize the account with its data, which is done by the owner
This owner is important and will decide who can actually mutate the data it holds, as well as sending lamports (sol) from the account.
So the next thing we need to do is to authorize the system program to debit an account for the new account's rent, ideally paying for 2 years to become exempt from paying any more later.
We do all this through some kind of client.
A client is anything that is talking to the blockchain but originates off chain. Usually these are written in a higher level language like Javascript, but there are plenty of Rust clients as well.
Here is what it looks like on kit:
describe("Create account", async () => {
const { rpc, rpcSubscriptions } = createDefaultSolanaClient()
it("Creates the account", async () => {
// Create signers.
const [payer, mint] = await Promise.all([generateKeyPairSigner(), generateKeyPairSigner()]);
// Create the instructions.
const createAccount = getCreateAccountInstruction({
payer, // <- TransactionSigner
newAccount: mint, // <- TransactionSigner
space,
lamports,
programAddress: TOKEN_PROGRAM_ADDRESS,
});
const initializeMint = getInitializeMintInstruction({
mint: mint.address,
mintAuthority: address("1234..5678"),
decimals: 2,
});
// Create the transaction.
const transactionMessage = pipe(
createTransactionMessage({ version: 0 }),
(tx) => setTransactionMessageFeePayerSigner(payer, tx), // <- TransactionSigner
(tx) => setTransactionMessageLifetimeUsingBlockhash(latestBlockhash, tx),
(tx) => appendTransactionMessageInstructions([createAccount, initializeMint], tx),
);
// Sign the transaction.
const signedTransaction = await signTransactionMessageWithSigners(transactionMessage);
// Create a send and confirm function from your RPC and RPC Subscriptions objects.
const sendAndConfirm = sendAndConfirmTransactionFactory({ rpc, rpcSubscriptions });
// Use it to send and confirm any signed transaction.
const transactionSignature = getSignatureFromTransaction(signedTransaction);
await sendAndConfirm(signedTransaction, { commitment: "confirmed" });
})
})
Kit tends to be a lot more verbose than its predecessor, but its usually expected you would combine these individual functions into your own more familiar utility functions.
Accounts are buffers of bytes
Like we said before, accounts are actually buffers. create_account is basically calloc on the blockchain. So it just allocates memory for an array of X objects of Y size, and initializes all bits zero.
There is no required structure for the account data, it is just an array of bytes of a specific size.
Bytes are a chunk of bits (1's and 0's) of a fixed size, in this case 8 bits in a single byte. Data inside of accounts is stored as an array of these bytes.
let number: u32 = 42;
// Convert to little-endian byte array
let number_bytes = number.to_le_bytes();
This would give us an array of 4 bytes.
Why 4? Well it was a 32 bit number. Each byte is 8 bits.
So 8 * 4 = 32!
Here is what it would look like in bytes:
[0x2A, 0x00, 0x00, 0x00] (Little Endian)
Little endian means that the least significant byte is stored first.
Little endian: Least significant byte first
0x1234
stored as [0x34, 0x12]
Big endian: Most significant byte first
0x1234
stored as: [0x12, 0x34]
So our example before looks like this:
Behold our number!
42 as a `u32`: v
00000000 00000000 00000000 00101010
Account ownership and authority
Clients use private keys to sign transactions which mark accounts as signed during program execution. This signed state does not have any special semantics in the runtime. Its up to the program to give this signed status meaning.
For example, the SPL Token Program:
- Each token account is owned by the SPL program
- Only the token program can change values
- In each token account, SPL stores a field for the address of the authority account which can spend these tokens
- When you transfer you provide a signature which corresponds to the authority field
- The program checks during execution if the signature is right and allows the transfer if it is
- The runtime checks that the SPL program owns the token account
Accounts can be created with an allocated size of 0, which means it will store no data. Which can let you use them only for authority and ownership functionalities.
Programs are just data stored in accounts. These accounts are flagged as executable and ownership is transferred to an ebpf loader program.
PDA (Program Derived Addresses)
A PDA address just looks like a public key (the type in Rust is a lie). They don't have any corresponding private key. Solana lets the program that derived the PDA "sign" during cross-program invocations using invoke_signed.
PDAs are usually used as a way for programs to own mutable accounts and easily sign for them during transactions.
If you want your program to be able to directly mutate an account's data, then it should be the owner.
Lets say you wanted to store a count for the number of users of your app. Normally someone would need to sign to change the data.
Instead of having a human sign, we can make a program the owner of the account. That way it can affect and make changes without really signing to prove its authority. Instead it has authority because it is the owner.
To do this we derive an address that can be found deterministically from a set of seeds:
--- The seed
v
Address = hash(["some_string" + "another_string"])
These accounts, by design, don't have private keys. Instead, programs can use these seeds to algorithmically sign for transactions and modify data in accounts that are PDAs.
So PDAs are a special kind of account that a program can sign for without a private key. Instead, they can only be signed by a program. These provide authority and ownership capabilities for your programs.
They enable things like:
- Allowing programs to own tokens
- Allow a token vault where only the program can withdraw from the vault
A PDA is created by generating an ed25519 pubkey not on the curve. This means not every combination of seed and program ID is usable. There are built in functions to take a nonce (the bump) and keep decrementing it until you get an invalid pubkey.
To have a program transfer lamports to a user account, you can have the program sign for its own PDA account and use invoke_signed on the system program.
This is in contrast to how you would usually do CPI using just invoke with your data. When you sign with a PDA using invoke_signed you are using the PDAs seeds, including the bump to sign.
A common pattern is to use addresses derived from a namespace and a user pubkey for efficient key/value mappings, keyed off the user.
A PDA is derived using:
- A program ID (the controlling smart contract).
- A set of seeds (arbitrary byte arrays).
- A bump (a single-byte value, 0–255).
pda = Pubkey::create_program_address(seeds + bump, program_id)
When you use the PDA to interact with an owned account you can directly borrow mutable access to the lamports of an account using try_borrow_mut_lamports instead of the normal CPI invocation.
**ctx.accounts.recipient.to_account_info().try_borrow_mut_lamports()? += amount
Bumps
When you use seeds to derive a public key, there is a chance that the seeds you use, and the public key it derives from them could in theory have an associated private key. If it has a corresponding private key it is considered on curve.
So, Solana tacks on an additional integer (the bump) to your seeds list to make sure it bumps it off of the ed2559 elliptic curve.
The tricky part is you have to manage this bump throughout the program so its good to stick it inside the account data.
Bumps are deterministic values used to find program-derived addresses (PDAs) without requiring a private key. They ensure that a PDA is valid (i.e., does not collide with an actual keypair) while allowing the same address to be derived consistently.
In practice, you can think of your seeds + bump as Solana's version of a hash dictionary:
[b"token", authority.key().as_bytes()]
In this seed we are using a string token and the authority of the transaction to create a unique pairing. This could get us the token address for this particular authority if we initialized an account with that seed.
By knowing the seed and the bump you can deterministically recreate the address without having to store the address anywhere. We can find any users account by reusing the same seed.
So if you wanted to say, store a big list of addresses you could use these seeds to quickly lookup the address you need without knowing it (storing it) ahead of time, which can be expensive and limiting on the blockchain.
Finding these addresses from seeds is an iterative process, where we start at a bump value of 255 and work our way down to zero until we have successfully first bumped off the curve:
let seeds = &[b"my_seed", user_pubkey.as_ref()];
let (pda, bump) = Pubkey::find_program_address(seeds, &program_id);
b"my_seed"is a static seed.user_pubkeyensures uniqueness per user.bumpis found automatically and returned by the iterating function
It can be tedious to derive these every time so a useful pattern is to store the bump inside of the account you create so we do not need to re-fetch these every time when we want to sign a transaction.
pub struct MyAccount {
pub bump: u8,
pub data: u64,
}
Storing the bump allows validation later when signing transactions in frameworks like Anchor.