Working with Tezos Smart Contracts as a software engineer

Oct 3, 2022

Tags: testing, tezos, smart contracts, blockchain, 4C

Recently I’ve been tasked with maintenance of Cambridge Center for Carbon Credits (4C) Tezos smart contracts, taking them from a research project proof-of-concept to something ready to deployed in production, and so I’ve had to quickly get up to speed on the details of what is, to me, a new domain. I did my blockchain homework when I started at 4C, just to understand what a blockchain is and how Tezos is differentiates itself from others, but that was at a very high level. Now I need to understand how to code for it, and that’s what this post is going to cover: a sort of bootstrapping guide for Tezos smart contract development.

As with a lot of young technology ecosystems, the documentation on the web for Tezos development is either sparse, assumes insider knowledge, or is out of date (or is a mix of all three). I imagine these notes will similarly age poorly, but they will at least serve as a reference for me for the near future.

Background

What is a smart contract?

I’m not going to cover "what is a blockchain" or "what is Tezos" here - I’ve written a little about that in the past which you can read if you want that background level of information. I will quickly cover what is a Tezos smart contract though, as I personally found the term a bit nebulous when I started on this. I’ve no idea if other smart contract systems are the same, I’ve only played with the Tezos one to date.

As a technologist, it’s probably best to ignore both the "smart" and "contract" parts of the term, and instead think of them as small bits of server-side code of limited complexity with some associated storage. The code will have one or more API endpoints where you can call it to either read the storage (or values derived from that storage) or to cause the storage state to mutate.

One key difference to working with on smart contract rather than say a normal web-service comes in terms of concurrency. In a regular web service you’d worry about two parallel calls to your service reading storage, mutating it, and then storing it (aka, a race condition). With a smart contract that isn’t an issue, as the entire chain, despite being evaluated on many nodes, is eventually a single-threaded system: in that all API calls will be evaluated to completion one at a time by each node, and only one node will "win" when it tries to write its results to the chain. Thus all your usual concerns about distributed systems kinda go out the window at the smart contract level, other than your call may be rejected and you need to try again (and I’m not going to go into details here on this, as this is meant to be a tooling post).

Both the code and the storage are simple compared to what you'd write for your average web-service: the code has to be guaranteed to complete, so you don’t get loops or such, and your storage is limited to structs, sets, lists, and dictionaries.

You’re also incentivised to keep both the code and storage simple, as both cost Tez (the Tezos chain currency) to when operations on the blockchain take place. Just uploading the initial 4C smart contract, which is just a standard simple token contract, will cost about $1 or so based on the last time I looked at the conversion rates. So you have real money at play (given that most people will need to buy Tez to have any), and a poorly designed contract can become impractically expensive if you’re not careful.

So what can you do with a "smart contract" given it’s really just a very restricted bit of computation logic and storage? Mostly you’re using it to do controlled ledger updates. At 4C it's being using it to create a publicly visible and publicly agreed upon state around who has bought carbon credits and who has "retired" them, as in they’ve matched the credit against some CO₂ they have generated, and thus the credit has been consumed. Additionally (and why 4C exists) this information is being stored on the chain with an unremovable reference to the data needed to scientifically evaluate the project that sunk the carbon in the first place. Thus we have some state that says who has generated credits to be bought, who has bought those credits, and when they’ve been retired, and API points to let us carry out those operations (again, if you don’t follow 4C you probably have lots of other questions about why, how, what of all this, but that’s not the point of this post).

For the purposes of 4C the problem of managing all this state and the operations around has been split into two different contracts - one that manages recording of the generation of carbon credits, and another that manages a per-institution view of consumption of those credits. Which leads to the final thing about smart contracts: API endpoints are either invoked by external entities with a Tez wallet (like a human, or a program/script that a human has let access a wallet) or they can be invoked by other contracts. But don't think that contracts can spontaneously call other contracts, at the root of the chain of calls will be a wallet address - contracts otherwise remain inert. As an aside, contracts can also be assigned Tez currency, which it can store or assign to wallets when another state is met, but we don't do any of that in 4C so I'll not cover that further here.

Clear? Maybe? Then let’s press on.

Tezos Terminology crash course

Before I can get on with the interesting stuff, we need to do a dump of terms that crop up here that are probably confusing if you’ve never used Tezos before - they certainly caused me to scratch my head when I started.

Wallet, Address, Hash

Ultimately, whilst in the 4C context we’re trying to think of Tezos as a distributed public ledger to let us store things, it is also at the same time a form of cryptocurrency. I don’t think there’s any escape from this duality, as the incentive for people to maintain the ledger is the reward in said currency. Anyway, this means that your identity on Tezos is called a "wallet", but that is just a fancy name for a public/private key pair that represents you. You can generate wallets quite readily using the tezos-client command line tool:

$ tezos-client gen keys mytestwallet
$ tezos-client list known addresses
mytestwallet: tz1cyDKwRw1CAT1dw7B95eBPqUq456rW6Gsw (unencrypted sk known)

The name you give it is just an alias for you, that never gets stored anywhere. The tz1… is the hash of the key that is used to identify this wallet, and is referred to as the address. Behind it if you look in your ~/.tezos-client folder you’ll find both the public and private keys.

A wallet isn’t particularly useful unless you get some Tez associated with it, as every action on the Tezos blockchain will require some Tez to pay the costs for recording your transactions. Note that any contracts you instantiate on the Tezos blockchains will also get an address, which is how contracts can be used to trigger other contracts, or possess Tez.

I said that a "wallet" represents you, but again it's just a pair of cryptographic keys, so you can generate as many of them as you like (or can keep track of). In testing I end up with several wallets that I assign to the various roles of bits of the system I'm testing.

Faucet

This one confused me no end when I started: a faucet is how you can create test accounts on the publicly accessible Tezos test networks. It felt like a very stretched metaphor to me when I started out, as the faucet would pour out wallets with money in them, not something I've seen happen in real life (yet). Since I started though, the faucet is a thing you tell about a wallet and it'll add money to that account.

(A hat tip to Nicolas Ochem for pointing out that by the time I published this post describing a facet as something that dispensed wallets it had already changed to something that fills wallets).

One thing that confused me early on was thinking I needed wallets per network. Now, this might be good practice for your own sanity's sake, but on the client side there is no state stored, and the wallet as we've discussed is just a public/private key pair, to which there is a note on the ledger saying you have some amount of Tez (or access to a contract, etc.), and so you can actually use the same wallet on any of the test networks, and each will have it's own per test network view of what's in that wallet. Again, I'm not advocating this as a recommended approach, but I just want to use it to flag that there's no state that you hold in your wallet on your end, it literally is just a set of keys, and everything else is on-chain.

Hangzhou, Ithaca, Jakarta, Kathmandu, etc.

One of the features of Tezos that differentiates it from other blockchains is that its implementation of how the blockchain works isn’t fixed, over time they will issue updates. The software that runs these has version numbers, but the protocols have names that are alphabetically increasing and named after ancient cities. As I write this the current protocol is Jakarta, and then work in progress version is called Kathmandu. This is a good way to try and work out how up to date a given tutorial you’re following is, as they’ll almost always refer to the protocol version at some point because they’ll point you to a docker container with that name or a test network. Speaking of which…

Mainnet, Ghostnet, Testnet, jaraktanet, kathmandunet, etc.

We’ll focus a lot on local testing in this article, but you will probably at some point want to test your contract out on a live network that is not the one the main Tezos blockchain. The network of notes that form the "real" Tezos blockchain is referred to as ”mainnet”, but there are a series of other tests blockchains out there, with their associated network of nodes running them, referred to as "testnets". There’s usually one running for the current protocol (so jakartanet), and one for the next protocol release (kathmandunet), there’s a long running one that you can use for longer tests (ghostnet), and then on that is reset daily (dailynet) and one that is reset weekly (mondaynet). Phew!

When you use any of these you need to find the associated faucet to get a wallet that’ll have Tez associated with it. My normal practice is to just get one of those, and then create other wallets for specific jobs I want and transfer some of the Tez over to those using the tezos-client command line tool.

$ tezos-client gen keys mytestwallet
$ tezos-client transfer 42 from faucetWallet to mytestwallet
$ tezos-client get balance for mytestwallet
42 tz

This saves me having to keep getting more tez from the faucet, and is also playing nice - there is a finite amount of currency the faucet can give out on a test network instance.

Indexers and explorers

A lot of people compare a blockchain to a database, which I don't think is a useful comparison, and I'd certainly suggestion you don't think of that way as a developer: whilst there is some storage of data going on here, and transactionality around operations on that data, you’ll quickly realise that on its own Tezos has no way to query the blockchain, because it is just a ledger. There is no active way to search the data on the ledger using just the client interface.

To be able to query the chain like you would a database you need a separate tool: referred to as an indexer or explorer (you'll find the two terms used interchangeably, but I'll stick with indexer). An indexer effectively takes the ledger state and shoves it into an actual database so you can search it (so perhaps we can say a blockchain is a database once you’ve put it into a database… :). Indexers will be continuously watching for the latest transactions so that they can keep relatively up to date, but you will see some lag between you committing an operation on the Tezos blockchain and it being available via an indexer for hopefully obvious reasons.

There are several public indexers with APIs you can use, the most popular of which is probably tzstats. Tzstats also host indexers for the testnets, but do so with a lower quality of service than they do for mainnet, which is something to watch out for if whatever your building runs on one of those testnets for internal development testing. If uptime of those is important to you you probably want to run your own indexer - which you can do using either tzindexer (which is what tzstats runs) or tzkt. I tend to use tzkt, as their open source version seems to track protocol updates better, which is important for testing.

How will I interact with my deployed smart contract?

Before we get into working on your smart contract, which will be the remained of this post, one thing to consider is how you'll be interacting with your smart contract once it has been instantiated on the blockchain. If you're writing certain types of contract, for example one that deals with "tokens", then there are standard API calls you can have your contract support and then people can use existing websites designed to talk to Tezos token contracts to interact with them. But for anything new you're doing, you'll want to be able to call and query your contract from your own services running somewhere else.

Thankfully both the Tezos chain and the indexers such as tzindex and tzkt use simple REST based APIs. Though for Tezos itself I'd suggest you don't want to write your own code for that as you'll need to deal with all the cryptographic signing of requests, so I'd use a library for that. There are libraries for most common languages, though in a mixed state of repair, but for typescript we've been using taquito which is quite well featured and actively maintained.

How do I write a smart contract

I’m not going to talk much about writing the coding part here, more all the bits that enable you to do that. Partly as I’ve come at the 4C project when the initial contract work was already done, but also because before you write contracts you need tooling in place, and that’s really what this post is about.

A smart contract in Tezos is compiled down to Michelson, which is the assembly language equivalent of the Tezos blockchain. Michelson has a bunch of properties that make it good for this, such programs expressed in it being formally verifiable, which is desirable if you’re going to attach a bunch of real value to an instantiated smart contract. As with assembly language, you can write Michelson by hand if you like, but that would for most people be a very slow and tedious option, so there are a bunch of high level languages that people use to do that. The main two are SmartPY, which as the name implies is a Python interface for generating contracts, and then there’s Ligo, which is what we us at 4C. Ligo lets you use a bunch of different language styles: Pascal, Javascript, and a couple of flavours of ML. I say flavours, as because Michelson won’t let you express things like for loops, the Ligo languages don’t let you do that either, so they’re not full languages, but rather a similar syntax laid over the Michelson primitives.

At 4C we use the CameLigo, which is based on OCaml (which also happens to be the language used to write Tezos itself and something the rest of the 4C team is familiar with, myself excluded). The ligo compiler (which also runs tests etc.) is "meant" to be consumed as a docker container, and if you’re not on Linux (or even if you are) then this is your best bet:

$ docker run --rm -v "$PWD":"$PWD" -w "$PWD" ligolang/ligo:0.50.0 [COMMAND]

But, as they suggest, you can alas that:

$ alias ligo="docker run --rm -v "$PWD":"$PWD" -w "$PWD" ligolang/ligo:0.50.0"

And then you can just do:

$ ligo [COMMAND]

Alternatively they do have static binaries for x86 Linux available as a tar.gz or as a Debian package, but it’s clear from the docs they want you to use the docker container.

There is no ligo REPL environment, which can make development frustrating as you get started but you can evaluate code snippets from the command line at least, which is good as you try to get to grips with this new language.

So, to write your contract you write some code that describes both the API behaviour and state in one of these languages, and then you compile it to Michelson thus:

$ ligo compile contract my_fancy_contract.mligo > my_fancy_contract.michelson

Assuming it compiles successfully, then the output will be a the Michelson code that you can then instantiate (or ”originate” in Tezos parlance) on the blockchain. But before we go anywhere near mainnet or even a testnet, we’re going to want to do some testing.

Testing your smart contract

Before you put your smart contract near a real Tezos blockchain you should write some tests. One of the most important things to understand about an instantiated smart contract is that it can’t be changed once published, so you want to be really sure that it does what you think it does. You can’t even remove it, as a blockchain only ever lets you add to it, you can never change things added to it in the past. This is why it’s important to consider your contract’s upgrade strategy before you ship (even if that’s just having an API call to your contract to set a bit that stops it accepting future calls), but that’s beyond the scope of this document.

Given all this, perhaps you can see why the designers made it so you can do formal proofs on Michelson if you’re so inclined. But I’m not a formal methods person, so in the near term I want to use the tools I already understand to get confidence in the smart contracts we have at 4C, which means adding lots of tests.

Unit testing

The ligo compiler comes with support for testing (as does, I believe, SmartPY, but I’ve not used that and so won’t be covering it here). I’ve used this to write a growing suite of unit tests to give us confidence that our contract does what we think it does, and in particular trying to cover all the negative use cases that will be less well exercised in the broader 4C manual testing as we spin up full stack prototypes: these negative cases are where we’re likely to get burnt if we’re not careful with permissions checking, testing for overflows, etc.

The ligo documentation for testing is okay, and I’m pleased that it’s covered right there in page one of their tutorial, but as a software engineer looking for the equivalent of pytest or XCTest I was a little lost as to how to build and run a proper test suite. But by reading through example code and Ligo's github repository I finally got the tools I wanted. I can’t take credit for any of this, it’s just bits I’ve scraped from others, but hopefully this set of things is enough for you to start writing unit tests.

A test suite in ligo is just a ligo file that contains a series of functions whose name starts with test, which will be evaluated when you call that file with:

$ ligo run test my_test_suite.mligo

Obviously the tests don’t want to depend on a chain instance somewhere, so there is a Test module in the ligo library that you can use to make calls that would normally go onto the chain. The most immediately useful of these are:

Test.reset_state - this lets you generate a set of wallets to use for your tests on the internal test chain
Test.nth_bootstrap_account - lets you get one of the wallets you generated above
Test.originate_from_file - loads your ligo contract onto the internal test chain
Test.cast_address - original will return a chain address, but often you want a typed version of that to let you call your contract’s API, and so this call lets you do that
Test.to_entrypoint- lets you get a API entrypoint on the contract
Test.set_source - this sets a global that is used for chain operations, as you can’t specify that per call. This to me is a bit weak, as I’m likely to forget to do this in my tests being a weak human (which is why I need tests), but it is what it is, so be careful
Test.transfer_to_contract - let’s you finally call that entry point. This will return a test_exec_result object which you can then start to reason about if your call worked
Test.get_storage - lets you query the contract’s current storage, so you can look to see if your call did what you expected

As you can perhaps infer, invoking a method on a smart contract in your tests is a lot more detailed than just calling a method on an object in say a python test or such. To help deal with this I tend to have a file of common boilerplate calls to reset the chain, set up the default storage for my contract, instantiate some wallets, originate the contract etc. so that my test code doesn’t become un-readable, and I define some structs to make it easier to pass around the information you need per contract (address of wallet that controls the contract, address of contract, type casted version of address).

// tests/common.mligo

// The types ”entrypoint” and ”storage” will be loaded from my contract, but
// we rename them incase we load multiple contracts
#include ”../src/mycontract.mligo”
type my_storage = storage
type my_entrypoint = entrypoint

type mycontract_info = {
	owner: address;
	contract_address: address;
	contract: (my_entrypoint, my_storage) typed_address;
}

let mycontract_bootstrat(accounts_needed : nat): mycontract_info =
	//  Reset the chain and take the first account for the contract owner
	let owner =
		let _ = Test.reset.state accounts_needed [] in
			Test.nth_boothstrap_account 0 in

	// ensure this wallet is used to set up the contract
	let _ : unit = Test.set_source owner in

	// set up the initial contract storage
	let my_contract_storage: my_storage = {
		owner = owner;
		count = 42n;
	} in

	// load the contract and get the typed address
	let (addr, _, _) =
		Test.originate_from_file ”../src/mycontract.mligo” ”main” []
            (Test.compile_value my_contract_storage) 0tez in
	let typed_contract : (my_entrypoint, my_storage) typed_address =
        Test.cast_address addr in

	// return value
	{ owner = owner; contract = typed_contract; contract_address = addr; }

Then at the start of each test I just invoke this to give me my preconditions without a lot of distracting boilerplate code:

#import "./common.mligo" "Common"

let test_something =
	let test_contract = Common.mycontract_bootstrap(3n) in
      …

To evaluate if your contract works or not, you have two tools that you can use mostly:

Test.failwith - this forces a test to end, and you can pass it a string to help you work out what was going on
assert - as you’d expect this takes an expression and blows up if it evaluates to false

There’s no standard library that you see in many test suites of test equals, test greater than, test is nil, etc. you need to add those yourself, or at least borrow them from the sample code. I did a mix of both into another common file.

And with this I now have a way to build up a suite of unit tests to start making me believe this contract does what I think it does.

Integration testing

Whilst the unit tests are my first port of call, I also know that they don’t tell me that I’ve made something deployable, and so for that I also run a set of integration tests that use docker containers to build up a sandbox test network and prod my contract that way. For this to work you’ll need at least three docker containers:

A Tezos sandbox - we use Flextesa
An indexer - as mentioned above we use tzkt
A place to run your tests

Whilst tzindex will run in a single container, tzkt actually uses three containers: one to run Postgres for storage, and then it splits up the crawler and query API into different services too, so I actually end up running five containers in this scenario, as per this mock docker-compose.yml

version: "3.7"

services:
    tezossandbox:
        build:
            context: ci/flextesa
        command: kathmandubox start

    tzkt-db:
        restart: always
        image: postgres:13
        environment:
            POSTGRES_USER: ${POSTGRES_USER:-tzkt}
            POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-qwerty}
            POSTGRES_DB: ${POSTGRES_DB:-tzkt_db}

    tzkt-api:
        build:
            context: ci/tzkt
            dockerfile: Tzkt.Api/Dockerfile
        restart: on-failure:5
        depends_on:
            - tzkt-sync
        environment:
            ConnectionStrings__DefaultConnection:>
                host=tzkt-db;
                port=5432;
                database=${POSTGRES_DB:-tzkt_db};
                username=${POSTGRES_USER:-tzkt};
                password=${POSTGRES_PASSWORD:-qwerty};
            Kestrel__Endpoints__Http__Url: http://0.0.0.0:5000

    tzkt-sync:
        build:
            context: ci/tzkt
            dockerfile: Tzkt.Sync/Dockerfile
        restart: on-failure:5
        depends_on:
            - tezossandbox
            - tzkt-db
        environment:
            ConnectionStrings__DefaultConnection:>
                host=tzkt-db;
                port=5432;
                database=${POSTGRES_DB:-tzkt_db};
                username=${POSTGRES_USER:-tzkt};
                password=${POSTGRES_PASSWORD:-qwerty};
            TezosNode__Endpoint: http://tezossandbox:20000

    test:
        build:
            context: .
            dockerfile: Dockerfile.test
        environment:
            - TEZOS_CLIENT_UNSAFE_DISABLE_DISCLAIMER=yes
            - TEZOS_RPC_HOST=http://tezossandbox:20000
            - TEZOS_INDEX_HOST=http://tzkt-api:5000
        depends_on:
            - signatory
            - tezossandbox
            - tzkt-api
            - tzkt-sync
        command: /app/integration_tests.sh

Three notes on this:

There's no faucet in this set-up: the sandbox comes with two well known test wallets which have all the tez in the sandbox chain on them. The sandbox readme describes these wallets, including their secret keys, so you can just transfer tez from those to your own test scenario wallets.
I actually run a sixth service, omitted here for sake of compactness of the example, which contains Signatory. Signatory lets you protect your deployed wallets using a Hardware Security Module or Azure/AWS Key Management Services, which is what you'll need to safely deploy things when going into production. Wallet security is beyond the scope of this document, but just please never use private keys stored in files on servers that face the public internet.
I’m building the images for sandbox and indexer myself because not all images on docker hub support both amd64 and aarch64, and because I’m testing on Kathmandu, the next protocol release at the time I write this, and so I’m not running ”latest” either.

For the integration tests themselves, those are written in typescript using taquito. Now, typescript isn’t my favourite language, but at 4C we need to write some code to let us query our contracts from multiple places, including the browser, and so typescript is one option where we can have a library of common non-ligo code.

In fact, this is a related tip: we built a command line wrapper for our contract in typescript as otherwise you’d need to use tezos-client and worry about constructing Michelson types at the command line. For example, contrast:

$ tezos-client transfer 0 from MyOwner to MyContract \
    --entrypoint dostuff \
    --arg '{Pair (Pair "tz1hcU2KvWpiFWRJH2WbK6QZZVkrpUGNLMXq" 1000) 42}' \
    --burn-cap 0.1

With my cli tool that I wrote (where the contract name is explicit if I find there’s only one in my config):

$  x4c dostaff MyOwner SomeOtherAddress 42 1000

As your storage types get more complex using tezos-client becomes very tedious to use, and so we have a command line wrapper for every contract call. Also tezos-client doesn’t talk to indexers, but our command line tool does, so I can also use it to query storage state.

Once you have these tools to make your life easier you can then just write simple integration scripts that run in the above docker environment, where I use a mix of tezos-client to bootstrap some test wallets, and my cli client to run operations and then also my cli client to check that the storage (via the indexer) is what I now expect.

Summary

This was a whirlwind tour of trying to work with Tezos as a new developer who doesn’t really know much about blockchain things beyond the basic concepts. Getting to this stage has taken me a while reading through source code and chatting to people of the Tezos slack - though unfortunately that is currently invite only. I have found though that when I’ve raised issues people in the community have been fast to respond, which helps a lot, but does require you have an in. Hopefully this guide will help someone on their way, or just myself in a few months when I need to set up some new contracts for 4C :)

Tech notes by Michael Winston Dales