SCF #12
Award Completed
DYET (Do you even test?)
by

AI fuzzer that finds bugs and vulnerabilities in your Soroban smart contracts.

Awarded
Awarded
Budget request:
$
15,000
*
WebsiteCode

Project Stage

Development

Category

Soroban
Tools

Based in

Team size

1

Active since

Products & Services

There have been a lot of bugs and vulnerabilities in smartcontracts on other chains which led to significant losses. Having an automatic and easy way of testing or writing tests might give Soroban smart contracts en edge against the competition by minimising losses in the ecosystem and community.

We propose an AI fuzzer and test generation for Soroban smart contracts. It would be a model trained with reinforcement learning and it's goal would be to cover every code block in the least amount of calls. Later a detector and parser of the blockchain state would be needed to determine if the state is valid or not or if certain accounts received funds they weren't supposed to.

We see many ways to explore this problem:

- Aim for a model that covers most of the lines/blocks of code in the least possible number of calls.

- Use the training process as fuzzing.

- Use user annotations and code parsers as an additional source of information for training and fuzzing. The most important information includes code branching and reading or writing to memory.

No items found.
Previous Project(s)
SCF #10
Award recipient
Vitreous
Transparent fundraising platform
Learn more
Progress so far

We have explored reinforcement learning and GPT-4 with goal of testing Soroban contracts.

Goals
To get there, we request a budget of  
$
15,000
*
  to:
Additional information

I’m also the founder of Vitreous, which has received some funds from SCF#10, but I would like to work on this solo project parallel to Vitreous.

I’ve already spent some time on this and I was able to cover with tests some trivial pieces of code using a model trained with reinforcement learning and I would like to expand this into a helpful tool for Soroban.

Update

The initial idea was to use reinforcement learning to train a fuzzer for Soroban smart contracts. However, with the release of GPT-4, we've been able to achieve much better results compared to training our own models. We also discovered that training by running a Soroban contract in a sandbox was extremely slow, and we haven't found a solution to this problem so far.

The trivial contract to train a proof-of-concept fuzzer consists of n functions and 2n features, where each function needs to be called twice with arguments equal to features[2i] and features[2i + 1], where i is the function index. This is designed to emulate code branching. We parameterized the number of functions and the maximum feature value to increase or decrease the difficulty of this simple game/puzzle. For example, if the contract is initialized with n=2 and features=[2, 5, 3, 0], then the calls required to maximize its coverage would be: (i=0, arg=2), (i=0, arg=5), (i=1, arg=3), (i=1, arg=0).

GPT-4 was able to:

  1. Solve our training Soroban smart contract (https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/lib.rs), meaning it made the correct calls to maximize its coverage given the initial features.
  2. Generate tests for this smart contract (https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/test.rs).
  3. Refactor the code. The original unrefactored contract can be found here: https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/old_lib.rs

We believe that using GPT-4 is the way forward in this project. In order to build a product we will need access to GPT-4 API, which we don't have at the moment.

Pitch deck
No items found.
Deliverables
First Deliverable

Show on trivial and handpicked Soroban smart contracts that we can achieve significant block/line coverage by automatically generating inputs or tests with a reinforcement learning model.

Simply, create a tool that makes sure that each block/line of a smart contract is ran at least once.

Since reinforcement learning is really challenging, I also want to make it clear that the first deliverables won't be a functioning product but a proof-of-concept.


Reviewer instructions

We'll provide a docker image or a google collab notebook, where it will be possible to retrain a model and check what tests and calls are being generated.

Update

For the first deliverable, we have prepared a dockerised jupyter notebook that uses RL training to find the correct calls to maximise coverage of the trivial contract written in Python. Since training by running the Soroban contract directly was extremely slow, we can't really show any results for that setup. You can view the notebook directly here: https://github.com/matusv/DYET/blob/main/poi.ipynb

On the other hand, GPT-4 was able to generate the correct calls for the trivial contract written in Rust. In addition to that it also generated tests and refactored it.

Original contract: https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/old_lib.rs

Refactored contract: https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/lib.rs

tests: https://github.com/matusv/DYET/blob/main/soroban/dyet-test/src/test.rs

Links:

Team

Matus Vojcik (matus#8231)

Data-scientist

This project is the perfect intersection of my interests - AI and crypto.

Github / LinkedIn