This example shows you how you can easily create (integration) test cases for Burr your project.
With Burr it is easy to foster a test-driven development mindset when developing actions, which are the building blocks of your Burr application.
Note: writing test cases for GenAI projects can be tricky. The same LLM API calls can result in different outputs. This means that ‘exact’ equality tests may not work, and you‘ll need to resort to more fuzzy tests like checking for the presence of certain words or phrases, or using LLMs to grade the output, etc. We aren’t opinionated on how you do this, but in any case, you‘ll need to write a test case to exercise things, and this is what we’re showing you how to do here.
With LLMs non-deterministic behavior can impact your application's behavior. It is important to create test cases to ensure that your application is robust and behaves as expected; it is easy to tweak a prompt and break a particular use case. This example shows you how you can create test cases for your project from real traces, since part of the struggle is to create test cases that are representative of the real-world behavior of your application -- Burr can make this process easier.
burr-test-case create command passing the project name, partition_key, app_id, and sequence_id as arguments.For step (5) above, this is the corresponding command:
%%sh burr-test-case create \ --project-name "SOME_NAME" \ --partition-key "SOME_KEY" \ --app-id "SOME_ID" \ --sequence-id 0 \ --target-file-name /tmp/test-case.json
See the notebook for example usage.
In test_application.py you'll find examples tests for a simple action that is found in application.py.
The notebook also shows how things work.
We see many more improvements here: