5 Evaluate your models - Documentation

You can start a new experiment in Experiment List view.

Set up an Experiment:

Step 1: Click on the `Start New` button

It will take you to the Start Experiment page on which you can configure your experiment such as choosing model and parameters.

Step 2: Click `Run` button after you complete the configuration.

There are 2 types of experiments: “Single Test” and “Side by side Test”. A “Side by side” test allows you to compare 2 variants in one single experiment. while a “Single Test” is simple and more efficient to test single change or variant.

How to run Single Test:
Fill the required parameters for the “Test config”, click the “Run” button.
How to run SxS Test:
1. Toggle on the “Side by Side Test” switch.
2. Configure the “Baseline Config” section. Choose an AI model or prompt to compare against your Test configuration.
3. Click “Run” to start the SxS experiment.

View Experiment Detail

After you start an experiment, you will be taken to the “Experiment detail” page, you can view the job running status and the test result for each query. A table of metrics summary will shown under the Overview Section after the experiment run is complete.

The metrics currently supported include Accuracy and F1 score for classification, as well as system metrics such as Latency and Cost.

We will support more metrics and user defined customized metrics in our future releases.

Clone Experiment:

On the experiment detail page, you can click the Clone button to rerun an experiment with the same configuration or make small tweeks to start a new experiment.

Quickstart

​Set up an Experiment:

​Step 1: Click on the Start New button

​Step 2: Click Run button after you complete the configuration.

​View Experiment Detail

​Clone Experiment:

Set up an Experiment:

Step 1: Click on the `Start New` button

Step 2: Click `Run` button after you complete the configuration.

View Experiment Detail

Clone Experiment: