Run an experiment to evaluate your fine-tuned models.
Experiments allow you to evaluate the performance of your fine-tunes on specific datasets. By running experiments, you can gain insights into how well your models performed relative to other models.
You’ll need a dataset uploaded to Montelo to get started. Remember, an experiment is always associated to a dataset.
You’ll also need a runner
and an evaluator
function. Here’s a simple example
The runner
function runs the datapoint’s input and returns the output. Here’s the most basic example of this:
The runner
can be any function. You could call an external service, an internal API, or even a really simple function
like above.
The evaluator
function takes the datapoint’s expected output, and the output that the runner
returned, and runs an evaluation on it. Again, you could run any
evaluation you want!
Note that you must return an object, not a primitive, for evaluators. Here’s an example:
Your evaluator can be any function!
We’ll take the object that you return, and run statistics and create charts for you.
Now that you have a runner
and an evaluator
, use the createAndRun
to create an experiment and run it.
And that’s all you need! Behind the scenes, Montelo will pull all the datapoints of the dataset and run the
runner
and evaluator
on you each datapoint.
We’ll inject each datapoint’s input, metadata, and expected output into your runner
and evaluator
, and track the
data that you return.
You can pass in options
to the createAndRun
function:
How many datapoints to run concurrently, to speed up the time it takes to finish the experiment.
Only run the experiment on the test datapoints, skipping the train datapoints.
Run an experiment to evaluate your fine-tuned models.
Experiments allow you to evaluate the performance of your fine-tunes on specific datasets. By running experiments, you can gain insights into how well your models performed relative to other models.
You’ll need a dataset uploaded to Montelo to get started. Remember, an experiment is always associated to a dataset.
You’ll also need a runner
and an evaluator
function. Here’s a simple example
The runner
function runs the datapoint’s input and returns the output. Here’s the most basic example of this:
The runner
can be any function. You could call an external service, an internal API, or even a really simple function
like above.
The evaluator
function takes the datapoint’s expected output, and the output that the runner
returned, and runs an evaluation on it. Again, you could run any
evaluation you want!
Note that you must return an object, not a primitive, for evaluators. Here’s an example:
Your evaluator can be any function!
We’ll take the object that you return, and run statistics and create charts for you.
Now that you have a runner
and an evaluator
, use the createAndRun
to create an experiment and run it.
And that’s all you need! Behind the scenes, Montelo will pull all the datapoints of the dataset and run the
runner
and evaluator
on you each datapoint.
We’ll inject each datapoint’s input, metadata, and expected output into your runner
and evaluator
, and track the
data that you return.
You can pass in options
to the createAndRun
function:
How many datapoints to run concurrently, to speed up the time it takes to finish the experiment.
Only run the experiment on the test datapoints, skipping the train datapoints.