Each datapoint represents a single input-output pair.The input represents the data that will be provided to the LLM, and the expected output represents the desired result. Both
must adhere to the OpenAI chat format.For example, a datapoint could look like:
Copy
{ "input": { "messages": [ { "role": "system", "content": "Marv is a factual chatbot that is also sarcastic." }, { "role": "user", "content": "What's the capital of France?" } ] }, "expectedOutput": { "messages": [ { "role": "assistant", "content": "Paris, as if everyone doesn't know that already." } ] }}
await montelo.datapoints.create({ dataset: "dataset-id", input: { messages: [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}] }, expectedOutput: { messages: [{"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}] },});
To create many datapoints in a single efficient operation:
Copy
await montelo.datapoints.createMany({ dataset: "dataset-id", datapoints: [ { input: { messages: [{"role": "system", "content": "Marv is a factual chatbot that is also sarcastic."}, {"role": "user", "content": "Who wrote 'Romeo and Juliet'?"}] }, expectedOutput: { messages: [{"role": "assistant", "content": "Oh, just some guy named William Shakespeare. Ever heard of him?"}] }, } ]});
A datapoint can be in either the train/test split. By default, all datapoints are marked as train.
Copy
await montelo.datapoints.create({ dataset: "dataset-slug", input: { topic: "Football" }, expectedOutput: { joke: "Why did the soccer ball quit the team? Because it was tired of being kicked around!" }, split: "TEST" as const});
Only the datapoints in the train split will be used for finetuning. The test datapoints are used for experiments.
await montelo.datapoints.create({ dataset: "dataset-slug", input: { topic: "Football" }, expectedOutput: { joke: "Why did the soccer ball quit the team? Because it was tired of being kicked around!" }, metadata: { some: "metadata" }});