Create New Evaluation
Last updated
Last updated
Click the Create Evaluation button, which will prompt the creation of a new evaluation template.
You will be taken to an empty evaluation titled Untitled Evaluation - {Creation Time}, where you can start adding details.
Click on the title field, which will initially display the default name, and give your evaluation a meaningful name (e.g., "Financial Agent Evaluation").
Unfocus the name input to apply the changes.
There will be one empty test case created by default. Fill in test case details.
Question: Input the query or task you want the AI agent to handle. For example, “What was the cumulative total return for Meta Platforms, Inc. at the end of 2022 compared to its peak value during the five-year period ending December 31, 2023?”
Expected Answer: Input the correct answer that the AI agent is expected to provide. For example, “The cumulative total return for Meta Platforms, Inc. at the end of 2022 was 90, which is significantly lower compared to its peak value of 275 at the end of 2023.”
You want to create a set of test cases that can be repeatedly executed during AI agent updates, so that we can be confident our configuration changes don't introduce regression.
Click the Add Test Case button to input questions and expected answers for your evaluation. A new row will be added for each test case, with columns for Question, Expected Answer for you to fill in same as above.
Add as many test cases as needed.
After an evaluation is created, you can always go back and make changes by clicking the card from the list.
You can also delete an evaluation from the actions dropdown menu.