> For the complete documentation index, see [llms.txt](https://epsilla-inc.gitbook.io/epsilladb/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://epsilla-inc.gitbook.io/epsilladb/evaluation.md).

# Evaluation

In Epsilla, AI agent evaluation is designed as a continuous performance assessment framework, aimed at testing and improving the AI agents' response quality over time. This evaluation system runs predefined scenarios that simulate real-world interactions, allowing AI agent builders and operation team to monitor the performance of AI agents across various situations. The evaluation process utilizes large language models (LLMs) to compare the AI-generated responses against human-labeled answers, scoring them based on a set of metrics such as accuracy, relevance, and coverage.

This approach is conceptually similar to Continuous Integration/Continuous Delivery (CI/CD) practices, where the goal is to iteratively test and improve the system. It leverages human input and LLMs to provide ongoing feedback on the AI's performance, ensuring that the agents meet high-quality standards as they are updated and refined over time.

On the navigation bar, click on the **Evaluations** tab.

<figure><img src="/files/q9HOW3zXFTL1vSq30zI6" alt="" width="253"><figcaption></figcaption></figure>

This will lead you to the page where you can create and manage all your evaluations.

<figure><img src="/files/yAo4Ouu98rDHjXGJchEP" alt=""><figcaption></figcaption></figure>


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://epsilla-inc.gitbook.io/epsilladb/evaluation.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
