Programmatically Manage Knowledge Bases
Last updated
Last updated
In this section, we introduce the REST APIs for managing knowledge bases. This allows developers to efficiently interact with and update the knowledge base using automation and code.
Request:
curl -X GET https://etl.epsilla.com/api/v1/datasources/<project_id>/pipeline/<datasource_id> \
-H "X-API-Key: <Project-API-Key>"
The project_id and datasource_id are identifiers for your project and your knowledge base's data source. You can obtain the project ID and data source ID from the URL:
You can find how to get the Project API Key in Project Management.
Response:
{
"statusCode": 200,
"message": "Get pipeline successfully.",
"result": {
"name": "<knowledge_base_name>",
"resource_id": "<datasource_id>",
"updated_at": 1721318300666,
"created_at": 1721318285687,
"status": "Active",
"sync_status": "synchronized",
"last_synced": 1721318341060,
"creator_user_id": "user1@example.com",
"configs": { // Knowledge base data source configs
"source_type": "file",
"source_config": {},
"auto_sync": false,
"sync_schedule": "",
"loader_type": "auto", // Loading config
"chunk_config": { // Chunking config
"type": "sentence",
"chunk_overlap": 192,
"chunk_size": 1024
},
"embed_model": "openai/text-embedding-3-large", // Embedding config
"web_hook": "",
"meta_file_pattern": "",
"meta_fields": []
},
"files_status": [ // Data processing status
{
"entity_type": "FileSyncStatus",
"sync_status": "synchronized",
"last_synced": 1721318325641,
"name": "<FileName>"
}
]
}
}
Request
curl -X POST https://etl.epsilla.com/api/v1/datasources/<project_id>/create \
-H "X-API-Key: <Project-API-Key>" \
-d {
"name": "<knowledge_base_name>",
"source_type":"s3",
"source_config":{"aws_key_id":"<secret_key>","aws_access_key":"<access_key>","bucket":"<bucket_name>","prefix":"<object_prefix>"},
"loader_type":"auto",
"chunk_config":{"type":"sentence","chunk_size":1024,"chunk_overlap":192},
"embed_model":"openai/text-embedding-3-large",
"auto_sync": false,
"web_hook": "<The full URL that will receive the loading job status update>"
}
Response:
{
"statusCode": 200,
"message": "Success to create and execute datasource <datasource_id> for project <project_id>",
"result": {
"project_id": <project_id>,
"datasource": <datasource_id>, // Please keep this ID for your record.
"status": "Created"
}
}
Request
curl -X PUT https://etl.epsilla.com/api/v1/datasources/<project_id>/pipeline/update/<datasource_id> \
-H "X-API-Key: <Project-API-Key>" \
-d '{"auto_sync": true, "sync_schedule": "15min"}' # Only provide the entries that need to be updated
Response:
{
"statusCode": 200,
"message": "Update pipeline successfully.",
"result": {
... // The knowledge base config
}
}
Request
curl -X POST https://etl.epsilla.com/api/v1/datasources/<project_id>/pipeline/<datasource_id> \
-H "X-API-Key: <Project-API-Key>"
Response:
{
"statusCode": 200,
"message": "Start to execute data pipeline successfully.",
"result": {
"task_id": "<data_processing_task_id>",
"pipeline_id": "<datasource_id>"
}
}
Request:
curl -X GET https://etl.epsilla.com/api/v1/datasources/<project_id>/pipeline/<datasource_id>/status \
-H "X-API-Key: <Project-API-Key>"
Response:
// In process
{
"statusCode": 200,
"message": "Pipeline still have 1 un-successful sub_tasks.",
"result": {
"sub_task_stats": [
"SUCCESS",
"STARTED"
]
}
}
// Completed
{
"statusCode": 200,
"message": "Get pipeline status successfully.",
"result": {
"status": "synchronized" // synchronized
}
}
Request:
curl -X DELETE https://etl.epsilla.com/api/v1/datasources/<project_id>/pipeline/<datasource_id> \
-H "X-API-Key: <Project-API-Key>"
Response:
{
"statusCode": 200,
"message": "Delete data source successfully."
}