evaluation_runs
Creates, updates, deletes, gets or lists an evaluation_runs
resource.
Overview
Name | evaluation_runs |
Type | Resource |
Id | digitalocean.genai.evaluation_runs |
Fields
The following fields are returned by SELECT
queries:
- genai_get_evaluation_run
A successful response.
Name | Datatype | Description |
---|---|---|
created_by_user_id | string (uint64) | (example: 12345) |
agent_name | string | Agent name (example: example name) |
run_name | string | Run name. (example: example name) |
test_case_name | string | Test case name. (example: example name) |
agent_deleted | boolean | Whether agent is deleted |
agent_uuid | string | Agent UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
agent_version_hash | string | Version hash (example: example string) |
agent_workspace_uuid | string | Agent workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000) |
created_by_user_email | string | (example: example@example.com) |
error_description | string | The error description (example: example string) |
evaluation_run_uuid | string | Evaluation run UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
evaluation_test_case_workspace_uuid | string | Evaluation test case workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000) |
finished_at | string (date-time) | Run end time. (example: 2023-01-01T00:00:00Z) |
pass_status | boolean | The pass status of the evaluation run based on the star metric. |
queued_at | string (date-time) | Run queued time. (example: 2023-01-01T00:00:00Z) |
run_level_metric_results | array | |
star_metric_result | object | |
started_at | string (date-time) | Run start time. (example: 2023-01-01T00:00:00Z) |
status | string | Evaluation Run Statuses (default: EVALUATION_RUN_STATUS_UNSPECIFIED, example: EVALUATION_RUN_STATUS_UNSPECIFIED) |
test_case_description | string | Test case description. (example: example string) |
test_case_uuid | string | Test-case UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
test_case_version | integer (int64) | Test-case-version. |
Methods
The following methods are available for this resource:
Name | Accessible by | Required Params | Optional Params | Description |
---|---|---|---|---|
genai_get_evaluation_run | select | evaluation_run_uuid | To retrive information about an existing evaluation run, send a GET request to /v2/gen-ai/evaluation_runs/{evaluation_run_uuid} . | |
genai_run_evaluation_test_case | insert | To run an evaluation test case, send a POST request to /v2/gen-ai/evaluation_runs . |
Parameters
Parameters can be passed in the WHERE
clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
Name | Datatype | Description |
---|---|---|
evaluation_run_uuid | string | Evaluation run UUID. (example: "123e4567-e89b-12d3-a456-426614174000") |
SELECT
examples
- genai_get_evaluation_run
To retrive information about an existing evaluation run, send a GET request to /v2/gen-ai/evaluation_runs/{evaluation_run_uuid}
.
SELECT
created_by_user_id,
agent_name,
run_name,
test_case_name,
agent_deleted,
agent_uuid,
agent_version_hash,
agent_workspace_uuid,
created_by_user_email,
error_description,
evaluation_run_uuid,
evaluation_test_case_workspace_uuid,
finished_at,
pass_status,
queued_at,
run_level_metric_results,
star_metric_result,
started_at,
status,
test_case_description,
test_case_uuid,
test_case_version
FROM digitalocean.genai.evaluation_runs
WHERE evaluation_run_uuid = '{{ evaluation_run_uuid }}' -- required;
INSERT
examples
- genai_run_evaluation_test_case
- Manifest
To run an evaluation test case, send a POST request to /v2/gen-ai/evaluation_runs
.
INSERT INTO digitalocean.genai.evaluation_runs (
data__agent_uuids,
data__run_name,
data__test_case_uuid
)
SELECT
'{{ agent_uuids }}',
'{{ run_name }}',
'{{ test_case_uuid }}'
RETURNING
evaluation_run_uuids
;
# Description fields are for documentation purposes
- name: evaluation_runs
props:
- name: agent_uuids
value: array
description: >
Agent UUIDs to run the test case against.
- name: run_name
value: string
description: >
The name of the run.
- name: test_case_uuid
value: string
description: >
Test-case UUID to run