evaluation_test_cases
Creates, updates, deletes, gets or lists an evaluation_test_cases
resource.
Overview
Name | evaluation_test_cases |
Type | Resource |
Id | digitalocean.genai.evaluation_test_cases |
Fields
The following fields are returned by SELECT
queries:
- genai_list_evaluation_runs_by_test_case
- genai_get_evaluation_test_case
- genai_list_evaluation_test_cases_by_workspace
- genai_list_evaluation_test_cases
A successful response.
Name | Datatype | Description |
---|---|---|
created_by_user_id | string (uint64) | (example: 12345) |
agent_name | string | Agent name (example: example name) |
run_name | string | Run name. (example: example name) |
test_case_name | string | Test case name. (example: example name) |
agent_deleted | boolean | Whether agent is deleted |
agent_uuid | string | Agent UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
agent_version_hash | string | Version hash (example: example string) |
agent_workspace_uuid | string | Agent workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000) |
created_by_user_email | string | (example: example@example.com) |
error_description | string | The error description (example: example string) |
evaluation_run_uuid | string | Evaluation run UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
evaluation_test_case_workspace_uuid | string | Evaluation test case workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000) |
finished_at | string (date-time) | Run end time. (example: 2023-01-01T00:00:00Z) |
pass_status | boolean | The pass status of the evaluation run based on the star metric. |
queued_at | string (date-time) | Run queued time. (example: 2023-01-01T00:00:00Z) |
run_level_metric_results | array | |
star_metric_result | object | |
started_at | string (date-time) | Run start time. (example: 2023-01-01T00:00:00Z) |
status | string | Evaluation Run Statuses (default: EVALUATION_RUN_STATUS_UNSPECIFIED, example: EVALUATION_RUN_STATUS_UNSPECIFIED) |
test_case_description | string | Test case description. (example: example string) |
test_case_uuid | string | Test-case UUID. (example: 123e4567-e89b-12d3-a456-426614174000) |
test_case_version | integer (int64) | Test-case-version. |
A successful response.
Name | Datatype | Description |
---|---|---|
name | string | (example: example name) |
created_by_user_id | string (uint64) | (example: 12345) |
updated_by_user_id | string (uint64) | (example: 12345) |
dataset_name | string | (example: example name) |
archived_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_by_user_email | string | (example: example@example.com) |
dataset | object | |
dataset_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
description | string | (example: example string) |
latest_version_number_of_runs | integer (int32) | |
metrics | array | |
star_metric | object | |
test_case_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
total_runs | integer (int32) | |
updated_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
updated_by_user_email | string | (example: example@example.com) |
version | integer (int64) |
A successful response.
Name | Datatype | Description |
---|---|---|
name | string | (example: example name) |
created_by_user_id | string (uint64) | (example: 12345) |
updated_by_user_id | string (uint64) | (example: 12345) |
dataset_name | string | (example: example name) |
archived_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_by_user_email | string | (example: example@example.com) |
dataset | object | |
dataset_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
description | string | (example: example string) |
latest_version_number_of_runs | integer (int32) | |
metrics | array | |
star_metric | object | |
test_case_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
total_runs | integer (int32) | |
updated_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
updated_by_user_email | string | (example: example@example.com) |
version | integer (int64) |
A successful response.
Name | Datatype | Description |
---|---|---|
name | string | (example: example name) |
created_by_user_id | string (uint64) | (example: 12345) |
updated_by_user_id | string (uint64) | (example: 12345) |
dataset_name | string | (example: example name) |
archived_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
created_by_user_email | string | (example: example@example.com) |
dataset | object | |
dataset_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
description | string | (example: example string) |
latest_version_number_of_runs | integer (int32) | |
metrics | array | |
star_metric | object | |
test_case_uuid | string | (example: 123e4567-e89b-12d3-a456-426614174000) |
total_runs | integer (int32) | |
updated_at | string (date-time) | (example: 2023-01-01T00:00:00Z) |
updated_by_user_email | string | (example: example@example.com) |
version | integer (int64) |
Methods
The following methods are available for this resource:
Name | Accessible by | Required Params | Optional Params | Description |
---|---|---|---|---|
genai_list_evaluation_runs_by_test_case | select | evaluation_test_case_uuid | evaluation_test_case_version | To list all evaluation runs by test case, send a GET request to /v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs . |
genai_get_evaluation_test_case | select | test_case_uuid | evaluation_test_case_version | To retrive information about an existing evaluation test case, send a GET request to /v2/gen-ai/evaluation_test_case/{test_case_uuid} . |
genai_list_evaluation_test_cases_by_workspace | select | workspace_uuid | To list all evaluation test cases by a workspace, send a GET request to /v2/gen-ai/workspaces/{workspace_uuid}/evaluation_test_cases . | |
genai_list_evaluation_test_cases | select | To list all evaluation test cases, send a GET request to /v2/gen-ai/evaluation_test_cases . | ||
genai_create_evaluation_test_case | insert | To create an evaluation test-case send a POST request to /v2/gen-ai/evaluation_test_cases . | ||
genai_update_evaluation_test_case | replace | test_case_uuid | To update an evaluation test-case send a PUT request to /v2/gen-ai/evaluation_test_cases/{test_case_uuid} . |
Parameters
Parameters can be passed in the WHERE
clause of a query. Check the Methods section to see which parameters are required or optional for each operation.
Name | Datatype | Description |
---|---|---|
evaluation_test_case_uuid | string | Evaluation run UUID. (example: "123e4567-e89b-12d3-a456-426614174000") |
test_case_uuid | string | Test-case UUID to update (example: "123e4567-e89b-12d3-a456-426614174000") |
workspace_uuid | string | Workspace UUID. (example: "123e4567-e89b-12d3-a456-426614174000") |
evaluation_test_case_version | integer | Version of the test case. (example: 1) |
SELECT
examples
- genai_list_evaluation_runs_by_test_case
- genai_get_evaluation_test_case
- genai_list_evaluation_test_cases_by_workspace
- genai_list_evaluation_test_cases
To list all evaluation runs by test case, send a GET request to /v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs
.
SELECT
created_by_user_id,
agent_name,
run_name,
test_case_name,
agent_deleted,
agent_uuid,
agent_version_hash,
agent_workspace_uuid,
created_by_user_email,
error_description,
evaluation_run_uuid,
evaluation_test_case_workspace_uuid,
finished_at,
pass_status,
queued_at,
run_level_metric_results,
star_metric_result,
started_at,
status,
test_case_description,
test_case_uuid,
test_case_version
FROM digitalocean.genai.evaluation_test_cases
WHERE evaluation_test_case_uuid = '{{ evaluation_test_case_uuid }}' -- required
AND evaluation_test_case_version = '{{ evaluation_test_case_version }}';
To retrive information about an existing evaluation test case, send a GET request to /v2/gen-ai/evaluation_test_case/{test_case_uuid}
.
SELECT
name,
created_by_user_id,
updated_by_user_id,
dataset_name,
archived_at,
created_at,
created_by_user_email,
dataset,
dataset_uuid,
description,
latest_version_number_of_runs,
metrics,
star_metric,
test_case_uuid,
total_runs,
updated_at,
updated_by_user_email,
version
FROM digitalocean.genai.evaluation_test_cases
WHERE test_case_uuid = '{{ test_case_uuid }}' -- required
AND evaluation_test_case_version = '{{ evaluation_test_case_version }}';
To list all evaluation test cases by a workspace, send a GET request to /v2/gen-ai/workspaces/{workspace_uuid}/evaluation_test_cases
.
SELECT
name,
created_by_user_id,
updated_by_user_id,
dataset_name,
archived_at,
created_at,
created_by_user_email,
dataset,
dataset_uuid,
description,
latest_version_number_of_runs,
metrics,
star_metric,
test_case_uuid,
total_runs,
updated_at,
updated_by_user_email,
version
FROM digitalocean.genai.evaluation_test_cases
WHERE workspace_uuid = '{{ workspace_uuid }}' -- required;
To list all evaluation test cases, send a GET request to /v2/gen-ai/evaluation_test_cases
.
SELECT
name,
created_by_user_id,
updated_by_user_id,
dataset_name,
archived_at,
created_at,
created_by_user_email,
dataset,
dataset_uuid,
description,
latest_version_number_of_runs,
metrics,
star_metric,
test_case_uuid,
total_runs,
updated_at,
updated_by_user_email,
version
FROM digitalocean.genai.evaluation_test_cases;
INSERT
examples
- genai_create_evaluation_test_case
- Manifest
To create an evaluation test-case send a POST request to /v2/gen-ai/evaluation_test_cases
.
INSERT INTO digitalocean.genai.evaluation_test_cases (
data__dataset_uuid,
data__description,
data__metrics,
data__name,
data__star_metric,
data__workspace_uuid
)
SELECT
'{{ dataset_uuid }}',
'{{ description }}',
'{{ metrics }}',
'{{ name }}',
'{{ star_metric }}',
'{{ workspace_uuid }}'
RETURNING
test_case_uuid
;
# Description fields are for documentation purposes
- name: evaluation_test_cases
props:
- name: dataset_uuid
value: string
description: >
Dataset against which the test‑case is executed.
- name: description
value: string
description: >
Description of the test case.
- name: metrics
value: array
description: >
Full metric list to use for evaluation test case.
- name: name
value: string
description: >
Name of the test case.
- name: star_metric
value: object
- name: workspace_uuid
value: string
description: >
The workspace uuid.
REPLACE
examples
- genai_update_evaluation_test_case
To update an evaluation test-case send a PUT request to /v2/gen-ai/evaluation_test_cases/{test_case_uuid}
.
REPLACE digitalocean.genai.evaluation_test_cases
SET
data__dataset_uuid = '{{ dataset_uuid }}',
data__description = '{{ description }}',
data__metrics = '{{ metrics }}',
data__name = '{{ name }}',
data__star_metric = '{{ star_metric }}',
data__test_case_uuid = '{{ test_case_uuid }}'
WHERE
test_case_uuid = '{{ test_case_uuid }}' --required
RETURNING
test_case_uuid,
version;