Skip to main content

evaluation_test_cases

Creates, updates, deletes, gets or lists an evaluation_test_cases resource.

Overview

Nameevaluation_test_cases
TypeResource
Iddigitalocean.genai.evaluation_test_cases

Fields

The following fields are returned by SELECT queries:

A successful response.

NameDatatypeDescription
created_by_user_idstring (uint64) (example: 12345)
agent_namestringAgent name (example: example name)
run_namestringRun name. (example: example name)
test_case_namestringTest case name. (example: example name)
agent_deletedbooleanWhether agent is deleted
agent_uuidstringAgent UUID. (example: 123e4567-e89b-12d3-a456-426614174000)
agent_version_hashstringVersion hash (example: example string)
agent_workspace_uuidstringAgent workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000)
created_by_user_emailstring (example: example@example.com)
error_descriptionstringThe error description (example: example string)
evaluation_run_uuidstringEvaluation run UUID. (example: 123e4567-e89b-12d3-a456-426614174000)
evaluation_test_case_workspace_uuidstringEvaluation test case workspace uuid (example: 123e4567-e89b-12d3-a456-426614174000)
finished_atstring (date-time)Run end time. (example: 2023-01-01T00:00:00Z)
pass_statusbooleanThe pass status of the evaluation run based on the star metric.
queued_atstring (date-time)Run queued time. (example: 2023-01-01T00:00:00Z)
run_level_metric_resultsarray
star_metric_resultobject
started_atstring (date-time)Run start time. (example: 2023-01-01T00:00:00Z)
statusstringEvaluation Run Statuses (default: EVALUATION_RUN_STATUS_UNSPECIFIED, example: EVALUATION_RUN_STATUS_UNSPECIFIED)
test_case_descriptionstringTest case description. (example: example string)
test_case_uuidstringTest-case UUID. (example: 123e4567-e89b-12d3-a456-426614174000)
test_case_versioninteger (int64)Test-case-version.

Methods

The following methods are available for this resource:

NameAccessible byRequired ParamsOptional ParamsDescription
genai_list_evaluation_runs_by_test_caseselectevaluation_test_case_uuidevaluation_test_case_versionTo list all evaluation runs by test case, send a GET request to /v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs.
genai_get_evaluation_test_caseselecttest_case_uuidevaluation_test_case_versionTo retrive information about an existing evaluation test case, send a GET request to /v2/gen-ai/evaluation_test_case/{test_case_uuid}.
genai_list_evaluation_test_cases_by_workspaceselectworkspace_uuidTo list all evaluation test cases by a workspace, send a GET request to /v2/gen-ai/workspaces/{workspace_uuid}/evaluation_test_cases.
genai_list_evaluation_test_casesselectTo list all evaluation test cases, send a GET request to /v2/gen-ai/evaluation_test_cases.
genai_create_evaluation_test_caseinsertTo create an evaluation test-case send a POST request to /v2/gen-ai/evaluation_test_cases.
genai_update_evaluation_test_casereplacetest_case_uuidTo update an evaluation test-case send a PUT request to /v2/gen-ai/evaluation_test_cases/{test_case_uuid}.

Parameters

Parameters can be passed in the WHERE clause of a query. Check the Methods section to see which parameters are required or optional for each operation.

NameDatatypeDescription
evaluation_test_case_uuidstringEvaluation run UUID. (example: "123e4567-e89b-12d3-a456-426614174000")
test_case_uuidstringTest-case UUID to update (example: "123e4567-e89b-12d3-a456-426614174000")
workspace_uuidstringWorkspace UUID. (example: "123e4567-e89b-12d3-a456-426614174000")
evaluation_test_case_versionintegerVersion of the test case. (example: 1)

SELECT examples

To list all evaluation runs by test case, send a GET request to /v2/gen-ai/evaluation_test_cases/{evaluation_test_case_uuid}/evaluation_runs.

SELECT
created_by_user_id,
agent_name,
run_name,
test_case_name,
agent_deleted,
agent_uuid,
agent_version_hash,
agent_workspace_uuid,
created_by_user_email,
error_description,
evaluation_run_uuid,
evaluation_test_case_workspace_uuid,
finished_at,
pass_status,
queued_at,
run_level_metric_results,
star_metric_result,
started_at,
status,
test_case_description,
test_case_uuid,
test_case_version
FROM digitalocean.genai.evaluation_test_cases
WHERE evaluation_test_case_uuid = '{{ evaluation_test_case_uuid }}' -- required
AND evaluation_test_case_version = '{{ evaluation_test_case_version }}';

INSERT examples

To create an evaluation test-case send a POST request to /v2/gen-ai/evaluation_test_cases.

INSERT INTO digitalocean.genai.evaluation_test_cases (
data__dataset_uuid,
data__description,
data__metrics,
data__name,
data__star_metric,
data__workspace_uuid
)
SELECT
'{{ dataset_uuid }}',
'{{ description }}',
'{{ metrics }}',
'{{ name }}',
'{{ star_metric }}',
'{{ workspace_uuid }}'
RETURNING
test_case_uuid
;

REPLACE examples

To update an evaluation test-case send a PUT request to /v2/gen-ai/evaluation_test_cases/{test_case_uuid}.

REPLACE digitalocean.genai.evaluation_test_cases
SET
data__dataset_uuid = '{{ dataset_uuid }}',
data__description = '{{ description }}',
data__metrics = '{{ metrics }}',
data__name = '{{ name }}',
data__star_metric = '{{ star_metric }}',
data__test_case_uuid = '{{ test_case_uuid }}'
WHERE
test_case_uuid = '{{ test_case_uuid }}' --required
RETURNING
test_case_uuid,
version;