Smart Agent Benchmark Runs

A benchmark answers a practical question: "which model tier should I put this agent on?" It sends the same set of prompts to the agent across several tiers...

Merged manual — this file documents two related pages in order: the benchmark runs list and the benchmark results list that opens from it.

In plain terms

A benchmark answers a practical question: "which model tier should I put this agent on?" It sends the same set of prompts to the agent across several tiers (Fast, Smart, Expert, …) and lays the results side by side — cost in credits, speed (latency), token usage, and the actual answers — so you can judge quality against cost.

It's a side-by-side comparison: run the same prompts on each tier and see which performs best for the price. The usual goal is to find the cheapest tier that still answers well enough for the job.

How to read it: the Runs list is one row per benchmark; drill into Results to compare each prompt across tiers. A higher tier that costs 3× the credits but gives the same answer as Smart is telling you to stay on Smart.

Smart Agent Benchmark Runs (list)

Page type: List Source table: Benchmark Run Object: Page 72778346 "SA Benchmark Runs QUA" — Page.72778346.SABenchmarkRuns.al

This page lists every benchmark run that has been started. A benchmark run sends a named set of prompts to a Smart Agent across one or more model tiers so you can compare cost, latency, and response quality between tiers side by side. The page is read-only; runs are created by the benchmarking process itself, not directly on this page.

How to open it

Tell Me (Alt+Q) → search "Smart Agent Benchmark Runs".

Fields

List columns

Field	Type	Description
Run No.	Integer	The unique number identifying this benchmark run. Read-only — assigned automatically.
Source Agent Name	Text[100]	The name of the source agent that was benchmarked.
Prompt Set Name	Text[100]	The label identifying the prompt set used for this run.
Started At	DateTime	When the run was started.
Completed At	DateTime	When the run reached a terminal state.
Status	Enum	The current status of this benchmark run. Values: Not Started, Running, Completed, Failed, Cancelled.
Prompt Count	Integer	Number of prompts in the prompt set.
Tier Count	Integer	Number of model tiers exercised.
Total Credits	Decimal	Rolled-up credit cost across every result row. Blank when zero.

Actions

Action	What it does
View Results	Show every prompt / tier result for this run. Opens the Smart Agent Benchmark Results page filtered to this run.
Cancel Run	Mark this run as cancelled and remove any transient clones. Only visible to users who have execute permission on the benchmark runner.

Smart Agent Benchmark Results

Smart Agent Benchmark Results (list)

Page type: List Source table: Benchmark Result Object: Page 72778347 "SA Benchmark Results QUA" — Page.72778347.SABenchmarkResults.al

Shows the individual result rows for a benchmark run — one row per prompt and model-tier combination. Use this page to compare latency, token usage, and credit cost across tiers for the same prompt, or to read the full agent response for any row. The page is read-only.

How to open it

View Results action on the Smart Agent Benchmark Runs list. Opens pre-filtered to the selected run.
Tell Me (Alt+Q) → search "Smart Agent Benchmark Results" (opens unfiltered).

Fields

List columns

Field	Type	Description
Prompt Index	Integer	Zero-based index of the prompt within the prompt set. Read-only.
Model Tier	Text[50]	The model tier this row was run against. Read-only.
Prompt	Text	First 200 characters of the prompt text sent to the agent. Select View Full Response to read the complete text. Read-only.
Latency (ms)	Integer	Wall-clock latency from submit to completion, in milliseconds. Blank when zero. Read-only.
Prompt Tokens	Integer	Input tokens reported by the backend. Blank when zero. Read-only.
Completion Tokens	Integer	Output tokens reported by the backend. Blank when zero. Read-only.
Credits	Decimal	Credits charged for this prompt / tier combination. Blank when zero. Read-only.
Status	Enum	Lifecycle status of this row. Values: Not Started, Running, Completed, Failed, Cancelled. Read-only.
Error Text	Text[2048]	Captured error message when the row failed. Read-only.

Actions

Action	What it does
View Full Response	Open the full agent response for this row in a read-only dialog. Displays a message if no response was captured.

Smart Agent Benchmark Runs

Notes

Cancel Run on the benchmark runs list is only visible to users who hold execute permission on the benchmark runner. Other users see the action hidden.
The Prompt column on the results page is truncated to 200 characters for display. Use View Full Response to read the complete agent reply; note that this action shows the response, not the full prompt. If you need the untruncated prompt text, it is stored in the underlying record and is not separately exposed in the UI.
Benchmark runs create temporary copies (clones) of the source agent — one per model tier. Cancel Run removes these clones in addition to marking the run as cancelled.
The Status values Not Started, Running, Completed, Failed, and Cancelled apply to both the run header and each individual result row independently.

Smart Agent Benchmark Runs

On this page