Skip to content

Smart Agent Benchmark Runs

A benchmark answers a practical question: "which model tier should I put this agent on?" It sends the same set of prompts to the agent across several tiers...

Merged manual — this file documents two related pages in order: the benchmark runs list and the benchmark results list that opens from it.

In plain terms

A benchmark answers a practical question: "which model tier should I put this agent on?" It sends the same set of prompts to the agent across several tiers (Fast, Smart, Expert, …) and lays the results side by side — cost in credits, speed (latency), token usage, and the actual answers — so you can judge quality against cost.

It's a side-by-side comparison: run the same prompts on each tier and see which performs best for the price. The usual goal is to find the cheapest tier that still answers well enough for the job.

How to read it: the Runs list is one row per benchmark; drill into Results to compare each prompt across tiers. A higher tier that costs 3× the credits but gives the same answer as Smart is telling you to stay on Smart.


Smart Agent Benchmark Runs (list)

Page type: List Source table: Benchmark Run Object: Page 72778346 "SA Benchmark Runs QUA"Page.72778346.SABenchmarkRuns.al

This page lists every benchmark run that has been started. A benchmark run sends a named set of prompts to a Smart Agent across one or more model tiers so you can compare cost, latency, and response quality between tiers side by side. The page is read-only; runs are created by the benchmarking process itself, not directly on this page.

How to open it

  • Tell Me (Alt+Q) → search "Smart Agent Benchmark Runs".

Fields

List columns

FieldTypeDescription
Run No.IntegerThe unique number identifying this benchmark run. Read-only — assigned automatically.
Source Agent NameText[100]The name of the source agent that was benchmarked.
Prompt Set NameText[100]The label identifying the prompt set used for this run.
Started AtDateTimeWhen the run was started.
Completed AtDateTimeWhen the run reached a terminal state.
StatusEnumThe current status of this benchmark run. Values: Not Started, Running, Completed, Failed, Cancelled.
Prompt CountIntegerNumber of prompts in the prompt set.
Tier CountIntegerNumber of model tiers exercised.
Total CreditsDecimalRolled-up credit cost across every result row. Blank when zero.

Actions

ActionWhat it does
View ResultsShow every prompt / tier result for this run. Opens the Smart Agent Benchmark Results page filtered to this run.
Cancel RunMark this run as cancelled and remove any transient clones. Only visible to users who have execute permission on the benchmark runner.

Smart Agent Benchmark Results (list)

Page type: List Source table: Benchmark Result Object: Page 72778347 "SA Benchmark Results QUA"Page.72778347.SABenchmarkResults.al

Shows the individual result rows for a benchmark run — one row per prompt and model-tier combination. Use this page to compare latency, token usage, and credit cost across tiers for the same prompt, or to read the full agent response for any row. The page is read-only.

How to open it

  • View Results action on the Smart Agent Benchmark Runs list. Opens pre-filtered to the selected run.
  • Tell Me (Alt+Q) → search "Smart Agent Benchmark Results" (opens unfiltered).

Fields

List columns

FieldTypeDescription
Prompt IndexIntegerZero-based index of the prompt within the prompt set. Read-only.
Model TierText[50]The model tier this row was run against. Read-only.
PromptTextFirst 200 characters of the prompt text sent to the agent. Select View Full Response to read the complete text. Read-only.
Latency (ms)IntegerWall-clock latency from submit to completion, in milliseconds. Blank when zero. Read-only.
Prompt TokensIntegerInput tokens reported by the backend. Blank when zero. Read-only.
Completion TokensIntegerOutput tokens reported by the backend. Blank when zero. Read-only.
CreditsDecimalCredits charged for this prompt / tier combination. Blank when zero. Read-only.
StatusEnumLifecycle status of this row. Values: Not Started, Running, Completed, Failed, Cancelled. Read-only.
Error TextText[2048]Captured error message when the row failed. Read-only.

Actions

ActionWhat it does
View Full ResponseOpen the full agent response for this row in a read-only dialog. Displays a message if no response was captured.

Notes

  • Cancel Run on the benchmark runs list is only visible to users who hold execute permission on the benchmark runner. Other users see the action hidden.
  • The Prompt column on the results page is truncated to 200 characters for display. Use View Full Response to read the complete agent reply; note that this action shows the response, not the full prompt. If you need the untruncated prompt text, it is stored in the underlying record and is not separately exposed in the UI.
  • Benchmark runs create temporary copies (clones) of the source agent — one per model tier. Cancel Run removes these clones in addition to marking the run as cancelled.
  • The Status values Not Started, Running, Completed, Failed, and Cancelled apply to both the run header and each individual result row independently.