Picking the right Smart Agents model tier for the job

Fast, Smart, Expert, Premium. Four tiers in the Smart Agent enum, with multipliers from reduced to 4x in the credit management module. When each one earns its keep, with the cost-vs-quality tradeoff in numbers from real BC pilots.

By Smart Agents Team

Smart Agents offers four model tiers because not every BC question deserves the same model. A routine "who are my top customers" question does not need the most powerful reasoning model. A cashflow forecast that has to weigh seasonality and unpaid invoices does. The default tier on a fresh agent is Smart for a good reason, but the right tier depends on the agent and you should not leave the decision to defaults if you care about the bill.

The tiers are declared in the AL enum Smart Agent AI Model Type QUA and stored in the AI Model Type field on each Smart Agent record. The runtime credit cost is computed in Smart Agent Credit Mgr. QUA, procedure estimate usage, lines 20 through 58. We will quote the multipliers from the source so you can verify them yourself.

THE FOUR TIERS

Fast (Silver) maps to the our fastest model deployment in the QUALIA backend. The credit multiplier is 0.5 against the Smart baseline. It is the right tier for short, routine, structured queries: "list the top ten customers by revenue this quarter," "how many open POs do we have for vendor V0010," "what is the on-hand quantity for item ITM-1234 across all locations." It is also the right tier for high-volume background jobs like email-triage replies that follow a template.

Smart (Gold) maps to our standard model and is the baseline (multiplier 1.0). It is what you should leave most agents on by default. It handles multi-step lookups well, it follows system prompts reliably, and it produces clean structured tool-call payloads. About eight out of ten of our pilot tenants ran Smart for everything except their cashflow agent and their compliance agent.

Expert (Platinum) maps to our advanced model with a 2x multiplier. The right use case is reasoning over messy data: "why did margin drop in the East region," "summarize the changes in this 80-page contract," "reconcile this bank statement to G/L when three of the line items do not match." Expert is more patient than Smart with ambiguity. It is twice the cost.

Premium (Frontier) maps to our premium model with a 4x multiplier. We named it Premium not because we are proud of it but because we want users to feel the cost when they pick it. The right use case is genuinely novel reasoning that the cheaper tiers either get wrong or refuse to answer well: a multi-quarter cashflow forecast across companies, a compliance review against a 200-page regulation document, a written analysis that has to weigh trade-offs and produce a defensible recommendation. We see well under 5% of pilot traffic land here.

THE COST IN ACTUAL NUMBERS

The base rate at the Smart tier is one credit per ten thousand input tokens, or one credit per five thousand output tokens. Output is twice as expensive as input because that is true everywhere; we did not invent that. The minimum charge per action is a small minimum credit charges, set in the estimate usage procedure on line 42.

A short structured lookup, say "top ten customers by revenue this quarter," runs roughly 1,800 input tokens and 600 output tokens. Smart cost: 1,800/10,000 + 600/5,000 = 0.30 credits, about a third of a euro cent. Run it on Fast and the cost is 0.30 x 0.5 = 0.15 credits. Run it on Premium and you pay 0.30 x 4.0 = 1.20 credits, four times the price for an answer that the cheap tier would have returned correctly. Premium for a top-customers query is waste.

A messy reconciliation, say a bank statement with 60 lines and three mismatches, runs roughly 6,000 input tokens and 1,500 output tokens. Smart cost: 6,000/10,000 + 1,500/5,000 = 0.90 credits. Expert cost: 0.90 x 2.0 = 1.80 credits. Premium cost: 0.90 x 4.0 = 3.60 credits. Whether Expert is the right call depends on whether Smart actually produces a usable answer; in our pilots it gets reconciliation right about three out of four times, and Expert closes the remaining gap. Premium does not improve on Expert often enough to justify another 2x.

A cashflow forecast across three months: 12,000 input tokens, 3,500 output tokens. Smart cost: 12,000/10,000 + 3,500/5,000 = 1.9 credits. Premium cost: 1.9 x 4.0 = 7.6 credits, around eight cents. The forecast is the kind of multi-factor reasoning that genuinely benefits from the larger frontier model. We have seen Smart get cashflow forecasts plausibly wrong in ways that would have been embarrassing if a CFO had read them; we have not seen that with Premium. This is one of the few workflows where Premium pays for itself.

THE DECISION RULE

Use Fast for high-volume, structured, routine queries where the answer is obviously right or obviously wrong and the model just has to follow a template. Email triage replies, top-N reports, stock lookups, balance-and-aging summaries.

Use Smart for everything that involves more than one tool call, any multi-step lookup, or any user-facing draft (a customer email, a quote draft, a process documentation page). This is the default for a reason.

Use Expert when Smart visibly degrades. The litmus test is: have a power user run the same prompt three times on Smart and three times on Expert, and read the answers. If Expert is materially better, switch the agent. If Smart is fine, leave it.

Use Premium when the cost of a wrong answer dwarfs the credit cost of the right one. Cashflow forecasts that go to the board, compliance reviews that go to the auditor, contract reviews that go to legal. The break-even is high but it exists.

WHERE TO SET IT

The per-agent default sits on the Smart Agent record. Open the Smart Agent card, pick the AI Model Type field, choose Fast, Smart, Expert, or Premium, and the next message routes to the appropriate backend deployment. There is no advance approval required: the change takes effect on the next message. The audit log records the model tier on every tool call so you can compare cost and outcome historically.

The single biggest cost-control move we see in pilots is to leave Smart as the tenant default and only promote agents to Expert or Premium when the cheaper tier produces visibly worse answers. That one decision typically halves the monthly credit burn relative to a tenant that defaults everything to Premium because it sounds more impressive. The math is in the related blog post on credits, and the model deployments are documented in the backend reference.

A REAL DECISION FROM A WHOLEthe system PILOT

A wholesale distribution pilot ran for six weeks before we made any tier changes. They started with everything on Smart, including their cashflow and reconciliation agents, and burned through their first 500-credit pool in eight days. Reasonable burn for a pilot.

We analyzed the SA Tool Call Log and saw three patterns. Their email-triage agent (built from the Sales Invoice Manager template) was running 200 calls a day on Smart and producing identical-quality drafts to a Fast-tier test we ran in parallel. Switched it to Fast: cost dropped 50% on that workflow, no quality complaints.

Their Bank Reconciliation Agent (from template code BNK in the templates the extension line 116) was getting matches right on roughly 75% of the messy lines on Smart, leaving the AR team to manually reconcile the rest. Promoted it to Expert: match rate went to 92%, and the time the AR team spent on the residual mismatches dropped enough that the 2x credit cost paid back in roughly two weeks of operator time.

Their custom cashflow agent stayed on Smart for two more weeks while we watched. The forecasts were reasonable but had a recurring tendency to under-weight one customer's seasonal pattern. Promoted to Premium for the monthly forecast run only (not for the daily quick-look variant). The forecasts improved noticeably; the additional credit cost was a few euros a month.

Net: by the end of week eight, the pilot's monthly credit burn had dropped roughly 30% relative to the all-Smart baseline, and the CFO was happier with the cashflow forecast. The right-tier-per-job decision is not a one-time configuration. It is a quarterly review of the audit log and a small amount of A/B comparison.

See /docs/user-guide for the chat workflow and the FAQ entry on what model tiers are available for the per-tier specs in one place.