Dashboard Guide — Inference

The Hypervize dashboard is the primary control plane for inference products during MVP.

Accessing the Dashboard

After logging in you will land on the main resource overview. The left sidebar contains:

Fleet Overview — High-level view of all your resources (inference blocks today)
Inference — The main hub for Elastic playground + Dedicated deployment
Keys — API key management

This is the most important screen for launch.

Use this to evaluate quality, latency, and pricing before committing to production traffic.

The right-hand side (or dedicated sub-tab) lets you:

Live pricing estimate updates as you type the model ID (fetches HF metadata when possible).

After clicking Deploy you receive an endpt-... ID immediately. The dedicated capacity is provisioned asynchronously in the background.

Click any of your dedicated endpoints from the overview or the list on the Inference page.

Each block shows:

Current status with color coding (ONLINE, PROVISIONING, SLEEPING, WAKING UP, FAILED, etc.). Scale-to-zero endpoints show SLEEPING when hibernated and WAKING UP when a request wakes them.
Hourly burn rate (base + addons)
Configuration summary
Tabs: Overview, Logs, Playground

The embedded playground on the detail page is pre-bound to that specific dedicated endpoint — extremely useful for validation after provisioning.

Located under Settings → Keys (also reachable from the sidebar).

Remember: every account starts with one default inference key.

Shows aggregate burn rate, number of active inference blocks, and quick links.

This will expand in future releases as more compute products are added, but for the inference MVP it focuses on your endpoints.

Use the playground heavily before sending production traffic.
Keep the dedicated detail page open after deploying — watch status change from PROVISIONING to ONLINE. For scale-to-zero, you'll also see transitions to SLEEPING and WAKING UP.
Create environment-specific keys (e.g., “prod-website”, “internal-agents”).
Check the Logs tab first when debugging a dedicated endpoint.