Documentation Menu
Back to Docs

Model Hub Dashboard

Discover, download, and manage Large Language Models

Overview

The Model Hub provides a comprehensive interface for discovering, downloading, and managing Large Language Models (LLMs). It integrates with multiple sources including local Ollama (auto-managed), HuggingFace Hub, Google Vertex AI, and cloud APIs.

Supported Sources

Ollama, HuggingFace, Cloud APIs

Hardware Check

Pre-flight VRAM validation

Download Progress

Real-time tracking

Access URL

/models

Interface Layout

Local Models Tab

Browse and manage locally-hosted models via Ollama integration.

Features:

  • HuggingFace search for GGUF models
  • Installed Ollama inventory
  • Active download progress
  • Quantization selection

Example Interface:

┌─────────────────────────────────────┐

│ [Search HuggingFace...] [🔍] │

├─────────────────────────────────────┤

│ [llama3:7b] [Pull] [Available] │

│ [deepseek-r1:32b] [✓] [Active] │

│ [qwen3:72b] [▼ 45%] [Downloading]│

└─────────────────────────────────────┘

API Catalog Tab

Discover and use models from cloud API providers.

Supported Providers:

  • Google (Gemini 2.5, Gemini 2.0)
  • OpenAI (GPT-4o, GPT-4o-mini)
  • Anthropic (Claude 3.5+)
  • OpenRouter (400+ models)

Provider Status:

✓ Google (Active)

✓ OpenAI (Active)

⚠ Anthropic (No Key)

Model Details Panel

When you select a model, a detailed sidebar appears showing model specifications and hardware compatibility.

Model Info

Name:llama-4-scout
Size:8.9GB
Quant:Q4_K_M
Context:32K tokens

Pre-flight Check

VRAM OK — 78% utilization
Warning — 94% utilization

Available Quants

Q3Q4Q6Q8

Suggested for your GPU: Q4_K_M

Quick Start Guide

1

Discover Models

  1. Navigate to /models
  2. Click the API Catalog tab to see cloud models
  3. Switch to Local Models tab for local/HuggingFace search
  4. Search for models like "deepseek-r1" or "llama-4"
2

Check Hardware Compatibility

  1. Select any model from the list
  2. Click Check VRAM button
  3. Review the pre-flight check results
  4. If insufficient VRAM, try lower quantization
3

Download & Use

  1. Click Pull Model on desired model
  2. Watch progress bar track download
  3. Once complete, go to Chat page
  4. Click Quick Model Swap FAB
  5. Select newly downloaded model

Search Tips

Recommended Models

  • deepseek-coder - Coding models
  • qwen3.5 - Qwen models
  • llama-4 - Latest Llama
  • gemma-3 - Lightweight

Search Filters

  • Auto-filters GGUF format only
  • Excludes sharded models
  • Excludes vision models (mmproj)
  • Shows available quantizations

Ready to Configure?

Learn how to set up local LLMs and cloud API providers with detailed configuration guides.