๐Ÿค– LLM Gateway v2.0

Smart model routing with 5 available models

graph TB User[User Query] -->|smart routing| Gateway[LLM Gateway] Gateway -->|code keywords| Qwen[Qwen Coder 32B
Code Specialist] Gateway -->|complex/long| Llama90[Llama 90B Vision
Deep Analysis] Gateway -->|screenshot + errors| Kimi[Kimi K2.5
Multimodal + Thinking] Gateway -->|fast vision| Llama11[Llama 11B Vision
Quick Image] Gateway -->|simple queries| Ollama[Ollama Local
qwen2.5:3b
FREE] subgraph "NVIDIA API (50 calls/day)" Qwen Llama90 Llama11 Kimi end subgraph "Local (Unlimited)" Ollama end style Gateway fill:#667eea style Ollama fill:#10b981 style Qwen fill:#f59e0b style Llama90 fill:#ef4444 style Llama11 fill:#3b82f6 style Kimi fill:#8b5cf6

๐Ÿ  Ollama Local (qwen2.5:3b)

FREE FAST

Best for: Fast simple queries

Speed: โšกโšกโšก 0.47s (warm)

Memory: 2.2GB GPU (persistent)

Cost: $0

  • Runs on Mac Mini
  • Model stays in memory (KEEP_ALIVE=-1)
  • Unlimited usage

๐Ÿง  Kimi K2.5

NVIDIA MULTIMODAL

Best for: Screenshot debugging, reasoning

Speed: โšกโšก Fast

Features: Vision + text + thinking mode

  • Part of 50 daily NVIDIA calls
  • Excellent for error analysis

๐Ÿฆ™๐Ÿ’ช Llama 90B Vision

NVIDIA HUGE

Best for: Long documents, complex forms, deep analysis

Speed: โšก Medium

Size: 90 billion parameters!

  • Part of 50 daily NVIDIA calls
  • Maximum reasoning capability

๐Ÿฆ™โšก Llama 11B Vision

NVIDIA FAST

Best for: Quick image analysis

Speed: โšกโšก Fast

Multimodal: Images + text

  • Part of 50 daily NVIDIA calls
  • Balanced speed/quality

๐Ÿ’ป Qwen Coder 32B

NVIDIA CODE SPECIALIST

Best for: Python, JavaScript, bash, debugging

Speed: โšกโšก Fast

Specialization: Code generation & review

  • Part of 50 daily NVIDIA calls
  • Auto-selected for code tasks

๐Ÿ› ๏ธ Essential CLI Tools

๐Ÿ“ง Email - Himalaya

CLI email client via IMAP/SMTP

  • Account: gmail (configured)
  • List, read, send, search
  • No web browser needed

๐ŸŽฎ Google Workspace - gog

Gmail, Calendar, Drive, Sheets, Docs

  • OAuth browser flow
  • Currently needs auth setup
  • Full API access

๐Ÿ“ Notes - memo (Apple Notes)

Manage Apple Notes from CLI

  • Create, search, list
  • Native Mac integration
  • Instant access

โœ… Tasks - Things 3

Manage Things 3 via CLI

  • Add tasks via URL scheme
  • Query local database
  • Search & list

๐Ÿ™ GitHub - gh CLI

Official GitHub CLI

  • Issues, PRs, workflows
  • Authenticated
  • GraphQL API access

๐Ÿงพ Summarize

Extract/summarize content

  • YouTube transcripts
  • PDFs, web pages
  • Multiple LLM backends

๐Ÿ”Œ mcporter (MCP)

Model Context Protocol client

  • Call MCP servers/tools
  • HTTP + stdio transports
  • Extensible architecture

๐Ÿ’Ž Gemini CLI

Google Gemini from terminal

  • One-shot Q&A
  • Summaries & generation
  • Used by research skill

๐Ÿ’ฌ Messaging CLIs

iMessage, WhatsApp, Slack

  • imsg - iMessage/SMS
  • wacli - WhatsApp
  • slack - Slack API

๐ŸŒค๏ธ Weather

Current weather & forecasts

  • No API key required
  • wttr.in backend
  • Location-aware

๐ŸŽจ OpenAI Image Gen

DALL-E image generation

  • Batch generation
  • Gallery HTML output
  • Random prompt sampling

๐ŸŽค Whisper (STT)

Speech-to-text

  • Local CLI (free)
  • API version also available
  • High accuracy

๐Ÿ“Ÿ LLM Gateway Commands

Telegram Bot Commands

  • /ask <question> - Smart routing
  • /code <task> - Force Qwen Coder
  • /vision <url> - Force Llama 11B
  • /analyze <url> - Force Llama 90B
  • /screenshot <url> - Force Kimi
  • /think <question> - Deep reasoning
  • /usage - Check daily stats

Direct CLI Usage

  • ~/dta/gateway/ask "query"
  • ~/dta/gateway/think-deep "problem"
  • ~/dta/gateway/analyze-screenshot "url"
  • ~/dta/gateway/llm-usage