Discussion
Search code, repositories, users, issues, pull requests...
Talderigi: Curious how the semantic caching layer works.. are you embedding requests on the gateway side and doing a vector similarity lookup before proxying? And if so, how do you handle cache invalidation when the underlying model changes or gets updated?
giorgi_pro: Hey, contributor here. That's right, GoModel embeds requests and does vector similarity lookup before proxying. Regarding the cache invalidation, there is no "purging" involved – the model is part of the namespace (params_hash includes the LLM model, path, guardrails hash, etc). TTL takes care of the cleanup later.
tahosin: This is really useful. I've been building an AI platform (HOCKS AI) where I route different tasks to different providers — free OpenRouter models for chat/code gen, Gemini for vision tasks. The biggest pain point has been exactly what you describe: switching models without changing app code.One thing I'd love to see is built-in cost tracking per model/route. When you're mixing free and paid models, knowing exactly where your spend goes is critical. Do you have plans for that in the dashboard?
indigodaddy: Any plans for AI provider subscription compatibility? Eg ChatGPT, GH Copilot etc ? (Ala opencode)
santiago-pl: This comment looks like AI-generated.However IIUC what you're asking for - it's already in the dashboard! Check the Usage page.