Servers and GPU Infrastructure

Minos operates a self-hosted Large Language Model (LLM) ecosystem, utilizing models like DeepSeek R3 and Mistral, selected for task-specific trading needs. These models drive Conversational Interactions enabling real-time trader queries and strategy adjustments, Complex Reasoning powering the Technical Analyst Agent’s signal generation and the Risk Manager’s VaR calculations and Advanced Classification supporting the Sentiment Agent’s bullish/bearish classifications with 92% accuracy.

Our infrastructure runs on RunPod’s scalable GPU resources within a secure Docker environment, allowing Minos to dynamically scale GPU instances to handle real-time market volatility, ensuring <3-second decision latency and optimize resource allocation, achieving 85–95% GPU/CPU utilization and saving 15–20% on cloud costs.

For large-scale data tasks, such as processing transaction batches or news articles via our Minos scraping system, we integrate centralized LLMs (e.g., O3 or Anthropic). This ensures efficient handling of extensive datasets, critical for the Fundamentals and Sentiment Agents.

PreviousDevOps NextRetrieval-Augmented Generation (RAG) Operations

Last updated 5 months ago