SubQ and Long Context Efficiency
SubQ claims the first fully subquadratic LLM tuned for 12 million token reasoning without hybrids or quality drop....
Tagged: cost-efficiency Clear ×
SubQ claims the first fully subquadratic LLM tuned for 12 million token reasoning without hybrids or quality drop....
I should have applied to the GPT-5.5 party anyway. The RSVP form went live and I passed because...
xAI has a genuine strength, and it is not competing with Claude Opus 4.5 or GPT-5.2. It is...
OpenAI released GPT-5.4 mini and GPT-5.4 nano on March 17, 2026. These are smaller, faster variants of GPT-5.4...
Data centers are buying massive ship-derived and industrial engines to generate their own power on-site. This is not...
Cost creep is being used too loosely. A vendor raising list prices is not enough. The only definition...
OpenAI's leaked $100 Pro plan finally lets us compare ChatGPT Pro and Claude Max at the same price....
Gemini 3.1 Flash-Lite Preview dropped on March 3, 2026, positioned as a drop-in replacement for low-cost models. It...
If you are choosing between frontier models right now, the decision is rarely about raw intelligence. It is...
Nine cents for a 4-second AI video – that’s the headline that makes anyone who budgets for motion...
GPT-5.2 dropped on December 11, 2025. It is live in the API and ChatGPT right now. This isn’t...
Claude Opus 4.5 is the first time Anthropic’s top tier model actually looks practical for day-to-day work instead...
Sherlock Alpha and Sherlock Think Alpha just appeared on OpenRouter with almost no announcement. They look like xAI...
MiniMax M2 and GLM 4.6 stand out as two strong options for coding and agent tasks right now....
Claude Haiku 4.5 is available now. Frontload the conclusion: it delivers near-frontier coding quality while cutting cost and...
Workflows are still the default. If you need predictable output, fixed steps, and measurable cost, build a workflow....
There’s a chart making the rounds that says Lovable is dying. That chart shows total accounts shrinking and...
Grok 4 Fast is xAI’s newest multimodal model built for one thing: cost-efficient reasoning at scale. The headline...
Isaac 0.1 is a perceptive‑language open‑weight model at roughly 2B parameters that claims to match much larger systems...
Cerebras free tier limits explained: 1M tokens per day, API access, real rate limits, free plan details, and...
Google just made Veo 3 and Veo 3 Fast generally available. These video generation models now run on...
OpenAI just made a bold move against their own API business. Last week, they dropped GPT-5 at $10...
The AI economy in 2025 is a study in contrasts: raw AI token costs continue to plummet, yet...
The point: GPT-OSS-120B is fast and cheap, not strong. Its a clear case of sloptimizationshaping a model to...
Mistral AI just dropped its new model, Voxtral-Mini-3B-2507, and it’s a big deal. Part of their Voxtral series,...