The context window has been shattered: Subquadratic debuts a 12M token window
We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.
Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.
Follow TNS on your favorite social media networks.
Become a TNS follower on LinkedIn.
Check out the latest featured and trending stories while you wait for your first TNS newsletter.
As a JavaScript developer, what non-React tools do you use most often? ✓ Angular 0% ✓ Astro 0% ✓ Svelte 0% ✓ Vue.js 0% ✓ Other 0% ✓ I only use React 0% ✓ I don't use JavaScript 0% Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter: SUBMIT NEW! Try Stackie AI ARCHITECTURE Cloud Native Ecosystem Containers Databases Edge Computing Infrastructure as Code Linux Microservices Open Source Networking Storage ENGINEERING AI AI Engineering API Management Backend development Data Frontend Development Large Language Models Security Software Development WebAssembly OPERATIONS AI Operations CI/CD Cloud Services DevOps Kubernetes Observability Operations Platform Engineering PROGRAMMING C++ Developer tools Go Java JavaScript Programming Languages Python Rust TypeScript CHANNELS Podcasts Ebooks Events Webinars Newsletter TNS RSS Feeds THE NEW STACK About / Contact Sponsors Advertise With Us Contributions PODCASTS EBOOKS EVENTS WEBINARS NEWSLETTER CONTRIBUTE ARCHITECTURE ENGINEERING OPERATIONS PROGRAMMING Cloud Native Ecosystem Containers Databases Edge Computing Infrastructure as Code Linux Microservices Open Source Networking Storage Why Claude needs a real environment to validate cloud-native code Apr 24th 2026 8:00am, by Arjun Iyer The next stages of AI conformance in the cloud-native, open-source world Apr 9th 2026 1:05pm, by Jennifer Riggins Microsoft wants to make service mesh invisible Apr 8th 2026 1:11pm, by Frederic Lardinois True enterprise sovereignty is more approachable than ever, thanks to K8s-powered cloud-neutral PostgreSQL Apr 7th 2026 11:31am, by TNS Staff Is observability still an operations problem at your organization? Apr 6th 2026 12:05pm, by TNS Staff Microsoft wants to make service mesh invisible Apr 8th 2026 1:11pm, by Frederic Lardinois Edera spent years calling KVM less secure. Here's why it changed its mind. Mar 25th 2026 2:22pm, by Steven J. Vaughan-Nichols Minimus aims to solve one of open-source's long-festering problems Mar 24th 2026 3:00am, by Adrian Bridgwater How to deploy Pi-Hole with Docker and stop ads on every device on your LAN Mar 23rd 2026 7:44am, by Jack Wallen Chainguard has a fix for the open source packages your AI agents keep grabbing Mar 18th 2026 9:24am, by Darryl K. Taft ScyllaDB cut Sprig's read latency 4X after Redis and ClickHouse hit a wall May 5th 2026 7:00am, by Cynthia Dunlop Why developers are betting on Postgres for AI Apr 27th 2026 10:00am, by Meredith Shubel Vectors gave us AI search, tensors are going to make it smarter Apr 24th 2026 9:42am, by Alex Wilhelm Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Alasdair Brown Postgres to Iceberg in 13 minutes: How Supermetal compares to Flink, Kafka Connect, and Spark Apr 15th 2026 11:00am, by Yaroslav Tkachenko Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference Apr 1st 2026 7:00am, by Adrian Bridgwater Developers are coding to a moving target, and nobody knows where AI lands next Mar 3rd 2026 7:33am, by Adrian Bridgwater Cloudflare’s new Markdown support shows how the web is evolving for AI agents Mar 2nd 2026 4:30am, by David Eastman React Server Components Vulnerability Found Dec 6th 2025 7:00am, by Loraine Lawson Kubernetes at the Edge: Lessons From GE HealthCare’s Edge Strategy Nov 24th 2025 10:00am, by Vicki Walker Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson The operational gap is real, and it's getting wider Mar 26th 2026 8:00am, by Yevgeny Pats Why "automated" infrastructure might cost more than you think Feb 24th 2026 4:00am, by Justyn Roberts Why 40% of AI projects will be canceled by 2027 (and how to stay in the other 60%) Feb 13th 2026 6:00am, by Alex Drag Sparky Linux 9 brings a rolling release to Debian Mar 30th 2026 8:00am, by Jack Wallen Edera spent years calling KVM less secure. Here's why it changed its mind. Mar 25th 2026 2:22pm, by Steven J. Vaughan-Nichols Your Kubernetes isn't ready for AI workloads, and drift is the reason Mar 25th 2026 8:43am, by TNS Staff Linux kernel scale is swamping an already-flawed CVE system Mar 20th 2026 4:30am, by Jed Salazar Scaling Btrfs to petabytes in production: a 74% cost reduction story Mar 18th 2026 5:00am, by Motiejus Jakštys Tetrate launches open source marketplace to simplify Envoy adoption Mar 11th 2026 10:52am, by Adrian Bridgwater OpenTelemetry roadmap: Sampling rates and collector improvements ahead Feb 24th 2026 11:00am, by B. Cameron Gain Merging To Test Is Killing Your Microservices Velocity Dec 16th 2025 7:00am, by Arjun Iyer IBM’s Confluent Acquisition Is About Event-Driven AI Dec 11th 2025 6:00am, by Joab Jackson Deploy Agentic AI Workflows With Kubernetes and Terraform Nov 26th 2025 9:00am, by Oladimeji Sowole How OpenAI scaled to 900 million weekly users with Ory May 4th 2026 12:00pm, by Damon Tepe Mainframe modernization is no longer optional for the AI-driven enterprise May 3rd 2026 11:00am, by Jason Bloomberg Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by Paul Sawers Meta abandons open-source Llama for proprietary Muse Spark Apr 30th 2026 11:05am, by Steven J. Vaughan-Nichols Warp's gamble: Going open source to take on closed-source rivals Apr 29th 2026 9:57am, by Steven J. Vaughan-Nichols From system of record to system of control: How NetBox Labs is making network engineers “masters of intent.” Apr 28th 2026 11:00am, by Doug Sillars Beyond the VPN: Cloudflare Mesh builds a private network for the age of AI agents Apr 14th 2026 11:04am, by Adrian Bridgwater Model Flop Utilization is the metric Aria Networks says will define the AI infrastructure era Apr 7th 2026 9:00am, by Adrian Bridgwater How to deploy Pi-Hole with Docker and stop ads on every device on your LAN Mar 23rd 2026 7:44am, by Jack Wallen Why flat Kubernetes networks fail at scale Mar 20th 2026 7:00am, by Reza Ramezanpour Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Alasdair Brown Scaling Btrfs to petabytes in production: a 74% cost reduction story Mar 18th 2026 5:00am, by Motiejus Jakštys What is KubeVirt and why it’s growing Mar 17th 2026 9:00am, by Tiago Castro S3 is the new network: Rethinking data architecture for the cloud era Feb 2nd 2026 4:00am, by Max Liu Agoda’s secret to 50x scale: Getting the database basics right Jan 28th 2026 7:00am, by Cynthia Dunlop AI AI Engineering API Management Backend development Data Frontend Development Large Language Models Security Software Development WebAssembly AI has a sprawling data problem. Airbyte has just launched a tool to fix it. May 5th 2026 1:24pm, by Frederic Lardinois "To us, it's just a tool": How SAS is selling AI to the Fortune 500 May 3rd 2026 12:00pm, by Frederic Lardinois Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by Paul Sawers The OpenAI-Microsoft reset, decoded: Why AWS may come out ahead Apr 30th 2026 1:17pm, by Paul Sawers Anthropic's Claude Security emerges from closed preview to scan your codebases for vulnerabilities Apr 30th 2026 1:00pm, by Frederic Lardinois The context window has been shattered: Subquadratic debuts a 12-million-token window May 5th 2026 2:01pm, by Frederic Lardinois AI won't speed up software delivery — nothing has May 4th 2026 5:38pm, by Steve Fenton The agent code explosion is here. We need to rethink our pipelines, fast. May 4th 2026 10:00am, by Arjun Iyer Inside OpenSearch's bid to become the default AI data layer May 2nd 2026 12:00pm, by Anil Inamdar AI agents are running wild on developer machines. Incredibuild has a fix. May 1st 2026 4:21pm, by Darryl K. Taft Why JSON Schema matters more than ever in the age of generative AI Apr 28th 2026 1:00pm, by Charles Humble SmartBear's Swagger update targets the API drift problem AI coding tools created Apr 19th 2026 10:00am, by Darryl K. Taft MCP is everywhere, but don't panic. Here's why your existing APIs still matter. Mar 23rd 2026 5:00am, by Camille Crowell-Lee and Morgan Fine Before you let AI agents loose, you’d better know what they’re capable of Mar 12th 2026 1:22pm, by Charles Humble GSMA Open Gateway offers developers one API for 300+ mobile networks Mar 4th 2026 10:26am, by Adrian Bridgwater Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Alasdair Brown Expo bets big on React Native’s agentic future Apr 16th 2026 11:37am, by Paul Sawers From clobbered drafts to real-time sync Apr 14th 2026 10:00am, by David Moore Moving beyond the “magic scaling sauce” myth Apr 2nd 2026 9:30am, by TNS Staff Backend Development in 2026: What's Changed, What Matters, and What to Learn Next Mar 19th 2026 11:37am, by TNS Staff ScyllaDB cut Sprig's read latency 4X after Redis and ClickHouse hit a wall May 5th 2026 7:00am, by Cynthia Dunlop How to find and unlock the data hidden within videos Apr 26th 2026 10:00am, by Kai Borgen Vectors gave us AI search, tensors are going to make it smarter Apr 24th 2026 9:42am, by Alex Wilhelm What engineering leaders get wrong about data stack consolidation Apr 15th 2026 12:00pm, by Anil Inamdar Kumo's new foundation model replaces months of data science engineering with plain-English queries Apr 14th 2026 12:01pm, by Adrian Bridgwater Expo bets big on React Native’s agentic future Apr 16th 2026 11:37am, by Paul Sawers Digital Experience Monitoring belongs in the modern developer workflow Apr 3rd 2026 10:00am, by Kayla Bondy WebMCP turns any Chrome web page into an MCP server for AI agents Mar 17th 2026 11:50am, by David Eastman Confluent adds A2A support, anomaly detection, and Queues for Kafka in major platform update Mar 3rd 2026 10:21am, by Jelani Harper Google's Chrome browser moves to a two-week release cycle Mar 3rd 2026 9:00am, by Frederic Lardinois Why JSON Schema matters more than ever in the age of generative AI Apr 28th 2026 1:00pm, by Charles Humble Microsoft-OpenAI rewrite opens the door for Anthropic and Google Apr 27th 2026 6:25pm, by Adrian Bridgwater Mistral’s Leanstral wants to kill off human-in-the-loop code checks, but is it blowing in the wind? Apr 24th 2026 12:04pm, by Adrian Bridgwater OpenAI's new Privacy Filter runs on your laptop so PII never hits the cloud Apr 23rd 2026 4:54pm, by Meredith Shubel AI shrinkflation: Why Anthropic's Claude Opus 4.7 may be less capable than the model it replaced Apr 23rd 2026 8:52am, by Adrian Bridgwater How OpenAI scaled to 900 million weekly users with Ory May 4th 2026 12:00pm, by Damon Tepe Palo Alto Networks makes a $700M-class AI bet on Portkey gateway May 4th 2026 10:49am, by Janakiram MSV AI agents are running wild on developer machines. Incredibuild has a fix. May 1st 2026 4:21pm, by Darryl K. Taft Quickbase's Pave targets vibe coding's notorious 80% problem Apr 30th 2026 11:00am, by Adrian Bridgwater Can your operations handle your security ambitions? Apr 27th 2026 9:00am, by Jamie Dicken AI won't speed up software delivery — nothing has May 4th 2026 5:38pm, by Steve Fenton Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by Paul Sawers Meta abandons open-source Llama for proprietary Muse Spark Apr 30th 2026 11:05am, by Steven J. Vaughan-Nichols GitHub moves Copilot to usage-based billing as AI coding costs climb Apr 27th 2026 3:02pm, by Paul Sawers Can your operations handle your security ambitions? Apr 27th 2026 9:00am, by Jamie Dicken Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference Apr 1st 2026 7:00am, by Adrian Bridgwater WebAssembly is now outperforming containers at the edge Mar 29th 2026 9:00am, by B. Cameron Gain WebAssembly could solve AI agents' most dangerous security gap Mar 24th 2026 9:01am, by B. Cameron Gain How WebAssembly plugins simplify Kubernetes extensibility Mar 3rd 2026 2:00pm, by B. Cameron Gain WebAssembly is everywhere. Here's how it works Feb 25th 2026 11:00am, by Jessica Wachtel AI Operations CI/CD Cloud Services DevOps Kubernetes Observability Operations Platform Engineering How HPE is closing the loop on cloud and AI sprawl with agentic AI Apr 29th 2026 9:00am, by Jennifer Riggins Jaeger adopts OpenTelemetry at its core to solve the AI agent observability gap Apr 25th 2026 12:00pm, by Jonah Kowall Jim Bugwadia on why finding a Kubernetes problem is only half the battle for Kyverno users Apr 23rd 2026 1:48pm, by Jennifer Riggins Google finally builds the AI and agent platform it's been describing for years Apr 22nd 2026 8:00am, by Frederic Lardinois GitHub pauses Copilot sign-ups as AI coding drives up compute demand Apr 21st 2026 7:03pm, by Paul Sawers The agent code explosion is here. We need to rethink our pipelines, fast. May 4th 2026 10:00am, by Arjun Iyer Is your internal platform ready to keep up with AI-accelerated development? Apr 16th 2026 6:07am, by TNS Staff Claude Code can now do your job overnight Apr 14th 2026 2:56pm, by Frederic Lardinois The TeamPCP attacks are a warning: Your CI/CD pipeline is the new front line Apr 2nd 2026 12:00pm, by Dan Lorenc Why coding agents will break your CI/CD pipeline (and how to fix it) Apr 2nd 2026 11:00am, by Arjun Iyer AI agents need to spend money — Stripe and iWallet are building the rails May 5th 2026 7:43am, by John Biggs AWS lands OpenAI on Bedrock, but Trainium is the real story Apr 29th 2026 10:54am, by Janakiram MSV Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson Microsoft-OpenAI rewrite opens the door for Anthropic and Google Apr 27th 2026 6:25pm, by Adrian Bridgwater The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson 3 steps to escaping the “break-fix” trap Apr 17th 2026 10:00am, by Cristina Dias Is your internal platform ready to keep up with AI-accelerated development? Apr 16th 2026 6:07am, by TNS Staff Why data governance is the secret to AI agent success Apr 10th 2026 11:00am, by Rod Cope Is observability still an operations problem at your organization? Apr 6th 2026 12:05pm, by TNS Staff How SUSE positions itself as the infrastructure layer for the AI era May 1st 2026 4:16pm, by Adrian Bridgwater Why Broadcom is betting on a private cloud comeback Apr 28th 2026 2:09pm, by B. Cameron Gain Beyond prompting: How KubeStellar reached 81% PR acceptance with AI agents Apr 26th 2026 12:00pm, by Andy Anderson Can you make Kubernetes invisible? Here's why AWS is on a mission to do it. Apr 14th 2026 1:52pm, by Adrian Bridgwater Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time Apr 7th 2026 1:55pm, by Adrian Bridgwater Arize AI and Google Cloud lay down standardized telemetry mandate to keep enterprise agents in check May 4th 2026 11:00am, by Adrian Bridgwater Sentry’s Seer Agent lets developers debug production issues in natural language Apr 28th 2026 12:01pm, by Frederic Lardinois The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson Jaeger adopts OpenTelemetry at its core to solve the AI agent observability gap Apr 25th 2026 12:00pm, by Jonah Kowall How to prepare your company for the era of agentic ITops Apr 17th 2026 4:54pm, by Alex Wilhelm Fresh data has us asking, does AI demand Kubernetes? May 1st 2026 4:18pm, by Jennifer Riggins Cut AI token usage by 96%? Here's how AWS Strands Agents does it. Apr 29th 2026 2:52pm, by Frederic Lardinois Anaconda acquires Outerbounds to rein in the buggy code AI agents keep shipping Apr 29th 2026 12:43pm, by Darryl K. Taft How HPE is closing the loop on cloud and AI sprawl with agentic AI Apr 29th 2026 9:00am, by Jennifer Riggins Why AI engineering needs old-school discipline Apr 27th 2026 2:53pm, by Frederic Lardinois Mainframe modernization is no longer optional for the AI-driven enterprise May 3rd 2026 11:00am, by Jason Bloomberg A nine-point checklist for shipping production-ready AI Apr 30th 2026 2:00pm, by Oladimeji Sowole How AI transforms your role as a platform engineer Apr 29th 2026 9:05am, by Zohar Einy Why Broadcom is betting on a private cloud comeback Apr 28th 2026 2:09pm, by B. Cameron Gain From system of record to system of control: How NetBox Labs is making network engineers “masters of intent.” Apr 28th 2026 11:00am, by Doug Sillars C++ Developer tools Go Java JavaScript Programming Languages Python Rust TypeScript Open source USearch library jumpstarts ScyllaDB vector search Feb 5th 2026 12:00pm, by Jelani Harper AWS WAF vs. Google Cloud Armor: A Multicloud Security Showdown Nov 25th 2025 10:00am, by Advait Patel Goodbye Dashboards: Agents Deliver Answers, Not Just Reports Nov 23rd 2025 9:00am, by Ketan Karkhanis Rust vs. C++: a Modern Take on Performance and Safety Oct 22nd 2025 2:00pm, by Zziwa Raymond Ian Building a Real-Time System Monitor in Rust Terminal Oct 15th 2025 7:05am, by Tinega Onchari "Real maturity problems": Not every developer is thrilled with Bun after Anthropic acquisition May 5th 2026 1:03pm, by Adrian Bridgwater AI agents need to spend money — Stripe and iWallet are building the rails May 5th 2026 7:43am, by John Biggs Most AI coding is "like taking your Ferrari to buy milk": IBM's Neel Sundaresan May 3rd 2026 10:12am, by Darryl K. Taft IBM Bob hits 80,000 developers with 45% productivity gains May 1st 2026 3:06pm, by Darryl K. Taft Warp's gamble: Going open source to take on closed-source rivals Apr 29th 2026 9:57am, by Steven J. Vaughan-Nichols Go Experts: 'I Don't Want to Maintain AI-Generated Code' Sep 28th 2025 6:00am, by David Cassel How To Run Kubernetes Commands in Go: Steps and Best Practices Jun 27th 2025 8:00am, by Sunny Yadav Prepare Your Mac for Go Development Apr 12th 2025 7:00am, by Damon M. Garn Pagoda: A Web Development Starter Kit for Go Programmers Mar 19th 2025 6:10am, by Loraine Lawson Microsoft TypeScript Devs Explain Why They Chose Go Over Rust, C# Mar 18th 2025 7:00am, by David Cassel In the AI age, Java is more relevant than ever Apr 8th 2026 5:30pm, by Mary Branscombe Java 26 lands without an LTS badge. Here's why developers should care anyway. Mar 18th 2026 9:35am, by Darryl K. Taft 62% of enterprises now use Java to power AI apps Feb 10th 2026 12:58pm, by Darryl K. Taft BellSoft bets Java expertise can beat hardened container wave Jan 26th 2026 3:00pm, by Darryl K. Taft Java Developers Get Multiple Paths To Building AI Agents Dec 26th 2025 7:02am, by Darryl K. Taft "Real maturity problems": Not every developer is thrilled with Bun after Anthropic acquisition May 5th 2026 1:03pm, by Adrian Bridgwater TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft WebAssembly is everywhere. Here's how it works Feb 25th 2026 11:00am, by Jessica Wachtel Wasm vs. JavaScript: Who wins at a million rows? Feb 22nd 2026 6:00am, by Jessica Wachtel Arcjet reaches v1.0, promises stable security for JavaScript apps Feb 14th 2026 7:00am, by Darryl K. Taft Who will maintain the web when PHP's veterans retire? Apr 16th 2026 2:53pm, by Darryl K. Taft Will AI force code to evolve or make it extinct? Mar 22nd 2026 6:00am, by David Cassel Java 26 lands without an LTS badge. Here's why developers should care anyway. Mar 18th 2026 9:35am, by Darryl K. Taft TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft Nearly half of all companies now use Rust in production, survey finds Mar 6th 2026 10:45am, by Darryl K. Taft How to build an AI-powered private document search app with RAG, ChromaDB, and memory Apr 10th 2026 12:00pm, by Teri Eyenike In the AI age, Java is more relevant than ever Apr 8th 2026 5:30pm, by Mary Branscombe OpenAI acquires Astral to bring open source Python developer tools to Codex — but details are still fuzzy Mar 20th 2026 7:33am, by Meredith Shubel Python virtual environments: isolation without the chaos Feb 16th 2026 7:00am, by Jessica Wachtel Statistical language R is making a comeback against Python Feb 12th 2026 2:57pm, by Darryl K. Taft Nearly half of all companies now use Rust in production, survey finds Mar 6th 2026 10:45am, by Darryl K. Taft Wasm vs. JavaScript: Who wins at a million rows? Feb 22nd 2026 6:00am, by Jessica Wachtel Open source USearch library jumpstarts ScyllaDB vector search Feb 5th 2026 12:00pm, by Jelani Harper The 'weird' things that happened when Clickhouse replaced C++ with Rust Feb 4th 2026 7:26am, by B. Cameron Gain Async Rust: Pinning demystified Jan 26th 2026 11:00am, by Anshul Gupta From clobbered drafts to real-time sync Apr 14th 2026 10:00am, by David Moore TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft Mastra empowers web devs to build AI agents in TypeScript Jan 28th 2026 11:00am, by Loraine Lawson Inferno Vet Creates Frontend Framework Built With AI in Mind Dec 10th 2025 11:00am, by Loraine Lawson JavaScript Utility Library Lodash Changing Governance Model Nov 1st 2025 7:00am, by Loraine Lawson 2026-05-05 14:01:04 The context window has been shattered: Subquadratic debuts a 12-million-token window AI Engineering / AI Models / Emerging technologies The context window has been shattered: Subquadratic debuts a 12-million-token window Subquadratic has launched a new AI architecture featuring a 12-million-token context window that outperforms GPT-5.5 on retrieval benchmarks. May 5th, 2026 2:01pm by Frederic Lardinois
Every frontier model in 2026 advertises a context window of at least a million tokens, but almost none of them are actually great at making use of all of that information. On MRCR v2, the multi-reference retrieval benchmark labs report, the best model is GPT-5.5, which scores 74.0%. Others like Claude Opus 4.7 at 32.2% are far behind.
At this point, a million tokens seems to be the maximum for the context window that the major frontier labs are offering. One major reason for the million-token max is the same one that has shaped every transformer-based model since 2017: Attention cost scales quadratically with context length, so doubling the input quadruples the work. Essentially, RAG, agentic decomposition, hybrid model architectures, and every other workaround the industry has built are ways of making tradeoffs to get around this.
Subquadratic, a Miami-based startup, launched its first model on Tuesday and claims it can get around all of this, now offering a model that can handle a token window of 12 million tokens. What’s more, the company says it plans to offer a model with a 50-million-context window soon.
The company, which has 11 Ph.D. researchers on staff, argues that its architecture, called Subquadratic Selective Attention (SSA), scales linearly in both compute and memory with respect to context length. The company says it runs 52 times faster than dense attention at a million tokens, hits 92.1% on needle-in-a-haystack retrieval at 12 million tokens — a context length no frontier model currently gets close to — and scores 83 on MRCR v2, beating OpenAI by nine points.
The company says its Subquadratic Selective Attention architecture runs 52 times faster than dense attention at a million tokens, hits 92.1% on needle-in-a-haystack retrieval at 12 million tokens, and scores 83 on MRCR v2, beating OpenAI by nine points.
Those are large claims, and Subquadratic isn’t the first to try to tackle this problem. The benchmarks the company is releasing are impressive, including a 82.4% score on SWE-bench, which bests Anthropic’s last model, Opus 4.6, which scored 81.42% and Google’s Gemini 3.1 Pro at 80.6%. And it’s doing all of this at a significantly lower cost.
Subquadratic is making this model available through an API — which will feature a 12-million-token context window — as well as a coding agent (SubQ Code) and a deep research tool (SubQ Search).
The quadratic cost of attention is obviously not a new problem, and SSA is not the first attempt to solve it. The research line goes back nearly to the original transformer paper, and the overall pattern has remained consistent. Every approach has traded one necessary property to gain another, and none have been able to replace dense attention at the frontier scale.
Every approach has traded one necessary property to gain another, and none have been able to replace dense attention at the frontier scale.
Among the different approaches is, for example, fixed-pattern sparse attention. In models like Longformer, it achieves linear scaling by letting each token attend only to a sliding window. It works when relevant information sits nearby and breaks when it does not.
State-space models like Mamba, Mamba-2, RWKV, RetNet replace the all-pairs comparison with a recurrent state that compresses everything seen so far. The compression is lossy, however. Nvidia’s study at 8B scale found pure Mamba-2 lagged transformers on MMLU and phonebook lookup, with the gap closing only when attention was added back.
Hybrid architectures, as seen in Jamba, Kimi Linear, Qwen3-Next, and Nvidia’s Nemotron v3, are the pragmatic answer to this. They keep most layers efficient and retain a few dense attention layers for retrieval. But the economics are less favorable than they look. A hybrid that is three times cheaper at 32K tokens remains three times cheaper at 10M tokens, because the dense layers it retains still do O(n²) work.
The most recent entries went in a different direction. Rather than trying to fix the pattern or compress the state, they learn which positions to attend to.
DeepSeek’s Native Sparse Attention won the ACL 2025 best paper award, for example. Its successor, DeepSeek Sparse Attention (DSA), is shipping in DeepSeek V3.2-Exp. DSA’s lightning indexer routes attention to a small subset of selected keys, and the attention over those keys is genuinely sparse. The indexer that picks them, however, has to score every query against every key, meaning the selection step is itself quadratic.
SubQuadratic CTO Alex Whedon tells The New Stack, “Sparse attention basically means instead of doing what transformers do, which is if you have 1,000 words, you look at every possible relationship between all 1,000 words, which is 1,000 squared combinations. You realize that only a portion of those actually matter and you only process the portion that matter.”
SSA’s pitch is that it does what DSA tried to do without the indexer trap. Selection is content-dependent. For any given query, the model picks which positions matter based on what the query and keys actually contain — and most importantly, the selection mechanism itself does not go quadratic.
“For prompt A, words one and six are going to be important to each other,” Whedon says. “For prompt B, maybe it’s words two and three. It’s different for every single input.”
According to Whedon, hybrids deliver “a scalar benefit,” but a pure subquadratic mechanism delivers a scaling-law advantage. SubQ’s reported 7.2× speedup at 128K and 52.2× at 1M in its benchmarks.
On RULER at 128K, SubQ scores 97.1 against Opus 4.6’s 94.8. On MRCR v2, the gap to the rest of the frontier is wider than the gap between the rest of the frontier and itself.
On SWE-Bench Verified, SubQ reports 82.4%, edging out Opus 4.6’s 81.4%, and Gemini 3.1 Pro’s 80.6%. At 12 million tokens, where no frontier model operates, SubQ holds 92.1% on a needle-in-a-haystack benchmark.
There are some caveats. Each model was run only once, according to the technical paper, due to their high inference cost. The SWE-Bench margin is, as the paper acknowledges, “harness as much as model.” And the SubQ model is, by Whedon’s own description, “way smaller than the big labs.”
The company is launching two products in beta: an API that exposes the full 12M-token window and SubQ Code, a CLI agent built on the same model. Both run on neoclouds rather than the major hyperscalers — “they’re very expensive,” CEO Justin Dangel says.
The company is not open-sourcing weights but plans to offer training tools for enterprises to do their own post-training. The 50-million-token context window target is set for Q4.
There is a bit of a cautionary tale here, though. Magic.dev announced a 100M-token context-window model in August 2024, with a claimed 1000× efficiency advantage. It raised over $500 million on its strength. As of early 2026, there is no public evidence of LTM-2-mini being used outside Magic.
Subquadratic has raised $29 million to date at a $500 million valuation from investors including former SoftBank Vision Fund partner Javier Villamizar and Tinder co-founder Justin Mateen. The company was previously called Aldea and worked on speech models before pivoting. The technical case is real. The category’s track record is the rest of the story.
Схожі новини
Почти гиперкар: "серийный" Porsche Taycan Turbo GT установил исторический рекорд Нюрбургринга