BETA — Сайт у режимі бета-тестування. Можливі помилки та зміни.
UK | EN |
LIVE
Технології 🇺🇸 США

The context window has been shattered: Subquadratic debuts a 12M token window

Hacker News gmays 0 переглядів 61 хв читання
OK SUBSCRIBE Join our community of software engineering leaders and aspirational developers. Always stay in-the-know by getting the most important news and exclusive content delivered fresh to your inbox to learn more about at-scale software development. EMAIL ADDRESS REQUIRED SUBSCRIBE RESUBSCRIPTION REQUIRED   It seems that you've previously unsubscribed from our newsletter in the past. Click the button below to open the re-subscribe form in a new tab. When you're done, simply close that tab and continue with this form to complete your subscription. RE-SUBSCRIBE The New Stack does not sell your information or share it with unaffiliated third parties. By continuing, you agree to our Terms of Use and Privacy Policy. Welcome and thank you for joining The New Stack community! Please answer a few simple questions to help us deliver the news and resources you are interested in. FIRST NAME REQUIRED LAST NAME REQUIRED COMPANY NAME REQUIRED COUNTRY REQUIRED Select ... United States Canada India United Kingdom Germany France --- Afghanistan Albania Algeria American Samoa Andorra Angola Anguilla Antarctica Antigua and Barbuda Argentina Armenia Aruba Asia/Pacific Region Australia Austria Azerbaijan Bahamas Bahrain Bangladesh Barbados Belarus Belgium Belize Benin Bermuda Bhutan Bolivia Bonaire, Sint Eustatius and Saba Bosnia and Herzegovina Botswana Bouvet Island Brazil British Indian Ocean Territory Brunei Darussalam Bulgaria Burkina Faso Burundi Cambodia Cameroon Canada Cape Verde Cayman Islands Central African Republic Chad Chile China Christmas Island Cocos (Keeling) Islands Colombia Comoros Congo Congo, The Democratic Republic of the Cook Islands Costa Rica Croatia Cuba Curaçao Cyprus Czech Republic Côte d'Ivoire Denmark Djibouti Dominica Dominican Republic Ecuador Egypt El Salvador Equatorial Guinea Eritrea Estonia Ethiopia Falkland Islands (Malvinas) Faroe Islands Fiji Finland France French Guiana French Polynesia French Southern Territories Gabon Gambia Georgia Germany Ghana Gibraltar Greece Greenland Grenada Guadeloupe Guam Guatemala Guernsey Guinea Guinea-Bissau Guyana Haiti Heard Island and Mcdonald Islands Holy See (Vatican City State) Honduras Hong Kong Hungary Iceland India Indonesia Iran, Islamic Republic Of Iraq Ireland Isle of Man Israel Italy Jamaica Japan Jersey Jordan Kazakhstan Kenya Kiribati Korea, Republic of Kuwait Kyrgyzstan Laos Latvia Lebanon Lesotho Liberia Libyan Arab Jamahiriya Liechtenstein Lithuania Luxembourg Macao Madagascar Malawi Malaysia Maldives Mali Malta Marshall Islands Martinique Mauritania Mauritius Mayotte Mexico Micronesia, Federated States of Moldova, Republic of Monaco Mongolia Montenegro Montserrat Morocco Mozambique Myanmar Namibia Nauru Nepal Netherlands Netherlands Antilles New Caledonia New Zealand Nicaragua Niger Nigeria Niue Norfolk Island North Korea North Macedonia Northern Mariana Islands Norway Oman Pakistan Palau Palestinian Territory, Occupied Panama Papua New Guinea Paraguay Peru Philippines Pitcairn Islands Poland Portugal Puerto Rico Qatar Reunion Romania Russian Federation Rwanda Saint Barthélemy Saint Helena Saint Kitts and Nevis Saint Lucia Saint Martin Saint Martin Saint Pierre and Miquelon Saint Vincent and the Grenadines Samoa San Marino Sao Tome and Principe Saudi Arabia Senegal Serbia Serbia and Montenegro Seychelles Sierra Leone Singapore Sint Maarten Slovakia Slovenia Solomon Islands Somalia South Africa South Georgia and the South Sandwich Islands South Sudan Spain Sri Lanka Sudan Suriname Svalbard and Jan Mayen Swaziland Sweden Switzerland Syrian Arab Republic Taiwan Tajikistan Tanzania, United Republic of Thailand Timor-Leste Togo Tokelau Tonga Trinidad and Tobago Tunisia Turkey Turkmenistan Turks and Caicos Islands Tuvalu Uganda Ukraine United Arab Emirates United Kingdom United States United States Minor Outlying Islands Uruguay Uzbekistan Vanuatu Venezuela Vietnam Virgin Islands, British Virgin Islands, U.S. Wallis and Futuna Western Sahara Yemen Zambia Zimbabwe Åland Islands ZIPCODE REQUIRED Great to meet you! Tell us a bit about your job so we can cover the topics you find most relevant. What is your job level? REQUIRED Select ... C-Level VP/Director Manager/Supervisor Mid Level or Senior Non-Managerial Staff Entry Level/Junior Staff Freelancer/Contractor Student/Intern Other ... Which of these most closely describes your job role? REQUIRED Select ... Developer/Software Engineer SysAdmin/Operations/SRE Architect Security Professional DevOps Engineer/Team Community Manager/Developer Advocate IT management, including CIO/CISO/CTO Business Development/Marketing/Sales Enthusiast/Hobbyist Other ... How many employees are in the organization you work with? REQUIRED Select ... Self-employed 2-10 11-50 51-250 251-1,000 1,001-10,000 > 10,000 I am not working What option best describes the type of organization you work for? REQUIRED Select ... “End user” organization that primarily uses IT products and services to support their business deliverables Hardware / software vendor or supplier Cloud service provider or managed service provider System integrator or IT consulting firm Other ... Which of the following best describes your organization's primary industry? REQUIRED Select ... Advertising/Marketing Aerospace/Aviation Agriculture Automotive Biotech/Pharmaceutical Business Services (accounting, consulting, etc.) Computers/Information Technology Construction Education Facilities/Service Industry Finance/Financial Services (banking, insurance, etc.) Government Healthcare Human Resources Legal Life sciences (biotech, pharmaceuticals, etc.) Manufacturing Media Non-profit Real Estate Retail/Consumer Goods Telecommunications Transportation/Logistics Travel/Hospitality/Entertainment Utility/Energy Other ... LINKEDIN PROFILE URL   Welcome!

We’re so glad you’re here. You can expect all the best TNS content to arrive Monday through Friday to keep you on top of the news and at the top of your game.

Check your inbox for a confirmation email where you can adjust your preferences and even join additional groups.

Become a TNS follower on LinkedIn.

Check out the latest featured and trending stories while you wait for your first TNS newsletter.

As a JavaScript developer, what non-React tools do you use most often? ✓ Angular 0% ✓ Astro 0% ✓ Svelte 0% ✓ Vue.js 0% ✓ Other 0% ✓ I only use React 0% ✓ I don't use JavaScript 0% Thanks for your opinion! Subscribe below to get the final results, published exclusively in our TNS Update newsletter: SUBMIT   NEW! Try Stackie AI ARCHITECTURE Cloud Native Ecosystem Containers Databases Edge Computing Infrastructure as Code Linux Microservices Open Source Networking Storage ENGINEERING AI AI Engineering API Management Backend development Data Frontend Development Large Language Models Security Software Development WebAssembly OPERATIONS AI Operations CI/CD Cloud Services DevOps Kubernetes Observability Operations Platform Engineering PROGRAMMING C++ Developer tools Go Java JavaScript Programming Languages Python Rust TypeScript CHANNELS Podcasts Ebooks Events Webinars TNS RSS Feeds THE NEW STACK About / Contact Sponsors Advertise With Us Contributions PODCASTS EBOOKS EVENTS WEBINARS NEWSLETTER CONTRIBUTE ARCHITECTURE ENGINEERING OPERATIONS PROGRAMMING Cloud Native Ecosystem Containers Databases Edge Computing Infrastructure as Code Linux Microservices Open Source Networking Storage Why Claude needs a real environment to validate cloud-native code Apr 24th 2026 8:00am, by Arjun Iyer The next stages of AI conformance in the cloud-native, open-source world Apr 9th 2026 1:05pm, by Jennifer Riggins Microsoft wants to make service mesh invisible Apr 8th 2026 1:11pm, by Frederic Lardinois True enterprise sovereignty is more approachable than ever, thanks to K8s-powered cloud-neutral PostgreSQL Apr 7th 2026 11:31am, by TNS Staff Is observability still an operations problem at your organization? Apr 6th 2026 12:05pm, by TNS Staff Microsoft wants to make service mesh invisible Apr 8th 2026 1:11pm, by Frederic Lardinois Edera spent years calling KVM less secure. Here's why it changed its mind. Mar 25th 2026 2:22pm, by Steven J. Vaughan-Nichols Minimus aims to solve one of open-source's long-festering problems Mar 24th 2026 3:00am, by Adrian Bridgwater How to deploy Pi-Hole with Docker and stop ads on every device on your LAN Mar 23rd 2026 7:44am, by Jack Wallen Chainguard has a fix for the open source packages your AI agents keep grabbing Mar 18th 2026 9:24am, by Darryl K. Taft ScyllaDB cut Sprig's read latency 4X after Redis and ClickHouse hit a wall May 5th 2026 7:00am, by Cynthia Dunlop Why developers are betting on Postgres for AI Apr 27th 2026 10:00am, by Meredith Shubel Vectors gave us AI search, tensors are going to make it smarter Apr 24th 2026 9:42am, by Alex Wilhelm Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Alasdair Brown Postgres to Iceberg in 13 minutes: How Supermetal compares to Flink, Kafka Connect, and Spark Apr 15th 2026 11:00am, by Yaroslav Tkachenko Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference Apr 1st 2026 7:00am, by Adrian Bridgwater Developers are coding to a moving target, and nobody knows where AI lands next Mar 3rd 2026 7:33am, by Adrian Bridgwater Cloudflare’s new Markdown support shows how the web is evolving for AI agents Mar 2nd 2026 4:30am, by David Eastman React Server Components Vulnerability Found Dec 6th 2025 7:00am, by Loraine Lawson Kubernetes at the Edge: Lessons From GE HealthCare’s Edge Strategy Nov 24th 2025 10:00am, by Vicki Walker Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson The operational gap is real, and it's getting wider Mar 26th 2026 8:00am, by Yevgeny Pats Why "automated" infrastructure might cost more than you think Feb 24th 2026 4:00am, by Justyn Roberts Why 40% of AI projects will be canceled by 2027 (and how to stay in the other 60%) Feb 13th 2026 6:00am, by Alex Drag Sparky Linux 9 brings a rolling release to Debian Mar 30th 2026 8:00am, by Jack Wallen Edera spent years calling KVM less secure. Here's why it changed its mind. Mar 25th 2026 2:22pm, by Steven J. Vaughan-Nichols Your Kubernetes isn't ready for AI workloads, and drift is the reason Mar 25th 2026 8:43am, by TNS Staff Linux kernel scale is swamping an already-flawed CVE system Mar 20th 2026 4:30am, by Jed Salazar Scaling Btrfs to petabytes in production: a 74% cost reduction story Mar 18th 2026 5:00am, by Motiejus Jakštys Tetrate launches open source marketplace to simplify Envoy adoption Mar 11th 2026 10:52am, by Adrian Bridgwater OpenTelemetry roadmap: Sampling rates and collector improvements ahead Feb 24th 2026 11:00am, by B. Cameron Gain Merging To Test Is Killing Your Microservices Velocity Dec 16th 2025 7:00am, by Arjun Iyer IBM’s Confluent Acquisition Is About Event-Driven AI Dec 11th 2025 6:00am, by Joab Jackson Deploy Agentic AI Workflows With Kubernetes and Terraform Nov 26th 2025 9:00am, by Oladimeji Sowole How OpenAI scaled to 900 million weekly users with Ory May 4th 2026 12:00pm, by Damon Tepe Mainframe modernization is no longer optional for the AI-driven enterprise May 3rd 2026 11:00am, by Jason Bloomberg Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by Paul Sawers Meta abandons open-source Llama for proprietary Muse Spark Apr 30th 2026 11:05am, by Steven J. Vaughan-Nichols Warp's gamble: Going open source to take on closed-source rivals Apr 29th 2026 9:57am, by Steven J. Vaughan-Nichols From system of record to system of control: How NetBox Labs is making network engineers “masters of intent.” Apr 28th 2026 11:00am, by Doug Sillars Beyond the VPN: Cloudflare Mesh builds a private network for the age of AI agents Apr 14th 2026 11:04am, by Adrian Bridgwater Model Flop Utilization is the metric Aria Networks says will define the AI infrastructure era Apr 7th 2026 9:00am, by Adrian Bridgwater How to deploy Pi-Hole with Docker and stop ads on every device on your LAN Mar 23rd 2026 7:44am, by Jack Wallen Why flat Kubernetes networks fail at scale Mar 20th 2026 7:00am, by Reza Ramezanpour Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Alasdair Brown Scaling Btrfs to petabytes in production: a 74% cost reduction story Mar 18th 2026 5:00am, by Motiejus Jakštys What is KubeVirt and why it’s growing Mar 17th 2026 9:00am, by Tiago Castro S3 is the new network: Rethinking data architecture for the cloud era Feb 2nd 2026 4:00am, by Max Liu Agoda’s secret to 50x scale: Getting the database basics right Jan 28th 2026 7:00am, by Cynthia Dunlop AI AI Engineering API Management Backend development Data Frontend Development Large Language Models Security Software Development WebAssembly AI has a sprawling data problem. Airbyte has just launched a tool to fix it. May 5th 2026 1:24pm, by "To us, it's just a tool": How SAS is selling AI to the Fortune 500 May 3rd 2026 12:00pm, by Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by The OpenAI-Microsoft reset, decoded: Why AWS may come out ahead Apr 30th 2026 1:17pm, by Anthropic's Claude Security emerges from closed preview to scan your codebases for vulnerabilities Apr 30th 2026 1:00pm, by The context window has been shattered: Subquadratic debuts a 12-million-token window May 5th 2026 2:01pm, by AI won't speed up software delivery — nothing has May 4th 2026 5:38pm, by The agent code explosion is here. We need to rethink our pipelines, fast. May 4th 2026 10:00am, by Inside OpenSearch's bid to become the default AI data layer May 2nd 2026 12:00pm, by AI agents are running wild on developer machines. Incredibuild has a fix. May 1st 2026 4:21pm, by Why JSON Schema matters more than ever in the age of generative AI Apr 28th 2026 1:00pm, by SmartBear's Swagger update targets the API drift problem AI coding tools created Apr 19th 2026 10:00am, by MCP is everywhere, but don't panic. Here's why your existing APIs still matter. Mar 23rd 2026 5:00am, by and Before you let AI agents loose, you’d better know what they’re capable of Mar 12th 2026 1:22pm, by GSMA Open Gateway offers developers one API for 300+ mobile networks Mar 4th 2026 10:26am, by Why Postgres wants NVMe on the hot path, and S3 everywhere else Apr 17th 2026 9:00am, by Expo bets big on React Native’s agentic future Apr 16th 2026 11:37am, by From clobbered drafts to real-time sync Apr 14th 2026 10:00am, by Moving beyond the “magic scaling sauce” myth Apr 2nd 2026 9:30am, by Backend Development in 2026: What's Changed, What Matters, and What to Learn Next Mar 19th 2026 11:37am, by ScyllaDB cut Sprig's read latency 4X after Redis and ClickHouse hit a wall May 5th 2026 7:00am, by How to find and unlock the data hidden within videos Apr 26th 2026 10:00am, by Vectors gave us AI search, tensors are going to make it smarter Apr 24th 2026 9:42am, by What engineering leaders get wrong about data stack consolidation Apr 15th 2026 12:00pm, by Kumo's new foundation model replaces months of data science engineering with plain-English queries Apr 14th 2026 12:01pm, by Expo bets big on React Native’s agentic future Apr 16th 2026 11:37am, by Digital Experience Monitoring belongs in the modern developer workflow Apr 3rd 2026 10:00am, by WebMCP turns any Chrome web page into an MCP server for AI agents Mar 17th 2026 11:50am, by Confluent adds A2A support, anomaly detection, and Queues for Kafka in major platform update Mar 3rd 2026 10:21am, by Google's Chrome browser moves to a two-week release cycle Mar 3rd 2026 9:00am, by Why JSON Schema matters more than ever in the age of generative AI Apr 28th 2026 1:00pm, by Microsoft-OpenAI rewrite opens the door for Anthropic and Google Apr 27th 2026 6:25pm, by Mistral’s Leanstral wants to kill off human-in-the-loop code checks, but is it blowing in the wind? Apr 24th 2026 12:04pm, by OpenAI's new Privacy Filter runs on your laptop so PII never hits the cloud Apr 23rd 2026 4:54pm, by AI shrinkflation: Why Anthropic's Claude Opus 4.7 may be less capable than the model it replaced Apr 23rd 2026 8:52am, by How OpenAI scaled to 900 million weekly users with Ory May 4th 2026 12:00pm, by Palo Alto Networks makes a $700M-class AI bet on Portkey gateway May 4th 2026 10:49am, by AI agents are running wild on developer machines. Incredibuild has a fix. May 1st 2026 4:21pm, by Quickbase's Pave targets vibe coding's notorious 80% problem Apr 30th 2026 11:00am, by Can your operations handle your security ambitions? Apr 27th 2026 9:00am, by AI won't speed up software delivery — nothing has May 4th 2026 5:38pm, by Mistral, Europe’s answer to OpenAI and Anthropic, pushes its coding agents to the cloud May 1st 2026 10:46am, by Meta abandons open-source Llama for proprietary Muse Spark Apr 30th 2026 11:05am, by GitHub moves Copilot to usage-based billing as AI coding costs climb Apr 27th 2026 3:02pm, by Can your operations handle your security ambitions? Apr 27th 2026 9:00am, by Edge-forward: Akamai eyes sweet spot between centralized & decentralized AI inference Apr 1st 2026 7:00am, by WebAssembly is now outperforming containers at the edge Mar 29th 2026 9:00am, by WebAssembly could solve AI agents' most dangerous security gap Mar 24th 2026 9:01am, by How WebAssembly plugins simplify Kubernetes extensibility Mar 3rd 2026 2:00pm, by WebAssembly is everywhere. Here's how it works Feb 25th 2026 11:00am, by AI Operations CI/CD Cloud Services DevOps Kubernetes Observability Operations Platform Engineering How HPE is closing the loop on cloud and AI sprawl with agentic AI Apr 29th 2026 9:00am, by Jennifer Riggins Jaeger adopts OpenTelemetry at its core to solve the AI agent observability gap Apr 25th 2026 12:00pm, by Jonah Kowall Jim Bugwadia on why finding a Kubernetes problem is only half the battle for Kyverno users Apr 23rd 2026 1:48pm, by Jennifer Riggins Google finally builds the AI and agent platform it's been describing for years Apr 22nd 2026 8:00am, by Frederic Lardinois GitHub pauses Copilot sign-ups as AI coding drives up compute demand Apr 21st 2026 7:03pm, by Paul Sawers The agent code explosion is here. We need to rethink our pipelines, fast. May 4th 2026 10:00am, by Arjun Iyer Is your internal platform ready to keep up with AI-accelerated development? Apr 16th 2026 6:07am, by TNS Staff Claude Code can now do your job overnight Apr 14th 2026 2:56pm, by Frederic Lardinois The TeamPCP attacks are a warning: Your CI/CD pipeline is the new front line Apr 2nd 2026 12:00pm, by Dan Lorenc Why coding agents will break your CI/CD pipeline (and how to fix it) Apr 2nd 2026 11:00am, by Arjun Iyer AI agents need to spend money — Stripe and iWallet are building the rails May 5th 2026 7:43am, by John Biggs AWS lands OpenAI on Bedrock, but Trainium is the real story Apr 29th 2026 10:54am, by Janakiram MSV Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson Microsoft-OpenAI rewrite opens the door for Anthropic and Google Apr 27th 2026 6:25pm, by Adrian Bridgwater The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson Why Terraform is green when your cloud is broken Apr 28th 2026 9:00am, by Joe Karlsson 3 steps to escaping the “break-fix” trap Apr 17th 2026 10:00am, by Cristina Dias Is your internal platform ready to keep up with AI-accelerated development? Apr 16th 2026 6:07am, by TNS Staff Why data governance is the secret to AI agent success Apr 10th 2026 11:00am, by Rod Cope Is observability still an operations problem at your organization? Apr 6th 2026 12:05pm, by TNS Staff How SUSE positions itself as the infrastructure layer for the AI era May 1st 2026 4:16pm, by Adrian Bridgwater Why Broadcom is betting on a private cloud comeback Apr 28th 2026 2:09pm, by B. Cameron Gain Beyond prompting: How KubeStellar reached 81% PR acceptance with AI agents Apr 26th 2026 12:00pm, by Andy Anderson Can you make Kubernetes invisible? Here's why AWS is on a mission to do it. Apr 14th 2026 1:52pm, by Adrian Bridgwater Amazon EKS Auto Mode wants to end Kubernetes toil — one node at a time Apr 7th 2026 1:55pm, by Adrian Bridgwater Arize AI and Google Cloud lay down standardized telemetry mandate to keep enterprise agents in check May 4th 2026 11:00am, by Adrian Bridgwater Sentry’s Seer Agent lets developers debug production issues in natural language Apr 28th 2026 12:01pm, by Frederic Lardinois The one Slack message that proved our elite engineering team was flying blind Apr 26th 2026 11:00am, by Joe Karlsson Jaeger adopts OpenTelemetry at its core to solve the AI agent observability gap Apr 25th 2026 12:00pm, by Jonah Kowall How to prepare your company for the era of agentic ITops Apr 17th 2026 4:54pm, by Alex Wilhelm Fresh data has us asking, does AI demand Kubernetes? May 1st 2026 4:18pm, by Jennifer Riggins Cut AI token usage by 96%? Here's how AWS Strands Agents does it. Apr 29th 2026 2:52pm, by Frederic Lardinois Anaconda acquires Outerbounds to rein in the buggy code AI agents keep shipping Apr 29th 2026 12:43pm, by Darryl K. Taft How HPE is closing the loop on cloud and AI sprawl with agentic AI Apr 29th 2026 9:00am, by Jennifer Riggins Why AI engineering needs old-school discipline Apr 27th 2026 2:53pm, by Frederic Lardinois Mainframe modernization is no longer optional for the AI-driven enterprise May 3rd 2026 11:00am, by Jason Bloomberg A nine-point checklist for shipping production-ready AI Apr 30th 2026 2:00pm, by Oladimeji Sowole How AI transforms your role as a platform engineer Apr 29th 2026 9:05am, by Zohar Einy Why Broadcom is betting on a private cloud comeback Apr 28th 2026 2:09pm, by B. Cameron Gain From system of record to system of control: How NetBox Labs is making network engineers “masters of intent.” Apr 28th 2026 11:00am, by Doug Sillars C++ Developer tools Go Java JavaScript Programming Languages Python Rust TypeScript Open source USearch library jumpstarts ScyllaDB vector search Feb 5th 2026 12:00pm, by Jelani Harper AWS WAF vs. Google Cloud Armor: A Multicloud Security Showdown Nov 25th 2025 10:00am, by Advait Patel Goodbye Dashboards: Agents Deliver Answers, Not Just Reports Nov 23rd 2025 9:00am, by Ketan Karkhanis Rust vs. C++: a Modern Take on Performance and Safety Oct 22nd 2025 2:00pm, by Zziwa Raymond Ian Building a Real-Time System Monitor in Rust Terminal Oct 15th 2025 7:05am, by Tinega Onchari "Real maturity problems": Not every developer is thrilled with Bun after Anthropic acquisition May 5th 2026 1:03pm, by Adrian Bridgwater AI agents need to spend money — Stripe and iWallet are building the rails May 5th 2026 7:43am, by John Biggs Most AI coding is "like taking your Ferrari to buy milk": IBM's Neel Sundaresan May 3rd 2026 10:12am, by Darryl K. Taft IBM Bob hits 80,000 developers with 45% productivity gains May 1st 2026 3:06pm, by Darryl K. Taft Warp's gamble: Going open source to take on closed-source rivals Apr 29th 2026 9:57am, by Steven J. Vaughan-Nichols Go Experts: 'I Don't Want to Maintain AI-Generated Code' Sep 28th 2025 6:00am, by David Cassel How To Run Kubernetes Commands in Go: Steps and Best Practices  Jun 27th 2025 8:00am, by Sunny Yadav Prepare Your Mac for Go Development Apr 12th 2025 7:00am, by Damon M. Garn Pagoda: A Web Development Starter Kit for Go Programmers Mar 19th 2025 6:10am, by Loraine Lawson Microsoft TypeScript Devs Explain Why They Chose Go Over Rust, C# Mar 18th 2025 7:00am, by David Cassel In the AI age, Java is more relevant than ever Apr 8th 2026 5:30pm, by Mary Branscombe Java 26 lands without an LTS badge. Here's why developers should care anyway. Mar 18th 2026 9:35am, by Darryl K. Taft 62% of enterprises now use Java to power AI apps Feb 10th 2026 12:58pm, by Darryl K. Taft BellSoft bets Java expertise can beat hardened container wave Jan 26th 2026 3:00pm, by Darryl K. Taft Java Developers Get Multiple Paths To Building AI Agents Dec 26th 2025 7:02am, by Darryl K. Taft "Real maturity problems": Not every developer is thrilled with Bun after Anthropic acquisition May 5th 2026 1:03pm, by Adrian Bridgwater TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft WebAssembly is everywhere. Here's how it works Feb 25th 2026 11:00am, by Jessica Wachtel Wasm vs. JavaScript: Who wins at a million rows? Feb 22nd 2026 6:00am, by Jessica Wachtel Arcjet reaches v1.0, promises stable security for JavaScript apps Feb 14th 2026 7:00am, by Darryl K. Taft Who will maintain the web when PHP's veterans retire? Apr 16th 2026 2:53pm, by Darryl K. Taft Will AI force code to evolve or make it extinct? Mar 22nd 2026 6:00am, by David Cassel Java 26 lands without an LTS badge. Here's why developers should care anyway. Mar 18th 2026 9:35am, by Darryl K. Taft TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft Nearly half of all companies now use Rust in production, survey finds Mar 6th 2026 10:45am, by Darryl K. Taft How to build an AI-powered private document search app with RAG, ChromaDB, and memory Apr 10th 2026 12:00pm, by Teri Eyenike In the AI age, Java is more relevant than ever Apr 8th 2026 5:30pm, by Mary Branscombe OpenAI acquires Astral to bring open source Python developer tools to Codex — but details are still fuzzy Mar 20th 2026 7:33am, by Meredith Shubel Python virtual environments: isolation without the chaos Feb 16th 2026 7:00am, by Jessica Wachtel Statistical language R is making a comeback against Python Feb 12th 2026 2:57pm, by Darryl K. Taft Nearly half of all companies now use Rust in production, survey finds Mar 6th 2026 10:45am, by Darryl K. Taft Wasm vs. JavaScript: Who wins at a million rows? Feb 22nd 2026 6:00am, by Jessica Wachtel Open source USearch library jumpstarts ScyllaDB vector search Feb 5th 2026 12:00pm, by Jelani Harper The 'weird' things that happened when Clickhouse replaced C++ with Rust Feb 4th 2026 7:26am, by B. Cameron Gain Async Rust: Pinning demystified Jan 26th 2026 11:00am, by Anshul Gupta From clobbered drafts to real-time sync Apr 14th 2026 10:00am, by David Moore TypeScript 6.0 RC arrives as a bridge to a faster future Mar 14th 2026 9:00am, by Darryl K. Taft Mastra empowers web devs to build AI agents in TypeScript Jan 28th 2026 11:00am, by Loraine Lawson Inferno Vet Creates Frontend Framework Built With AI in Mind Dec 10th 2025 11:00am, by Loraine Lawson JavaScript Utility Library Lodash Changing Governance Model Nov 1st 2025 7:00am, by Loraine Lawson 2026-05-05 14:01:04 The context window has been shattered: Subquadratic debuts a 12-million-token window AI Engineering / AI Models / Emerging technologies The context window has been shattered: Subquadratic debuts a 12-million-token window Subquadratic has launched a new AI architecture featuring a 12-million-token context window that outperforms GPT-5.5 on retrieval benchmarks. May 5th, 2026 2:01pm by Featued image for: The context window has been shattered: Subquadratic debuts a 12-million-token window

Every frontier model in 2026 advertises a context window of at least a million tokens, but almost none of them are actually great at making use of all of that information. On MRCR v2, the multi-reference retrieval benchmark labs report, the best model is GPT-5.5, which scores 74.0%. Others like Claude Opus 4.7 at 32.2% are far behind. 

At this point, a million tokens seems to be the maximum for the context window that the major frontier labs are offering. One major reason for the million-token max is the same one that has shaped every transformer-based model since 2017: Attention cost scales quadratically with context length, so doubling the input quadruples the work. Essentially, RAG, agentic decomposition, hybrid model architectures, and every other workaround the industry has built are ways of making tradeoffs to get around this.

Subquadratic, a Miami-based startup, launched its first model on Tuesday and claims it can get around all of this, now offering a model that can handle a token window of 12 million tokens. What’s more, the company says it plans to offer a model with a 50-million-context window soon. 

The company, which has 11 Ph.D. researchers on staff, argues that its architecture, called Subquadratic Selective Attention (SSA), scales linearly in both compute and memory with respect to context length. The company says it runs 52 times faster than dense attention at a million tokens, hits 92.1% on needle-in-a-haystack retrieval at 12 million tokens — a context length no frontier model currently gets close to — and scores 83 on MRCR v2, beating OpenAI by nine points.

The company says its Subquadratic Selective Attention architecture runs 52 times faster than dense attention at a million tokens, hits 92.1% on needle-in-a-haystack retrieval at 12 million tokens, and scores 83 on MRCR v2, beating OpenAI by nine points.

Those are large claims, and Subquadratic isn’t the first to try to tackle this problem. The benchmarks the company is releasing are impressive, including a 82.4% score on SWE-bench, which bests Anthropic’s last model, Opus 4.6, which scored 81.42% and Google’s Gemini 3.1 Pro at 80.6%. And it’s doing all of this at a significantly lower cost. 

Subquadratic is making this model available through an API — which will feature a 12-million-token context window — as well as a coding agent (SubQ Code) and a deep research tool (SubQ Search).

The quadratic cost of attention is obviously not a new problem, and SSA is not the first attempt to solve it. The research line goes back nearly to the original transformer paper, and the overall pattern has remained consistent. Every approach has traded one necessary property to gain another, and none have been able to replace dense attention at the frontier scale.

Every approach has traded one necessary property to gain another, and none have been able to replace dense attention at the frontier scale.

Among the different approaches is, for example, fixed-pattern sparse attention. In models like Longformer, it achieves linear scaling by letting each token attend only to a sliding window. It works when relevant information sits nearby and breaks when it does not.

State-space models like Mamba, Mamba-2, RWKV, RetNet replace the all-pairs comparison with a recurrent state that compresses everything seen so far. The compression is lossy, however. Nvidia’s study at 8B scale found pure Mamba-2 lagged transformers on MMLU and phonebook lookup, with the gap closing only when attention was added back. 

Hybrid architectures, as seen in Jamba, Kimi Linear, Qwen3-Next, and Nvidia’s Nemotron v3, are the pragmatic answer to this. They keep most layers efficient and retain a few dense attention layers for retrieval. But the economics are less favorable than they look. A hybrid that is three times cheaper at 32K tokens remains three times cheaper at 10M tokens, because the dense layers it retains still do O(n²) work. 

The most recent entries went in a different direction. Rather than trying to fix the pattern or compress the state, they learn which positions to attend to. 

DeepSeek’s Native Sparse Attention won the ACL 2025 best paper award, for example. Its successor, DeepSeek Sparse Attention (DSA), is shipping in DeepSeek V3.2-Exp. DSA’s lightning indexer routes attention to a small subset of selected keys, and the attention over those keys is genuinely sparse. The indexer that picks them, however, has to score every query against every key, meaning the selection step is itself quadratic.

SubQuadratic CTO Alex Whedon tells The New Stack, “Sparse attention basically means instead of doing what transformers do, which is if you have 1,000 words, you look at every possible relationship between all 1,000 words, which is 1,000 squared combinations. You realize that only a portion of those actually matter and you only process the portion that matter.”

SSA’s pitch is that it does what DSA tried to do without the indexer trap. Selection is content-dependent. For any given query, the model picks which positions matter based on what the query and keys actually contain — and most importantly, the selection mechanism itself does not go quadratic. 

“For prompt A, words one and six are going to be important to each other,” Whedon says. “For prompt B, maybe it’s words two and three. It’s different for every single input.”

According to Whedon, hybrids deliver “a scalar benefit,” but a pure subquadratic mechanism delivers a scaling-law advantage. SubQ’s reported 7.2× speedup at 128K and 52.2× at 1M in its benchmarks.

On RULER at 128K, SubQ scores 97.1 against Opus 4.6’s 94.8. On MRCR v2, the gap to the rest of the frontier is wider than the gap between the rest of the frontier and itself. 

On SWE-Bench Verified, SubQ reports 82.4%, edging out Opus 4.6’s 81.4%, and Gemini 3.1 Pro’s 80.6%. At 12 million tokens, where no frontier model operates, SubQ holds 92.1% on a needle-in-a-haystack benchmark.

There are some caveats. Each model was run only once, according to the technical paper, due to their high inference cost. The SWE-Bench margin is, as the paper acknowledges, “harness as much as model.” And the SubQ model is, by Whedon’s own description, “way smaller than the big labs.”

The company is launching two products in beta: an API that exposes the full 12M-token window and SubQ Code, a CLI agent built on the same model. Both run on neoclouds rather than the major hyperscalers — “they’re very expensive,” CEO Justin Dangel says. 

The company is not open-sourcing weights but plans to offer training tools for enterprises to do their own post-training. The 50-million-token context window target is set for Q4.

There is a bit of a cautionary tale here, though. Magic.dev announced a 100M-token context-window model in August 2024, with a claimed 1000× efficiency advantage. It raised over $500 million on its strength. As of early 2026, there is no public evidence of LTM-2-mini being used outside Magic.

Subquadratic has raised $29 million to date at a $500 million valuation from investors including former SoftBank Vision Fund partner Javier Villamizar and Tinder co-founder Justin Mateen. The company was previously called Aldea and worked on speech models before pivoting. The technical case is real. The category’s track record is the rest of the story.

Поділитися

Схожі новини