Технології
🇺🇸 США
Show HN: Prompt-to-Excalidraw demo with Gemma 4 E2B in the browser (3.1GB)
Describe any diagram, Gemma 4 E2B generates it as Excalidraw — entirely in your browser. Desktop Chrome 134+ only.
The LLM outputs compact code (~50 tokens) instead of raw Excalidraw JSON (~5,000 tokens). The TurboQuant algorithm (polar + QJL) compresses the KV cache ~2.4× so longer conversations fit in GPU memory. Needs WebGPU subgroups (Safari/iOS not supported yet) and ~3 GB RAM (mobile browsers cap well below this).
This demo reimplements the TurboQuant algorithm in WGSL compute shaders so it runs on the GPU at 30+ tok/s. The sibling turboquant-wasm npm package implements the same algorithm in WASM+SIMD for CPU-side vector search.
Джерело
Читати оригінал
Поділитися
Схожі новини
China warns strong El Nino this year may worsen global fossil fuel crisis
South China Morning Post
·
Asia’s EV revolution shifts into overdrive with Iran war oil shock
South China Morning Post
·
Технології
I hid 4 Bluetooth trackers (including AirTags) to test their reliability - here's how Android rivals compared
ZDNet
·
Технології
Swiss AI Initiative
Hacker News
·