Технології
🇺🇸 США
Google’s latest trick gets Gemma 4 running 3x faster right on your phone
Credit: Google
TL;DR
- Google has introduced new assistant models, called “drafters,” that could significantly speed up Gemma 4.
- Drafters work by predicting sections of prompts to the main model, which can focus on processing them in bigger batches.
- This allows the model to use the memory and the compute more efficiently.
Google’s recently launched Gemma 4 edge AI models are especially designed to run locally on consumer-hosted hardware. While favorable from a privacy standpoint, local models can easily hog resources and slow down results, rendering them ineffective. So, Google is now offering a potential solution, which it claims can speed up Gemma 4 models by up to three times.
Google recently released Multi-Token Prediction (MTP) drafters for Gemma 4. These drafters are essentially smaller, assistive models that help the primary model by “predicting” part of the user’s request. These smaller models also work in parallel to the main model to manage the compute more effectively.
Джерело
Читати оригінал
Поділитися
Схожі новини
Технології
Технології
Технології
Технології
Nvidia is facing more competition and it’s spooking investors
Japan Times
·
Тігіпко став позаштатним радником керівника ОП. За що він відповідатиме?
AIN.ua — Технології
·
ВАКАНСІЯ: AIN шукає Sales-менеджера за напрямом Defense Tech
AIN.ua — Технології
·
Meet Kunvar Thaman: Solo Indian researcher whose paper was accepted at an elite AI conference dominated by OpenAI and DeepMind
Times of India — World
·