Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
tripplyons
on April 22, 2024
|
parent
|
context
|
favorite
| on:
Lossless Acceleration of LLM via Adaptive N-Gram P...
They use a separate ngram model to generate the proposed sequence instead of extra heads on top of the main model. The process of verifying the proposed sequence appears to be the same.
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: