Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

I tried it against deepseek-r1-distill-llama-70b running on Groq (which is really fast) and it didn't get the right answer: https://gist.github.com/simonw/487c4c074cd6ad163dba061e1e594...

I ran it like this:

  llm -m groq/deepseek-r1-distill-llama-70b '
    H01 There are five huts.
    H02 The Scotsman lives in the purple hut.
    H03 The Welshman owns the parrot.
    H04 Kombucha is drunk in the scarlet hut.
    H05 The Romanian drinks butterscotch.
    H06 The scarlet hut is immediately to the right of the pink hut.
    H07 The Old Gold smoker owns scorpions.
    H08 Kools are smoked in the turquoise hut.
    H09 Red Bull is drunk in the middle hut.
    H10 The Brazilian lives in the first hut.
    H11 The man who smokes Chesterfields lives in the hut next to the man with the bear.
    H12 Kools are smoked in a hut next to the hut where the mule is kept.
    H13 The Lucky Strike smoker drinks rum.
    H14 The German smokes Parliaments.
    H15 The Brazilian lives next to the brown hut.
    Now,
    Q1 Who drinks water?
    Q2 Who owns the zebra?'
Using this plugin: https://github.com/angerman/llm-groq


Full DeepSeek R1 - accessed through the DeepSeek API (their "deepseek-reasoner" model) - got the right answer: https://gist.github.com/simonw/f77be3bbc720e1314235d42593562...


Whatever model is behind chat.deepseek.com got it in 348 seconds.

It amazes me they don't time that thing out after, IDK, 5 minutes of computation time.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: