What do you mean there is no such thing as R1-1.5b? DeepSeek released a distilled version based on a 1.5B Qwen model with the full name DeepSeek-R1-Distill-Qwen-1.5B, see chapter 3.2 on page 14 of their research article [0].
That may have been ok if it was just same model at different sizes but they're completely different things here & it's created confusion out of thin air for absolutely no reason other than ollama being careless.
It hasn't been great. Applying for jobs in this market while having the post-MSc burnout has been tough. I finished an MSC in AI back in June, but there are so few AI-relevant jobs where I live it's depressing. I've interview for data science and analyst roles, but barely get in the door.
I'm getting a good amount of interviews for developer and data engineering positions, but the competition is tough. Many positions have seen a 5x increase in candidates since the same time last year, according to my interviewers.
However, I'm hopefully getting an offer as a data platform engineer soon. The department leader has ranked me as their first choice, so unless the higher-ups complain... Knock on wood.
I have multiple questions regarding the methods of this test.
The biggest one is that, well... The test doesn't aim to see what GPT-4 can do and how well it does it, only whether the participant can guess the (possibly cherry-picked) answer the author decided on. In short, we don't know if he sampled answers and decided on the most probable answer (akin to consensus voting/self-consistency[1]), or if he asked a question and chose the first one.
Maybe GPT-4 guesses the correct answer for a question 80% of the time, but he got unlucky? You don't know, the author doesn't tell you. The answers are generated ahead of time and are the same every time you go through the test.
The questions mostly have correct or incorrect answers, and where there is some leeway, the author provides a fairly detailed explanation of what they would consider correct in each case. Do you have some specific criticism of an answer that you believe the author gets wrong?
> only whether the participant can guess the (possibly cherry-picked) answer the author decided on
My understanding is that the quiz samples a new GPT-4 answer every time you use it. That's why you put a confidence rather than a 0%/100% answer. There's always a chance it'll fail by freak accident.
If you're basing this on the animation used when revealing the answer, that's a fake effect. The source code[0] reveals that there's a typewriter effect that plays out when you select to answer the question.
Also, the commentary on the answers refers to specific parts of the answers. For it to be as in-depth as it is, it would have to be either pre-written or the commentary also generated by GPT on the fly. (And of course it wouldn't make sense to do that given the nature of the quiz.)
Yeah, one of the problems with being too abstract is that it's open to many interpretations, most of them incorrect. Also, a few pages or paragraphs won't stick; I need examples. I'm not a computer; installing new ideas in my brain isn't like installing apps. My brain prefers storytelling to better grasp ideas and make them its own.
Agreed, but many self-help books either waste their time by making dubious claims from half-sourced research, using useless anecdotes, or repeating the problem in different ways. The best ones actually tell stories of different ways of using its method to solve a variety of problems.
Absolutely, that contextualization can be so useful even if some deem it fluffy. Also no one is obligated to read every single word of a book, if you don't want the anecdotes and case studies, you can skip them! They are helpful for the rest of us.
Did you use AI to write this...? Because it does not follow from the post you're replying to.