Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Realistically speaking, how long would it take for someone who has college-level Linear Algebra and multi-variable Calculus knowldge and rusty familiarity with ML (via Andrew NG's matlab course) to learn the concept of LLM and state of the art algorithms behind SDs and GPTs?

Would it even make sense if one's interest is not image or text generation?



Only a month or two for the basics, if you're willing to take some stochastic calculus in the SDs for granted.

Part of what's eluded public awareness so far is that these algorithms are simple.


The algos are but the architecture isn't.


For example, cf. ZionEX and Pathways.

AI accelerators are important.


A couple of months. Also, there is a world of difference between knowing the topic on a superficial level and training and evaluating a model yourself.


i mean to learn all the details it'd prob take the equivalent work of doing a phd + research fellowship or two. but i have the qualifications you cite and I'm doing the fast.ai courses to get a working knowledge of things and that seems to take 4-8 weeks depending on your pace


Not really. These algorithms are generally quite simple and you are not coming up with them from scratch.

It might take a PhD to build the JVM, but you don't need one to understand what it does (at least most parts of it).




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: