Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

Just saying if you ask for capital of an obscure country that it hasn’t been trained on, you will not get the answer, so 15k will get you come general stuff only within the confines. Also, to code you will need pretty complete documentation for it to ingest and then enough examples on how the code is done


15k is not the full training corpus. The model is trained on huge swaths of internet text. 15k is just the fine-tuning corpus to show it how to follow instructions. Stuff like world capitals and such are already present in the model weights due to being trained on tons of internet text.

With the raw LLM, you can get the capital of Mongolia with the prompt "The capital of Mongolia is", i.e. text completion. The fine-tuning allows you to get at that information by asking questions or giving commands, e.g. "Tell me the capital of Mongolia"




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: