Hey there! I worked on Dolly, and I work on Model Serving at Databricks. DollyV1 is GPT-J-based, so it'll run easily on llama.cpp. DollyV2 is Pythia-based, which is built with the GPT-NeoX library
GPT-NeoX is not that different than GPT-J (it also has the rotary embeddings, which llama.cpp supports for GPT-J). I would imagine it's not too heavy of a lift to add NeoX architecture support
Because the firehost of AI/GPT is a lot to try to take in, please ELI5 unpack and provide more definitions for this comment.
-
Thank you.
Just so I am clear, "parameters" refers to the number of total node-relation-connections btwn a single node and its neighbors for that Prompt/Label? Or how would you explain this ELI5 style?
Sure! I'll try to briefly summarize though almost certainly will oversimplify. There are a couple of open source language models trained by Eleuther AI - the first one was called GPT-J, and it used some newer model architecture concepts. Subsequently, they released a model architected in the likeness of GPT-3, called GPT-NeoX-20B. Functionally, it was quite similar architecturally to GPT-J, but just with more parameters. Pythia is a model with the same architecture and the same dataset but with different parameter sizes to test scaling laws.
DollyV2 is a Pythia model fine tuned on the Databricks 15K dataset
Augmenting the answer to address your followup: parameters are any trainable variable in a model's definition. Model training is a process where you basically tweak the parameters in your model and then re-evaluate the model on a metric judging its quality. A lot of models consist of matrix multiplication, so if you are multiplying matrix A of size 2x2 with matrix B of size 2x2 and both matrices can we tweaked, then you've got 8 parameters, since you've got 8 numbers that can be tweaked