is everything ( for the most part ) a Llama model? does everything fork llama? is GGML part of llama? what is the relation of llama and mode formats. Is there an analogy? is GGML to react is to javascript? What is the differnence in GPT4all models vs llama.cpp vs ollama?
Everything (most llms and modern embedding models) is a transformer so the architecture is very similar. Llama(2) is a Meta (facebook) developed transformer plus the training they did on it.
Ggml is a "framework" like pytorch etc (for the purposes of this discussion) that lets you code up the architecture of a model, load in the weights that were trained, and run inference with it. Llama.cpp is a project that I'd describe as using ggml to implement some specific AI model architectures.
i am only dabbling in this space myself, so can't answer everything. all the formats i mentioned are for a quantized version of the original model. basically a lower resolution version, with the associated precision loss. e.g. original model weights are in f16, the gptq version is in int4. a big difference in size but often an acceptable loss of quality. using quants is basically a tradeoff between quality and "can i run it?".
examples of original models are llama(2), mistral, xwin. they are not directly related to any quantized versions. quants are mostly done by third parties (e.g. thebloke[1]).
using a full model for inference requires pretty beefy hardware. most inference on consumer hardware is done with quantized versions for that reason.
GGML is the framework for running deep neural network, mostly for interference. It's the same level as Pytorch or Tensorflow. So I would say GGML is the browser in your Javascript/React analogy.
llama.cpp is a project that uses GGML the framework under the hood, same authors. Some features were even developed in llama.cpp before being ported to GGML.
Ollama provides a user-friendly way to uses llama models. No ideas what it uses under the hood.
LLaMA was the model Facebook released under a non-commercial license back in February which was the first really capable openly available model. It drove a huge wave of research, and various projects were named after it (llama.cpp for example).
Llama 2 came out in July and allowed commercial usage.
But... there are increasing number of models now that aren't actually related to Llama at all. Projects like llama.cpp and Ollama can often be used to run those too.
So "Llama" no longer reliably means "related to Facebook's LLaMA architecture".
Ollama seems to be using a lot of the same, but as a really nice and easy to use wrapper for a lot of glue a lot of us would wind up writing anyway. It's quickly become my personal preference.
It looks to include submodules for GGML and GGUF from llama.cpp
there are modern products in this niche, and there is huge interest. the market certainly seems to be there (regium tried to defraud people for almost a million dollars i believe was the kickstarter sum before they got shut down).
there is squareoff [0], with new products currently in development (swap / neo)
then there was regium, an elaborate scam on kickstarter [1]
now there is phantom [2], which hopefully is not a scam. they at least posted some engineering details on hackaday [3]
squareoff has chess.com support, hopefully with lichess support coming (they are promising it, but has not yet happend). phantom claims working lichess support and to work on chess.com support
If it works I'll get two. Miss playing chess with my old roommate from college. Online works, but it would be too cool to have it sitting still on my desk only to get distracted from work by a piece moving.
If you are paranoid about something like this happening, just use https://www.qubes-os.org/. all usb devices are jailed in a non-networked vm by default.
In general, if what you do warrants that level of paranoia, qubes will help you massively.
since the qr code is just the totp seed, i simply print the seed in huge font on a sheet of paper. chance of enough degredation to inlegibility is pretty slim if stored correctly
> This means that OpenRA is not restricted by the technical limitations of the original closed-source games: it includes native support for modern operating systems and screen resolutions...
I think this is the most compelling reason - it opens up features that would be way harder to realize in the original engine.
It is a great question! Why does anybody play a specific game? For my kids gaming is a way to form a collective in-group narrative between a set of kids. That means that they all drift to the same games. My sons actually play different games together than the games they play to form a narrative with the bigger group. For me I play solo without talking about it with other adults too much. Many of the games are of the type I played with my friends when I was in the phase my kids are in now. Mostly 4x and RTS/TD. So my relaxation also has a nostalgic element. I don't grok Fortnite, but I do grok the cultural element in a few hundred million (!) people watching the same seasonal change. I wish I could have had gaming as a cultural element on that scale when I was young. Based on my preferences it beats the fads of the 90s by a very long stretch!
1) I used this [1] guide to get it working on Windows 10.
2) This [2] is a more detailed version with a lot of comments.
The bugs don't bother me that much to be honest, unless you want to complete all quests in a city, then you should have a look at them.
The tools one uses for configuration like dgVoodoo generally work, but I guess anyone could slip in malicious code if they really wanted to.
One final tip, the Hero Editor is a great way to edit your characters if you want to turn a fire mage into a wind one or stuff like that. Since multiplayer is not common these days, I reckon it's not cheating as you've lost all the benefits of multiplayer anyway. (I think the tool is just in German, not sure.) [3]
In terms of the open frameworks, it's not my area of specialty, but there is Sacred ReBorn, and there is even a Diablo 2 mod to Sacred. I believe even dgVoodoo does edit some files to use newer (open) frameworks.
if you look at the graph, it all begins after 1971. that was the year the gold standard was finally abandoned for real.
i am not an expert and don't know all the important puzzle pieces, or even understand them. but looking at our financial systems, i think a big part of the reasons for the problems of our current form of capitalism is the unlinking of fiat currency from any form of backing.
I'm just trying to get my head around this too. I think the basic mechanics of it are:
1) Abolish gold standard, enabling government to print money
2) Government prints money which is captured by large corporations
3) 90% of the population still owns the same amount of money but it's now worth far less because the rest of the pie is having cash pumped into it.
We're seeing a current example of 2/3 with the Coronavirus government stimulus, which went directly into plutocrat pockets while the masses were handwaved away with a token gesture.
>> We're seeing a current example of 2/3 with the Coronavirus government stimulus, which went directly into plutocrat pockets while the masses were handwaved away with a token gesture.
This helps to maintain an illusion of scarcity. It keeps the productive working class desperate for fiat money. The realist is that there is no scarcity of fiat at the top echelons; this is made clear by the high valuation of cryptocurrency projects.
The top 1% has so much free money coming in that they can toss it away some cryptocurrency projects with no effect at all on their lifestyles... And in doing so they also hedge their personal risk against the reality of an unsustainable fiat system of which they are currently beneficiaries.
Yeah in principle the money printing could have been used to devalue the assets of the wealthy and redistribute wealth, decreasing inequality. That didn't happen, presumably because corporations captured the wealth in the way that you describe.
Further, it is clear that some of those corporations (e.g. FAANG) are not the "original wealthy", which is often used as part of an argument suggesting that there is a lot of wealth mobility, but it seems incredibly limited to me, and the flow is hardening just as it did before.
trickle down economics don't work like the theories would want it to, just look at quantitative easing and the ECB equivalent. all those cheap loans to banks never ended up in the real economy
This doesn't really stand up to the facts. Governments went off the gold standard far before 1971. By this logic the New Deal and WW2 government spending should've caused huge inequality through inflation.
It seems far more about the success of neoliberal policy allowing economic & political power to concentrate. The various ways that concentrated power then kept capturing more and more of the pie can't be reduced to a single simple narrative.
I don't think what you're saying here disagrees with my theory. You're describing the mechanics of the second half of part 2 ('which is captured by large corporations').
As government policy is restructured to concentrate more and more wealth at the top, the rest of the economy slowly becomes illiquid and the government has to keep printing money to keep the axles greased. It's unsustainable and we're seeing the endgame now.
Your original post put the blame for inequality on government intervention via money printing.
Govt Inflation -> Inequality.
If only we still had the gold standard then they couldn't cause inequality!
This theory however is contradicted by a bunch of historical data.
There was massive inequality on the gold standard pre-WW1. The New Deal was a huge govt intervention which reduced inequality.
My explanation is that the problem is not government interference in 'free' markets but inequality in power. Economic power via monopolies & weak labour bargaining position AND political power via lobbying, strong parties, gerrymandering.
This power inequality is then leveraged by the powerful to create wealth inequality. The exact mechanisms by which they corrupt the systems to capture that wealth be it inflation or deflation or government handouts or M&A regulations or even slavery doesn't matter. If the systems are controlled by this massive inequality in power then it will find a way to corrupt the rules in powers favour.
So arguing for any particular economic policy is less important than reforming the voting systems, the tax system, the lobbying system and ownership of the media.
What do you back it with though? If the global economy was still linked to gold the the price of gold would be astronomical and cause similar problems to oil. Any country that happened to sit on a gold reserve is suddenly a potential world power.
Although if I remember the price of gold was actually regulated while it was linked to the dollar and after the gold standard was removed the price per ounce went up fast. That sort of control feels like a fiat currency anyway.
Changing the mechanics of the economy will not fix inequality, it is born from corruption in the ruling class. They more they are held to account the more equal society will be.
The real problem now is more and more power lies with entities that are detached from government, and the peoples vote (such as it is) has even less effect on those global entities.
quantized model formats:
- GGML: used with llama.cpp, outdated, support is dropped or will be soon. cpu+gpu inference
- GGUF: "new version" of the GGML file format, used with llama.cpp. cpu+gpu inference. offers 2-8bit quantization
- GPTQ: pure gpu inference, used with AutoGPTQ, exllama, exllamav2, offers only 4 bit quantization
- EXL2: pure gpu inference, used with exllamav2, offers 2-8bit quantization
here[1] is a nice overview of VRAM usage vs perplexity of different quant levels (with the example of a 70b model in exl2 format)
[1] https://old.reddit.com/r/LocalLLaMA/comments/178tzps/updated...