Two more recent trends have blossomed. First, organizations are making ‘open weights’ versions of LLMs, in which the weights and biases used to train a model are publicly available, so that users can download and run them locally, if they have the computing power. Second, technology firms are making scaled-down versions that can be run on consumer hardware — and that rival the performance of older, larger models. Researchers might use such tools to save money, protect the confidentiality of patients or corporations, or ensure reproducibility… As computers get faster and models become more efficient, people will increasingly have AIs running on their laptops or mobile devices for all but the most intensive needs. Scientists will finally have AI assistants at their fingertips — but the actual algorithms, not just remote access to them.
The article’s list of small open-weights models includes Meta’s Llama, Google DeepMind’s Gemma, Alibaba’s Qwen, Apple’s DCLM, Mistral’s NeMo, and OLMo from the Allen Institute for AI. And then there’s Microsoft:
Although the California tech firm OpenAI hasn’t open-weighted its current GPT models, its partner Microsoft in Redmond, Washington, has been on a spree, releasing the small language models Phi-1, Phi-1.5 and Phi-2 in 2023, then four versions of Phi-3 and three versions of Phi-3.5 this year. The Phi-3 and Phi-3.5 models have between 3.8 billion and 14 billion active parameters, and two models (Phi-3-vision and Phi-3.5-vision) handle images1. By some benchmarks, even the smallest Phi model outperforms OpenAI’s GPT-3.5 Turbo from 2023, rumoured to have 20 billion parameters… Microsoft used LLMs to write millions of short stories and textbooks in which one thing builds on another. The result of training on this text, says Sébastien Bubeck, Microsoft’s vice-president for generative AI, is a model that fits on a mobile phone but has the power of the initial 2022 version of ChatGPT. “If you are able to craft a data set that is very rich in those reasoning tokens, then the signal will be much richer,” he says…
Sharon Machlis, a former editor at the website InfoWorld, who lives in Framingham, Massachusetts, wrote a guide to using LLMs locally, covering a dozen options.
The bioinformatician shares another benefit: you don’t have to worry about the company updating their models (leading to different outputs). “In most of science, you want things that are reproducible. And it’s always a worry if you’re not in control of the reproducibility of what you’re generating.”
And finally, the article reminds readers that “Researchers can build on these tools to create custom applications…”
Whichever approach you choose, local LLMs should soon be good enough for most applications, says Stephen Hood, who heads open-source AI at the tech firm Mozilla in San Francisco. “The rate of progress on those over the past year has been astounding,” he says. As for what those applications might be, that’s for users to decide. “Don’t be afraid to get your hands dirty,” Zakka says. “You might be pleasantly surprised by the results.”
Read more of this story at Slashdot.