You Can Run Generative AI Locally on Your Computer
You may have already tested generative AI engines such as ChatGPT and Google Bard . But while accessing these tools in the cloud is very popular, you can also install them locally on your computer. There are some real benefits to this: it’s certainly more private, and you won’t get any warnings that the AI is overloaded or unavailable. Plus, it’s just cool.
To get started, you’ll need a program to run the AI and a large language model (or LLM) to generate the responses. These LLMs are the basis of AI text generators. GPT-4 is the latest version that powers ChatGPT, and Google has now supplanted Gemini as the new and improved LLM powered by Google Bard.
If you’ve never heard the term LLM before, you clearly haven’t read our complete artificial intelligence glossary . They require a certain level of scientific and mathematical knowledge to fully understand, but essentially LLMs are trained on huge amounts of sample data and learn to recognize relationships between words and sentences (i.e. which words typically follow each other).
Simply put, LLMs are supercharged auto-leveling motors. They don’t really “know” anything, but they understand how words should fit together to sound natural and make sense. At a high enough level, it starts to look like you’re talking to a real person. There’s more to it, but you get the idea.
When it comes to running your own LLM programs, you don’t have to be a huge company or research organization to get access to them: there are several publicly available programs, including one released by Meta called LLaMa ; others were developed by researchers and volunteers. The general idea is that public LLM programs will help stimulate innovation and increase transparency.
For the purposes of this tutorial, we’re going to use LM Studio to show you how to install LLM locally. This is one of the best options for working (although there are many others). It is free to use and you can set it up on Windows, macOS and Linux systems.
How to set up a local LLM
The first step is to download LM Studio from the official website, taking into account the minimum system requirements: LLM work is quite demanding, so you will need a fairly powerful computer for this. Windows or Linux PCs that support AVX2 (usually on newer machines) and Apple Silicon Macs running macOS 13.6 or later will work, with at least 16GB of RAM recommended. On a PC, at least 6 GB of video memory is also recommended.
Once you’ve launched the software, you’ll need to find an LLM to download and use—there’s not much you can do without it. Part of the appeal of LM Studio is that it recommends “new and noteworthy” LLMs on the first screen of the app, so if you have no idea which LLM you need, you can select it here.
You’ll find that LLMs vary in size, complexity, data sources, purpose, and speed: there’s no right or wrong answer for which one to use, but there’s plenty of information on sites like Reddit and Hugging Face . if you want to do a little research. As you’d expect, LLM files can be several gigabytes in size, so you may want to do some background reading while you wait for it to download.
If you see an LLM you like on your home screen, simply click “Download.” Otherwise, you can run a search or paste the URL into the box at the top. You will be able to see the size of each LLM to estimate the loading time, as well as the date of the last update. You can also filter the results to see the models that have been downloaded the most.
You can install as many LLMs as you like (as long as you have space), but if you have at least one on your system, they will appear in the My Models panel. (Click the folder icon on the left to get there.) From here, you can view information about each model you have installed, check for updates, and uninstall models.
To start prompts, open the AI Chat panel using the speech bubble icon on the left. Select the model you want to use at the top, then enter your query in the user message box at the bottom and press Enter. The kind of output you get will be familiar if you’ve used an LLM like ChatGPT before.
On the right side, you can control various LLM-related settings, including how longer responses are processed and how much processing is offloaded to your system’s GPU. There is also a field for “pre-request”. For example, you can instruct the LLM to always respond in a certain tone or language style.
Click the New Chat button on the left if you want to start a new conversation, and your previous chats are saved at the bottom in case you need to return to them. Whenever generation of a particular response completes, you are given the option to take a screenshot, copy the text, or regenerate another response from the same request.
That’s all! You are already working with local LLMs. There are all sorts of possibilities you can explore in terms of LLM development, and tips if you want to dig deeper, but the basics aren’t hard to understand, and LM Studio makes the setup process super easy, even if you’re a complete beginner. .