I Ran OpenAI’s Open-Weight Model on My Laptop (but I Wouldn’t Recommend It)

All eyes in AI may be on GPT-5 this week, the latest model from OpenAI’s big programming language. But hype ( and disappointment ) aside, OpenAI made another big announcement this week: gpt-oss, a new AI model that you can run locally on your device. I was able to get it running on my laptop and iMac, though I’m not sure I’d recommend you do the same.
What is special about gpt-oss?
gpt-oss, like GPT-5, is an AI model. However, unlike OpenAI’s latest and greatest LLM model, gpt-oss uses a model with open weights. This allows developers to tweak and fine-tune the model for their specific use cases. However, this is different from open source: OpenAI would have to include both the underlying code for the model and the data it is trained on. Instead, the company simply gives developers access to the “weights,” or, in other words, controls over how the model recognizes relationships between data.
I’m not a developer, so I can’t take advantage of this. However, with gpt-oss I can run the model locally on my Mac, which I can’t do with GPT-5. The main benefit, at least for a regular user like me, is that I can run LLM without an internet connection. This makes this perhaps the most private way to use the OpenAI model, given that the company collects all the data I generate when using ChatGPT.
The model comes in two variants: gpt-oss-20b and gpt-oss-120b . The latter is a much more powerful LLM and is therefore designed to run on computers with at least 80 GB of system memory. I don’t have computers with that much RAM, so I don’t need 120 GB. Luckily, gpt-oss-20b has a minimum of 16 GB of memory, which is exactly how much memory my M1 iMac has, and two gigabytes less than my M3 Pro MacBook Pro.
Installing gpt-oss on Mac
Installing gpt-oss on a Mac is surprisingly easy: you’ll need Ollama , which lets you run LLMs locally on your computer. Once you’ve downloaded Ollama to your Mac, open it up. The app looks pretty much like any other chatbot you’ve probably used before, except you can first choose from several LLMs to download to your computer. Click the model selection button next to the send button, then search for “gpt-oss:20b.” Select it, then send any message to start the download. In my experience, it’s a little over 12GB to download.
You can also download LLM via the Terminal app on your Mac by running the following command: ollama run gpt-oss:20b . Once the download is complete, you’ll be ready to go.
Running gpt-oss on my Macs
With gpt-oss-20b installed on both my Macs, I was ready to test. I closed almost all active programs to dedicate as many resources as possible to running the model. The only ones that were active were Ollama, of course, and Activity Monitor so I could keep an eye on the load on my Macs.
I started with a simple one: “What is 2+2?” After hitting Enter on both keywords, I saw chats processing the query as if Ollama was typing. I also noticed that both my computers’ memory was maxed out.
Ollama on my MacBook thought about the query for 5.9 seconds, writing: “The user asks, ‘What is 2+2?’ This is a simple arithmetic question. The answer is 4. Should answer simply. No further explanation is needed, but can answer politely. No additional context is needed.” It then answered the question. The entire process took about 12 seconds. My iMac, on the other hand, thought for almost 60 seconds, writing: “The user asks, ‘What is 2+2?’ This is a simple arithmetic question. The answer is 4. Should answer simply. No further explanation is needed, but can answer politely. No additional context is needed.” In total, it took about 90 seconds after answering the question. That’s a long time to figure out the answer to 2+2.
Next, I tried a task that I’d seen GPT-5 struggle with: “how many ‘b’s are there in ‘blueberry’?” Once again, my MacBook started generating an answer much faster than my iMac, which isn’t surprising. While still slow, it was able to output text at a reasonable speed, while the iMac struggled to read each word. It took my MacBook about 90 seconds in total, while the iMac took about 4 minutes and 10 seconds . Both programs were able to correctly answer that there are indeed two ‘b’s in ‘blueberry’.
Finally, I asked both of them who the first king of England was. I admit, I’m not familiar with that part of English history, so I assumed the answer would be simple. But apparently it was complicated, and that really made the model pause. It took my MacBook Pro two minutes to fully answer the question—it was either Athelstan or Alfred the Great, depending on who you asked—while my iMac took a full 10 minutes. To be fair, it took longer to name the kings of the other kingdoms before England was united under one flag. For the extra effort.
gpt-oss vs. ChatGPT
These three simple tests show that my MacBook’s M3 Pro chip and an extra 2GB of RAM outperformed my iMac’s M1 chip with 16GB of RAM. But that shouldn’t give the MacBook Pro too much credit. Some of these responses are still painfully slow, especially compared to the full version of ChatGPT. Here’s what happened when I entered the same three queries into my ChatGPT app, which now uses GPT-5.
-
When asked “how much is 2+2?” ChatGPT answered almost instantly.
-
When asked “how many BS letters are in blueberry?” ChatGPT answered in about 10 seconds. ( OpenAI appears to have fixed this GPT-5 bug .)
-
When asked “who was the first king of England,” ChatGPT responded in about 6 seconds.
The bot took longer to ponder the question of blueberries than it did to consider the complex history of England’s royal family.
I probably won’t use gpt-oss much.
I don’t use ChatGPT very often in my daily life, so I might not be the best test subject for this experiment. But even if I were an active LLM user, gpt-oss is too slow on my personal hardware for me to consider it as a regular application.
Compared to my iMac, gpt-oss feels fast on my MacBook Pro. But compared to the ChatGPT app, gpt-oss is slow. There’s only one area where gpt-oss completely outperforms ChatGPT: privacy. I can’t help but appreciate this: despite the slowness, none of my requests are sent to OpenAI or anywhere else. All processing happens locally on my Mac, so I can be confident that whatever I use the bot for will remain private.
That alone might be a good reason to use Ollama on my MacBook Pro whenever I want to use AI. I don’t think I’d be able to use it on an iMac, except to relive the internet experience of the ’90s. But if you have a powerful enough machine — say, a Mac with a Pro or Max processor and 32GB of RAM or more — it might be the best of both worlds. I’d be interested to see how gpt-oss-20b scales on such hardware. For now, I’ll have to settle for slow and private mode.
Disclosure: Lifehacker’s parent company, Ziff Davis, filed a lawsuit against OpenAI in April, alleging that it infringed Ziff Davis’ copyrights in the training and operation of its AI systems.