You Can Now Run Your Own ChatGPT From Your Nvidia GPU

You’ve probably noticed that generative AI tools like Google Gemini and ChatGPT are making their way into most of the technologies we use every day. These tools are based on giant large language models, or LLMs: networks trained on massive amounts of human data so they can produce realistic text, images or video .

However, you don’t need a cloud application to access these LLMs – you can also run them on your computer . You can take advantage of everything these models offer while offline, and you also won’t have to hand over your tips and conversations to companies like Google or OpenAI.

Nvidia has now launched its own local LLM app that leverages the capabilities of its RTX 30 and RTX 40 series graphics cards, called Chat with RTX. If you have one of these GPUs, you can install a generative AI chatbot directly on your computer and tailor it to your needs.

Where to begin

Before you begin, make sure you have the latest drivers for your Nvidia GPU—the GeForce Experience app on your computer will help you do this—and then head to the Chat with RTX download page . To run the tool, you’ll need Windows 11, a GeForce RTX 30/40 series GPU (or an RTX Ampere or Ada Generation GPU with at least 8GB of VRAM), and at least 16GB of RAM.

Keep in mind that Nvidia is labeling this as a “demo app,” which we assume means it’s not in fully finished form (and you may encounter some bugs and glitches as a result). It’s also a massive download, coming in at around 35GB in size as it includes a couple of LLMs – something to consider if you’re limited in terms of disk space or internet speed. The installation tool also takes some time to complete all its tasks.

Chat with RTX is launched from your PC. 1 credit

Eventually you should find the Chat with RTX app added to the Start menu. Launch it, and after a few seconds of processing, the program’s interface will appear in your default web browser. In the top left corner, you can select the open source AI model you want to use: Mistral or Llama . After this, you can start sending requests as if you were using ChatGPT or Google Gemini.

If you’ve used an LLM before, you’ll know what these generative AI engines are capable of: getting help composing emails, documents, text messages, etc., simplifying complex topics, or asking questions you might otherwise do an Internet search (for example, “what game is good for a party for four adults?”).

The application also leaves the command prompt window open. 1 credit

Standard AI bot rules apply: try to be as specific and detailed as possible, and be careful about trusting the answers you get too much (especially since it’s a “demo”). Chat with RTX can’t find current information on the internet, so it’s not really good at getting answers that need to be up-to-date, but it will always try to give you an answer based on the mass of online text it’s been sent. trained.

At the bottom of the interface, you have a button to create a new response to the last request (if you’re not entirely happy with the current one), a cancel button (to return to the previous request), and a delete chat button, which will clear your conversation history so you can start over. There is currently no other way to export responses other than copy and paste the text.

Add your own data and YouTube videos

Even in this early form, Chat with RTX has several useful features, one of which is the ability to base your answers on documents you provide: perhaps a week’s worth of research, or a series of reports you need to analyze, or even an entire piece of fan fiction you’ve written. Under the Dataset heading, select Folder Path , then direct the program to the folder containing the documents you want to use.

The app will scan the folder you pointed to (this may take a minute or two if it has a lot of files) and then you can start typing your queries. The bot will scan the text for suitable answers, and then at the end of the answer it will name the files it used. You can request a resume, check facts, or have the bot generate new text based on the text you entered.

Chat with RTX can also work with your files. 1 credit

Again, this is an early version of technology that is known to be less than 100 percent accurate—not yet something you want to base board decisions on. However, it’s fun to play around with an LLM that can work with the documents you give it, be it interview transcripts or volumes of poetry.

Speaking of transcripts, Chat with RTX can also analyze YouTube videos and suggest responses based on them through transcripts associated with the clips. (Based on our testing, the program can automatically generate transcripts for videos that don’t already have them.) This even works with all YouTube playlists, so you can run the program on a whole series of clips at once. time.

You can also point the program to a YouTube video. 1 credit

Select YouTube URL as your data set, then paste the video or playlist URL into the box below. If you’re working with a playlist, you can specify the number of videos you want to include from it in the box on the right. Finally, click the download button on the far right and Chat with RTX will download and process the transcript, ready for any queries you may have.

As with the document scanning feature, it can be useful for getting summaries or highlighting pieces of important information, but we found it to be the most buggy part of the program we use – a program that, to be fair, is labeled as version 0.2. The app sometimes doesn’t understand what video we’re talking about, but if you need quick answers about long videos and playlists you don’t have time to watch, Chat with RTX can be a useful tool.

More…

Leave a Reply