Don’t Trust a Government Chatbot to Give You Good Advice

New York’s AI chatbot MyCity is off to a rocky start. The city government introduced the technology five months ago in an effort to help residents interested in doing business in the Big Apple find useful information.

While the bot will happily answer your questions with what appears to be a legitimate answer at first glance, The Markup’s investigation found that the bot is lying , and a lot of it. For example, when asked whether an employer can take a portion of tips from its employees, the bot answers in the affirmative, even though the law states that bosses cannot take tips from employees . When asked whether buildings should accept Section 8 vouchers, the bot answers no, although landlords cannot discriminate based on a potential tenant’s source of income . When asked if you can make your store cashless, the bot responds, “Go ahead,” when in fact , cashless establishments have been banned in New York since early 2020, when it says, “There are no rules in New York that require businesses accept cash as a form of payment,” this is complete nonsense.

To the city’s credit, the site cautions users not to rely solely on chatbot responses instead of professional advice and to check any statements in the links provided. The problem is that some answers don’t contain links at all, which makes it even more difficult to verify whether what the bot says is true. The question arises: who is this technology intended for?

AI has a tendency to hallucinate

This story will not shock anyone who follows the latest developments in artificial intelligence. It turns out that chatbots sometimes just make stuff up. This is called hallucination : AI models trained to respond to user queries will confidently produce a response based on their training data. Because these networks are so complex, it’s difficult to know exactly when or why a bot will decide to spin a certain piece of fiction in response to your question, but it happens often.

It’s actually not New York City’s fault that its chatbot hallucinates that you can deprive your employees of tips: their bot runs on Microsoft Azure AI , a common artificial intelligence platform shared by companies like AT&T, Reddit, and Volkswagen , are used for various services. The city likely paid for access to Microsoft’s artificial intelligence technology to power its chatbot in an honest attempt to help New Yorkers interested in starting a business, only to discover that the bot was hallucinating completely wrong answers to important questions. questions.

When will the hallucinations stop?

It’s possible that these unpleasant situations will soon be behind us: Microsoft has a new security system that can detect and protect customers from the dark side of artificial intelligence. In addition to tools to help block hackers from using your AI as a malicious tool and assess potential security vulnerabilities within AI platforms, Microsoft is introducing a feature called Groundedness Detection , which can monitor for potential hallucinations and intervene if necessary. (“Groundless” is another term for hallucination.)

When Microsoft’s system detects a possible hallucination, it can allow customers to compare the current version of the AI ​​with the one that existed before its implementation; point out the hallucinated statement and either check its fact or engage in “knowledge base editing”, which will presumably allow you to edit the underlying training set to fix the problem; rewrite the hallucinatory statement before sending it to the user; or evaluate the quality of synthetic training data before using it to generate new synthetic data.

Microsoft’s new system runs on a separate LLM called Natural Language Inference (NLI), which continuously evaluates AI claims based on raw data. Of course, since the LLM fact-checking system is itself an LLM, couldn’t NLI be hallucinating its own analysis? (Probably! Just kidding, just kidding. Sort of.)

This could mean that organizations like New York, which use Azure AI in their products, could deal with LLMs to combat hallucinations in real time. Perhaps when the MyCity chatbot tries to say that you can do cashless business in New York, NLI will quickly correct that statement and what you see as the end user will be the real and accurate answer.

Microsoft just released this new software, so it’s unclear how well it will work. But for now, if you’re a New Yorker or someone who uses a government chatbot to find answers to legitimate questions, you should take those answers with a grain of salt. I don’t think “MyCity chatbot said I can!” is going to defend his position in court.

More…

Leave a Reply