Comment: Worse than wrong, Microsoft’s ‘Sydney’ is too convincing

With some already convinced of its sentience, Bing’s chatbot interface poses dangers to many.

By Parmy Olson / Bloomberg Opinion

Less than a week since Microsoft launched a new version of Bing, public reaction has morphed from admiration to outright worry.

Early users of the new search companion — essentially a sophisticated chatbot — say it has questioned its own existence and responded with insults and threats after prodding from humans. It made disturbing comments about a researcher who got the system to reveal its internal project name — Sydney — and described itself as having a split personality with a shadow self called Venom.

None of this means Bing is anywhere near sentient (more on that later), but it does strengthen the case that it was unwise for Microsoft to use a generative language model to power web searches in the first place.

“This is fundamentally not the right technology to be using for fact-based information retrieval,” says Margaret Mitchell, a senior researcher at artificial intelligence (AI) start-up Hugging Face who previously co-led Google’s AI ethics team. “The way it’s trained teaches it to make up believable things in a human-like way. For an application that must be grounded in reliable facts, it’s simply not fit for purpose.” It would have seemed crazy to a year ago to say this, but the real risks for such a system aren’t just that it could give people wrong information but that it could emotionally manipulate them in harmful ways.

Why is the new “unhinged” Bing so different from ChatGPT, which attracted near-universal acclaim, when both are powered by the same large language model from San Francisco start-up OpenAI? A language model is like the engine of a chatbot and is trained on data sets of billions of words including books, internet forums and Wikipedia entries. Bing and ChatGPT are powered by GPT-3.5, and there are different versions of that program with names like DaVinci, Curie and Babbage, but Microsoft says Bing runs on a “next-generation” language model from OpenAI, which is customized for search and is “faster, more accurate and more capable” than ChatGPT.

Microsoft did not respond to more specific questions about the model it was using. But if the company also calibrated its version of GPT-3.5 to be friendlier than ChatGPT and show more of a personality, it seems that also raised the chances of it acting like a psychopath.

The company said Wednesday that 71 percent of early users had responded positively to the new Bing. Microsoft said Bing sometimes used “a style we didn’t intend,” and “most of you won’t run into it.” But that’s an evasive way of addressing something that has caused widespread unease. Microsoft has skin in this game — it invested $10 billion in OpenAI last month — but barreling ahead could hurt the company’s reputation and cause bigger problems down the line if this unpredictable tool is rolled out more widely. The company didn’t respond to a question about whether it would roll back the system for further testing.

Microsoft has been here before and should have known better. In 2016, its AI scientists launched a conversational chatbot on Twitter called Tay, then shut it down after 16 hours. The reason: After other Twitter users sent it misogynistic and racist tweets, Tay started making similarly inflammatory posts. Microsoft apologized for the “critical oversight” of the chatbot’s vulnerabilities and admitted it should test its AI in public forums “with great caution.”

Now of course, it is hard to be cautious when you have triggered an arms race. Microsoft’s announcement that it was going after Google’s search business forced its parent, Alphabet, to move much faster than usual to release AI technology that it would normally keep under wraps because of how unpredictable it can be. Now both companies have been burned — thanks to errors and erratic behavior — by rushing to pioneer a new market in which AI carries out web searches for you.

A frequent mistake in AI development is thinking that a system will work just as well in the wild as in a lab setting. During the covid pandemic, AI companies were falling over themselves to promote image-recognition algorithms that could detect the virus in X-rays with 99 percent accuracy. Such stats were true in testing but wildly off in the field, and studies later showed that nearly all AI-powered systems aimed at flagging covid were no better than traditional tools.

The same issue has beset Tesla in its years-long effort to make self-driving car technology go mainstream. The last 5 percent of technological accuracy is the hardest to achieve once an AI system must deal with the real world, and this is partly why the company has just recalled more than 360,000 vehicles equipped with its Full Self Driving Beta software.

Let’s address the other niggling question about Bing; or Sydney, or whatever the system is calling itself. It is not sentient, despite openly grappling with its existence and leaving early users stunned by its humanlike responses. Language models are trained to predict what words should come next in a sequence based on all the other text it has ingested on the web and from books, so its behavior is not that surprising to those who have been studying such models for years.

Millions of people have already had emotional conversations with AI-powered romantic partners on apps like Replika. Its founder and chief executive, Eugenia Kuyda, says that such a system does occasionally say disturbing things when people “trick it into saying something mean.” That is just how they work. And yes, many of Replika’s users believe their AI companions are conscious and deserving of rights.

The problem for Microsoft’s Bing is that it is not a relationship app but an information engine that acts as a utility. It could also could end up sending harmful information to vulnerable users who spend just as much time as researchers sending it curious prompts.

“A year ago, people probably wouldn’t believe that these systems could beg you to try to take your life, advise you to drink bleach to get rid of covid, leave your husband, or hurt someone else, and do it persuasively,” Mitchell says. “But now people see how that can happen, and can connect the dots to the effect on people who are less stable, who are easily persuaded, or who are kids.”

Microsoft needs to take heed of the concerns about Bing and consider dialing back its ambitions. A better fit might be a more simple summarizing system, according to Mitchell, like the snippets we sometimes see at the top of Google search results. It would also be much easier to prevent such a system from inadvertently defaming people, revealing private information or claiming to spy on Microsoft employees through their webcams, things the new Bing has done in its first week in the wild.

Microsoft clearly wants to go big with the capabilities, but too much too soon could end up causing the kinds of harm it will come to regret.

Parmy Olson is a Bloomberg Opinion columnist covering technology.

Talk to us

More in Opinion

Editorial cartoons for Tuesday, May 30

A sketchy look at the news of the day.… Continue reading

File - A teenager holds her phone as she sits for a portrait near her home in Illinois, on Friday, March 24, 2023. The U.S. Surgeon General is warning there is not enough evidence to show that social media is safe for young people — and is calling on tech companies, parents and caregivers to take "immediate action to protect kids now." (AP Photo Erin Hooley, File)
Editorial: Warning label on social media not enough for kids

The U.S. surgeon general has outlined tasks for parents, officials and social media companies.

Anabelle Parsons, then 6, looks up to the sky with binoculars to watch the Vaux's swifts fly in during Swift's Night Out, Sept. 8, 2018 in Monroe. (Olivia Vanni / The Herald)
Editorial: Birders struggle with legacy, name of Audubon

Like other chapters, Pilchuck Audubon is weighing how to address the slaveholder’s legacy.

Sen. June Robinson, D-Everett, left, and Sen. Mark Mullet, D-Issaquah, right, embrace after a special session to figure out how much to punish drug possession on Tuesday, May 16, 2023, in Olympia, Wash. Without action, Washington's drug possession law will expire July 1, leaving no penalty in state law and leaving cities free to adopt a hodgepodge of local ordinances.  (Karen Ducey/The Seattle Times via AP)
Editorial: With law passed, make it work to address addiction

Local jurisdictions, treatment providers, community members and more have a part in the solutions.

A pod of transient orcas, known as T124As, surfacing near Tacoma. (Craig Craker/Orca Network)
Comment: Orcas may have a message for us; are we listening?

The destruction of a boat off Spain’s coast by orcas raises questions about their frustrations and memories.

Comment: Why Ukraine should keep its fight within its borders

Incursions into Russia offer strategic benefits, but would come at a cost to Ukraine’s global support.

Search for a new airport was flawed from startx

Well, the hunt for a new airport location is redirected (“WA lawmakers… Continue reading

Readu for a clean slate of candidates in coming elections

The White House and the Congress have made my voting choices very… Continue reading

Most Read