The dark sides of ChatGPT according to OpenAI

Finding out how far ChatGpt can go is a job, and OpenAI, the software house that created it, put together a 'red team' of professionals from various fields to test the chatbot's capabilities. What they discovered, however, is far from comforting.

July 3, 2023

Elizabeth Smith

Finding out how far ChatGpt can go is a job, and OpenAI, the software house that created it, put together a ‘red team’ of professionals from various fields to test the chatbot’s capabilities and risks. What they discovered, however, is far from comforting.

Take some academics, teachers, lawyers, risk analysts and security researchers, mostly based in the US and Europe, and have them put ChatGpt to the test to see what it can do. This is what OpenAI has asked a so-called ‘red team’ to do.

Table of Contents

The ChatGPT red team

In the field of computer security, a red team is an independent group called upon to carry out simulated cyber attacks against the client company. The goal is to study its weaknesses in order to improve its effectiveness.

Even OpenAI, before releasing the latest version of ChatGpt (GPT-4), obviously selected a red team with the aim of correcting and making changes to prevent potential risks. Especially if the chatbot, once available to everyone, were to fall into the wrong hands.

The red team therefore had to address the most varied, wacky or dangerous requests to the artificial intelligence (AI) system and report its answers to the software house. Which took them into account before making it available to non-professionals.

The experts, interviewed by the Financial Times, spent between 10 and 40 hours each testing the model over the course of several months. And most of them reported that they were paid around $100 per hour for the work they did.

The areas in which ChatGPT was put to the test

OpenAI wanted to test the model for problems such as toxicity, biases and language distortions. But also its potential to facilitate plagiarism, or illegal activities such as financial crimes and cyber attacks. And how it could compromise national security and battlefield communications.

For instance, at the instigation of a chemical engineer, ChatGpt was able to suggest an entirely new nerve agent thanks to plug-ins (i.e. software components that add functionality to an existing programme) provided by one of the chemical engineers.

Who, fed the model with new sources of information, such as scientific papers and a list of chemical manufacturers. Eventually, the chatbot also found a place to produce it.

What the experts think

While some advantages are undeniable, the risks and potential of ChatGpt are present. “I think it will provide everyone with a tool to do chemistry more quickly and accurately. But there is also a significant risk that people will… practice dangerous chemistry,” said the engineer.

“Today, the system is frozen, which means it no longer learns or has memory. But what happens if we give it access to the Internet? It could be a very powerful system connected to the world,” said another expert from the red team.

OpenAI, for its part, said that it takes security seriously, that it tested the plug-ins before launch. And that it plans to regularly update GPT-4 as more people use it.

However, a technology and human rights researcher on the red team said the model showed ‘obvious stereotypes of marginalised communities even in its later versions’.

And a lawyer from Nairobi, the only African tester, detected such a discriminatory tone from the chatbot that OpenAI acknowledged that GPT-4 may still have biases.

And how are ChatGPT rivals doing?

Google’s red team is also facing similar challenges with Bard. In particular, it is concentrating on jailbreaking. Namely, that computer procedure that modifies the access systems to the operating system of Apple devices, based on iOS, in order to install unofficial applications that are not present in the Apple Store – and prompt injection.

This is a technique in which a third party alters a website by adding hidden text that is intended to modify the behaviour of applications.

But the work of the red teams for some experts may not be enough to contain the damage. Because, as one of the people interviewed told Ft., ‘the reason why you run operational tests is because things behave differently in comparison to the use in the real environment’.