Careful! ChatGPT may Leak Your Confidential Data

Since we’ve all plunged into our AI future over the past few months, it’s easy to become comfortable around ChatGPT, OpenAI’s artificial intelligence chatbot. We’ve quickly progressed from simple questions and ideas to feeding it more detailed information to solve our business, organizational, and even personal problems. That’s all good until your data suddenly ends up in the wrong hands.

Samsung’s Confidential Data Leak

The most notable recent data leak involves Samsung, the South Korean multinational manufacturing conglomerate. Initially, Samsung blocked access to ChatGPT on company hardware and devices. Access was allowed starting March 11th, and within weeks, the company experienced three confidential data leaks. From Gizmodo:

Local Korean media reports say a Samsung employee copied the source code from a faulty semiconductor database into ChatGPT and asked it to help them find a fix. In a separate case, an employee shared confidential code to try and find a fix for defective equipment. Another employee reportedly submitted an entire meeting to the chatbot and asked it to create meeting minutes. After learning about the leaks Samsung tried to control the damage by putting in place an “emergency measure” limiting each employee’s prompt to ChatGPT to 1024 bytes.

Three data leaks may not sound like a lot, but as Tom’s hardware notes, it’s disturbing when it happens in just twenty days of Samsung providing access. As you’d expect, Samsung has no comment on the incidents, describing them as an “internal matter.”

Several other companies have recently implemented restrictions on the use of ChaGPT due to data privacy issues:

In recent weeks both Amazon and Walmart have reportedly issued notices to employees warning them about sharing sensitive information with the AI mode. Others, like Verizon and J.P. Morgan Chase, have blocked the tool for employees altogether.

Axios has some additional helpful – well, you might see them as disturbing – details about how ChatGPT is being used:

2.3% of workers have put confidential company information into ChatGPT, according to a recent report from Cyberhaven.

A Fishbowl survey last month found that nearly 70% of employees who are using ChatGPT at work haven’t told their bosses about it.

The Challenges With ChatGPT

ChatGPT has posed numerous challenges for education, businesses, and other organizations, but data confidentiality is a particularly tough one. The more specific you are with your prompts, the better the quality of ChatGPT’s response. And that can cause problems when you ask it to summarize internal corporate meetings or, worse, to refine proprietary code. At the moment, it’s unclear whether Samsung has suffered any long-term damage from the data leak. But the company warned that it might again block access to ChatGPT. Once that data is submitted to ChatGPT, you trust OpenAI to safeguard your confidentiality – not a good move for any corporation.

Possible Solutions

Blocking access to ChatGPT is not the best solution, and it will only give your competitors an advantage as they might have less to lose. Instead, ComputerWorld suggests that large organizations with proprietary data may need to take Apple’s approach of transferring as little data as possible.

That, of course, is why Apple’s general approach to AI makes so much sense. While there are exceptions (including at one time the egregious sharing of data for grading within Siri), Apple’s tack is to try to make intelligent solutions that require very little data to run. The argument is that if Apple and its systems have no insight into a user’s information, then that data remains confidential, private, and secure. . . .

. . . . The future of such tools seems, then, likely to be based on taking such solutions as much as possible out of the cloud and onto the device. That’s certainly the direction of travel Microsoft seems to be signalling as it works to weave OpenAI tools within Azure and other products.

That appears to be Samsung’s response as it is implementing “protective measures” to prevent data leaks in the future and develop its own in-house AI platform. This may be the only practical solution for companies in highly competitive sectors. Smaller companies that can’t afford this approach may need to rely on third-party providers to provide on-site large language models (LLMs) that can guarantee data privacy. There’s a huge potential market here, especially for developing custom-built LLMs that small businesses do not have the resources to undertake.

Don’t Put Too Much Faith in OpenAI

Some have argued that OpenAI already violates the European General Data Protection Regulation (GDPR). As Professor Uri Gal from the University of Sydney Business School explains,

OpenAI offers no procedures for individuals to check whether the company stores their personal information, or to request it be deleted. This is a guaranteed right in accordance with the European General Data Protection Regulation (GDPR) – although it’s still under debate whether ChatGPT is compliant with GDPR requirements.

This “right to be forgotten” is particularly important in cases where the information is inaccurate or misleading, which seems to be a regular occurrence with ChatGPT.

OpenAI’s work on developing an easy-to-use generative AI platform is groundbreaking, but their focus is on AI development, not maintaining privacy safeguards. That is one reason why Italy instituted a temporary ban on ChatGPT, and other EU countries may soon follow with similar prohibitions. In addition, there have been reports online of ChatGPT displaying information from other users. Whatever you put into ChatGPT will be used by OpenAI to improve its services – but privacy has not been its primary focus.

Of course, you could always ask ChatGPT about the best ways to safeguard your and your organization’s privacy while using it. And it will happily inform you that it doesn’t collect personal or business data. And indeed, OpenAI changed its terms of service in early March to no longer take user inputs to train its AI models. But the lesson from the Samsung incident is that you should treat it like anything else on the web – if you put confidential information there, there will always be the possibility that it could end up in someone else’s hands.

ChatGPT may be the most helpful friend and colleague you’ve ever had. Just treat it like a friend who tends to be a little too chatty with everyone else.

Emory Craig

Emory Craig is a writer, speaker, and consultant specializing in virtual reality (VR) and generative AI. With a rich background in art, new media, and higher education, he is a sought-after speaker at international conferences. Emory shares unique insights on innovation and collaborates with universities, nonprofits, businesses, and international organizations to develop transformative initiatives in XR, GenAI, and digital ethics. Passionate about harnessing the potential of cutting-edge technologies, he explores the ethical ramifications of blending the real with the virtual, sparking meaningful conversations about the future of human experience in an increasingly interconnected world.

Samsung’s Confidential Data Leak

The Challenges With ChatGPT

Possible Solutions

Don’t Put Too Much Faith in OpenAI

Related