Data Privacy Day: What does it mean in 2024?

24 January 2024 | Harry Borovick

This Sunday is Data Privacy Day, marking “an international effort to create awareness about the importance of respecting privacy, safeguarding data, and enabling trust.” With consistent advances and increasing use of AI globally, questions and concerns surrounding regulations and, subsequently, compliance are growing. But in such a rapidly changing landscape, lawmakers and regulators are struggling to stay ahead. With so many new suggestions, guidelines and rules being thrown around, I thought it would be helpful to answer some basic questions about AI and compliance in today’s landscape.

How does AI use data?

Generalist AI models like ChatGPT are trained on vast amounts of data from across the internet. We’re talking hundreds of billions of datapoints here! It’s during this training on huge datasets that the model learns the patterns, grammatical and semantic structures present across text. The more data that is inputted into the model, the better it becomes at predicting how words are connected with each other.

Why do people want to regulate AI’s use of data?

Since the release of ChatGPT over a year ago now, AI has been thrust into the mainstream consciousness like never before. With these models becoming more intelligent and more present in people’s personal and professional lives, it’s unsurprising that questions surrounding the safety and potential consequences of these technologies are being raised.

You may have heard that the New York Times is filing a lawsuit against Open AI and Microsoft. They’re accusing them of copyright infringement, since the companies use millions of its articles to train AI models. Whilst it’s true that AI solutions like ChatGPT produce answers by scraping information and data from across the entire internet, the companies have defended themselves with the argument that the material is public and the technology isn’t reproducing it in its entirety. Nonetheless, the saga has certainly added further pressure to governments to introduce stricter legislation on AI.

What’s the drawback of regulation?

It’s a fine line to tread. Of course, we need to protect everyone’s privacy and ensure the safe use of AI, but applying overly precautionary principles could stifle innovation, barring the gates of AI before it reaches its full potential. Just think about the societal benefits AI has already brought – from revolutionising diagnoses through data-driven medicine to improving access to education and aiding in the fight against climate change. We need to make sure that any regulation introduced doesn’t prevent more advances in this kind of life-changing technology.

It’s unsurprising that regulators are hesitant to be too strict too soon. Restricting the use of data would severely limit the potential of generative AI models, giving a competitive advantage to players not subject to such heavy regulation.

How have governments responded to compliance demands?

As I said earlier, it’s a very hard balance to strike, so it’s unsurprising that various governments seem to be taking different approaches. The contrast is particularly stark when comparing the EU and the US…

The EU recently revealed its ‘AI Act’, which attempts to legislate the technology by dividing AI systems into categories based on risk. Whilst this likely won’t be coming into force imminently, it has notable implications, since businesses around the world looking to trade in the EU would need to comply with its regulation across all markets.

Of course, here in the UK there’s an election on the horizon, so it will be interesting to see the various attitudes politicians take towards AI. So far, Labour appears to be in favour of heavier regulation, although Starmer himself has acknowledged the opportunities AI presents and the need to embrace innovation. Time will tell!

Meanwhile, the US is taking a much more decentralised, sector-specific approach to regulation, with an emphasis on incentives, rather than constraints. With the US home to Big Tech players such as OpenAI, Google and Microsoft, it makes sense that the government is keen to sustain the economic momentum around AI.

What’s Luminance’s approach to compliance?

Luminance is in a different position to generalist models such as ChatGPT, since our AI is proprietary and has been developed through a process called fine-tuning. This is achieved through exposing the model to vast amounts of domain-specific training data. Luminance’s pioneering ‘legal-grade’ AI has been informed by over 150 million verified legal documents. So, these kinds of specialist models selectively curate data rather than scraping any information from across the entire internet.

Whilst this alleviates certain privacy concerns, that’s not to say we don’t take privacy very seriously. ‘Legal-grade’ AI automatically works in a more regulated space – as you can imagine, lawyers are extremely concerned with security and data privacy! That’s why Luminance takes a ground-up information security approach, making sure to always be transparent with our customers and treating privacy as a top concern. Practically, this ranges from having all the latest “on paper” certifications and processes, both in cybersecurity and data privacy, but also ensuring our customers’ data is secured in segregated environments to avoid the biggest risk of using generalist systems – data mixing or data use in the models without clear customer consent.