Demystifying Luminance’s unsupervised machine learning

9 April 2021 | Luke Taylor, Subject Matter Expert, Luminance

Artificial Intelligence (AI) has been disrupting the way we work at an accelerating pace for quite some time now. Platforms like Luminance are bringing real value to organisations, enabling lawyers to analyse their datasets at rapid speeds, confident that they are fully appraised of the contents of their datasets. This is all made possible by Luminance’s powerful combination of supervised and unsupervised machine learning. Let’s take a look at the world of Luminance’s unsupervised machine learning, whereby the technology forms an understanding of the dataset without human input.

Lawyers today are inundated by data

The explosion in enterprise data has meant that lawyers are now simply unable to digest the necessary volumes of data required for a complete M&A due diligence transaction or in preparation for an upcoming litigation. Indeed, it is now not uncommon for lawyers to be reviewing thousands - if not tens of thousands- of documents, often within very tight timeframes. And in a profession where time is of the essence and lawyers must respond quickly and dynamically to legal issues, reviewing datasets in their entirety can often be an unfeasible task. It would be difficult to overcome human error when talking about such complex data sets, but resorting to risky ‘sampling’ methods where only a small portion of the dataset is reviewed compounds the issue of missing something very crucial.

As a result, an increasing number of lawyers are turning to advanced technology such as AI to gain maximum insight into their dataset, with recent research by Luminance finding that over 80% of lawyers believe AI and machine learning tools are critical to the future of the industry.

Nonetheless, a lot of so-called ‘AI’ tools out there are in reality still relying on legacy approaches. Rules-based systems mean that before the lawyer can even get up and running on the platform, the system needs weeks of machine training and rules configuration, at significant time and cost. Further, the inflexibility of these systems means that technology is only able to surface that results that it is pre-programmed to, and therefore cannot identify the hidden risks that may be critical to a review.

Luminance and the true AI approach

Luminance does not require any lengthy set-up or intensive machine training. Deployed via the cloud and using a powerful blend of unsupervised and supervised machine learning, Luminance is able to immediately read an form an understanding of legal data.

With unsupervised machine learning, there are no predefined labels and the system therefore does not need machine ‘training’. Instead, the machine actively interacts with the dataset and looks for patterns across this information. The technology is then able to identify the underlying patterns within the data in the form of similarities or anomalies. It is this aspect of Luminance’s unsupervised machine learning which powers its ability to automatically identify and classify similar documents, clauses and surface key datapoints like dates, parties and currencies.

Expose the hidden risks in the dataset with Luminance’s unsupervised machine learning

Luminance’s unique ability to surface insight without reference to pre-conceived human notions or labels also enables its ability to uncover the hidden risks or the ‘smoking guns’ in the dataset. These can often be what are sometimes called ‘unknown unknowns’ - these are the hidden risks in the dataset that the reviewer did not know were in the dataset and thus did not think to search for. A key example of this in practice was when the world’s largest law firm, Dentons, used Luminance to conduct a Covid-19-focused review over a set of real estate agreements. With Luminance, on day one, the team discovered a crucial yet unexpected force majeure clause which nullified the agreements in their entirety. Indeed, the clause was buried amongst clauses totally unrelated to force majeure or termination of contract and would have been easily missed using manual review methods.

Unsupervised machine learning is also critical for when lawyers start to delve deeper into the dataset. For instance, when previewing documents, Luminance can automatically suggest to users similarly-worded documentation from within the dataset. Let’s take the example of the boutique Indian law firm, PSL, who used Luminance for an arbitration review. The team needed to comb through over 110GB of data (180,000 documents) to put together the statements of claim for their client who believed that the respondent had breached a certain agreement. During their review, PSL found an example of a respondent breaching an agreement and Luminance was able to instantly surface all other conceptually-similar results. This critical discovery not only expedited the review by 75%, but also provided PSL with the utmost confidence that nothing had been missed.

The supervised and unsupervised machine learning combination is what makes Luminance so powerful

In recent years, there has been a lot of talk about whether unsupervised machine learning is needed in a review and even misguided claims that lawyers may lose oversight by relying on unsupervised machine learning techniques. That is why Luminance relies on the powerful combination of supervised and unsupervised machine learning in a review: with unsupervised machine learning lawyers are provided with immediate insight into their dataset but supervised machine learning- where the machine learns from the interactions of the reviewer- enables lawyers to steer the direction of the review, resulting in exponential time savings and value.