Using NLP to Rapidly Analyze Legal Documents: The Class Action Lawsuit against Sam Bankman-Fried

Using NLP to Rapidly Analyze Legal Documents:
The Class Action Lawsuit against Sam Bankman-Fried

By: Christopher Sanfilippo

In this article, we demonstrate the power of natural language processing (NLP), a branch of machine learning, to rapidly derive insights from unstructured data in PDF format and its supporting content (online press coverage, twitter, etc.). We analyze the recent lawsuit filed against Sam Bankman-Fried. Our NLP engine processes the document, extracting key attributes of the case and allowing a human to quickly query the document in context and discover themes of interest.

SBF, as he is commonly known, was the founder and CEO of FTX, a Bahamas-based cryptocurrency trading platform. He is now at the center of the scandal surrounding the solvency crisis and subsequent collapse of FTX. He is accused of, among other things, fraud and embezzlement.

The steps taken to analyze this document using our machine learning platform, Quantum Sim, were as follows.

We downloaded the publicly available filing and uploaded it into Quantum Sim.

Quantum Sim automatically discovered supporting content that is referenced in the document.

Then, the training process begins. For this 41-page document plus hundreds of MBs of supporting content, this took under an hour on a laptop with mid-range processing power.

The output is a model trained on the combined content—consisting of the original document and discovered supporting content. How does Quantum Sim make sense of the content in meaningful way? Leveraging transfer learning, it uses pre-trained models to better contextualize for a specific task. In this case, the specific task is parsing legal text.

The model contextualizes the words and sentences—in a manner akin to how a human would—which allows the user to query the target document in plain English.

After we have our trained model, we can ask it questions. The image to the right shows some example queries.


Let’s take a closer look at the second example, “celebrities endorsed FTX.” The exact phrase does not appear in the document, but when we search this, Quantum Sim understands what we are looking for and can find passages that are likely to reveal which celebrities endorsed FTX. Quantum Sim also returns a “similarity score,” which essentially tells us how certain the algorithm is that the passage it found is relevant to our query.

In the above figure on the left, we see sentences (keyed to locations in the document) that Quantum Sim believes are relevant to our query. If we click on one, we are taken to the part of the document where that sentence occurs:

The above image is directly from the filing. Quantum Sim connected “celebrity” to “star quarterback and businesswoman” and also connected “ambassador” to “endorse” and so returned this section of the filing as likely relevant for our query. A quick scan of the passage picked out by Quantum Sim does confirm for us, the human, that they did endorse FTX and are in fact defendants in the lawsuit.

Quantum Sim can also help reveal hidden relationships that may be of interest. The gifs  graphically represent keywords the engine has picked out.

On the left, we see that Quantum Sim has picked out several terms related to FTX like “platform,” “defendants,” and “deceptive.” When we click on “deceptive,” we can see the keywords that the engine believes are closely related with that adjective, which could reveal to us interesting connections a human may not have picked up and thus prompt further queries. On the right, we see that relationships between other keywords can be accessed as well.

This whole process demonstrates how leveraging NLP helps us rapidly analyze and extract cogent information from a legal document. Quantum Sim saved thousands of man hours by automatically collecting supporting material, combining that with the target document, and presenting a consolidated view which augments a user’s capability to extract meaningful insights. In a business context, a platform like this can exponentially increase decision velocity when faced with thousands of pages of text. 

In short, Quantum Sim enables a user to filter the noise and focus on the signal.

About Forum Systems

Forum Systems is a leader in intelligent API gateway technology, deep data analytics, and cloud technologies. Forum technology, used by some of the largest global companies for building intelligent business workflows, is certified and secure. Along with industry-leading performance, interoperability, and security, Forum Systems takes pride in their customer-driven innovation and simplified user experience.