• LinkedIn



What is The Atticus Project?
The Atticus Project is a non-profit organization and whose mission is to harness the power of AI to accelerate accurate and efficient contract review. The Atticus Project started as a grass-roots movement by experienced lawyers in public companies and leading law firms aiming to achieve high-quality, low-cost, accurate and timely contract review using AI. It was officially incorporated as a California non-profit public benefit corporation in January 2020.

What problem is Atticus solving?
In many corporate transactions, the time and cost of conducting due diligence are significant because the contract review process is conducted by lawyers physically searching through thousands of documents to find what are, in essence, a few needles in a haystack. Employing the power of AI is the solution to this problem.

What are the benefits of using AI in contract reviews?
There are a number of significant benefits to using AI for contract reviews, including (1) faster, cheaper, and more accurate reviews, (2) faster data-driven business decisions, and (3) freeing-up lawyers to do more substantive work.

Is Atticus trying to replace human attorneys?
No. Leveraging AI in contract review will free up the attorneys to do more substantive work.

Is Atticus affiliated with any for-profit companies or law firms?
No. The Atticus Project is an independent non-profit organization not affiliated with other companies or law firms.

What is the Atticus Dataset?
The Atticus Dataset is the first open-sourced training dataset of legal contracts specifically built for AI training purposes. The dataset and related resources will include:

  • Atticus labels for Commercial Contracts, Shareholder/Employee Agreements (e.g., option/employment/voting/non-compete), Loan Agreements and Leases

  • 800+ labelled Edgar contracts with 15,000+ unique labels

  • Atticus College training videos and playbook on how & what to Label

  • 80+ samples for each hard-to-find clause (MFN, exclusivity, minimum commitment, etc.)

The Atticus Labels reflect collective know-how of experienced lawyers. Each contract in the Atticus Dataset is human-labelled with multiple levels of quality-control and approved by highly sophisticated and experienced lawyers.

What will be the end result of the Atticus Dataset?
Atticus Dataset will be an ongoing effort by the legal community to provide AI and data researchers around the world an easily accessible legal contract database. By the end of 2021, The Atticus Project will:

  • Achieve F score of 0.80+ for 10 most critical clauses (MFN, Change in Control, Exclusivity, IP Ownership Assignment, etc.) by partnering with the open-source AI community

  • Open source Atticus Dataset v. 2.0 with 2,000+ labelled contracts

  • Expand Atticus Dataset to cover all industries (pharmaceutical, energy, transportation, communication, health care, etc.)

  • Expand Atticus Dataset to cover use cases beyond corporate transaction due diligence (contractual and regulatory compliance, etc.)


What is a labelled dataset?
A labelled dataset, in this context, means a large volume of legal contracts with relevant provisions tagged or highlighted in each contract. We call these tagged or highlighted provisions "labels".

Will Atticus charge for the Atticus Dataset?
No. The Atticus Dataset will be open-source and free to the public.

What is open source?
Open source, in this context, means data that is free to use, reuse or distribute for public or commercial purpose subject to the requirement of Attribution or ShareAlike. The Atticus Contract Labels, for example, are licensed under the Creative Commons Attribution-ShareAlike 4.0 International License.

What is F-score?
The F-score is a common metric used to measure a model’s accuracy. The F-score provides a more realistic measure of a model’s performance by using both precision and recall. Precision, or the positive predictive value, refers to the fraction of relevant instances among the total retrieved instances. Recall, also known as sensitivity, refers to the fraction of relevant instances retrieved over the total amount of relevant instances.

Why open source?
Historically large-scale open-sourced training datasets, such as ImageNet, have enabled AI development that surpassed human review accuracy. Open source is how we can leverage the full power of the AI community to solve legal problems.

Who can benefit from the Atticus Dataset?
Anyone who wants to increase the accuracy and efficiency of contract review can benefit from the Atticus Dataset. AI community can use the Atticus Dataset to develop and train machine learning models to solve pressing legal problems. Companies and law firms can leverage the Atticus Dataset to conduct more accurate and efficient contract review that satisfy their unique needs.

Where did the name The Atticus Project come from?
The name was inspired by Atticus Finch, a lawyer from Harper Lee’s To Kill a Mockingbird. Atticus’s courage, integrity and perseverance inspired many of us to go into the practice of law. He continue to inspire us to take on difficult challenges.

Is The Atticus Project trying to build or sell an AI product?
No. The Atticus Project aims to build an open-source training dataset for the public. Namely, we are building a playbook for the AI to learn from.

Why is Atticus uniquely positioned to contribute to the accuracy and efficiency of contract review using AI?
The key to success in AI development in contract review is a high-quality, large-scale and consistently labelled contract dataset. No single law firm or company has the expertise, the bandwidth or the resources to do this alone. A task of this magnitude requires the effort of the entire legal industry working together. The Atticus Project is an independent non-profit organization that unites the legal community to take on this task.

How can you help?

We looking for your support of our mission, including:

  • Informing us of contract review needs in your organizations and industries

  • Commenting on the existing Atticus Labels

  • Spreading the word about the Atticus Dataset among the AI community