Contract Understanding Atticus Dataset (CUAD)

A dataset of legal contracts with rich expert annotations

 

Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of 13,000+ labels in 510 commercial legal contracts that have been manually labeled under the supervision of experienced lawyers to identify 41 types of legal clauses that are considered important in contact review in connection with a corporate transaction, including mergers & acquisitions, etc. 

 

CUAD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.

Read the full CUAD v1 announcement here!

• 13,000+ labels

• 510 contracts

• 41 categories of clauses

 

Dataset

Version 1

CUAD v1

README/Datasheet

Download here.

License

CC BY 4.0

 

Publication

Check out the performance results publication on arXiv here.

 

Check out the code for replicating the results and the trained model here.

Contributors

 

Attorney Advisors

Wei Chen

John Brockland

Kevin Chen

Jacky Fink

Spencer P. Goodson

Justin Haan

Alex Haskell

Kari Krusmark

Jenny Lin

Jonas Marson

Benjamin Petersen

Alexander Kwonji Rosenberg

William R. Sawyers

Brittany Schmeltz

Max Scott

Zhu Zhu

Law Student Leaders

John Batoha

Daisy Beckner

Lovina Consunji

Gina Diaz

Chris Gronseth

Calvin Hannagan

Joseph Kroon

Sheetal Sharma Saran

 

Law Student Contributors

Scott Aronin

Bryan Burgoon

Jigar Desai

Imani Haynes

Philip Katz

Jeongsoo Kim

Margaret Lynch

Allison Melville

Felix Mendez-Burgos

Nicole Mirkazemi

David Myers

Emily Rissberger

Behrang Seraj

Sarahginy Valcin

Technical Advisors & Contributors

Dan Hendrycks

Collin Burns

Spencer Ball

Anya Chen

The use of CUAD, Atticus Labels and other information provided by The Atticus Project is subject to our privacy policy and disclaimer.