A dataset of legal contracts with rich expert annotations
Contract Understanding Atticus Dataset (CUAD) v1 is a corpus of 13,000+ labels in 510 commercial legal contracts that have been manually labeled under the supervision of experienced lawyers to identify 41 types of legal clauses that are considered important in contact review in connection with a corporate transaction, including mergers & acquisitions, etc.
CUAD is curated and maintained by The Atticus Project, Inc. to support NLP research and development in legal contract review.
​
Read the full CUAD v1 announcement here!
• 13,000+ labels
• 510 contracts
• 41 categories of clauses
Dataset
Publication
Our paper on CUAD is accepted by NeurIPS 2021, the 35th Conference on Neural Information Processing Systems (Datasets and Benchmarks Track)!
​
Check out the code for replicating the results and the trained model here.
Contributors
Attorney Advisors
Wei Chen
John Brockland
Kevin Chen
Jacky Fink
Spencer P. Goodson
Justin Haan
Alex Haskell
Kari Krusmark
Jenny Lin
Jonas Marson
Benjamin Petersen
Alexander Kwonji Rosenberg
William R. Sawyers
Brittany Schmeltz
Max Scott
Zhu Zhu
Law Student Leaders
John Batoha
Daisy Beckner
Lovina Consunji
Gina Diaz
Chris Gronseth
Calvin Hannagan
Joseph Kroon
Sheetal Sharma Saran
Law Student Contributors
Scott Aronin
Bryan Burgoon
Jigar Desai
Imani Haynes
Philip Katz
Jeongsoo Kim
Margaret Lynch
Allison Melville
Felix Mendez-Burgos
Nicole Mirkazemi
David Myers
Emily Rissberger
Behrang Seraj
Sarahginy Valcin
Technical Advisors & Contributors
Dan Hendrycks
Collin Burns
Spencer Ball
Anya Chen
The use of CUAD, Atticus Labels and other information provided by The Atticus Project is subject to our privacy policy and disclaimer.