Dataset 03

Atticus Clause Retrieval
Dataset (ACORD)

126,000+ query-clause pairs114 queriesFully expert-annotatedCC BY 4.0
DatasetPublicationCode ↗Contributors

An expert-annotated clause retrieval dataset

The Atticus Clause Retrieval Dataset (ACORD) is a corpus of commercial contract clauses with over 126,000 query-clause pairs in response to 114 queries. Each pair is rated from 1 to 5-stars by expert attorneys.

ACORD is the first expert-annotated dataset built to help AI find the right legal contract clauses — including real lawyer-written queries and thousands of rated clauses, making it easier to draft and review complex contracts like Limitation of Liability and Indemnification.

Dataset

Supplementary: 2–5 Star Clause Pairs, Queries (short, medium, long formats)

License: CC BY 4.0

Publication

Accepted at ACL 2025 — the 63rd Annual Meeting of the Association for Computational Linguistics.

"ACORD: An Expert-Annotated Retrieval Dataset for Legal Contract Drafting"

Contributors

Attorney Annotators

Wei Chen

Yuji Sun

Tao Zhang

Benjamin Hendrick

Stacey Phillip

Alexander Kwonji Rosenberg

Michelle Sonu

Chris Herbst

Hannah Kang

Andy Song

Tim Evans

Ji-Hyun Park

Dataset Leads & Students

Yuyang Sun

Sarah Harrell

Adam Shankman

Lyla Sax

Jerry Jiang

Tarunya Dharmarajan

Liam Percer

Penelope Chung

Kevin Chen

AI Researchers & Technical

Steven Wang

Andreas Plesner

Maksim Zubkov

Kexin Fan

Max Emanuel

Evan Wang

Anya Chen

Subject to our privacy policy and disclaimer.