Top of page

A screen grab depicting the user stories where C2PA standards could be useful. Text reads, "User Stories Organized by Content Lifecycle Model Steps 1. Content and Records Creation 1.1 How can C2PA document tools, including AI tools, used for descriptive metadata generation? 1.2 How can C2PA identify and potentially limit access to datasets for AI model training based on if the dataset content was human-created, AI created or a mix of both? 1.3 How can C2PA identify and potentially limit access to content for AI model training based on Copyright status? 1.4 How can C2PA be used to document the creation process, including digital tools, for new content to establish the baseline provenance data? 1.5 How can C2PA be leveraged to establish the provenance of open access publications governed by CC BY 4.o license variations? 1.6 How can C2PA be incorporated into audio digitization preservation processes to document provenance and continuity? 2. Maintenance and Use 2.1 How can C2PA be used to document stewardship of specific instances of content? 2.2 What role can C2PA play in documenting image accuracy of historical maps? 3. Accessioning and Processing 3.1 How can C2PA be leveraged in the acquisition process if incoming content already has C2PA? 3.2 What role can C2PA play for documenting images of artifacts for determining origination? 4. Preservation 4.1 How can C2PA enable provenance chains for any non-C2PA files? 4.2 How can C2PA relate or document derivative files with their authentic original or preservation/access copies of the same content item? 5. Access 5.1 How can C2PA be applied to “certify” the accuracy and completeness of copies of records generated for online access and in response to research requests? 5.2 How can C2PA be used to ensure the authenticity of redacted records? "
A screen grab of the user stories developed by the C2PA G+LAM working group which explore how C2PA could be used in digital preservation.

New Community of Practice for Exploring Content Provenance and Authenticity in the Age of AI

Share this post:

Today’s post is from Abbey Potter and Isabel Brador of the Digital Strategy Directorate and Kate Murray of the Digital Collections Management & Services Division here at the Library of Congress.


Since January 2025, a new Library of Congress working group has been exploring ways to bring responsible AI together with digital preservation through an emerging standard known as C2PA (Coalition for Content Provenance and Authenticity).

The Library’s internal AI Working Group formed the C2PA for G+LAM (Government plus Libraries, Archives and Museums) group, which brings together colleagues from cultural heritage organizations and government to explore how C2PA could be implemented in these sectors. In brief, C2PA is a cooperative effort led by industry partners to create an open technical standard providing publishers, creators and consumers the ability to trace the origin of different types of media, including data created or impacted by artificial intelligence (AI). The C2PA specification has been fast tracked as an ISO standard and interest and engagement with the C2PA concepts for authenticity has grown exponentially in recent months in diverse domains including photojournalism, online videos on streaming platforms, social media platforms including LinkedIn and Meta products, cybersecurity efforts and international broadcasters. C2PA is everywhere you look.

The C2PA for G+LAM Community Group serves as a gathering place to debate if and how C2PA could be useful in digital preservation workflows, where documentation of digital content creation, history, relationships and impacting events are paramount. The growing adoption of C2PA in industry – in cameras, scanners, internet browsers, streaming services and more – brings it well into the sphere of collecting, preserving and access organizations.

C2PA provenance data can be useful as a means to store information about the history of a file, but some collections already contain this data on acquisition. Digital preservation communities need to explore how this data can be created, maintained, and utilized in downstream workflows, as well as incorporated into essential tools (see, for example, the request for implementation in the popular open source tool BWF MetaEdit used for embedding, validating and exporting metadata in Broadcast WAVE Format [BWF] files).

Leonard Rosenthal, chair of the C2PA Technical Working Group, recognizes the symbiotic relationship between the goals of C2PA and those of the digital preservation community. He told the Library’s C2PA team that “C2PA development is actively evolving with a new version of the specification published in May 2025. The time is now for community feedback and engagement to help steer the work in ways useful for the digital preservation community at large. This is part of a larger effort to explore and document Responsible AI for cultural heritage, memory and government institutions through shared, consistent, transparent and verifiable protocols.”

To explore these ideas at the macro level, the C2PA for G+LAM Community Group has hosted quarterly meetings to build awareness and share experiences about the need for content and provenance documentation and kicked off several more focused action teams. Deliverables from the group so far include a set of diverse user stories and scenarios in addition to developing action assertion models that would be useful for digital preservation. Also planned is a “C2PA for Digital Preservation primer to describe, at a high level, how C2PA can support digital preservation and how it intersects with existing trusted resources.

If you have any questions about the Library’s work in the C2PA community or would like to share how your institution has been engaging with the C2PA or content authenticity and provenance more broadly, send an email to [email protected].

Add a Comment

Your email address will not be published. Required fields are marked *