Artificial intelligence at the service of digital archives: reconstituting archival aggregations and enriching metadata schemas

Authors

DOI:

https://doi.org/10.6092/issn.2532-8816/21243

Keywords:

artificial intelligence, archives, case-files, document aggregations, metadata, archival bond, classification, indexing

Abstract

This paper aims to present the results of a study conducted as part of the international InterPARES Trust AI project and entitled «The role of AI in identifying or reconstituting archival aggregations of digital records and enriching metadata schemas». The overall objective of this study is to investigate the ability of artificial intelligence to support the creation (or re-creation) of archival aggregations to solve the problem of the presence of unaggregated, unordered or de-contextualised documents (both in the current and semi-current phase of the archive) that arises in many situations. Often, in both public administrations and companies, documents are neither classified nor aggregated; or document aggregations are formed but incorrectly. Furthermore, metadata – which are necessary to guarantee authenticity, reliability and searchability – are not often correctly identified and associated with documents. As a result, the organisation’s archive is not properly formed, which is a severe problem because it leads to an uncontrolled number of documents that are unordered, misplaced and therefore difficult to find. Unfortunately, in spite of the progress made by information technology to provide help in document management activities, it must be recognised that current software products are able to provide very limited support for this type of need. However, artificial intelligence techniques seem to promise great progress in this field. The study aims to address these issues by providing answers to research questions such as the following: can artificial intelligence tools help to create document aggregations when these have never been formed or to re-create them when they had been formed but have been lost? Can they provide valuable help in identifying metadata and associating it with related documents?

References

Agenzia per l’Italia Digitale. 2021. Linee Guida sulla formazione, gestione e conservazione dei documenti informatici. (17 maggio 2021). https://www.agid.gov.it/sites/agid/files/ 2024-05/linee_guida_sul_documento_informatico.pdf.

Agenzia per l’Italia Digitale. 2021. Linee Guida sulla formazione, gestione e conservazione dei documenti informatici. (17 maggio 2021). Allegato 5 (“Metadati”). https://www.agid.gov.it/sites/agid/files/2024-06/allegato_5_metadati.pdf.

Allegrezza, Stefano. 2022. La conservazione degli archivi di posta elettronica. Torre del Lago Puccini. Civita Editoriale.

Bonfiglio Dosio, Giorgetta. 2023. Primi passi nel mondo degli archivi. 5 ed. Padova.CLEUP.

Caravaca, Maria Mata. 2017. Policies and Requirements for Archival Sedimentation in a Hybrid Records Management Environment: A Critical Analysis of International Writings, PhD Thesis, Sapienza Università di Roma. https://hdl.handle.net/11573/937637.

Carucci, Paola, e Guercio, Maria. 2008. Manuale di Archivistica. Roma. Carocci. p. 209.

De Felice, Raffaele. 1998. L’archivio contemporaneo. Titolario e classificazione sistematica di competenze nei moderni archivi di enti pubblici e privati. Roma. La Nuova Italia Scientifica.

Decreto del Presidente della Repubblica 28 dicembre 2000, n. 445. Testo unico delle disposizioni legislative e regolamentari in materia di documentazione amministrativa. (GU n.42 del 20-02-2001 - Suppl. Ordinario n. 30).

Decreto Legislativo 7 marzo 2005, n. 82. Codice dell’amministrazione digitale. (GU n.112 del 16-05-2005 - Suppl. Ordinario n. 93).

DLM Forum Foundation. 2011. MoReq2010: Modular Requirements for Records Systems, Volume 1, “Core Services & Plug-in Modules”, Version 1.1. https://moreq.info/files/moreq2010_vol1_v1_1_en.pdf.

Eastwood, Terry, Hans Hofman and Randy Preston. 2008. “InterPARES 2 Project Book. Part Five: Modeling Digital Records Creation, Maintenance and Preservation: Modeling Cross-domain Task Force Report” in «Archivi», a. III-n.2 (luglio - dicembre 2008), numero speciale – Relazioni finali del gruppo internazionale di ricerca InterPARES 2. Disponibile anche all’indirizzo http://www.interpares.org/ip2/display_file.cfm?doc=ip2_book_ part_5_modeling_task_force.pdf.

Guercio, Maria. 2010. Archivistica informatica. Roma. Carocci.

International Council on Archives (ICA). 2008. ICA-Req: Principles and Functional Requirements for Records in Electronic Office Environments, Module 2: Guidelines and Functional Requirements for Electronic Records Management Systems. pag.26 https://www.ica.org/resource/ica-req-implementation-guidance-and-training-products

InterPARES Trust AI project. https://interparestrustai.org.

InterPares Trust AI, CU05 working group. 2023. The role of AI in identifying or reconstituting archival aggregations of digital records and enriching metadata schemas, https://interparestrustai.org/assets/public/dissemination/Report-CU05-Survey-of-the-Companies_v121.pdf.

Michetti, Giovanni. 2014. “Gli standard di gestione documentale”, in (Giuva, Linda, e Maria Guercio, a cura di) Archivistica. Teorie, metodi e pratiche. Roma. Carocci.

Penzo Doria, Gianni. 2007. «Il fascicolo archivistico: le cinque tipologie e i modelli archivistici», in Archivi e computer, anno XVII, Fasc. 2-3/07.

Penzo Doria, Gianni. 2022. ”Il piano per la fascicolatura”. In Annuario dell'Archivio di Stato di Milano. Milano. Archivio di Stato di Milano.

Pigliapoco, Stefano. 2022. Documentare archiviare e conoscere. Formare e conservare la memoria nel contesto digitale. Torre del Lago Puccini. Civita editoriale.

Romiti, Antonio. 2020. Archivistica generale. Primi elementi. Torre del Lago Puccini. Civita Editoriale.

Published

2025-08-07

How to Cite

Allegrezza, S. (2025). Artificial intelligence at the service of digital archives: reconstituting archival aggregations and enriching metadata schemas. Umanistica Digitale, 9(20), 643–660. https://doi.org/10.6092/issn.2532-8816/21243