Within the Museo Virtuale della Musica BellinInRete project, a corpus of letters, written by the renowned composer Vincenzo Bellini (1801 - 1835) from Catania, will be encoded and made publicly available. This contribution aims at illustrating the part of the project regarding the implementation of the ongoing prototype for the metadata and text encoding, indexing and visualization of Bellini’s correspondence. The encoding scheme has been defined according to the latest guidelines of the Text Encoding Initiative and it has been instantiated on a sample of letters. Contextually, a first environment has been implemented by customizing two open source tools: Edition Visualization Technology and Omega Scholarly platform. The main objective of the digital edition is to engage the general public with the cultural heritage held by the Belliniano Civic Museum of Catania. This wide access to Bellini’s correspondence has been conceived preserving the scholarly transcriptions of the letters edited by Seminara within her recent critical edition ( ). The digital edition of the corpus takes care of handling the correspondence metadata by means of the correspDesc TEI tagset. Finally, Bellini's letters will be accessible via the Web platform as well as integrated into a forthcoming interactive and multimedia tour hosted at the museum.

Nel contesto del progetto Museo Virtuale della Musica BellinInRete sarà codificato e reso fruibile un corpus di lettere manoscritte del noto compositore catanese Vincenzo Bellini (1801-1835). Questo contributo si prefigge l’obiettivo di illustrare nel dettaglio la parte di progetto inerente lo sviluppo del prototipo per la metadatazione, codifica, gestione, indicizzazione e visualizzazione delle lettere belliniane. A oggi è stato definito lo schema di codifica seguendo le ultime linee guida della Text Encoding Initiative e istanziato su un campione rappresentativo di missive. Contestualmente, è stato realizzato un primo ambiente di fruizione personalizzando due strumenti software open source: Edition Visualization Technology e Omega Scholarly platform. La realizzazione dell’edizione digitale del corpus epistolare risponde, in primo luogo, all’esigenza di rendere fruibile al grande pubblico questa parte del patrimonio culturale custodito presso il Museo Civico Belliniano di Catania, senza rinunciare al rigore scientifico della trascrizione delle lettere presente nella recente edizione critica ( ) a cura di Seminara. L’edizione digitale del corpus epistolare presta particolare attenzione alla gestione dei metadati relativi alla corrispondenza, codificati mediante l’utilizzo del tagset TEI correspDesc. L’edizione delle lettere belliniane sarà sia consultabile via Web, sia integrata in un percorso museale multimediale e interattivo in allestimento presso il Museo Belliniano.

1. Introduction

This paper aims at illustrating the BellinInRete digital correspondence project which concerns the implementation of a digital prototype based on a corpus of letters. It involves text encoding, metadata recording, data indexing and visual rendering of a collection of missives written by the renowned composer Vincenzo Bellini (1801 - 1835) from Catania.

BellinInRete digital correspondence is part of the Museo Virtuale della Musica BellinInRete project (hereafter BellinInRete), which aims at renewing and creating a lasting change in the exploitation and enhancement of the Belliniano Civic Museum of Catania, through a new museum exhibition design. This challenging objective will be pursued by adopting digital methods and by developing, at the same time, a new communicative standpoint of the tangible and intangible cultural heritage. The new museum set up tells the main periods of Vincenzo Bellini's life by means of audiovisual technologies within peculiar scenographic settings that evoke the different theater environments of the early XIX century (orchestra, scene, foyer, stage), in which the visitor is immersed with the help of the "life in 4 acts" story of the Maestro. In the museum rooms the arias and some parts of the Maestro's works will be reproduced. The Museum will show both the Bellini's material documents, and the intangible cultural heritage. Indeed, the digital edition of the letters aims to render both the tangible legacy as well as what cannot be touched, namely the theoretical heritage, the knowledge, and the aesthetic heritage that has been produced during the time. Inside the museum rooms, a screen will be set up for the public engagement with the digital edition of the missives.

One of the most ambitious aims of the BellinInRete project is to promote the variegated ecosystem of Italian opera, making it accessible and understandable even to a non-specialist audience. Bellini's legacy consists of heterogeneous resources related to the archival, bibliographic and museum fields, with particular regard to the music domain. It includes autograph scores, booklets, handwritten letters, museum objects, sound and audiovisual reproductions, photographs and a wide library.

Above all, this project involves the analysis and the organization of the Museum's heritage according to the most recent and accredited standards and specifications for data and metadata with regards to museums, libraries and archives. Once digitized, the aforementioned heritage will embody a large knowledge base accessible through a multi-channel digital museum that illustrates the content of the documents.

The whole BellinInRete project lays its effectiveness on a multidisciplinary base taking advantage of a set of complementary competences from different institutions: the University of Catania, the Institute for Cognitive Sciences and Technologies (ISTC) of the CNR of Catania and the Institute for Computational Linguistics "A. Zampolli" (ILC) of the CNR of Pisa. Experts in musicology, archivistics, philology, art history, history of theater and entertainment, experts in communication processes and technologies, in knowledge management, technologies and methodologies, and in computer sciences and information technology have been involved.

A prototype of a selection of the Maestro's correspondence kept at the Museum will be ready in conjunction with the new scenographic installation, scheduled in summer 2020.

The main objective of the digital edition is to engage the general public with the cultural heritage held by the Belliniano Civic Museum of Catania. This wide access to Bellini correspondence has been conceived preserving the scholarly transcriptions of the letters edited by Graziella Seminara within her most recent critical edition . The encoding scheme has been defined by the CNR partners according to the latest guidelines of the Text Encoding Initiative and it has been instantiated on a sample of letters for evaluating the feasibility of the entire process. The main purpose of the developed scheme is to encode the correspondence metadata by means of the correspDesc tagset. Alongside the transcription and encoding of the text, standard techniques for archive cataloging and bibliographic description of the correspondence have been analyzed by the university and the museum partners and the adherence to some well-known schemes has been taken into account (e.g., UNIMARC scheme for recording bibliographic items, due to the fact that the digital catalog has been recorded by means of the SBN platform).

In this scenario, the good practices of the Semantic Web, by which information from the documents to the most authoritative public datasets (such as VIAF, Geonames and RISM ) are linked, were considered appropriate to meet the hermeneutical requirements of BellinInRete digital correspondence.

Finally, the digital edition combines the facsimile scanned image of each Bellini’s letter with the corresponding electronic transcription, conducted with conservative criteria in respect to the form of the original document's dictation.

The paper is structured as follows. After an overview of Bellini’s corpus which is preserved at the Belliniano Civic Museum of Catania (section 2) and a brief survey about related initiatives (section 3) concerning inspiring works on digital archives of letters and music documents; the objectives and the methodology adopted in this project are described in section 4, which analyses the work so far conducted for the production of the digital edition of Bellini’s correspondence. This latter section examines in detail the entire process which has been put in place starting from the description of the physical documents and ending with the exploitation of the digital edition throughout a Web-based scholarly environment. In particular, section 4.1 illustrates the metadata encoding activities along with related issues. Subsequently, the digitization work for the facsimile image acquisition is described (section 4.2). The design choices and the schema definition for the digital representation and encoding of the letters are discussed in section 4.3. The Web scholarly platform prototype, instantiated ad hoc for the analysis and dissemination of the digital edition, is illustrated in section 4.4. The paper ends with some notes concerning further developments planned as next steps of BellinInRete.

2. The Bellini corpus preserved at the museum

Currently, 41 autographed letters of Vincenzo Bellini are kept in the archive of the Belliniano Civic Museum of Catania. The initial nucleus of this collection originates from the donation of maestro Ascanio Bazan, great-grandson of the famous composer, which took place in July 1930 when the museum collections were set up, while other letters were acquired later. However, in the absence of an inventory of the archival assets owned by the Belliniano Civic Museum, it is impossible to determine exactly the acquisition date of each of these documents. It is only thanks to the information contained in some historical catalogs that it is possible to date back the presence of a certain number of these letters in the Museum given the time in which the catalogs themselves have been written. On the basis of this criterion it can be stated that in 1935, the year of publication of the second edition of the catalog compiled by Benedetto Condorelli ( ), there were already 25 copies in the collection. Similarly, from the examination of another inventory ( ), it results that at least seven more letters were acquired since the publication of the Condorelli catalog; however, the information that can be gained from this document - written in 1968 - is not always reliable due to the fact that record descriptions are not accurate and detailed. Finally, another acquisition regarding a small number of Bellini's letters took place in 1998 when the Municipality of Catania purchased an epistolary collection, put up for sale by Christie’s, belonging to Giovanni Battista Perucchini (1784-1869), a nobleman Venetian amateur. This collection is now kept in the Belliniano Civic Museum, and it is entirely cataloged in the "Fondo Perucchini" .

Bellini's autographed letters kept in the Bellini Civic Museum of Catania are arranged over a period of time that covers the whole life of the musician: from the "Supplica" sent by the young Vincenzo in May 1819 to Stefano Notarbartolo to the draft of the letter addressed to Giovanni Ricordi, dated September 3, 1835.

Nonetheless, their distribution is extremely irregular, as it is the origin of the documentary material: a fair number of letters date back to the years 1830-32 and comes largely from private collections; a second, much larger group of letters, refers to Bellini's stay in Paris in the last three years of his life (from 1833 to 1835) and was transmitted by Rossini to his family after the death of the musician. This explains the peculiar typology of this last group of misssives: they are mostly drafts, sketches of missives, that Bellini generally prepared for high-ranking recipients and for which he felt the need for a more supervised and stylistically elevated writing.

3. Related work

The digital edition of Vincenzo Bellini's correspondence kept at the Belliniano Civic Museum of Catania, as a part of BellinInRete, is an initiative of great cultural importance and with a clear innovative contribution, since, currently, there are no similar projects in Italy related to opera composers of the nineteenth century.

The followed approach is similar to that adopted successfully in other initiatives, where the epistolary corpora have been codified following the TEI guidelines ( ). Among other things, a Special Interest Group of TEI (Correspondence SIG), has worked, from 2008 to 2014, on the development of the CorrespDesc tagset towards the definition of a standard encoding schema for correspondence data ( ). The CorrespSearch web service makes use of this resource and allows to index letters by sender, recipient, date, place of sending and place of receiving through the Correspondence Metadata Interchange Format (CMIF) protocol ( ).

Among the projects closest to the BellinInRete objectives we mention, first of all, the German project WeGa that involves the digital edition of works and writings of the musician Carl Maria von Weber. In the Digital Archive of Letters in Flanders (DALF) project, it has been digitized a corpus of 1500 letters extracted from the Flemish literature "Van Nu en Straks" magazine. Many of the markup elements produced within the DALF project have been subsequently reused for the digital edition of the Van Gogh letter project. "In Mozart's Words" is a project aimed to collect, preserve and enhance the correspondence of the musician Wolfgang Amadeus Mozart. This project, thanks to an accurate digital encoding of the content of the letters, allows to discover and deepen the experiences of the Austrian composer’s life via Web.

The recent online archive of Casa Ricordi does not seem to follow the scholarly best practices dictated by the DH community. However, this initiative has really remarkable benefits considering the huge and worthy heritage now available and freely accessible via Web, as well as the transcription of a broad collection of letters, including some of those written by Bellini.

The ILC participated in the Clavius on the Web project whose main purpose was to digitize, process and share online a corpus of letters sent to the jesuit and mathematician Christophorus Clavius ( ). Similarly to BellinInRete, the project has been carried out with a twin-track approach, by digitizing, encoding and annotating the letters and, at the same time, by developing an interface to make the letters and their content publicly available on the Web. The letters have been transcribed and encoded using TEI-XML P5, then the texts have been tokenized and annotated at linguistic, terminological and semantic levels. All the information has been made explicit and dereferenced with Canonical URN ( ) and shared as Linked Open Data (LOD) . From a visualization perspective, three HTML5 interfaces have been developed to let the user access to all the aforementioned data: the first to browse the manuscript, the second to visualize all the annotations and the third for storytelling purposes. Although Proust's drafts prototype project does not focus on the digital edition of letters, a mention has to be deserved to this initiative, in which interesting solutions to encode genetic editions are proposed. Finally, with regards to the aspects of cataloging and linking contents in the Semantic Web environment, worth noting is the "RISM" project, which is considered an authority catalog of the musical domain. It has developed a LOD archive of a broad collection of musical sources accessible through a dedicated SPARQL endpoint.

4. Objectives of the project and methodology

Albeit the main objective of the project is to produce, preserve and make available the digital edition of Bellini's correspondence, there are some elements of innovation that distinguish this initiative from other similar projects. Besides the online publication, the digital edition will be available also within the setting of the Belliniano Museum, thus allowing visitors to "touch by hand" the original documents for a more immersive experience.

The experimentation of the tagset correspDesc of the TEI header and the use of the LOD paradigm in the activity of annotation of the letters (see 4.3) as well as the implementation of a prototype environment (see 4.4), constitute some further elements of innovation.

Finally, as expected from a project of this nature, the texts that will be made available and shared online, will make Vincenzo Bellini known to a wider audience of non-experts.

In agreement with the experts of the domain, the initial phase has been focused on the digital edition of Bellini's letters and on cataloging and metadata encoding. The production of this edition envisages to work on two technological fronts in parallel: on the one hand, the TEI encoding of textual phenomena and metadata, on the other hand, the description of the relevant entities by means of formal vocabularies defined by following the Semantic Web technologies.

4.1. Metadata encoding

The Management of the Museum has made available an inventory of the manuscripts (written in the years 2000), some lists of documentary materials in the library, and some objects kept in the Museum. These lists do not adhere to specific conventions and they are therefore not present in national or international catalogs (SBN, SAN, Internet Culturale, etc.). In the elaboration of a systematic catalog, functional to making the Museum's heritage available on the Web, cataloging rules for the musical and documentary material at the base of the SBN platform were taken into consideration: uniform title for the scores, catalog manuals, description of the manuscripts. Where not present, a unique identification code has been created for the documents with its specific signature. The other elements of Bellini's heritage have been meta-dated according to the catalog sheets of the Istituto Centrale per il Catalogo e la Documentazione. In this way it will be possible to exploit the sharing protocols provided by the Online Public Access Catalog (OPAC - Z39.50), useful to encode appropriately the access information, and to index the document to be cataloged.

Metadata are currently being published in national catalogs, and the creation of an ontology dealing with the contents of Bellini's heritage is planned, by extracting the concepts of particular interest which are present in the letters (names of persons, places, etc.) and in other documents, and by referring to standard metadata schemes such as RDA, FRBRoo and FOAF. Moreover, with a view to the reuse and interoperability of the Semantic Web resources, an accurate analysis of existing ontologies, vocabularies and thesaurus has been carried out, with particular attention to the music domain; therefore, the above mentioned RISM project for the musical domain and the VIAF and GeoNames resources for the other concepts have been taken as reference.

All letters and documents that have been reorganized and cataloged are currently divided into different sections, each of which an identification code has been assigned to. In line with the criterion adopted in the most recent critical edition of Bellini's correspondence ( ), the missives have been organized in chronological order with respect to the present or presumed date of the letter and the abbreviation LL has been chosen for the missives attributable to Vincenzo Bellini and to his family of origin.

In the current exhibition of the Museum, the autographed letters of the Catanese composer are all displayed in the windows of the birth house, with the exception of one letter. The LL1 acronym has been assigned to the group of Vincenzo Bellini's autographed letters which has a total of 41 units. The 85 letters addressed to the composer from Catania were collected, instead, under the acronym LL2. An additional letter was also found, the location of which was unknown at the time of the recognition carried out by Seminara in . Several missives have also been found, which can be traced back to a time interval following the death of Vincenzo Bellini. These are letters, or more frequently drafts of letters addressed to the father or to other members of the family. Furthermore, two other categories have been devised, LL3 and LL4 - with 39 and 48 units respectively -, which have been organized according to the criteria previously described. A small group of letters (14 missives) has been grouped under the LL5 acronym and consists of letters sent by people not belonging to the Bellini family addressed to different recipients.

The signature of the individual documents (e.g. LL1.4) identifies the group which the document belongs to, followed, after a dot, by a progressive serial number. In the quite frequent cases where the same unit contains more than one letter, it has been preferred to assign the same signature to both letters with the addition of I and II to show that it is a "multiple" document. All the signatures assigned to the individual units, when they were inserted in SBN, were accompanied by an alphanumeric code that contains information about the institution where the document is located; this is the code assigned by SBN: for the Belliniano Civic Museum of Catania, the string that precedes the signature is CT0031, so, for example, the first letter of Bellini is marked as CT0031_LL1.1 and this represents his unique "identification code".

The elements highlighted in the cataloging phase are functional to the description of the main characteristics of each document and concern not only the elements relating to the sender, recipient, dating, etc. but also the current location of the cataloging resource and its physical characteristics, as the format, and the language in which the text is written.

4.2. Digitization of the Bellini correspondence

Most of the paper documentation existing at the Museum (mostly handwritten) was reproduced in digital format. With regard to the letters written by and addressed to Bellini, about 600 photo shots in three different formats (TIFF master at 600 dpi with a color depth of 24-bit RGB, JPEG compressed to 300 dpi and JPEG compressed to 100 dpi) were taken. The relative descriptive metadata encoded in MAG format has been attached to each set of images. The scanning activity was carried out in the Museum's rooms to protect the extreme delicacy and preciousness of the original documents. To this end, a portable digital high-definition colour planetary/robotic scanner, with a special plane to minimize the effects of deformation and cold neon lighting, has been installed. The images will be cut out, balanced and processed in order to offer a digital facsimile representation that corresponds to the original as much as possible.

4.3. Letters encoding

After the phases concerned with the metadata encoding and with the digital acquisition of Bellini’s correspondence, the documents were analyzed to identify the textual and metatextual information which had to be marked up. As indicated supra, the digital edition of Bellini’s letters is based on data that are already published in the philological study conducted in .

Within the BellinInRete project, the edition is conceived to follow the image-based encoding, consequently providing the facsimile reproduction of the original document in parallel with the electronic version of the transcription. The textual and structural phenomena have been appropriately encoded in TEI XML format in the electronic document.

In particular, in this edition some textual phenomena have been taken into account that were omitted in the printed edition for editorial reasons. Moreover, some information was added within the encoded texts by exploiting linking mechanisms both to external resources as well as to internal files.

The added value of this digital edition is not limited to the metatextual data associated with the digital transcription of the letters. Indeed, the user is able to view the digital reproduction of the autographs, coming in contact with the emotional charges of Bellini (evident in the anxious profile of his writing), with his morphological and lexical uncertainties (visible by the large amount of self-corrections), but also with his correspondence to the "rules" of the epistolary grammar of the nineteenth century.

One example of these rules is the use of the so called "dar la linea", namely, the amplitude of the space between the introductory formula and the body of the letter and between the conventional closing formulas and the final signature, which depended on the hierarchical relationship between the writer and the addressee: an aspect of Bellini's epistolary writing that appears to be closely related with the multiplicity of stylistic levels (confidential or formal) consciously adopted by the musician according to his correspondents.

Often the letters present successive third hands annotations made by catalogers. These metadata were managed with the TEI handDesc module.

The analysis phase produced an encoding scheme suitable to record and process data related to the selected corpus. The encoding provides scholars with advanced navigation features that can describe historical-critical information, displaying, next to the transcribed text, the critical apparatus drawn up by the editor.

The defined XML scheme considers the adoption of two important extensions of the TEI Header: the correspDesc and the msDesc elements. The first one, placed in the profileDesc descriptor, is suitable to record the correspondence; while the second one, situated in the TEI "manuscript description" module, is useful for the physical description of the letters. Furthermore, the msDesc elements allow to structure all the characteristic aspects of the manuscript, encoding the descriptive, codicological, structural and content data.

The encoding work was carried out on a representative sample of letters. For example, the manuscript letter that Vincenzo Bellini sent from Puteaux to Carlo Pepoli dating 26 June 1834 has been encoded with LL1.16 signature. This document corresponds to the 16th Bellini's letter kept at the museum in chronological order.

The current signatures and collocations of the letters, defined in the cataloging phase, have been appropriately encoded by using the msIdentifier element.

Specific interest was given to the extratextual aspects related to the physical layout of the letters. Generally, the letters themselves also have the purpose of acting as envelopes. In fact, the letters are folded on themselves and postmarks and wax seals are sometimes affixed on them, see for example the folio in . The TEI msDesc module defines all the descriptors to suitably encode all the necessary information regarding preservation conditions, the type of paper, the description of the stamps and to the seals, the size and modality of folding the folio, the collation, and other codicological and archival metadata.

Enveloping of letter LL1.4.

Enveloping of letter LL1.4.

The adoption of the correspDesc element has allowed us to enrich the encoding of the logical-structure of the document (sender, recipient, date of sending, date of receipt, communication context, closing of letters). shows an example about the correspDesc tagset for the LL1.16 letter.

In addition to the traditional encoding concerning the structural hierarchies of the document and the logical hierarchies of the text, the schema also allows to represent other important information related to the content of the letters (through the front, body and back elements of the text node).

The correspDesc module relative to letter LL1.16.

The correspDesc module relative to letter LL1.16.

Given the method through which the document is folded and enveloped, the front element contains information about both the addressee and the description of the stamps, as well as to other regions of interest upon the selected folio (hotspots). The content model of the body element contains, instead, the text encoding of the letter, which is conducted line-by-line, taking into account alignments, headlines, newlines, spelling phenomena (such as abbreviations, underscores, deletions, etc.), the addressee (encoded by adopting the opener type node), as well as the introductory and farewell formulas, and the final salutation with the author signature (closer). In particular, the introductory and farewell formulas have been encoded through the salute ( ), responding to formal stylistic choices of the maestro, as already mentioned supra. Finally, the back element contains bibliographic references to other critical editions of the letter, as well as editorial notes and further information extracted from the printed edition. Notes, generally, have the purpose of contextualizing what is expressed in the text; however, no new notes have been created when linking an item encoded in other lists is sufficient to deepen and understand the selected textual fragment, as will be explained infra.

Introductory formula of letter LL1.16 encoded with salute tag.

Introductory formula of letter LL1.16 encoded with salute tag.

Indeed, the museum context and the educational purpose, underlying this digital edition, have led us to provide a set of lists of named entities, explicitly cited in the text or implicitly referenced. These lists are accessible within the context of the visualization of the letters, through the use of inline pop-up and references to authoritative online datasets. A list of people (TEI-ListPerson) has been created in order to be always available and independent from the contexts in which the names appear. This list includes the addressees and the persons present in the text, mentioned or implicitly referenced. Most of the information relative to each entry has been taken from the notes of the source critical edition. A short biographical note and the social role (count, baron, etc.) have been appropriately defined; the role in the musical context (librettist, tenore, etc.) has been encoded using the syntax of the guidelines of the Music Encoding Initiative (MEI). External references to the authoritative RISM and Treccani repertoires (Biographical Dictionary) have been added as well.

By virtue of the divulgative nature of the edition, it has been decided to create the list TEI-ListOrganization, which includes the theaters mentioned in the text, and the theaters where the Opera premieres took place. This list has been created according to the Treccani repertoire by using the orgName and listOrg elements.

The list of places (TEI-ListPlace) contains the places where people were born and died, the places mentioned in the text and the cities where the Opera premieres took place. The external reference to the Geonames dataset is always added, as well as the reference to the Treccani dictionary.

It has been chosen to include in TEI-ListTerm both the words explained in notes in the reference edition, and all the occurrences of terms specifically belonging to the musical domain (duetto, sortita, atto, etc.) and to the context of musical performances (tenore, spartito, libretto, etc.) taken from the book edited by Della Seta ( ).

In this digital edition the internal and external bibliography is managed with two different lists: TEI-ListWork and TEI-ListBibl. The first one concerns the musical compositions (opera, musica da camera, etc.) made by Bellini and other authors (directly or indirectly) mentioned in the letters considered significant in the life and works of the composer. Since Bellini refers, in most cases, to his ongoing works, the term indicated in the list is conceptually associated with Work or Expression according to the FRBR criteria and not to Manifestation or Item. However, in TEI-ListWork we recorded a brief descriptive note extracted from Seminara's notes and integrated with information on the first performance (theater, place and date) as well as references to the Treccani and to Wikipedia pages. According to the MEI guidelines the elements that identify the intellectual content in the musical context were also considered, such as composer and librettist, as shown in . On the other side, TEI-ListBibl contains the bibliographical references about critical editions of letters and repertoires.

Bibliographic entry of Pirata Opera with composer and librettist elements of MEI.

Bibliographic entry of Pirata Opera with composer and librettist elements of MEI.

As mentioned supra, the edition presents the digital reproduction of the manuscript in parallel with the electronic transcription, based on a line-by-line synchronization. In order to fulfill the production of the digital edition and its wide exploitation, all the phenomena upon the regions of interest within the facsimile image are encoded and made available for further machine processing: For example, the image areas yielded from the folding of the folio have been thoroughly encoded. Specifically, the standard encoding mechanism, adopting zone elements and points attribute for the definition of the regions of interest, has been used. This encoding approach makes use of the TEI facsimile module encompassing the surface, graphic and zone elements. The stamp zones, the address zones and those ones containing second and third hands have been identified and recorded. In the facsimile images of LL1.16 letter, zones corresponding to text lines, retro-recto and retro-verso without content, addressee, second hand, third hand and stamps have been managed. The regions of interest encoded in the fronte-recto folio are shown in , as well as the line-based highlighting within the address area. Thanks to this encoding of the image zones, different levels at different granularities of text-image linking can be managed, enriching the edition by means of notes and references starting from the facsimilar of the digital document. Concerning the image retrieval aspects, the API implemented by the International Image Interoperability Framework (IIIF) was used ( ).

In such a way a detailed browsing of the correspondence content and context has been achieved, linking the most relevant information (e.g. names entities, catalogs, bibliography) to authoritative online datasets. Specifically, domain datasets such as RISM have been preferred to the more general purpose datasets like Dbpedia.

Fronte-recto of letter LL1.16 with address, second hand, third hand and stamps.

Fronte-recto of letter LL1.16 with address, second hand, third hand and stamps.

4.4. The digital scholarly environment

One of the most innovative aspects of BellinInRete digital correspondence is the work carried out towards the production, indexing and visualization of the digital edition. To this end, a Web prototype ( ) has been implemented to process Bellini’s manuscripts based on the specifications and encoding choices introduced supra.

As described in previous sections, the digitization process of Bellini's letters has started by organizing the catalog into an electronic database and by acquiring facsimile scans of the original manuscripts (as is typically done in these kinds of projects). The second phase of this process involved the transcription and the digital encoding of the original documents, which are part of the selected collection. Although the digital representation of the letters is basically carried out by the aforementioned steps, the actual availability of the Bellini corpus cannot be considered fulfilled if the entire corpus is not published using some kind of Internet service ( ). In particular, both scholars and interested users, are nowadays in the habit of consulting their sources by means of Web browsers through open access modalities ( ). In the light of all the above, it has been decided to publish the digital edition of Bellini’s correspondence by developing a Web environment based on two extant systems: 1) Omega Web services for scholarly texts ( ); 2) Edition Visualization Technology (EVT) for Web presentation of scholarly texts ( ). At the end of the development process, the two systems (described infra) will cooperate in an integrated environment for digital scholarly editing suitable to produce, exploit and disseminate Bellini’s correspondence ( ).

Omega has been conceived as a modular digital environment for textual scholarship based on a microkernel architecture and on a collection of Domain Specific Application Programming Interfaces (DS-API). The development of Omega, carried out by a group of researchers and technologists working in Pisa at the ILC, seeks to promote critical text enquiries by means of computational tools able to efficiently model and handle textual, linguistic and semantic analyses. In order to reach this challenging objective, the digital environment adopts the Object-Oriented Analysis and Design (OOAD) paradigm and the Semantic Web technologies, paying particular attention to involve textual experts in the development team, as dictated by the Domain-Driven principles and patterns. In addition, this project takes into account both the requirements of textual scholars and the needs of software developers. Indeed, the Omega microkernel provides developers with the basic functionalities needed to load, index, annotate, and query the domain entities for text representation and processing. These entities are reusable text abstractions composed both by a set of minimal elements such as the source element, the annotation element, the locus element and by a set of operations allowed upon these elements. In this way, it is possible to implement a flexible architecture of the digital environment by adding several layers of functionality one over the other, which end, on the top of all, with a layer of RESTful services.

The well-known EVT tool aims at publishing digital editions encoded by following the TEI-XML guidelines. Although the principal objective of this tool is to help textual scholars in their digital studies providing a lightweight and compact Web environment to create digital editions, the project team has been developing special views and use-cases to arrange requirements for the general public to better run and understand the content of the corpus. EVT has been completely conceived and implemented with a client-side architecture and comes with a user-friendly interface, exploiting different levels of editions (facsimile, diplomatic, interpretative, and critical). Currently, EVT is freely available in two different versions (EVT1 and EVT2) which have some differences both in terms of software architecture and in terms of functionality. Although the EVT2 version has a modern Web design and is designed to be more flexible compared to EVT1 version, this latter, although the core parser is based only on the XSLT2.0 technology, does still implement more features than the newer one. As a matter of fact, EVT team is hardly working on fulfilling the gap and achieving a feature parity between the two versions. EVT project provides a fitting starting point for the Bellini prototype as it is able to suitably manage TEI-encoded texts equipped with the corresponding facsimile images. Furthermore, the EVT instance for BellinInRete digital correspondence provides facilities specifically made for non-experts people such as didascalic notes, biographic descriptions of cited people, explanation notes about places, events, organizations (named entities), as well as exegetic comments about domain terminology e.g. words like duetto or atto (see and ). Indeed, EVT will implement different views suitable for generic museum visitors in Italian language.

To date, the ongoing Web prototype of the digital edition is publicly available and hosted on the CNR servers. shows the current state of the environment considering only one letter identified by the LL1.16 signature.

Specifically, once the TEI-encoded document has been sent from the back-end (server-side, the Omega sub-system) to the front-end (client-side, EVT sub-system) by calling an asynchronous RESTful service, the whole presentation and analysis of the letters occurs on the browser side. In particular, the features, provided by the EVT tool, have been customized to meet the requirements of the BellinInRete project. Thanks to this customization work, the second version of EVT has been equipped with a highly performing image viewer with zooming and tiling capabilities made available by integrating the OpenSeaDragon JavaScript library. Moreover, other functionalities have been developed and integrated into the environment, which now supports text-image linking, hotspots, and search capabilities. These new enhancements have also reduced the gap between the two current versions of EVT, reaching, in this way, a substantial feature parity between them.

Current prototype concerning the Bellini’s Correspondence Digital Edition: EVT.

Current prototype concerning the Bellini’s Correspondence Digital Edition: EVT.

, and illustrate in details the prototype implemented so far. In fact, throughout the Web platform it is currently possible to visualize data regarding the correspondence description and the envelope modality, encoded, as described in the previous sections, respectively by means of the correspDesc tagset within the profileDesc of the TEI header and the front part of the text element. Furthermore, it is available a dedicated panel which shows on the screen the entire description of both the primary source and the BellinInRete digital correspondence. These data have been encoded through the msDesc tagset and the projectDesc elements of the header.

Manuscript description (left hand box) and the envelope information (right hand box).

Manuscript description (left hand box) and the envelope information (right hand box).

The digital environment also provides features to handle different kinds of lists such as name entities (persons, places, organizations), bibliography (performance works, books, etc.), the collection of the pages, the collection of the documents, and others. Moreover, resulting from the work conducted within BellinInRete, it is now possible to link a high-resolution facsimile image with the corresponding textual transcription accompanied with zooming, rotating, and panning functionalities.

Finally, the Bellini’s correspondence digital edition provides also some sort of critical notes besides the main text as well as the highlighting of parallel lines in the text and the corresponding image, the handling of hotspots upon the selected image zones and a useful search engine, thanks to which it is possible to query for text passages containing specific words or specific characters ( ).

The Graphical User Interface (GUI) in presents the description of the source manuscript, on the left hand box and, alternatively, on the right hand box, the envelope information which was written on the fronte-verso surface of the original document. The screenshot also shows the XML encoding of the Paris place name, plus the transcription of the stamps data as well as the text which has been added by later hands (highlighted in green).

The screenshot in shows the main layout of the GUI from which scholars and interested users can activate several interactive Web functionalities. For example, in the left-hand box the Web environment provides users with a highly performing image viewer which, using tiling capabilities, is able to handle scans with a great resolution. Moreover, the Graphical User Interfaces allow scholars to synchronize text and image at line granularity as well as to search keyword in contest (KWIC) thanks to a new search engine, developed from scratch ( ), and to deepen the description for each encoded name entities. In fact, it is extremely useful the change offered by the system to access the information encoded in the documents every time that a term occurs within the text, where a traditional reader of printed-edition has to search for the first occurrence of a lemma to retrieve the corresponding information.

Text-Image linking feature and Name entities description box.

Text-Image linking feature and Name entities description box.

Finally, illustrates some other useful features to automatically handle the hotspots encoded upon the images as well as showing the notes that accompany the main text. The project leaders have been paid a great deal of attention in encoding as much extratextual descriptions as possible, thanks to which, by clicking on the highlighted terms, the digital prototype provides end-users with references to people mentioned by Bellini, to Opera institutions, to places, as well as to numerous technical "lemmas" that appear in the letters. Particularly interesting is the case of the so-called voci (description entries) largely concerning the morphology of the Italian opera house (aria, duetto, cabaletta), and consequently are easily accessible also to readers not familiar with the vocabulary used by the Italian opera composers of the nineteenth century.

Image hotspots and text notes.

Image hotspots and text notes.

In order to ease the maintenance and the long-term preservation of both the digital edition and the project resources, open source technologies and well-received standard formats have been adopted. By doing so, the risk of digital obsolescence is drastically reduced. Indeed, special attention has been made to be compliant with the FAIR data principles (Findability, Accessibility, Interoperability, and Reusability), in order to support information management and interchange along with data processing tasks. Moreover, we plan to publish the digital edition under the CLARIN-IT umbrella in order to take advantage of a pan-European infrastructure devoted to maintain digital resources within the digital humanities field.

Ultimately, the Museum organization will be in charge of accounting matters related to the preservation and updating issues of the digital edition and software packages.

5. Conclusions and work in progress

This paper has described the stage reached so far regarding the production of the digital edition concerning the Vincenzo Bellini’s correspondence. The research here illustrated pertains to the Museo Virtuale della Musica BellinInRete project. In particular, this work has dealt with the part of the project having to do with the corpus of Bellini’s letters, discussing the text encoding scheme and editorial choices adopted, with a special reference to the structure of the epistolary grammar, the encoding of named entities as well as the parallel exploitation of the digital documents by means of showing the transcribed text and the corresponding scan images. Two subsystems composing the digital environment have been examined (i.e., Omega platform and EVT Web tools).

With regards to the creation of the digital catalog, common vocabularies have been analyzed which are the most useful to build authoritative datasets easily integrable into a Linked Open Data perspective.

The activities involved in the digitization of the letters pave the way for numerous enhancements. First and foremost it is planned to complete the digitization work and the text encoding of all the selected materials, including the "minute" (draft manuscripts), which Bellini used to write before the "formal" and fair manuscript. These documents are by all means of great philological interest as they present author’s interventions with additions, deletions, reconsiderations, and other morphological and stylistic shapings made by the composer.

Concerning the terminology, and with a view to reusability, the Lesmu lexicon will be taken into account ( ). Lesmu is a computational Italian lexicon of music literature developed with the contribution of ILC in the early 2000’s. Although the technology which it is based on is outdated, the musicologists working at BellinInRete have found Lesmu a valuable resource, since it allows to search over 22.500 lexicographical entries composed by more than three million and six thousand indexed words. In , for example, a screenshot of the old interface to Lesmu shows the lexical entry sortita, appearing in the LL1.16 example letter of the corpus.

The lexical entry “sortita” in Lesmu computational lexicon.

The lexical entry sortita in Lesmu computational lexicon.

Among the future works, we’d like to renew the Lesmu resource by reimplementing it with state-of-the-art technologies and lexical models compliant to the Semantic Web, such as lemon ( ). As for the interface to the lexicon we will adapt LexO ( ), a Web-based collaborative editor developed by ILC, to the needs of the Bellini lexicon. LexO, at its current stage of development, allows a lexicographer (or terminologist) to create a resource constituted by one or more lexica, each of which is composed of lexical entries with a lemma, a number of forms (with morphological traits) and one or more lexical senses. Each sense can also be linked to a concept of an ontology represented by means of the OWL language.

The tool that will make possible to link the terminological resource with the corresponding occurrences within the selected corpus is the lemmatization. In fact, it is planned to develop a module in charge of handling the words constituting the Bellini corpus with the intent to semi-automatically encode the canonical form (lemma) of each term within the same TEI-encoded document. The encoding may be done, for example, by using the term element of the TEI schema. In order to implement such a linguistic and text analysis we will provide a KWIC (Key Word In Context) functionality for classification purpose suggesting the most likely words or phrases which are candidates to be encoded as terms, along with the corresponding concordances presented in all the letters. Natural Language Processing (NLP) approaches will be adopted to solve this task. Once the lemmatization of candidate terms will be available, an informative box will show the description of the highlighted words while browsing the letters ( ).

As far as the development of the digital scholarly platform is concerned, new features, corresponding to several use cases, have been planned to be implemented during the next steps of BellinInRete digital correspondence. In particular, amongst other, it will be scheduled the integration of two new modules within the Web application. The first one is conceived to manage the whole corpus, encompassing the entire collection of the letters, by handling the facsimile images of each folio. This latter functionality will be realized by putting together the properties already available within the image viewer and the presentation API of the IIIF technology. In this way, a general feature useful both to our project and to the general EVT Web application will be added. The second module is conceived to handle the collation of the texts and, consequently, the word alignment among the letters and the manuscript drafts written by Bellini. This module will exploit the canonical text services (CTS) and the alignment component, implemented in the back-end side (Omega), as well as will integrate the EVT functionality concerning the visualization of the parallel loci and the corresponding texts.


This research project is partially funded by the Patto per Catania under the Fondo Sviluppo e Coesione 2014-2020: Piano per il Mezzogiorno.


  1. Abrate, Matteo, Angelo Mario Del Grosso, Emiliano Giovannetti, Angelica Lo Duca, Damiana Luzzi, Lorenzo Mancini, Andrea Marchetti, Irene Pedretti, and Silvia Piccini. 2014. ‘Sharing Cultural Heritage: The Clavius on the Web Project’. In Proceedings of the 9th International Conference on Language Resources and Evaluation, 627–34. Reykjavik, Iceland: European Language Resources Association (ELRA), 2014.

  2. Bellandi, Andrea, Emiliano Giovannetti, and Silvia Piccini. 2018. ‘Collaborative Editing of Lexical and Termino-Ontological Resources: A Quick Introduction to LexO’. In XVIII EURALEX International Congress Lexicography in Global Contexts - Book of Abstracts, edited by Jaka Čibej, Vojko Gorjanc, Iztok Kosem, and Simon Krek, 23–27. Ljubljana, Slovenia: Znanstvena založba Filozofske fakultete Univerze v Ljubljani / Ljubljana University Press, Faculty of Arts.

  3. Bellini, Vincenzo. 2017. Carteggio. Edited by Graziella Seminara. Firenze: Olschki.

  4. Blackwell, C., and N. Smith. 2014. ‘The Homer Multitext: Technically’. Website page.

  5. Bozzi, Andrea. 2014. ‘Computer-Assisted Scholarly Editing of Manuscript Sources’. In New Publication Cultures in the Humanities: Exploring the Paradigm Shift, edited by Peter Dávidházi, 99–116. Amsterdam: Ansterdam University Press.

  6. Bozzi, Andrea. 2006. ‘Edizione Elettronice e Filologia Computazionale’. In Fondamenti Di Critica Testuale, edited by Alfredo Stussi, 207–32. Manuali. Bologna: Il Mulino.

  7. Burnard, Lou. 2015. ‘TEI P5: Guidelines for Electronic Text Encoding and Interchange - Version 2.9.1’.

  8. Cacioli, G. 2019. ‘Filologia e Information Retrieval: Progettazione e Sviluppo Di Un Motore Di Ricerca per EVT’. Master’s Thesis, Università degli Studi di Pisa.

  9. Capizzi, Erica. 2018. ‘Il Fondo Perucchini Del Museo Civico Belliniano Di Catania’. In Un Nobile Veneziano in Europa. Teatro e Musica Nelle Carte Di Giovanni Battista Perucchini, edited by Maria Rosa De Luca, Graziella Seminara, and C. Steffann, 159–67. Lucca: LIM.

  10. Condorelli, Benedetto, ed. 1935. Il Museo Belliniano. Catalogo Storico - Iconografico. Catania: Spampinato e Sgroi.

  11. Condorelli, Benedetto. 2017. Il Museo Belliniano. Catalogo Storico Iconografico. 1st ed. Catania, 1930.

  12. Del Grosso, A., Albanesi, D., Giovannetti, E., Marchi, S. 2016. Defining the Core Entities of an Environment for Textual Processing in Literary Computing. In Digital Humanities 2016: Conference Abstracts. Jagiellonian University & Pedagogical University, Kraków, 771-775.

  13. Del Grosso, Angelo, Emiliano Giovannetti, and Simone Marchi. ‘Il Modello a Microkernel Di Omega Nello Sviluppo Di Strumenti per Lo Studio Dei Testi: Dagli ADT Alle API’. In AIUCD 2017 - Book of Abstracts, edited by Fabio Ciotti and Gianfranco Crupi, 79–85. Quaderni Di Umanistica Digitale.

  14. Del Grosso, Angelo Mario, Andrea Bellandi, Emiliano Giovannetti, Simone Marchi, and Ouafae Nahli. 2018. ‘Scanning Is Just the Beginning: Exploiting Text and Language Technologies to Enhance the Value of Historical Manuscripts’. In Proceedings of the IEEE 5th International Congress on Information Science and Technology (CiSt), 214–19.

  15. Del Grosso, Angelo, Daria Spampinato, Salvatore Cristofaro, Maria Rosa De Luca, Emiliano Giovannetti, Simone Marchi, and Graziella Seminara. 2018. ‘Le Lettere Di Bellini: Dalla Carta al Web’. In AIUCD 2018 - Book of Abstracts, edited by Daria Spampinato, 60–64. Quaderni Di Umanistica Digitale.

  16. Della Seta, Fabrizio, ed. 2010. Le Parole Del Teatro Musicale. Roma: Carocci.

  17. McCrae, John, Dennis Spohr, and Philipp Cimiano. 2011. ‘Linking Lexical Resources and Ontologies on the Semantic Web with Lemon’. In The Semantic Web: Research and Applications, edited by Grigoris Antoniou, Marko Grobelnik, Elena Simperl, Bijan Parsia, Dimitris Plexousakis, Pieter De Leenheer, and Jeff Pan, 245–59. Lecture Notes in Computer Science. Springer Berlin Heidelberg.

  18. Neri, Carmelo, ed. 1998. Guida Illustrata Del Museo Civico Belliniano Di Catania. Catania: Maimone Editore.

  19. Neuber, Frederike. 2016. ‘CorrespSearch’. Variants. The Journal of the European Society for Textual Scholarship, no. 12–13: 284–85.

  20. Nicolodi, Fiamma, and Paolo Trovato. 2007. Lessico Della Letteratura Musicale Italiana 1490-1950. Firenze: Franco Cesati Editore.

  21. Pastura, Francesco, ed. 1968. Inventario generale del Museo Belliniano. Unpublished.

  22. Stadler, Peter, Marcel Illetschko, and Sabine Seifert. 2016. ‘Towards a Model for Encoding Correspondence in the TEI: Developing and Implementing <correspDesc>’. Journal of the Text Encoding Initiative 9.

  23. Rosselli Del Turco, R., and C. Di Pietro. 2016. ‘Between Innovation and Conservation: The Narrow Path of UI Design for the DSE.’ In Digital Scholarly Editions as Interfaces. Institute for Documentology and Scholarly Editing (IDE), 133-163.

  24. Vavliakis, Konstantinos N., Georgios Th Karagiannis, and Pericles A. Mitkas. 2012. ‘Semantic Web in Cultural Heritage After 2020’. In Proceedings of the Workshop ‘What Will the Semantic Web Look like 10 Years from Now?’ Boston, MA.

  25. Vecchio, G. and U. Martelli. ed. 1968. Inventario Cimeli e libri del Museo Belliniano nel 1968. (Unpublished).

  26. Wilkinson, Mark D., Michel Dumontier, IJsbrand Jan Aalbersberg, Gabrielle Appleton, Myles Axton, Arie Baak, Niklas Blomberg, et al. 2016. ‘The FAIR Guiding Principles for Scientific Data Management and Stewardship’. Scientific Data 3.

  27. Ying, William, and James Shulman. 2015. ‘Bottled or Tap? A Map for Integrating International Image Interoperability Framework (IIIF) into Shared Shelf and Artstor’. D-Lib Magazine 21, 7/8.

Last access URLs: 12th September 2019.

The present paper is an extension of the abstract presented at the AIUCD2018 Conference .

The TEI guidelines are available at the following Web address:

UNIversal MAchine Readable cataloging

Virtual International Authority File,

Répertoire International des Sources Musicales,

In the Supplica, the young Vincenzo Bellini (not yet eighteen years old) asked Stefano Notarbartolo, Duke of Sammartino, Intendant of the Valley of the city of Etna, to grant him a scholarship in order to follow the course of composition at the Royal College of Music of San Sebastiano in Naples.

In this draft, Bellini, 20 days before his premature death, communicated to his publisher that he had discovered the person responsible for placing an unauthorized copy of the Puritani score in the theatrical market.

It is worth to mention the publication, between the 1930s and the 1990s, of three catalogs illustrating the legacy owned by the Museum, precious repertoires which however do not meet any cataloging standards, see and .

Servizio Bibliotecario Nazionale,

Resource Description and Access,

Functional Requirements for Bibliographic Records object oriented,

Friend Of A Friend,

This regards the letter that Giuseppe Alvaro Paternò sent to Vincenzo Bellini (9 July 1826, Catania), in , p. 71 n. 6.

In addition to correspondence, there are a number of documents, on display or in archive, which were cataloged: these are mainly documents produced by the Municipality of Catania or budget notes sent to Vincenzo Bellini, inventories of his legacy and legal documents. Inside the Bellini Civic Museum there is also the Fondo Perucchini, composed of a considerable number of letters (only those addressed to the Venetian musician are over 700 units) and also includes some musical pages, handwritten and printed documents, and other materials.

Metadati Amministrativi Gestionali (MAG) is an Italian standard for collecting metadata related to digital objects, according to international standards.

A first version of the TEI-encoded document is available at the prototype link

As already mentioned, one of the main objectives of the BellinInRete project (and the digital edition) is the full availability of the materials not only to the experts, but also to occasional museum visitors like school children. Therefore it is extremely important to explicitly encode all the information that is implicit within the text in an immediate and sometimes simplified ways.

The MEI defines guidelines to encode musical documents in a machine-readable form. MEI closely mirrors the work done by text scholars in TEI. The schema is available at

EVT is born in the context of the Digital Vercelli Book project. It is hosted at the following web address:

The digital edition is available in a demo installation at the following web address:

OpenSeadragon is a Javascript library, compliant with the IIIF specifications with regards to the Image API, hosted on an open github repository. The project web address is available at the following link:

These features are existing features in the first version of EVT, but not yet available in the second version of the tool.

The Italian Clarin Web site is at the following URL: