Maia: an Open Collaborative Platform for Text Annotation, E-Lexicography, and Lexical Linking

Authors

DOI:

https://doi.org/10.6092/issn.2532-8816/19705

Keywords:

text annotation, e-lexicography, lexical linking, computational lexicon, linguistic resources, Linguistic Linked Open Data, OntoLex-Lemon, Maia

Abstract

Although open tools for manual text annotation and the creation of lexical resources have been available for some years, there is currently no integrated tool that allows, within the same environment, annotating a text corpus, building a computational lexicon, and linking linguistic annotations to lexical elements. For this reason, we have developed Maia, an open and collaborative web tool based on semantic web and linked open data technologies for text annotation, e-lexicography, and lexical linking, primarily designed and developed for use by digital humanists. This article presents the first version of Maia, describing its functionality, and outlining its software architecture and development prospects.

References

Bellandi, Andrea. 2023. “Building Linked Lexicography Applications with LexO-Server.” Digital Scholarship in the Humanities 38 (3): 937–52. https://doi.org/10.1093/llc/fqac095.

Bellandi, Andrea. 2021. “LexO: An Open-Source System for Managing OntoLex-Lemon Resources.” Language Resources and Evaluation 55 (4): 1093–1126. https://doi.org/10.1007/s10579-021-09546-4.

Bellandi, Andrea, Emiliano Giovannetti, and Anja Weingart. 2018. “Multilingual and Multiword Phenomena in a Lemon Old Occitan Medico-Botanical Lexicon.” Information 9 (3): 52. https://doi.org/10.3390/info9030052.

Bergamaschi, Sonia, Laura Po, Serena Sorrentino, and Alberto Corni. 2010. “Dealing with Uncertainty in Lexical Annotation.” Revista de Informática Teórica e Aplicada 16 (2): 93–96. https://doi.org/10.22456/2175-2745.12580.

Bizer, Christian, Tom Heath, and Tim Berners-Lee. 2009. “Linked Data: The Story so Far.” International Journal on Semantic Web and Information Systems 5 (July):1–22. https://doi.org/10.4018/jswis.2009081901.

Chiarcos, Christian, and Maxim Ionov. 2019. “Ligt: An LLOD-Native Vocabulary for Representing Interlinear Glossed Text as RDF.” Application/pdf. OASIcs, Volume 70, LDK 2019 70:3:1-3:15. https://doi.org/10.4230/OASICS.LDK.2019.3.

Cimiano, P., P. Buitelaar, J. McCrae, and M. Sintek. 2011. “LexInfo: A Declarative Model for the Lexicon-Ontology Interface.” Journal of Web Semantics 9 (1): 29–51. https://doi.org/10.1016/j.websem.2010.11.001.

Costa, Rute, Ana Salgado, and Bruno Almeida. 2021. “SKOS as a Key Element for Linking Lexicography to Digital Humanities.” In Information and Knowledge Organisation in Digital Humanities, by Koraljka Golub and Ying-Hsang Liu, 1st ed., 178–204. London: Routledge. https://doi.org/10.4324/9781003131816-9.

Espinoza, Mauricio, Asunción Gómez-Pérez, and Elena Montiel-Ponsoda. 2009. “Multilingual and Localization Support for Ontologies.” In The Semantic Web: Research and Applications, edited by Lora Aroyo, Paolo Traverso, Fabio Ciravegna, Philipp Cimiano, Tom Heath, Eero Hyvönen, Riichiro Mizoguchi, Eyal Oren, Marta Sabou, and Elena Simperl, 5554:821–25. Lecture Notes in Computer Science. Berlin, Heidelberg: Springer Berlin Heidelberg. https://doi.org/10.1007/978-3-642-02121-3_63.

Francopoulo, Gil, Monte George, Nicoletta Calzolari, Monica Monachini, Nuria Bel, Mandy Pet, and Claudia Soria. 2006. “Lexical Markup Framework (LMF).” In Proceedings of the Fifth International Conference on Language Resources and Evaluation (LREC’06), edited by Nicoletta Calzolari, Khalid Choukri, Aldo Gangemi, Bente Maegaard, Joseph Mariani, Jan Odijk, and Daniel Tapias. Genoa, Italy: European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2006/pdf/577_pdf.pdf.

Gambarara, Daniele, and Maria Pia Marchese, eds. 2013. Guida per Un’edizione Digitale Dei Manoscritti Di Ferdinand de Saussure: Progetto Di Ricerca PRIN 2008. Studi e Ricerche 117. Alessandria: Edizioni dell’Orso.

Giovannetti, Emiliano, Davide Albanesi, Andrea Bellandi, Simone Marchi, Mafalda Papini, and Flavia Sciolette. 2022. “The Role of a Computational Lexicon for Query Expansion in Full-Text Search.” In Proceedings of the Eighth Italian Conference on Computational Linguistics CliC-It 2021, edited by Elisabetta Fersini, Marco Passarotti, and Viviana Patti, 162–68. Accademia University Press. https://doi.org/10.4000/books.aaccademia.10638.

Kemps-Snijders, Marc, Menzo Windhouwer, Peter Wittenburg, and Sue Ellen Wright. 2008. “ISOcat: Corralling Data Categories in the Wild.” In Proceedings of the Sixth International Conference on Language Resources and Evaluation (LREC’08), edited by Nicoletta Calzolari, Khalid Choukri, Bente Maegaard, Joseph Mariani, Jan Odijk, Stelios Piperidis, and Daniel Tapias. Marrakech, Morocco: European Language Resources Association (ELRA). http://www.lrec-conf.org/proceedings/lrec2008/pdf/222_paper.pdf.

Klie, Jan-Christoph, Michael Bugert, Beto Boullosa, Richard Eckart de Castilho, and Iryna Gurevych. 2018. “The INCEpTION Platform: Machine-Assisted and Knowledge-Oriented Interactive Annotation.” In Proceedings of the 27th International Conference on Computational Linguistics: System Demonstrations, 5–9. Santa Fe, New Mexico: Association for Computational Linguistics. https://www.aclweb.org/anthology/C18-2002.

Liebeskind, Chaya, Ido Dagan, and Jonathan Schler. 2018. “Automatic Thesaurus Construction for Modern Hebrew.” In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), edited by Nicoletta Calzolari, Khalid Choukri, Christopher Cieri, Thierry Declerck, Sara Goggi, Koiti Hasida, Hitoshi Isahara, et al. Miyazaki, Japan: European Language Resources Association (ELRA). https://aclanthology.org/L18-1229.

Litkowski, K.C. 2006. “Computational Lexicons and Dictionaries.” In Encyclopedia of Language & Linguistics, 753–61. Elsevier. https://doi.org/10.1016/B0-08-044854-2/00945-7.

Měchura, Michal. 2017. “Introducing Lexonomy: An Open-Source Dictionary Writing and Publishing System.” In Electronic Lexicography in the 21st Century: Proceedings of eLex 2017 Conference, edited by V. Baisa I. Kosem C. Tiberius, M. Jakubíček, J. Kallas, S. Krek, 662–79. Leiden: Lexical Computing. https://elex.link/elex2017/.

Mengaldo, Pier Vincenzo. 1961. “‘Involare e rubare’ in italiano antico / Pier Vincenzo Mengaldo.” In “Involare e rubare” in italiano antico. Firenze: Sansoni.

Piccini, Silvia, Elisabetta Corsi, and Emiliano Giovannetti. 2017. “Une ressource termino-ontologique bilingue chinois-italien: le cas de la traduction de la Mappemonde de Matteo Ricci par Pasquale D’Elia.” In Actes de la conférence TOTh 2017, 33–49. Terminologica. Chambéry: Université Savoie Mont Blanc. http://ontologia.fr/TOTh/Conference/TOTh2017/TOTh_2017.pdf.

Romary, Laurent. 2015. “TEI and LMF Crosswalks.” Journal for Language Technology and Computational Linguistics 30 (1): 47–70. https://doi.org/10.21248/jlcl.30.2015.195.

Romary, Laurent, and Toma Tasovac. 2018. “TEI Lex-0: A Target Format for TEI-Encoded Dictionaries and Lexical Resources.” In TEI Conference and Members’ Meeting. Tokyo, Japan. https://inria.hal.science/hal-02265312.

Saponaro, D., E. Giovannetti, and F. Sciolette. 2022. “From Religious Sources to Computational Resources: Approach and Case Study on Hebrew Terms and Concepts.” Materia Giudaica Print XXVII(2022):21.

Singh, Siddharth, Ritesh Kumar, Shyam Ratan, and Sonal Sinha. 2022. “Towards a Unified Tool for the Management of Data and Technologies in Field Linguistics and Computational Linguistics - LiFE.” In Proceedings of the Workshop on Resources and Technologies for Indigenous, Endangered and Lesser-Resourced Languages in Eurasia within the 13th Language Resources and Evaluation Conference, edited by Atul Kr. Ojha, Sina Ahmadi, Chao-Hong Liu, and John P. McCrae, 90–94. Marseille, France: European Language Resources Association. https://aclanthology.org/2022.eurali-1.16.

Stellato, Armando, Manuel Fiorelli, Andrea Turbati, Tiziano Lorenzetti, Willem Van Gemert, Denis Dechandon, Christine Laaboudi-Spoiden, et al. 2020. “VocBench 3: A Collaborative Semantic Web Editor for Ontologies, Thesauri and Lexicons.” Edited by Aidan Hogan. Semantic Web 11 (5): 855–81. https://doi.org/10.3233/SW-200370.

Uren, Victoria, Philipp Cimiano, Jose Iria, Siegfried Handschuh, Maria Vargas-Vera, Enrico Motta, and Fabio Ciravegna. 2006. “Semantic Annotation for Knowledge Management: Requirements and a Survey of the State of the Art.” SSRN Electronic Journal. https://doi.org/10.2139/ssrn.3199324.

Weingart, Anja, and Emiliano Giovannetti. 2016. “Extending the Lemon Model for a Dictionary of Old Occitan Medico-Botanical Terminology.” In The Semantic Web, edited by Harald Sack, Giuseppe Rizzo, Nadine Steinmetz, Dunja Mladenić, Sören Auer, and Christoph Lange, 9989:408–21. Lecture Notes in Computer Science. Cham: Springer International Publishing. https://doi.org/10.1007/978-3-319-47602-5_53.

Wilby, David, Twin Karmakharm, Ian Roberts, Xingyi Song, and Kalina Bontcheva. 2023. “GATE Teamware 2: An Open-Source Tool for Collaborative Document Classification Annotation.” In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics: System Demonstrations, edited by Danilo Croce and Luca Soldaini, 145–51. Dubrovnik, Croatia: Association for Computational Linguistics. https://doi.org/10.18653/v1/2023.eacl-demo.17.

Zenzaro, S., A. M. Del Grosso, F. Boschetti, and G. Ranocchia. 2023. “Ease the Collaboration Making Scholarly Editions: The GreekSchools Case Study.” In Atti Del XII Convegno Annuale AIUCD, edited by E. Carbé, Alessia Lo Piccolo Gabrieleand Valenti, and F. Stella, 230–32. Siena: Alma Mater Studiorum-Università di Bologna (Bologna, ITA). https://doi.org/10.6092/unibo/amsacta/7721.

Zenzaro, S., A. M. Del Grosso, F. Boschetti, and G. Ranocchia. 2022. “Verso La Definizione Di Criteri per Valutare Soluzioni Di Scholarly Editing Digitale: Il Caso d’uso GreekSchools.” In AIUCD 2022-Proceedings. Culture Digitali. Intersezioni: Filosofia, Arti, Media, edited by F. Ciracì, G. Miglietta, and C. Gatto, 20–25. https://doi.org/10.6092/unibo/amsacta/6848.

Zhang, Chao, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian Sadler, Michelle Vanni, and Jiawei Han. 2018. “TaxoGen: Unsupervised Topic Taxonomy Construction by Adaptive Term Embedding and Clustering.” In Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 2701–9. London United Kingdom: ACM. https://doi.org/10.1145/3219819.3220064.

Downloads

Published

2024-10-16

How to Cite

Giovannetti, E., Albanesi, D., Bellandi, A., Marchi, S., Papini, M., & Sciolette, F. (2024). Maia: an Open Collaborative Platform for Text Annotation, E-Lexicography, and Lexical Linking. Umanistica Digitale, 8(18), 27–52. https://doi.org/10.6092/issn.2532-8816/19705

Issue

Section

Articles