Lexicographic Resources in the Semantic Web: Models, Tools, and Case Studies

Authors

  • Andrea Bellandi Istituto di Linguistica Computazionale "A. Zampolli"

DOI:

https://doi.org/10.60923/issn.2532-8816/22199

Keywords:

E-Lexicography, Dictionary, Linguistic Resources, Linguistic Linked Data, Lexicog, LexO-server

Abstract

With the advent of Semantic Web technologies and the Linked Data paradigm, digital lexicography has undergone a significant transformation, evolving from the mere digitization of printed dictionaries to more advanced forms of linguistic data representation and interconnection. Within this context, the Lexicography Module (Lexicog) emerges as a data model developed by the W3C OntoLex community group, designed to support the modeling of interoperable lexicographic resources in accordance with the FAIR principles. This paper provides a review of the Lexicog model, assessing its suitability for the practical needs of lexicographers in dictionary compilation and demonstrating its application through three real-world case studies involving different types of dictionaries. Furthermore, it presents a suite of open-source software tools developed to facilitate the construction and use of computational dictionaries. The goal is to promote open, sustainable digital lexicographic practices aligned with Semantic Web standards.

References

[1] Baader, F., ed. 2003. The Description Logic Handbook: Theory, Implementation and Applications. Cambridge: Cambridge University Press.

[2] Baader, F., I. Horrocks, and U. Sattler. 2008. "Description Logics." In Foundations of Artificial Intelligence, vol. 3, edited by Frank van Harmelen, Vladimir Lifschitz, and Bruce Porter, 135–179. Elsevier.

[3] Bellandi, Andrea. 2023. “Building Linked Lexicography Applications with LexO-Server.” Digital Scholarship in the Humanities 38 (3): 937–52. https://doi.org/10.1093/llc/fqac095

[4] Bellandi, Andrea. 2021. “LexO: An Open-Source System for Managing OntoLex-Lemon Resources.” Language Resources and Evaluation 55 (4): 1093–1126. https://doi.org/10.1007/s10579-021-09546-4

[5] Bekiari, C., M. Doerr, P. Le Boeuf, and P. Riva. 2017. Definition of FRBRoo: A Conceptual Model for Bibliographic Information in Object-Oriented Formalism (Version 2.4). International Federation of Library Associations and Institutions (IFLA). https://repository.ifla.org/handle/20.500.14598/659.

[6] Berners-Lee, T., Hendler, J., Lassila, O. 2001. “The semantic web.” Scientific American 284 (5): 34-43.

[7] Bizer, Christian, Tom Heath, and Tim Berners-Lee. 2009. “Linked Data: The Story so Far.” International Journal on Semantic Web and Information Systems 5 (July):1–22. https://doi.org/10.4018/jswis.2009081901

[8] Buck, C. D. 1949. A Dictionary of Selected Synonyms in the Principal Indo-European Languages: A Contribution to the History of Ideas. Chicago: University of Chicago Press.

[9] Cabré, M. T. 1999. Terminology. Theory, Methods and Applications. Amsterdam/Philadelphia: John Benjamins.

[10] Chiarcos, C., K. Gkirtzou, M. Ionov, B. Kabashi, A. F. Khan, and C. O. Truică. 2022. "Modelling Collocations in OntoLex-FrAC." In Proceedings of GlobaLex-2022, Marseille, France.

[11] Cimiano, P., P. Buitelaar, J. McCrae, and M. Sintek. 2011. “LexInfo: A Declarative Model for the Lexicon-Ontology Interface.” Journal of Web Semantics 9 (1): 29–51. https://doi.org/10.1016/j.websem.2010.11.001

[12] Corradini, M. Sofia and Mensching, Guido (2017): “Le DiTMAO (Dictionnaire des Termes Médico-botaniques de l’Ancien Occitan): caractères et organisation des données lexicales”, in: Carrera, Aitor & Grifoll, Isabel (eds.) : Occitània en Catalonha. De tempses novèls, de novèlas perspectivas. Actes de l’XIen Congrès de l’Associacion Internacionala d’Estudis Occitans, Lhèida, 16-21 june 2014. Lhèida : Generalitat de Catalunya / Institut d’Estudis Ilerdencs, 125-138.

[13] Corradini, M. Sofia, Mensching, Guido, and Zwink, Julia (2021): “Le DiTMAO (Dictionnaire des termes médico-botaniques de l’ancien occitan) – innovations et évolution récente”, in: Courouau, Jean-François (éd.) : Fidélités et dissidences. Actes du XIIe Congrès de l’Association Internationale d’Etudes Occitanes, Albi, 10-15 juillet 2017. Toulouse: SFAIEO, II, 907-920.

[14] Corradini, M. Sofia, and Mensching, Guido (sous presse): “L’apport original du DiTMAO à la lexicographie scientifique de l’ancien occitan”, in : L’occitan à la rencontre des études romanes, Actes du XIVe Congrès de l’Association Internationale d’Etudes Occitanes, Munich 11-16 septembre 2023. Open Publishing LMU.

[15] Costa, Rute, Ana Salgado, and Bruno Almeida. 2021. “SKOS as a Key Element for Linking Lexicography to Digital Humanities.” In Information and Knowledge Organisation in Digital Humanities, by Koraljka Golub and Ying-Hsang Liu, 1st ed., 178–204. London: Routledge. https://doi.org/10.4324/9781003131816-9

[16] Della Valle, Valeria. 2024. Dizionari italiani: storia, tipi, struttura. Roma: Carocci, collana Bussole.

[17] De Mauro, Tullio. 2005. La fabbrica delle parole: il lessico e i problemi di lessicologia. Torino: UTET Università.

[18] Depecker, L. 2002. Entre signe et concept: Éléments de terminologie générale. Paris: Presses Sorbonne Nouvelle.

[19] Diez Platas, M. L., H. Bermúdez, S. Ros, E. González-Blanco, O. Corcho, O. K. Gómez, L. Hernández-Lorenzo, M. De Sisto, J. de la Rosa, Á. Pérez, A. Diez, e J. L. Rodriguez. 2022. “Description of Postdata Poetry Ontology V1.0.” In Tackling the Toolkit: Plotting Poetry through Computational Literary Studies, a cura di P. Plecháč, R. Kolár, A.-S. Bories, e J. Říha, 19–34. Prague: Institute of Czech Literature of the Czech Academy of Sciences.

[20] Doerr, M. 2003. "The CIDOC Conceptual Reference Module: An Ontological Approach to Semantic Interoperability of Metadata." AI Magazine 24(3): 75.

[21] Francis, W. N., and H. Kučera. 1964. A Standard Corpus of Present-Day Edited American English, for Use with Digital Computers (Brown). Providence, RI: Brown University. Revised 1971, 1979.

[22] Giovannetti, E., D. Albanesi, A. Bellandi, S. Marchi, M. Papini, and F. Sciolette. 2024. "Maia: An Open Collaborative Platform for Text Annotation, E-Lexicography, and Lexical Linking." Umanistica Digitale (18): 27–52.

[23] Khan, A. F., and F. Boschetti. 2018. "Towards a Representation of Citations in Linked Data Lexical Resources." In Proceedings of the XVIII EURALEX International Congress: Lexicography in Global Contexts, 137–147.

[24] Khan, A. F. 2018a. "Towards the Representation of Etymological Data on the Semantic Web." Information 9(12): 304.

[25] Khan, A. F. 2018b. "Towards the Representation of Etymological and Diachronic Lexical Data on the Semantic Web." In Proceedings of the 11th International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan.

[26] Khan, A. F., and A. Salgado. 2021. "Modelling Lexicographic Resources Using CIDOC-CRM, FRBRoo and Ontolex-Lemon." In SWODCH, 1–12.

[27] Mallia, M., M. Bandini, and V. Quochi. 2024. "An Interface for Linking Ancient Languages." Cybernetics and Information Technologies 24(4).

[28] Mambrini, F., and M. Passarotti. 2020. "Representing Etymology in the LiLa Knowledge Base of Linguistic Resources for Latin." In Proceedings of the 2020 Globalex Workshop on Linked Lexicography, 20–28.

[29] Mambrini, F., F. M. Cecchini, G. Franzini, E. Litta, M. C. Passarotti, and P. Ruffolo. 2020. "LiLa: Linking Latin Risorse linguistiche per il latino nel Semantic Web (AIUCD 2019)." Umanistica Digitale (8).

[30] Massariello Merzagora, Giovanna. La Lessicografia. 1st ed. Bologna: Nicola Zanichelli, 1983.

[31] McCrae, J. P., J. Bosque-Gil, J. Gracia, P. Buitelaar, e P. Cimiano. 2017. “The OntoLex-Lemon Model: Development and Applications.” In Proceedings of eLex 2017 Conference, 19–21.

[32] Miller, George A. 1994. “WordNet: A Lexical Database for English.” In Human Language Technology: Proceedings of a Workshop Held at Plainsboro, New Jersey, March 8–11, 1994.

[33] Muljačić, Žarko. 1971. Introduzione allo studio della lingua italiana. Torino: Einaudi.

[34] Murru, Chiara. 2019. "«Sanza che alla mia penna non dee essere meno d'auttorità conceduta che sia al pennello del dipintore». Considerazioni sulla pittura di Giotto da Giovanni Boccaccio a Roberto Longhi." Studi sul Boccaccio, 12. http://digital.casalini.it/4606737.

[35] Passarotti, M., F. Mambrini, G. Franzini, F. M. Cecchini, E. Litta, G. Moretti, P. Ruffolo, and R. Sprugnoli. 2020. "Interlinking through Lemmas. The Lexical Collection of the LiLa Knowledge Base of Linguistic Resources for Latin." Studi e Saggi Linguistici LVIII(1): 177–212.

[36] Peroni, S., and D. Shotton. 2012. "FaBiO and CiTO: Ontologies for Describing Bibliographic Resources and Citations." Journal of Web Semantics 17: 33–43.

[37] Peroni, S., and D. Shotton. 2018. "The SPAR Ontologies." In Proceedings of the 17th International Semantic Web Conference (ISWC 2018), 119–136.

[38] Quochi, V., A. Bellandi, M. Mallia, A. Tommasi, and C. Zavattari. 2022. "Supporting Ancient Historical Linguistics and Cultural Studies with EpiLexO." In CLARIN Annual Conference Proceedings, vol. 39.

[39] Ricotta, V. 2019. "Con animi e con vocaboli onestissimi si convien dire. Prime attestazioni di hapax in Boccaccio." Studi di lessicografia italiana 36: 67–102.

[40] Wilkinson, M., et al. “The FAIR Guiding Principles for Scientific Data Management and Stewardship.” Scientific Data 3 (2016). https://doi.org/10.1038/sdata.2016.18

Published

2025-10-23

How to Cite

Bellandi, A. (2025). Lexicographic Resources in the Semantic Web: Models, Tools, and Case Studies. Umanistica Digitale, 9(21), 107–140. https://doi.org/10.60923/issn.2532-8816/22199

Issue

Section

Articles