Applicazione di metodi di text mining per la costruzione di un'ontologia di dominio dalle definizioni
DOI:
https://doi.org/10.6092/issn.2532-8816/19284Parole chiave:
terminologia, definizione, ontologia di dominio, informazione ricca di conoscenza, corpus di dominio, dizionario terminologicoAbstract
Questo articolo si propone di descrivere un approccio di text-mining su un corpus di dominio (sughero) nel quadro teorico che contempla la dimensione duale della terminologia al fine di sviluppare un dizionario terminologico e correlarlo ad un'ontologia. Verranno fatte alcune considerazioni su (i) specificità del dominio; (ii) marcatori lessicali; (iii) trattamento automatico del corpus utilizzando Sketch Engine; (iv) rappresentazione delle reti lessicali tramite l'utilizzo di CmapTools; e (v) rappresentazione del sistema concettuale con Protégé. L'obiettivo dell'ontologia è quello di supportare, dal punto di vista logico, la coerenza e la qualità delle definizioni in linguaggio naturale contenute nella risorsa terminologica.
Riferimenti bibliografici
Agbago, Akakpo, and Caroline Barrière. “Corpus Construction for Terminology.” Corpus Linguistics 2005 Conference. Birmingham: National Research Council of Canada, 2005.
Atkins, Sue, Jeremy Clear, and Nicholas Ostler. “Corpus Design Criteria.” Literary and Linguistic Computing 7, no. 1 (1992): 1 - 16. https://doi.org/10.1093/llc/7.1.1
Baker, Paul, Andrew Hardie, and Tony McEnery. A Glossary of Corpus Linguistics. In the series Glossaries in Linguistics. Edinburgh University Press, 2006. https://doi.org/10.1515/9780748626908
Bowker, Lynne, and Jennifer Pearson. Working wiht specialized language: a practicle guide to using corpora. London: Routledge, 2002.
Brezina, Vaclav. Statistics in Corpus Linguistics: A Practical Guide. Cambridge University Press, 2018. https://doi.org/10.1017/9781316410899
Costa, Rute. “Pressupostos teóricos e metodológicos para a extracção automática de unidades terminológicas multilexémicas.” PhD Thesis. Lisboa: Universidade Nova de Lisboa, Faculdade de Ciências Sociais e Humanas, 2001.
Fernández, Mariano, Asunción Gómez-Pérez, and Natalia Juristo. “Methontology: From Ontological Art Towards Ontological Engineering.” Ontological Engineering | AAAI Spring Symposium. Association for the Advancement of Artificial Intelligence, 1997.
Gruber, Tom. “Ontology.” In Encyclopedia od Database Systems, edited by Ling Liu and Tamer Özsu, 1963-1965. Boston, MA: Springer, 2009. https://doi.org/10.1007/978-0-387-39940-9_1318
Horridge, Matthew, and Peter F. Patel-Schneider. “OWL 2 Web Ontology Language Manchester Syntax (Second Edition).” W3C Working Group Note. 11 December 2012. https://www.w3.org/TR/owl2-manchester-syntax/#ref-manchester-owl-dl.
Horridge, Matthew, Nick Drummond, Goodwin, Rector, Alan John, Robert Stevens, and Hai H. Wang. “The Manchester OWL Syntax.” Proceedings of the OWLED*06 Workshop on OWL: Experiences and Directions, Athens, Georgia, USA, 2006.
ISO/FDIS 1087 (E). “Terminology work and terminology science - Vocabulary.” Suisse: ISO, 2019.
ISO/NF 704. “Travail terminologique - Principes et méthodes.” La Plaine Saint-Denis: Association Française de Normalisation (AFNOR), 2009.
Izquierdo, Alba Fernández. Themis. 07 2020. http://themis.linkeddata.es/index.html.
Laviosa, Sara. “Corpus Linguistics and translation studies.” In Perspectives on Corpus Linguistics, edited by Vander Viana, Sonia Zyngier and Geoff Barnbrook, 131-153. Amsterdam / Philadelphia: John Benjamins Publishing Company, 2011. https://doi.org/10.1075/scl.48
L'Homme, Marie Claude. La Terminologie: principes et techniques. Collection : Paramètres. Montréal: Les presses de l'Université de Montréal, 2004. https://doi.org/10.4000/books.pum.10693
Lim, Edward, James Liu, and Raymond Lee. Knowledge Seeker – Ontology Modelling for Information Search and Management. Series: Intelligent Systems Reference Library. Edited by Janusz Kacprzyk, Jain and Lakhmi. Hong Kong: Springer Berlin, Heidelberg, 2011.
Mechura, Michal. “Introducing Lexonomy: an open-source dictionary writing.” Proceedings of the eLex 2017 conference, 2017.
Meyer, Ingrid. “Extracting Knowledge-Rich contexts for terminography: a conceptual and methodological framework.” In Recent Advances in Computational Terminology, edited by Didier Bourigault, Christian Jacquemin and Marie-Claude L'Homme, 279 - 302. Amsterdam / Philadelphia: John Benjamins Publishing Company, 2001. https://doi.org/10.1075/nlp.2.15mey
Musen, M. A., and Protégé team. “The Protégé project: A look back and a look forward.” AI Matters 1, no. 4 (June 2015). https://doi.org/10.1145/2757001.2757003
Ontology-Lexicon Community Group. “Lexicon Model for Ontologies.” W3C Community Group Final Report. Edited by Philipp Cimiano, John P. McCrae and Paul Buitelaar. 10 May 2016. https://www.w3.org/2016/05/ontolex/.
Pearson, Jennifer. Terms in context. Amsterdam: John Benjamins Publishing Company, 1998. https://doi.org/10.1075/scl.1
Pottier, Bernard. Théorie et analyse en Linguitique. 2, corrigée. Paris: HACHETTE, Supérieur, 1992.
Poveda-Villalón, María, Asunción Gómez-Pérez, and Mari Carmen Suárez-Figueroa. “OOPS!(Ontology Pitfall Scanner!): An on-line tool for ontology evaluation.” International Journal on Semantic Web and Information Systems (IJSWIS) (IGI Global) 10, no. 2 (2014): 7-34.
Ramos, Margarida. “Knowledge Organization and Terminology: application to Cork.” PhD Thesis. Lisboa: Universidade NOVA de Lisboa, Faculdade de Ciências Socias e Humanas; Université Savoie Mont Blanc, Laboratoire d'Informatique, Systèmes, Traitement de l'Information et de la Connaissance, 2020. http://hdl.handle.net/10362/111722; https://hal.science/tel-03106436
Ramos, Margarida. OntoCork. Dataset - OWL File. 2020. https://doi.org/10.34619/a27q-1ryd
Ramos, Margarida, and Rute Costa. “Extracting knowledge rich information from definitions. A corpus-based approach to build a conceptual based terminological resource.” 2nd International Conference on Multilingual Digital Terminology Today (MDTT 2023). CEUR Workshop Proceedings (CEUR-WS.org), 2023.
Ramzan, Talib, K. Hanif Muhammad, Ayesha Shaeela, and Fatima Fakeeha. “Text Mining: Techniques, Applications and Issues.” International Journal of Advanced Computer Science and Applications (IJACSA) 7, no. 11 (2016): 414 - 418. https://dx.doi.org/10.14569/IJACSA.2016.071153
Sabou, Marta, and Miriam Fernandez. “Ontology (Network) Evaluation.” In Ontology Engineering in a Networked World, edited by M. Suárez-Figueroa, A. Gómez-Pérez, E. Motta and A Gangemi, 193-212. Berlin, Heidelberg: Springer, 2012. https://doi.org/10.1007/978-3-642-24794-1_9
Suárez-Figueroa, Mari Carmen, Asunción Gómez-Pérez, and Mariano Fernández-López. “The NeOn Methodology for Ontology.” In Ontology Engineering in a Networked World, edited by Mari Carmen Suárez-Figueroa, Asunción Gómez-Pérez, Enrico Motta and Aldo Gangemi, 9-34. Berlin, Heidelberg: Springer, 2007. https://doi.org/10.1007/978-3-642-24794-1_2
Tognini Bonelli, Elena. “Theoretical overview of the evolution of corpus linguistics.” Chap. 2 in The Routledge Handbook of Corpus Linguistics, edited by Anne O’Keeffe and Michael McCarthy, 14-27. London: Routledge, 2010. https://doi.org/10.4324/9780203856949
Uschold, Mike, and Michael Gruninger. “Ontologies: principles, methods and applications.” The Knowledge Engineering Review 11 (1996): 93-136.
Viana, Vander. “The politics of Corpus Linguistics.” In Perspectives on Corpus Linguistics, edited by Vander Viana, Sonia Zyngier and Geoff Barnbrook, 229-245. Amsterdam / Philadelphia: John Benjamins Publishing Company, 2011. https://doi.org/10.1075/scl.48
Wilkinson, Mark D., et al. “The FAIR Guiding Principles for scientific data management and stewardship.” Scientific Data, nº 3 (03 2016). https://doi.org/10.1038/sdata.2016.18
Downloads
Pubblicato
Come citare
Fascicolo
Sezione
Licenza
Copyright (c) 2024 Margarida Ramos, Rute Costa
Questo lavoro è fornito con la licenza Creative Commons Attribuzione 4.0 Internazionale.