Wordnet Annotated Corpora
Citations
- bul Svetla Koeva, Svetlozara Leseva, Ekaterina Tarpomanova, Borislav Rizov, Tsvetana Dimitrova, Hristina Kukova. Bulgarian Sense Annotated Corpus – Results and Achievements. In Tadić, M., Dimitrova- Vulchanova, M. and Koeva, S. (eds.): Proceedings of the 7th International Conference of Formal Approaches to South Slavic and Balkan Languages (FASSBL-7), 4-6 October 2010, Dubrovnik, Croatia, pp. 41-48. ISBN 978-953-55375-2-6.
- baq Eneko Agirre, Izaskun Aldezabal, Jone Etxeberria, EliIzagirre, Karmele Mendizabal, Eli Pociello, and Mikel Quintian. 2006. Improving the Basque WordNet by corpus annotation. In Proceedings of the Third International WordNet Conference, pages 287-290.
- dut Vossen P., Görög, A., Laan, F., Van Gompel, M., Izquierdo, R. , Van den Bosch, A. (2011). DutchSemCor: building a semantically annotated corpus for Dutch. In: Proceedings of Electronic Lexicography in the 21st century: New Applications for new users (eLEX2011), Bled, Slovenia, November 10-12, 2011
- eng George A. Miller, Martin Chodorow, Shari Landes, Claudia Leacock, and Robert G. Thomas. 1994. Using a semantic concordance for sense identification. In Proceedings of the ARPA Human Language Technology Workshop, pages 240-243; George A. Miller, Claudia Leacock, Randee Tengi, and Ross T. Bunker. (1993). “A Semantic Concordance.” In: Proceedings of the 3 DARPA Workshop on Human Language Technology
- eng Ide, N. (2012). MultiMASC: An Open Linguistic Infrastructure for Language Research. Proceedings of the Fifth Workshop on Building and Using Comparable Corpora, held in conjunction with LREC 2012, Istanbul.
- eng Benjamin Snyder and Martha Palmer (2005) The English All-Words Task, in Proceedings of the Third International Workshop on the Evaluation of Systems for the Semantic Analysis of Text (SENSEVAL-3), 2004
- ger Verena Henrich, Erhard Hinrichs, and Tatiana Vodolazova: WebCAGe — A Web-Harvested Corpus Annotated with GermaNet Senses. In Proceedings of the 13th Conference of the European Chapter of the Association for Computational Linguistics (EACL 2012), Avignon, France, April 2012, pp. 387-396.
- ita Luisa Bentivogli, Emanuele Pianta and Marcello Ranieri MultiSemCor: an English Italian aligned corpus with a shared inventory of senses In Proceedings of the Meaning Workshop 2005, Trento, Italy, February 3-4, 2005, p. 90; Luisa Bentivogli and Emanuele Pianta. 2005. Exploiting parallel texts in the creation of multilingual semantically annotated resources: the multisemcor corpus. Natural Language Engineering, 11(3):247�261.
- ita Simonetta Montemagni, Francesco Barsotti, Marco Battista, Nicoletta Calzolari, Ornella Corazzari, Alessandro Lenci, Antonio Zampolli, Francesca Fanciulli, Maria Massetani, Remo Raffaelli, Roberto Basili, Maria Teresa Pazienza, Dario Saracino, Fabio Zanzotto, Nadia Mana, Fabio Pianesi, Rodolfo Delmonte, 2003. “Building the Italian Syntactic-Semantic Treebank”, in Anne Abeillé (a cura di), Building and using Parsed Corpora, Language and Speech series, Kluwer, Dordrecht, pp. 189-210 ; Stefano Dei Rossi, Giulia Di Pietro, Maria SimiEvalita 2011:Description and Results of the SuperSense Tagging Task
- jpn Francis Bond, Timothy Baldwin, Richard Fothergill and Kiyotaka Uchimoto (2012) Japanese SemCor: A Sense-tagged Corpus of Japanese in The 6th International Conference of the Global WordNet Association (GWC-2012), Matsue.
- rum Monica Lupu, Diana Trandabat and Maria Husarciuc. A Romanian SemCor aligned to the English and Italian MultiSemCor. In 1st ROMANCE FrameNet Workshop at EUROLAN 2005 Summer School, Proceedings, pages 20{27, Cluj-Napoca, Romania, July 2005.
- spa Castellón I., Climent S., Coll-Florit M., Lloberes M. and Rigau G. Semantic Hand-Tagging of the SenSem Corpus Using Spanish WordNet Senses. Proceedings of the 6th Global WordNet Conference (GWC’12), Matsue, Japan. January, 2012.
References
- ↑ Both lexical and function words were subject to annotation
- ↑ 282,503 tagged manually by two annotators, 400,000+ by at least one annotator, and millions automatically
- ↑ According to Bentivogli and Pianta (2005), 23,4% of Italian words still need to be tagged, so we can estimate (given that 92,820 is the 76,6%) the taggable words at 121,175