Morphology based automatic acquisition of large-coverage lexica
CLÉMENT, Lionel
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Linguistic signs, grammar and meaning: computational logic for natural language [SIGNES]
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Linguistic signs, grammar and meaning: computational logic for natural language [SIGNES]
CLÉMENT, Lionel
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Linguistic signs, grammar and meaning: computational logic for natural language [SIGNES]
< Reduce
Laboratoire Bordelais de Recherche en Informatique [LaBRI]
Linguistic signs, grammar and meaning: computational logic for natural language [SIGNES]
Language
en
Communication dans un congrès
This item was published in
LREC 04, LREC 04, LREC 04, 2004, Lisbonne. 2004p. 1841-1844
English Abstract
In this article, we introduce a new technique for constructing wide-coverage morphological lexica from large corpora and morphological knowledge, with an application to French. Basically, it relies on the idea that the ...Read more >
In this article, we introduce a new technique for constructing wide-coverage morphological lexica from large corpora and morphological knowledge, with an application to French. Basically, it relies on the idea that the existence of a hypothetical lemma can be guessed if several different words found in the corpus are best interpreted as morphological variants of this lemma. We first validated our technique by extracting verbs and adjectives on a general French corpus of 25 million words. Compared with other lexical resources available for French, our results are very satisfying, since we cover many words, often derived words, that are not always present in other lexica. Application of our algorithm to the acquisition of domain-specific adjectives on a botanic corpus gave also very good results, thus demonstrating its usability to extract domain-specific lexica. Moreover, it is generalizable to any language with a substantial morphology.Read less <
Origin
Hal imported