Impact of corpus diversity and complexity on NER performance

Tatyana Shmanina, Ingrid Zukerman, Antonio Jimeno Yepes, Lawrence Cavedon, Karin Verspoor

    Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

    2 Citations (Scopus)


    We describe a cross-corpora evaluation of disease mention recognition for two annotated biomedical corpora: the Human Variome Project Corpus and the Arizona Disease Corpus. Our analysis of the performance of a state-of-the-art NER tool in terms of the characteristics and annotation schema of these corpora shows that these factors significantly affect performance.

    Original languageEnglish
    Title of host publicationAustralasian Language Technology Association Workshop 2013 - Proceedings of the Workshop (ALTA)
    EditorsSarvnaz Karimi, Karin Verspoor
    Place of PublicationStroudsburg PA USA
    PublisherAssociation for Computational Linguistics (ACL)
    Number of pages5
    Publication statusPublished - 2013
    EventAustralasian Language Technology Association Workshop 2013 - Queensland University of Technology, Brisbane, Australia
    Duration: 4 Dec 20136 Dec 2013
    Conference number: 11th (Proceedings)


    ConferenceAustralasian Language Technology Association Workshop 2013
    Abbreviated titleALTAW 2013
    Internet address

    Cite this