Topic-specific link analysis using independent components for information retrieval

Wray Buntine, Jaakko Löfström, Sami Perttu, Kimmo Valtonen

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearchpeer-review

Abstract

There has been mixed success in applying semantic component analysis (LSA, PLSA, discrete PCA, etc.) to information retrieval. Previous experiments have shown that high-fidelity language models do not imply good quality retrieval. Here we combine link analysis with discrete PCA (a semantic component method) to develop an auxiliary score for information retrieval that is used in post-filtering documents retrieved via regular Tf.Idf methods. For this, we use a topic-specific version of link analysis based on topics developed automatically via discrete PCA methods. To evaluate the resultant topic and link based scoring, a demonstration has been built using the Wikipedia, the public domain encyclopedia on the web.

Original languageEnglish
Title of host publicationAAAI Workshop - Technical Report
Pages47-52
Number of pages6
VolumeWS-05-07
Publication statusPublished - 1 Dec 2005
EventAAAI-05 Workshop - Pittsburgh, PA, United States of America
Duration: 10 Jul 200510 Jul 2005

Conference

ConferenceAAAI-05 Workshop
CountryUnited States of America
CityPittsburgh, PA
Period10/07/0510/07/05

Cite this