Product centric web page segmentation and localization

John Cuzzola, Dragan Gašević, Ebrahim Bagheri

Research output: Chapter in Book/Report/Conference proceedingConference PaperResearch

Abstract

The Internet is home to an ever increasing array of goods and services available to the general consumer. These products are often discovered through search engines whose focus is on document retrieval rather than product procurement. The demand for details of specific products as opposed to just documents containing such information has resulted in an influx of product collection databases, deal aggregation services, mobile apps, twitter feeds and other just-in-time methods for rapid finding, indexing, and notifying shoppers to sale events. This has led to our development of intelligent Web crawler technology aimed towards this specific category of information retrieval. In this paper, we demonstrate our solution for Web page categorization, segmentation and localization for identifying Web pages with shopping deals and automatically extracting specifics from the identified Web pages. Our work is supported with empirical data of its effectiveness. A screen cast demonstration is also available online at http://youtu.be/HHPme6AJuCk.

Original languageEnglish
Title of host publicationProceedings of the 4th Canadian Semantic Web Symposium (CSWS 2013)
EditorsRene Witte, Christopher J.O. Baker, Greg Butler, Michel Dumontier
Place of PublicationAachen Germany
PublisherRheinisch-Westfaelische Technische Hochschule Aachen
Pages29-32
Number of pages4
Publication statusPublished - 2013
Externally publishedYes
EventCanadian Semantic Web Symposium 2013 - Concordia University, Montreal, Canada
Duration: 10 Jul 201310 Jul 2013
Conference number: 4th
http://www.semanticsoftware.info/event/4th-canadian-semantic-web-symposium-csws-2013-montréal-canada

Conference

ConferenceCanadian Semantic Web Symposium 2013
Abbreviated titleCSWS 2013
Country/TerritoryCanada
CityMontreal
Period10/07/1310/07/13
OtherThe event is part of the Semantic Trilogy 2013 featuring:

International Conference on Biomedical Ontologies (ICBO 2013)
Canadian Semantic Web Symposium (CSWS 2013)
Data Integration in the Life Sciences (DILS 2013)
Internet address

Keywords

  • Classification
  • Deals
  • Localization
  • Natural language processing
  • Products
  • Search
  • Segmentation
  • Web crawling

Cite this