Discovering and Learning Semantic Models of Online Sources for Information Integration
José-Luis Ambite
Bora Gazen
Craig A. Knoblock
Kristina Lerman
Thomas Russ
Abstract
Much work in Information Integration and the Semantic Web assumes that
rich semantic models of sources exist. In practice, there is a
tremendous amount of data on the Web, but it is typically hard to
find, has little or no explicit structure, and there is
rarely any semantic description of the data. We describe an integrated
end-to-end system that can automatically discover web sources, invoke
and extract the data from them, and build their semantic models. We
describe the challenges in integrating the component technologies into a
unified approach to discovering, extracting and modeling new online
sources. We evaluate the integrated system in three different domains
and demonstrate that it can automatically discover and model new data
sources.
In Proceedings of the IJCAI Workshop on Information Integration on the Web, 2009.
The full paper is available in PDF (6pp).
Back to Paper List