next up previous
Next: Introduction

Information Extraction from Biomedical Text

Jerry R. Hobbs
Artificial Intelligence Center
SRI International
Menlo Park, California 94025
[email protected]

Abstract:

Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. It requires deeper analysis than key word searches, but its aims fall short of the very hard and long-term problem of full text understanding. Information extraction represents a midpoint on this spectrum, where the aim is to capture structured information without sacrificing feasibility.

One of the key ideas in this technology is to separate processing into several stages, in cascaded finite-state transducers. The earlier stages recognize smaller linguistic objects and work in a largely domain-independent fashion. The later stages take these linguistic objects as input and find domain-dependent patterns among them.

There are now initial efforts to apply this technology to biomedical text, In other domains, the technology plateaued at about 60% recall and precision. Even if applications to biomedical text do no better than this, they could still prove to be of immense help to curatorial activities.





Jerry Hobbs 2004-02-24