Using Machine Reading to Aid Cancer Understanding and Treatment

Friday, October 27, 2017, 11:00 am - 12:00 pm PDTiCal
11th floor large conference room
This event is open to the public.
AI Seminar
Mihai Surdeanu
In the first part of the talk, I will describe a natural language processing (NLP) approach that captures a system-scale, mechanistic understanding of cellular processes through automated, large-scale reading of scientific literature. At the core of this approach are compact semantic grammars that capture mentions of biological entities (e.g., genes, proteins, protein families, simple chemicals), events that operate over these biochemical entities (e.g., biochemical reactions), and nested events that operate over other events (e.g., catalyses). This grammar-based approach is a departure from recent trends in NLP such as deep learning, but I will argue that this is a better direction for cross-disciplinary projects such as this. 
I will show that the proposed approach performs machine reading at accuracy comparable with human domain experts, but at much higher throughput, and, more importantly, that this automatically-derived knowledge substantially improves the inference capacity of existing biological data analysis algorithms. Using this knowledge we were able to identify a large number of previously unidentified, but highly statistically significant mutually exclusively altered signaling modules in several cancers, which led to novel biological hypotheses within the corresponding cancer context.
In the second part of this talk, I will introduce a focused reading approach to guide the machine reading of biomedical literature towards what literature should be read to answer a biomedical query as efficiently as possible. I will introduce a family of algorithms for focused reading, including an intuitive, strong baseline, and a second approach which uses a reinforcement learning (RL) framework that learns when to explore (widen the search) or exploit (narrow it). I will demonstrate that the RL approach is capable of answering more queries than the baseline, while being more efficient, i.e., reading fewer documents.
Mihai Surdeanu is an Associate Professor in the Computer Science department at the University of Arizona. Dr. Surdeanu earned a Ph.D. degree in Computer Science from Southern Methodist University, Dallas, TX, in 2001. He has 15+ years of experience in building systems driven by natural language processing (NLP) and machine learning. His experience spans both academia (Stanford University, University of Arizona) and industry (Yahoo! Research and two NLP-centric startups). During his career, he published more than 80 peer-reviewed articles, including four articles that were among the top three most cited articles at their respective venues. He was a leader or member of teams that ranked in the top three at seven highly competitive international evaluations of end-user NLP systems such as question answering and information extraction. His work was funded by several government organizations (DARPA, NIH), as well as private foundations (the Allen Institute for Artificial Intelligence, the Bill & Melinda Gates Foundation).
Dr. Surdeanu's current work focuses on using machine reading to extract structure from free text and using this structure to construct causal models that can be used to understand, explain, and predict hypotheses for precision medicine.  
« Return to Upcoming Events