Negation and speculation are complex expressive linguistic phenomena that have been extensively studied from a theoretical perspective. They modify the meaning of the phrases in their scope.
The amount of negative and speculative information in biomedical texts cannot be underestimated. For example, 13.45% of sentences in the abstracts section of the BioScope corpus and 13.76% of sentences in the full papers section contain negations. The percentage of sentences with hedge cues in the abstracts and full papers section of the BioScope corpus are 17.70% and 19.44% respectively.
In addition, professionals need to have efficient tools for accessing the vast databases of scientific articles and clinical information available and then analysing the text in greater depth. This analysis should include negation and speculation detection because these linguistic phenomena are used in this domain with the aim to express impressions, hypothesised explanations of experimental results, or negative findings.
This lecture is motivated by the fact that negation and speculation detection is an emerging topic that has attracted the attention of many researchers. In recent years, several challenges and shared tasks have included the extraction of these language forms.
Therefore, it aims to define negation and speculation from a Natural Language Processing perspective, to explain the need for processing these phenomena, to summarise existing research on processing negation and speculation, to provide a list of resources and tools, and to speculate about future developments in this research area. An advantage of this lecture is that it will not only provide an overview of the state of the art in negation and speculation detection, but will also introduce newly developed data sets and scripts.
The participants will find about a number of rule-based tools that automatically identify negation and will discover their differences from machine-learning approaches. Then, they will have to annotate some biomedical documents with negation and speculation information.
The suggestions to students related to negation and speculation detection could be the following listed:
- Morante, Roser, and Caroline Sporleder. "Modality and negation: An introduction to the special issue." Computational linguistics 38.2 (2012): 223-260.
- Cruz Díaz, Noa Patricia. "Negation and speculation detection in medical and review texts." (2014).
A good introduction to NLP in general would be:
- Nadkarni, Prakash M., Lucila Ohno-Machado, and Wendy W. Chapman. "Natural language processing: an introduction." Journal of the American Medical Informatics Association 18.5 (2011): 544-551.
Noa Cruz is a postdoctoral researcher at the Group of Research and Innovation in Biomedical Informatics, Biomedical Engineering and Health Economy of the Virgen del Rocio University Hospital (Spain) where she works processing and extracting clinical knowledge from the information provided by the Electronic Health Record Ecosystem of Andalusian Public Health Provider, one of the largest in the world. She obtained her PhD in Computer Science at University of Huelva (2014) and her PhD dissertation was awarded the SEPLN (Spanish Association for Natural Language Processing) 2014 award for the Best Research in Natural Language Processing. Her research experience has focused on negation and speculation detection, intelligent access systems to biomedical information, classification based on machine learning, text mining, opinion mining and sentiment analysis. She has been research visitor at Research Institute of Information and Language Processing (Wolverhampton, UK), Research Group for Language Technology (Oslo, Norway) and Department of Biomedical Informatics (Utah, EEUU). She has got several awards such as the 10C Student Award 2009, in honour of the comprehensive training and skills acquired from the University of Huelva or a Google grant. You can find out more about Noa Cruz on her Linkedin profile.