| |
| Contents |
| - Introduction |
| - Negation Annotator |
| - NegationAnnotator.xml |
| - updating negex patterns |
| - Status Annotator |
| |
| ############ |
| Introduction |
| ############ |
| |
| The context annotator provides a mechanism for examining the context of existing annotations, finding |
| events of interest in the context, and acting on those events in some way. The negation and status |
| annotators both take advantage of this infrastructure by examining the context of named entities |
| (e.g. disorders and findings) to see if they should be considered as negated (e.g. "no chest pain") |
| or if their status should be modified (e.g. "myocardial infarction" should have status "history of"). |
| |
| In fact, the "negation annotator" is really just the context annotator configured to deal with negations. |
| |
| Similarly, the "status annotator" is the context annotator configured to identify the status of named entities. |
| |
| To better understand the context annotator code you should start by reading the javadocs for the class |
| org.apache.ctakes.necontexts.ContextAnnotator.java. It provides a nice conceptual overview of how the code works. |
| |
| ################## |
| Negation Annotator |
| ################## |
| What follows is an explanation of how negation is performed using the context annotator. |
| |
| The negation detection annotator is a pattern-based (no maxent models required/used) approach that uses |
| finite state machines and is roughly based on the popular Negex algorithm introduced by Wendy Chapman. |
| |
| %%%%%%%%%%%%%%%%%%%%% |
| NegationAnnotator.xml |
| |
| We will start by examining the descriptor file desc/NegationAnnotator.xml. Open this file with the Component |
| Descriptor Editor. Select the first tab labeled "Overview" and observe that the analysis engine that is |
| specified is org.apache.ctakes.necontexts.ContextAnnotator. There is no "negation annotator" analysis |
| engine - we simply configure the ContextAnnotator for the task. Next select the tab labeled "Parameter Settings". |
| |
| We will discuss each setting in turn: |
| |
| - MaxLeftScopeSize = 10 |
| The maximum number of annotations that will make up the left-hand side context is ten. |
| Increase or decrease this parameter setting to increase or decrease the left hand side context. |
| - MaxRightScopeSize = 10 |
| The maximum number of annotations that will make up the right-hand side context is ten. |
| Increase or decrease this parameter setting to increase or decrease the right-hand side context. |
| - ScopeOrder = LEFT, RIGHT |
| The context annotator will look for signs of negation on the left-hand side of a named entity first and then |
| the right-hand side. |
| - ContextAnalyzerClass = org.apache.ctakes.necontexts.negation.NegationContextAnalyzer |
| The context analyzer looks at the context (e.g. the 10 words on the left or right of the named entity) and |
| determines if the named entity should be negated. If it should, then the negation context analyzer will |
| generate a context hit to be consumed by the context hit consumer (see below.) |
| - ContextHitConsumerClass = org.apache.ctakes.necontexts.negation.NegationContextHitConsumer |
| The context hit consumer handles context hits generated by the context analyzer. In this case, the negation |
| context hit consumer simply sets the certainty of a named entity to -1 which indicates that it has been negated. |
| - WindowAnnotationClass = edu.mayo.bmi.uima.common.type.Sentence |
| When the context annotator collects the context annotations for a named entity, it will not look beyond the |
| boundaries of the sentence that the named entity is found in. |
| - FocusAnnotationClass = edu.mayo.bmi.uima.common.type.NamedEntity |
| The negation annotator is concerned with negating named entities and thus the focus annotation type |
| (the annotations for which a context is generated and examined) specifies named entities. |
| - ContextAnnotationClass = edu.mayo.bmi.uima.common.type.BaseToken |
| The context of the named entities is a list of tokens - this is what the context analyzer is going to examine. |
| |
| So, the work of negating a named entity is done by |
| 1) finding negations by the NegationContextAnalyzer |
| 2) updating the status of NamedEntities by the NegationContextHitConsumer. |
| |
| The former is a pretty lightweight wrapper around another class which has all of the negation pattern finding |
| logic - org.apache.ctakes.core.fsm.machine.NegationFSM. If you want to update the pattern matching of negation detection, |
| then you would have to do it in that class. |
| |
| %%%%%%%%%%%%%%%%%%%%%%% |
| updating negex patterns |
| |
| Updating the negation detection patterns will involve either 1) trial and error experimentation or 2) understanding how |
| the the NegationFSM works. The rules, patterns, words that identify negation are hard-coded into the class |
| org.apache.ctakes.core.fsm.machine.NegationFSM which is found in the core project. I would suggest starting off with the |
| trial-and-error approach. For example, if you wanted to add "impossible" to the lexicon of negation words, |
| then you could try adding it to the the _negAdjectivesSet and test the behavior. |
| |
| |
| ################ |
| Status Annotator |
| ################ |
| |
| The way the status annotator works mirrors very closely how the negation annotator works. |
| You are encouraged to read the above section, examine the parameter settings given for desc/StatusAnnotator.xml, |
| and look at org.apache.ctakes.core.fsm.machine.StatusIndicatorFSM. |
| |
| |