| Contents |
| - Introduction |
| - Running the Drug NER pipeline |
| - AggregateTAE.xml for CDA documents conforming to the provided DTD. |
| - AggregatePlaintextProcessor.xml for plaintext documents. |
| - Fixes |
| - Version 1.2.1 |
| |
| ############ |
| Introduction |
| ############ |
| |
| This project adds the ability to identify attributes of drug mentions such as Dosage, Frequency, Frequency Unit, |
| Route and Strength from either plaintext or CDA documents. It also provides the ability to specify which sections |
| of a note contain drugs in a list format versus drug mentions within the narrative of the note. This allows for |
| customized processing done on different sections and generally improves the quality of the annotations. |
| This project utilizes various cTAKES components and hence requires cTAKES to be installed prior to using this component. |
| |
| |
| ############################################################################ |
| Running the ctakes-clinical-pipeline |
| ############################################################################ |
| |
| This project (or PEAR file), the "Drug NER", relies on other projects/PEAR files such as |
| 'ctakes-clinical-pipeline', 'context dependent tokenizer', 'core', 'dictionary lookup', |
| 'LVG' and 'NE contexts'. |
| |
| |
| The pipeline can process two types of documents |
| - plaintext files |
| - Clinical Document Architecture (CDA) XML files that conform to the DTD provided |
| |
| |
| %%%%%%%%%%%%%%%%%%%%%%%%% |
| AggregateTAE.xml for CDA documents conforming to the provided DTD. |
| |
| The file desc/analysis_engine/AggregateTAE.xml is the aggregate |
| analysis engine to use to run the entire pipeline, including the |
| CdaCasInitialzer analysis engine, which reads CDA documents that conform |
| to the DTD provided, and create Segment annotations based on the sections |
| within the CDA document. |
| |
| Open this file using the Component Descriptor Editor as described in the tutorial. |
| Click on the tab labeled "Aggregate" to observe that the Component Engine Flow (pipeline) |
| defined by this descriptor includes CdaCasInitialzer as the first component. |
| Observe that part of speech tagging (POSTagger) comes before chunking (Chunker), etc. |
| |
| Click on the tab labeled "Parameter Settings" to view the parameters set in this |
| descriptor. The 'medicationRelatedSection' is *not* set (generally set to |
| 20104, 20133, 20147 for Mayo Corpus) in the default implementation. If this parameter is left blank, |
| all sections will be treated as narrative sections and if these sections do contain Drugs in list format |
| the accuracy for identifying Drug mentions and its attributes may not be acceptable. |
| It is recommended to specify section ids that contain drugs in a list format if |
| such sections are available. |
| |
| Another parameter that relates to aforementioned 'medicationRelatedSection' is 'sectionOverrideSet'. |
| This parameter specifies the section ids where DrugLookupWindow annotations will span the complete |
| span of text of the specified section. The 'sectionOverrideSet' is *not* set (generally set to |
| 20104, 20133, 20147 for Mayo Corpus) in the default implementation. If this parameter is left blank, |
| all sections will be treated as narrative sections and if these sections do contain Drugs in list format |
| the accuracy for identifying Drug mentions and its attributes may not be acceptable. |
| It is recommended to specify section ids that contain drugs in a list format if |
| such sections are available. |
| |
| If you are not planning to use CDA documents as input, but rather plain text documents, and you |
| prefer the entire document's contents be handled as lists, rather than narrative, then 'SIMPLE_SEGMENT' |
| can be entered into the 'medicationRelatedSection' or 'sectionOverrideSet' (see the tutorial for |
| additional information on adding the 'SIMPLE_SEGMENT' to the Compenent Engine Flow (pipeline)). |
| |
| The parameters are: |
| DrugMentionAnnotator.xml |
| - medicationRelatedSection - IDs of sections generated by your Segment Annotator where |
| drug mentions appear in a list format. |
| |
| DrugCNP2LookupWindow.xml |
| - sectionOverrideSet - IDs of sections (or segments) where the complete section will be treated |
| as DrugLookupWindow which is designed to process medications or drugs in |
| 'list format'. |
| |
| %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%% |
| AggregatePlaintextProcessor.xml for plaintext documents. |
| |
| The file desc/analysis_engine/AggregatePlaintextProcessor.xml is the aggregate |
| analysis engine to use to run the entire pipeline, including the |
| SimpleSegmentAnnotator analysis engine, which creates a Segment annotation that |
| wraps the entire plaintext document. Other annotators in the pipeline require |
| at least 1 Segment annotation. |
| |
| Click on the tab labeled "Parameter Settings" to view the parameters set in this |
| descriptor. The 'medicationRelatedSection' is set to 20104, 20133, 20147. These |
| are section ids specific to Mayo's CDA documents. These section ids must be changed |
| to match the ids generated by your Segment Annotator. |
| |
| The parameters are: |
| - SegmentID - the identifier or name to assign to the Segment annotation |
| - medicationRelatedSection - IDs of sections generated by your Segment Annotator where |
| drug mentions appear in a list format. |
| |
| ############ |
| Fixes |
| ############ |
| |
| Version 1.2.1 - |
| - Fix problem where drug mentions are not aligned correctly with named entity parent |
| - Fix issue where drug change status increase/decrease/change not correctly creating a noChange mention and/or assigned incorrect attributes. |