Named Entity Recognition, Extraction, and Linking in German Legal Contracts
The semantic knowledge revealed by the continuously increasing amount of digitized legal documents is highly relevant to the reader. Since documents are mostly available as unstructured data, they are not processable by computer systems. We provide support for this business need by implementing a software component, enabling semantic analysis and structuring of legal contracts. Hence, different approaches to Named Entity Recognition are incorporated into an Apache UIMA pipeline. The evaluation of the developed system, using German legal data, demonstrates the applicability of such approaches.
Table of contents
- 1. Introduction
- 2. Related Work
- 3. Conceptual Overview
- 4. Recognizing Named Entities and Linking towards Semantic Models
- 4.1. Named Entity Recognition Pipelines
- 4.1.1. GermaNER
- 4.1.2. DBpedia Spotlight
- 4.1.3. Templated Named Entity Recognition
- 4.2. Named Entity Disambiguation
- 5. Evaluation
- 5.1. Evaluation Method
- 5.2. Data Set
- 5.3. Assessment
- 5.3.1. Which implemented NER pipeline performs best?
- 5.3.2. Which NE type is recognized best?
- 5.3.3. Which NE type is recognized worst?
- 6. Conclusion and Outlook
- 6.1. Conclusion
- 6.2. Limitations and Future Work
- 7. Literatur