MDParser - a tool for dependency parsing in SMILA 0.7

MDParser stands for multilingual dependency parser and is a data-driven system, which can be used to parse text of an arbitrary language for which training data is available. The parser is able to create both unlabeled and labeled dependency structures. The number of possible relation types depends on the granularity of the training data. It currently supports 2 "standard" output formats: Stanford and CoNLLX and computes dependency relations for German and English.

The models of the system are based on various features, which are extracted from the words of the sentence, including word forms and part of speech tags. Therefore in order to process previously unannotated text MDParser additionally includes some preprocessing components:

a sentence splitter, since the parser constructs a dependency structure for individual sentences
a tokenizer, in order to recognise the elements between the dependency relations will be built
a part of speech tagger, in order to determine the part of speech tags, which are one of the most important influencing factors for constructing the dependency structure.

MDParser is an especially fast system (~ 10 sentences / second) and therefore it is particularly suitable for processing very large amounts of data. Thus it can be used as a part of bigger applications in which dependency structures are desired. MDParser has already been tested for several languages, including German and English. It is currently able to achieve quite competitive results (86% - 88%), considering that it is based on a fast linear classification approach and a deterministic parsing strategy.

Categories: EclipseRT Target Platform Components

Tags: SMILA, dependency parsing, lingustic analysis, dependency relations

Additional Details

Organization Name: DFKI GmbH

Development Status: Production/Stable

Date Created: Wednesday, June 8, 2011 - 11:40

License: Commercial

Date Updated: Tuesday, June 14, 2011 - 06:03

Submitted by: Bogdan Sacaleanu

Date	Ranking	Clickthroughs
May 2024	0/0	2
April 2024	0/0	10
March 2024	0/0	11
February 2024	0/0	5
January 2024	0/0	6
December 2023	0/0	9
November 2023	0/0	14
October 2023	0/0	10
September 2023	0/0	9
August 2023	0/0	4
July 2023	0/0	3
June 2023	0/0	2

View Data for all Listings

Unsuccessful Installs

Unsuccessful Installs in the last 7 Days: 0

Download last 500 errors (CSV)