The LangID tool provides as main functionality the automatic language identification of any text provided as input. It is based on an n-gram approach to language identification and therefore it is very quick. It can distinguish among a number of 26 languages:
Catalan Croatian Czech Danish Dutch English Esperanto Estonian Finnish French German Hungarian Icelandic Indonesian Italian Latvian Lithuanian Malay Norwegian Portuguese Romanian Serbian Slovak Slovenian Spanish Swedish
The Language Identifier can be used to learn profiles for new languages based on a collection of language specific documents. It can detect the language of a document based either on the first 30 words or on its whole content. The precision of detecting the right language lies between 98% - 99,5%, depending on the profile size. The latency of the component is about 6ms when considering the first 30 words of a document.
Categories: EclipseRT Target Platform Components
Tags: SMILA, language identification
Additional Details
Organization Name: DFKI GmbH
Development Status: Production/Stable
Date Created: Wednesday, June 8, 2011 - 11:33
License: Commercial
Date Updated: Tuesday, January 10, 2012 - 05:21
Submitted by: Bogdan Sacaleanu
Date | Ranking | Installs | Clickthroughs |
---|---|---|---|
September 2024 | 0/0 | 0 | 3 |
August 2024 | 0/0 | 0 | 18 |
July 2024 | 0/0 | 0 | 19 |
June 2024 | 0/0 | 0 | 15 |
May 2024 | 0/0 | 0 | 9 |
April 2024 | 0/0 | 0 | 12 |
March 2024 | 0/0 | 0 | 9 |
February 2024 | 0/0 | 0 | 7 |
January 2024 | 0/0 | 0 | 16 |
December 2023 | 0/0 | 0 | 4 |
November 2023 | 0/0 | 0 | 7 |
October 2023 | 0/0 | 0 | 10 |
Reviews Add new review
Gracias, como lo descargo?
Submitted by Hugo Alberto M… on Fri, 11/30/2012 - 09:49
Gracias, como lo descargo?