Corpora Processing Software
corpora, processing, indexing
Set of programs for processing large text corpora. The programs transform data from the HTML format to a vertical text, its annotation at different levels and indexing in MG4J and Elastic.
Distributed under The Apache License Version 2.0 http://www.apache.org/licenses/LICENSE-2.0.txt