*** TrainRI ***
Usage: TrainRI ()
: file or folder to recursively Random Index files in : filename (without extension) to save the RandomIndex to : properties file to read settings from (optional)
Usage: moj.lang.FrequencyList <file> (<minimum token length>)
<file> : file to build frequency counts on
<minimum token length> : minimum length for a token for it to be counted
Usage: moj.lang.se.DecompounderConnection <FILE|WORD|DEMO> <file|word>
<FILE|WORD|DEMO> : decompound words in a file, given word, or print demo output
<file|word> : file to decompound the words in, or word to decompound
Usage: moj.lang.se.GranskaConnection <TOKENIZE|LEMMATIZE|LEMMATAG|POSTAG|PARSE|INFLECT|DEMO> <file|word> (<word class>)
<TOKENIZE|LEMMATIZE|LEMMATAG|POSTAG|PARSE|INFLECT|DEMO> : keywords denoting desired function/output, or demo output
<file|word> : file to tag words in, or word to inflect
<word class> : inflection paradigm (if left out forms are generated for all PoS)
Usage: moj.lang.StandardDeviationList <file> (<minimum token length>)
<file> : file to build standard deviations on
<minimum token length> : minimum length for a token for it to be counted
Usage: moj.lang.StopList <file>
<file> : file to remove stopwords from
(<properties>) : properties file
The properties file can contain any of the following items:
stoplist_file = <stopword file>
shortest_word = <length in characters>
longest_word = <length in characters>
minimum_words_per_file = <length in words>
Calculates the weighting based upon the distance to the current label in
the following manner: weight=(1/distance to focus word)
but also gives a higher weight to content words.