|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |
java.lang.Objectmoj.ri.RandomLabel
public class RandomLabel
A RandomLabel consists of a word (or term), a term frequency count, a document frequency count, a randomly initialised label and a contextually updated (weighted window) context vector. This container is used by RandomIndex to store the index terms and their corresponding frequency and context data.
Constructor Summary | |
---|---|
RandomLabel()
Creates an empty RandomLabel with a dimensionality of 0 (zero). |
|
RandomLabel(java.lang.String word)
Defaults the dimensionality to 1800 and the randomness to 8, i.e. |
|
RandomLabel(java.lang.String word,
int dimensionality,
int randomDegree,
int seed)
To construct a new RandomLabel object we need the word that is to be 'labelled', the length of the label (i.e. |
|
RandomLabel(java.lang.String word,
java.lang.String metadata,
long termFrequency,
int docFrequency,
int[] positivePositions,
int[] negativePositions,
float[] context)
Construct new RandomLabel from existing data. |
Method Summary | |
---|---|
int |
compareTo(java.lang.Object label)
Compares this RandomLabel with the specified RandomLabel for order on basis of term frequency. |
float |
cosineSim(RandomLabel label)
Calculate the cosine similarity between this RandomLabel and the given. |
float[] |
getContext()
Get the context vector associated to the word. |
int |
getDimensionality()
Get the random label associated to the word. |
int |
getDocumentFrequency()
Get the document frequency for the current word. |
java.lang.String |
getMetaData()
Get String based meta data for the RandomLabel. |
int[] |
getNegativePositions()
Get the negative positions in the random label associated to the word. |
int[] |
getPositivePositions()
Get the positive positions in the random label associated to the word. |
long |
getTermFrequency()
Get the term frequency for the current word. |
java.lang.String |
getWord()
Get the word which the RandomLabel is associated to. |
int |
incrementDocumentFrequency()
Increment the document frequency for the current word. |
long |
incrementTermFrequency()
Increment the term frequency for the current word. |
void |
prune()
Prunes the RandomLabel by setting the context vector to null . |
java.lang.String |
setMetaData(java.lang.String metadata)
Set String based meta data for the RandomLabel. |
java.lang.String |
toString()
String representation of the RandomLabel. |
boolean |
updateContext(RandomLabel[] leftContext,
RandomLabel[] rightContext,
WeightingScheme weightingScheme)
Update this RandomLabel's context with the weighted labels of the RandomLabels in left and right context using the supplied weighting scheme (as defined by a visiting object weightingScheme ). |
boolean |
validState()
Tells whether the RandomLabel is in a valid state or not. |
Methods inherited from class java.lang.Object |
---|
equals, getClass, hashCode, notify, notifyAll, wait, wait, wait |
Constructor Detail |
---|
public RandomLabel(java.lang.String word, int dimensionality, int randomDegree, int seed)
word
- the word which the label is to be associated to.dimensionality
- the length of the label to be associated to the word.randomDegree
- the 'degree' of randomness initially applied to the label
given in total number of non-zeros in the initial label (even number).
Should not be greater than dimensionality and therefore, in this case,
defaults to dimensionality.seed
- a seed for the local random generator. This seed, in combination
with word
, makes it very likely that the created random label is
"unique" yet reproducible.public RandomLabel(java.lang.String word)
word
- the word which the label is to be associated to.public RandomLabel()
public RandomLabel(java.lang.String word, java.lang.String metadata, long termFrequency, int docFrequency, int[] positivePositions, int[] negativePositions, float[] context)
word
- the word which the label is to be associated to.termFrequency
- the term frequency for the associated word.docFrequency
- the document frequency for the associated word.positivePositions
- the positions in the RandomLabel that should hold "1"negativePositions
- the positions in the RandomLabel that should hold "-1"context
- the context vector to be associated to the word
(i.e. an array representing the co-occurrence "coloring").Method Detail |
---|
public java.lang.String getWord()
public java.lang.String getMetaData()
String
based meta data for the RandomLabel. This can
for example be used to store part of speech tags or distributional
features.
String
based meta data for the RandomLabel.public java.lang.String setMetaData(java.lang.String metadata)
String
based meta data for the RandomLabel. This can
for example be used to store part of speech tags or distributional
features.
String
based meta data for the RandomLabel.public int[] getNegativePositions()
public int[] getPositivePositions()
public float[] getContext()
null
if the label has been pruned.public void prune()
null
.
public int getDimensionality()
public long getTermFrequency()
public long incrementTermFrequency()
public int getDocumentFrequency()
public int incrementDocumentFrequency()
public boolean validState()
true
if in a valid state, otherwise false
public boolean updateContext(RandomLabel[] leftContext, RandomLabel[] rightContext, WeightingScheme weightingScheme)
weightingScheme
). The RandomLabel of the word nearest to this is the first
element in the context vectors, (context window) the second nearest the second, and so
on (note: this applies to both rightContext
and
leftContext
).
All RandomLabels in leftContext
and rightContext
must have the same dimensionality. However, leftContext
and
rightContext
themselves (i.e. the context window) do not have to be of
equal length (i.e. you can have an unbalanced context window).
leftContext
- an array of RandomLables where the first element represents the word
closest to the left of the word who's label is being updated, the second element
represents the word second closest to the left and so on.
No slot may be empty (null) as this will cause a NullPointerException.rightContext
- same as for leftContext
but for the right side.weightingScheme
- a visiting object that contains the methods for calculating the
weights for the left resp. right contexts based upon distance to the current label.
public float cosineSim(RandomLabel label)
label
- the RandomLabel that is to be compared with this RandomLabel.
public int compareTo(java.lang.Object label) throws java.lang.ClassCastException
compareTo
in interface java.lang.Comparable<java.lang.Object>
label
- the RandomLabel to be compared.
java.lang.ClassCastException
public java.lang.String toString()
toString
in class java.lang.Object
|
|||||||||
PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD |