- java.lang.Object
-
- org.apache.lucene.analysis.ja.Token
-
public class Token extends java.lang.Object
Analyzed token with morphological data from its dictionary.
-
-
Field Summary
Fields Modifier and Type Field Description private Dictionary
dictionary
private int
length
private int
offset
private int
position
private int
positionLength
private char[]
surfaceForm
private JapaneseTokenizer.Type
type
private int
wordId
-
Constructor Summary
Constructors Constructor Description Token(int wordId, char[] surfaceForm, int offset, int length, JapaneseTokenizer.Type type, int position, Dictionary dictionary)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description java.lang.String
getBaseForm()
java.lang.String
getInflectionForm()
java.lang.String
getInflectionType()
int
getLength()
int
getOffset()
java.lang.String
getPartOfSpeech()
int
getPosition()
Get index of this token in input textint
getPositionLength()
Get the length (in tokens) of this token.java.lang.String
getPronunciation()
java.lang.String
getReading()
char[]
getSurfaceForm()
java.lang.String
getSurfaceFormString()
JapaneseTokenizer.Type
getType()
Returns the type of this tokenboolean
isKnown()
Returns true if this token is known wordboolean
isUnknown()
Returns true if this token is unknown wordboolean
isUser()
Returns true if this token is defined in user dictionaryvoid
setPositionLength(int positionLength)
Set the position length (in tokens) of this token.java.lang.String
toString()
-
-
-
Field Detail
-
dictionary
private final Dictionary dictionary
-
wordId
private final int wordId
-
surfaceForm
private final char[] surfaceForm
-
offset
private final int offset
-
length
private final int length
-
position
private final int position
-
positionLength
private int positionLength
-
type
private final JapaneseTokenizer.Type type
-
-
Constructor Detail
-
Token
public Token(int wordId, char[] surfaceForm, int offset, int length, JapaneseTokenizer.Type type, int position, Dictionary dictionary)
-
-
Method Detail
-
toString
public java.lang.String toString()
- Overrides:
toString
in classjava.lang.Object
-
getSurfaceForm
public char[] getSurfaceForm()
- Returns:
- surfaceForm
-
getOffset
public int getOffset()
- Returns:
- offset into surfaceForm
-
getLength
public int getLength()
- Returns:
- length of surfaceForm
-
getSurfaceFormString
public java.lang.String getSurfaceFormString()
- Returns:
- surfaceForm as a String
-
getReading
public java.lang.String getReading()
- Returns:
- reading. null if token doesn't have reading.
-
getPronunciation
public java.lang.String getPronunciation()
- Returns:
- pronunciation. null if token doesn't have pronunciation.
-
getPartOfSpeech
public java.lang.String getPartOfSpeech()
- Returns:
- part of speech.
-
getInflectionType
public java.lang.String getInflectionType()
- Returns:
- inflection type or null
-
getInflectionForm
public java.lang.String getInflectionForm()
- Returns:
- inflection form or null
-
getBaseForm
public java.lang.String getBaseForm()
- Returns:
- base form or null if token is not inflected
-
getType
public JapaneseTokenizer.Type getType()
Returns the type of this token- Returns:
- token type, not null
-
isKnown
public boolean isKnown()
Returns true if this token is known word- Returns:
- true if this token is in standard dictionary. false if not.
-
isUnknown
public boolean isUnknown()
Returns true if this token is unknown word- Returns:
- true if this token is unknown word. false if not.
-
isUser
public boolean isUser()
Returns true if this token is defined in user dictionary- Returns:
- true if this token is in user dictionary. false if not.
-
getPosition
public int getPosition()
Get index of this token in input text- Returns:
- position of token
-
setPositionLength
public void setPositionLength(int positionLength)
Set the position length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.
-
getPositionLength
public int getPositionLength()
Get the length (in tokens) of this token. For normal tokens this is 1; for compound tokens it's > 1.- Returns:
- position length of token
-
-