Package org.apache.lucene.analysis.core
Class WhitespaceTokenizerFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenizerFactory
-
- org.apache.lucene.analysis.core.WhitespaceTokenizerFactory
-
public class WhitespaceTokenizerFactory extends TokenizerFactory
Factory forWhitespaceTokenizer
.<fieldType name="text_ws" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory" rule="unicode" maxTokenLen="256"/> </analyzer> </fieldType>
Options:- rule: either "java" for
WhitespaceTokenizer
or "unicode" forUnicodeWhitespaceTokenizer
- maxTokenLen: max token length, should be greater than 0 and less than
MAX_TOKEN_LENGTH_LIMIT (1024*1024). It is rare to need to change this else
CharTokenizer
::DEFAULT_MAX_TOKEN_LEN
- Since:
- 3.1
- rule: either "java" for
-
-
Field Summary
Fields Modifier and Type Field Description private int
maxTokenLen
static java.lang.String
NAME
SPI nameprivate java.lang.String
rule
static java.lang.String
RULE_JAVA
private static java.util.Collection<java.lang.String>
RULE_NAMES
static java.lang.String
RULE_UNICODE
-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description WhitespaceTokenizerFactory()
Default ctor for compatibility with SPIWhitespaceTokenizerFactory(java.util.Map<java.lang.String,java.lang.String> args)
Creates a new WhitespaceTokenizerFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description Tokenizer
create(AttributeFactory factory)
Creates a TokenStream of the specified input using the given AttributeFactory-
Methods inherited from class org.apache.lucene.analysis.TokenizerFactory
availableTokenizers, create, findSPIName, forName, lookupClass, reloadTokenizers
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
RULE_JAVA
public static final java.lang.String RULE_JAVA
- See Also:
- Constant Field Values
-
RULE_UNICODE
public static final java.lang.String RULE_UNICODE
- See Also:
- Constant Field Values
-
RULE_NAMES
private static final java.util.Collection<java.lang.String> RULE_NAMES
-
rule
private final java.lang.String rule
-
maxTokenLen
private final int maxTokenLen
-
-
Method Detail
-
create
public Tokenizer create(AttributeFactory factory)
Description copied from class:TokenizerFactory
Creates a TokenStream of the specified input using the given AttributeFactory- Specified by:
create
in classTokenizerFactory
-
-