Class PatternTypingFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.pattern.PatternTypingFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class PatternTypingFilterFactory extends TokenFilterFactory implements ResourceLoaderAware
Provides a filter that will analyze tokens with the analyzer from an arbitrary field type. By itself this filter is not very useful. Normally it is combined with a filter that reacts to types or flags.<fieldType name="text_taf" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="com.example.PatternTypingFilter" patternFile="patterns.txt"/> <filter class="solr.TokenAnalyzerFilter" asType="text_en" preserveType="true"/> <filter class="solr.TypeAsSynonymFilterFactory" prefix="__TAS__" ignore="word,<ALPHANUM>,<NUM>,<SOUTHEAST_ASIAN>,<IDEOGRAPHIC>,<HIRAGANA>,<KATAKANA>,<HANGUL>,<EMOJI>"/> </analyzer> </fieldType>
Note that a configuration such as above may interfere with multi-word synonyms. The patterns file has the format:
(flags) (pattern) ::: (replacement)
Therefore to set the first 2 flag bits on the original token matching 401k or 401(k) and adding a type of 'legal2_401_k' whenever either one is encountered one would use:3 (\d+)\(?([a-z])\)? ::: legal2_$1_$2
Note that the number indicating the flag bits to set must not have leading spaces and be followed by a single space, and must be 0 if no flags should be set. The flags number should not contain commas or a decimal point. Lines for which the first character is#
will be ignored as comments. Does not support producing a synonym textually identical to the original term.- Since:
- 8.8
-
-
Field Summary
Fields Modifier and Type Field Description static java.lang.String
NAME
SPI nameprivate java.lang.String
patternFile
private PatternTypingFilter.PatternTypingRule[]
rules
-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description PatternTypingFilterFactory()
Default ctor for compatibility with SPIPatternTypingFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)
Creates a new PatternTypingFilterFactory
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description TokenStream
create(TokenStream input)
Transform the specified input TokenStreamvoid
inform(ResourceLoader loader)
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
SPI name- See Also:
- Constant Field Values
-
patternFile
private final java.lang.String patternFile
-
rules
private PatternTypingFilter.PatternTypingRule[] rules
-
-
Method Detail
-
inform
public void inform(ResourceLoader loader) throws java.io.IOException
Description copied from interface:ResourceLoaderAware
Initializes this component with the provided ResourceLoader (used for loading classes, files, etc).- Specified by:
inform
in interfaceResourceLoaderAware
- Throws:
java.io.IOException
-
create
public TokenStream create(TokenStream input)
Description copied from class:TokenFilterFactory
Transform the specified input TokenStream- Specified by:
create
in classTokenFilterFactory
-
-