Class ProtectedTermFilterFactory
- java.lang.Object
-
- org.apache.lucene.analysis.AbstractAnalysisFactory
-
- org.apache.lucene.analysis.TokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
-
- org.apache.lucene.analysis.miscellaneous.ProtectedTermFilterFactory
-
- All Implemented Interfaces:
ResourceLoaderAware
public class ProtectedTermFilterFactory extends ConditionalTokenFilterFactory
Factory for aProtectedTermFilter
CustomAnalyzer example:
Analyzer ana = CustomAnalyzer.builder() .withTokenizer("standard") .when("protectedterm", "ignoreCase", "true", "protected", "protectedTerms.txt") .addTokenFilter("truncate", "prefixLength", "4") .addTokenFilter("lowercase") .endwhen() .build();
Solr example, in which conditional filters are specified via the
wrappedFilters
parameter - a comma-separated list of case-insensitive TokenFilter SPI names - and conditional filter args are specified viafilterName.argName
parameters:<fieldType name="reverse_lower_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="truncate,lowercase" truncate.prefixLength="4" /> </analyzer> </fieldType>
When using the
wrappedFilters
parameter, each filter name must be unique, so if you need to specify the same filter more than once, you must add case-insensitive unique '-id' suffixes (note that the '-id' suffix is stripped prior to SPI lookup), e.g.:<fieldType name="double_synonym_with_exceptions" class="solr.TextField" positionIncrementGap="100"> <analyzer> <tokenizer class="solr.WhitespaceTokenizerFactory"/> <filter class="solr.ProtectedTermFilterFactory" ignoreCase="true" protected="protectedTerms.txt" wrappedFilters="synonymgraph-A,synonymgraph-B" synonymgraph-A.synonyms="synonyms-1.txt" synonymgraph-B.synonyms="synonyms-2.txt"/> </analyzer> </fieldType>
See related
CustomAnalyzer.Builder.whenTerm(Predicate)
- Since:
- 7.4.0
-
-
Field Summary
Fields Modifier and Type Field Description static char
FILTER_ARG_SEPARATOR
static char
FILTER_NAME_ID_SEPARATOR
private boolean
ignoreCase
static java.lang.String
NAME
static java.lang.String
PROTECTED_TERMS
private CharArraySet
protectedTerms
private java.lang.String
termFiles
private java.lang.String
wrappedFilters
-
Fields inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
LUCENE_MATCH_VERSION_PARAM, luceneMatchVersion
-
-
Constructor Summary
Constructors Constructor Description ProtectedTermFilterFactory()
Default ctor for compatibility with SPIProtectedTermFilterFactory(java.util.Map<java.lang.String,java.lang.String> args)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected ConditionalTokenFilter
create(TokenStream input, java.util.function.Function<TokenStream,TokenStream> inner)
Modify the incomingTokenStream
with aConditionalTokenFilter
void
doInform(ResourceLoader loader)
Initialises this component with the correspondingResourceLoader
CharArraySet
getProtectedTerms()
private void
handleWrappedFilterArgs(java.util.Map<java.lang.String,java.lang.String> args)
boolean
isIgnoreCase()
private void
populateInnerFilters(java.util.LinkedHashMap<java.lang.String,java.util.Map<java.lang.String,java.lang.String>> wrappedFilterArgs)
-
Methods inherited from class org.apache.lucene.analysis.miscellaneous.ConditionalTokenFilterFactory
create, inform, setInnerFilters
-
Methods inherited from class org.apache.lucene.analysis.TokenFilterFactory
availableTokenFilters, findSPIName, forName, lookupClass, normalize, reloadTokenFilters
-
Methods inherited from class org.apache.lucene.analysis.AbstractAnalysisFactory
defaultCtorException, get, get, get, get, get, getBoolean, getChar, getClassArg, getFloat, getInt, getLines, getLuceneMatchVersion, getOriginalArgs, getPattern, getSet, getSnowballWordSet, getWordSet, isExplicitLuceneMatchVersion, require, require, require, requireBoolean, requireChar, requireFloat, requireInt, setExplicitLuceneMatchVersion, splitAt, splitFileNames
-
-
-
-
Field Detail
-
NAME
public static final java.lang.String NAME
- See Also:
- Constant Field Values
-
PROTECTED_TERMS
public static final java.lang.String PROTECTED_TERMS
- See Also:
- Constant Field Values
-
FILTER_ARG_SEPARATOR
public static final char FILTER_ARG_SEPARATOR
- See Also:
- Constant Field Values
-
FILTER_NAME_ID_SEPARATOR
public static final char FILTER_NAME_ID_SEPARATOR
- See Also:
- Constant Field Values
-
termFiles
private final java.lang.String termFiles
-
ignoreCase
private final boolean ignoreCase
-
wrappedFilters
private final java.lang.String wrappedFilters
-
protectedTerms
private CharArraySet protectedTerms
-
-
Method Detail
-
handleWrappedFilterArgs
private void handleWrappedFilterArgs(java.util.Map<java.lang.String,java.lang.String> args)
-
populateInnerFilters
private void populateInnerFilters(java.util.LinkedHashMap<java.lang.String,java.util.Map<java.lang.String,java.lang.String>> wrappedFilterArgs)
-
isIgnoreCase
public boolean isIgnoreCase()
-
getProtectedTerms
public CharArraySet getProtectedTerms()
-
create
protected ConditionalTokenFilter create(TokenStream input, java.util.function.Function<TokenStream,TokenStream> inner)
Description copied from class:ConditionalTokenFilterFactory
Modify the incomingTokenStream
with aConditionalTokenFilter
- Specified by:
create
in classConditionalTokenFilterFactory
-
doInform
public void doInform(ResourceLoader loader) throws java.io.IOException
Description copied from class:ConditionalTokenFilterFactory
Initialises this component with the correspondingResourceLoader
- Overrides:
doInform
in classConditionalTokenFilterFactory
- Throws:
java.io.IOException
-
-