- java.lang.Object
-
- org.apache.lucene.search.Query
-
- org.apache.lucene.sandbox.search.PhraseWildcardQuery
-
public class PhraseWildcardQuery extends Query
A generalized version ofPhraseQuery
, built with one or moreMultiTermQuery
that provides term expansions for multi-terms (one of the expanded terms must match).Its main advantage is to control the total number of expansions across all
MultiTermQuery
and across all segments.Use the
PhraseWildcardQuery.Builder
to build aPhraseWildcardQuery
.This query is similar to
MultiPhraseQuery
, but it handles, controls and optimizes the multi-term expansions.This query is equivalent to building an ordered
SpanNearQuery
with a list ofSpanTermQuery
andSpanMultiTermQueryWrapper
. But it optimizes the multi-term expansions and the segment accesses. It first resolves the single-terms to early stop if some does not match. Then it expands each multi-term sequentially, stopping immediately if one does not match. It detects the segments that do not match to skip them for the next expansions. This often avoid expanding the other multi-terms on some or even all segments. And finally it controls the total number of expansions.Immutable.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description static class
PhraseWildcardQuery.Builder
Builds aPhraseWildcardQuery
.protected static class
PhraseWildcardQuery.MultiTerm
Phrase term with expansions.protected static class
PhraseWildcardQuery.PhraseTerm
AllPhraseWildcardQuery.PhraseTerm
are light and immutable.protected class
PhraseWildcardQuery.SegmentTermsSizeComparator
Compares segments based of the number of terms they contain.protected static class
PhraseWildcardQuery.SingleTerm
Phrase term with no expansion.static class
PhraseWildcardQuery.TermBytesTermState
Holds a pair of term bytes - term state.protected static class
PhraseWildcardQuery.TermData
protected static class
PhraseWildcardQuery.TermsData
Holds theTermState
andTermStatistics
for all the matched and collectedTerm
, for all phrase terms, for all segments.static class
PhraseWildcardQuery.TermStats
Accumulates the doc freq and total term freq.protected static class
PhraseWildcardQuery.TestCounters
Test counters incremented when assertions are enabled.
-
Field Summary
Fields Modifier and Type Field Description protected java.lang.String
field
protected int
maxMultiTermExpansions
protected static Query
NO_MATCH_QUERY
protected java.util.List<PhraseWildcardQuery.PhraseTerm>
phraseTerms
protected boolean
segmentOptimizationEnabled
protected int
slop
-
Constructor Summary
Constructors Modifier Constructor Description protected
PhraseWildcardQuery(java.lang.String field, java.util.List<PhraseWildcardQuery.PhraseTerm> phraseTerms, int slop, int maxMultiTermExpansions, boolean segmentOptimizationEnabled)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
checkTermsHavePositions(Terms terms)
protected int
collectMultiTermData(PhraseWildcardQuery.MultiTerm multiTerm, IndexSearcher searcher, java.util.List<LeafReaderContext> segments, int remainingMultiTerms, int maxExpansionsForTerm, PhraseWildcardQuery.TermsData termsData)
Collects theTermState
andTermStatistics
for a multi-term with expansion.protected java.util.List<PhraseWildcardQuery.TermBytesTermState>
collectMultiTermDataForSegment(PhraseWildcardQuery.MultiTerm multiTerm, LeafReaderContext leafReaderContext, int remainingExpansions, MutableValueBool shouldStopSegmentIteration, java.util.Map<BytesRef,PhraseWildcardQuery.TermStats> termStatsMap)
Collects theTermState
list andTermStatistics
for a multi-term on a specific index segment.protected void
collectMultiTermStats(IndexSearcher searcher, java.util.Map<BytesRef,PhraseWildcardQuery.TermStats> termStatsMap, PhraseWildcardQuery.TermsData termsData, PhraseWildcardQuery.TermData termData)
Collect the term stats across all segments.protected int
collectSingleTermData(PhraseWildcardQuery.SingleTerm singleTerm, IndexSearcher searcher, java.util.List<LeafReaderContext> segments, PhraseWildcardQuery.TermsData termsData)
Collects theTermState
andTermStatistics
for a single-term without expansion.(package private) PhraseWeight
createPhraseWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost, PhraseWildcardQuery.TermsData termsData)
protected PhraseWildcardQuery.TermsData
createTermsData(int numSegments)
Creates newPhraseWildcardQuery.TermsData
.protected TermsEnum
createTermsEnum(PhraseWildcardQuery.MultiTerm multiTerm, LeafReaderContext leafReaderContext)
Creates theTermsEnum
for the givenPhraseWildcardQuery.MultiTerm
and segment.protected java.util.Map<BytesRef,PhraseWildcardQuery.TermStats>
createTermStatsMap(PhraseWildcardQuery.MultiTerm multiTerm)
Creates aPhraseWildcardQuery.TermStats
map for aPhraseWildcardQuery.MultiTerm
.Weight
createWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost)
Expert: Constructs an appropriate Weight implementation for this query.protected Weight
earlyStopWeight()
boolean
equals(java.lang.Object o)
Override and implement query instance equivalence properly in a subclass.java.lang.String
getField()
int
hashCode()
Override and implement query hash code properly in a subclass.protected Weight
noMatchWeight()
Query
rewrite(IndexSearcher indexSearcher)
Expert: called to re-write queries into primitive queries.protected boolean
shouldOptimizeSegments()
java.lang.String
toString(java.lang.String omittedField)
Prints a query to a string, withfield
assumed to be the default field and omitted.void
visit(QueryVisitor visitor)
Recurse through the query tree, visiting any child queries.-
Methods inherited from class org.apache.lucene.search.Query
classHash, rewrite, sameClassAs, toString
-
-
-
-
Field Detail
-
NO_MATCH_QUERY
protected static final Query NO_MATCH_QUERY
-
field
protected final java.lang.String field
-
phraseTerms
protected final java.util.List<PhraseWildcardQuery.PhraseTerm> phraseTerms
-
slop
protected final int slop
-
maxMultiTermExpansions
protected final int maxMultiTermExpansions
-
segmentOptimizationEnabled
protected final boolean segmentOptimizationEnabled
-
-
Constructor Detail
-
PhraseWildcardQuery
protected PhraseWildcardQuery(java.lang.String field, java.util.List<PhraseWildcardQuery.PhraseTerm> phraseTerms, int slop, int maxMultiTermExpansions, boolean segmentOptimizationEnabled)
-
-
Method Detail
-
getField
public java.lang.String getField()
-
rewrite
public Query rewrite(IndexSearcher indexSearcher) throws java.io.IOException
Description copied from class:Query
Expert: called to re-write queries into primitive queries. For example, a PrefixQuery will be rewritten into a BooleanQuery that consists of TermQuerys.Callers are expected to call
rewrite
multiple times if necessary, until the rewritten query is the same as the original query.The rewrite process may be able to make use of IndexSearcher's executor and be executed in parallel if the executor is provided.
However, if any of the intermediary queries do not satisfy the new API, parallel rewrite is not possible for any subsequent sub-queries. To take advantage of this API, the entire query tree must override this method.
- Overrides:
rewrite
in classQuery
- Throws:
java.io.IOException
- See Also:
IndexSearcher.rewrite(Query)
-
visit
public void visit(QueryVisitor visitor)
Description copied from class:Query
Recurse through the query tree, visiting any child queries.
-
createWeight
public Weight createWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost) throws java.io.IOException
Description copied from class:Query
Expert: Constructs an appropriate Weight implementation for this query.Only implemented by primitive queries, which re-write to themselves.
- Overrides:
createWeight
in classQuery
scoreMode
- How the produced scorers will be consumed.boost
- The boost that is propagated by the parent queries.- Throws:
java.io.IOException
-
createTermsData
protected PhraseWildcardQuery.TermsData createTermsData(int numSegments)
Creates newPhraseWildcardQuery.TermsData
.
-
earlyStopWeight
protected Weight earlyStopWeight()
-
noMatchWeight
protected Weight noMatchWeight()
-
createPhraseWeight
PhraseWeight createPhraseWeight(IndexSearcher searcher, ScoreMode scoreMode, float boost, PhraseWildcardQuery.TermsData termsData) throws java.io.IOException
- Throws:
java.io.IOException
-
equals
public boolean equals(java.lang.Object o)
Description copied from class:Query
Override and implement query instance equivalence properly in a subclass. This is required so thatQueryCache
works properly.Typically a query will be equal to another only if it's an instance of the same class and its document-filtering properties are identical to those of the other instance. Utility methods are provided for certain repetitive code.
- Specified by:
equals
in classQuery
- See Also:
Query.sameClassAs(Object)
,Query.classHash()
-
hashCode
public int hashCode()
Description copied from class:Query
Override and implement query hash code properly in a subclass. This is required so thatQueryCache
works properly.- Specified by:
hashCode
in classQuery
- See Also:
Query.equals(Object)
-
toString
public final java.lang.String toString(java.lang.String omittedField)
Description copied from class:Query
Prints a query to a string, withfield
assumed to be the default field and omitted.
-
collectSingleTermData
protected int collectSingleTermData(PhraseWildcardQuery.SingleTerm singleTerm, IndexSearcher searcher, java.util.List<LeafReaderContext> segments, PhraseWildcardQuery.TermsData termsData) throws java.io.IOException
Collects theTermState
andTermStatistics
for a single-term without expansion.- Parameters:
termsData
- receives the collected data.- Throws:
java.io.IOException
-
collectMultiTermData
protected int collectMultiTermData(PhraseWildcardQuery.MultiTerm multiTerm, IndexSearcher searcher, java.util.List<LeafReaderContext> segments, int remainingMultiTerms, int maxExpansionsForTerm, PhraseWildcardQuery.TermsData termsData) throws java.io.IOException
Collects theTermState
andTermStatistics
for a multi-term with expansion.- Parameters:
remainingMultiTerms
- the number of remaining multi-terms to process, including the current one, excluding the multi-terms already processed.termsData
- receives the collected data.- Throws:
java.io.IOException
-
shouldOptimizeSegments
protected boolean shouldOptimizeSegments()
-
createTermStatsMap
protected java.util.Map<BytesRef,PhraseWildcardQuery.TermStats> createTermStatsMap(PhraseWildcardQuery.MultiTerm multiTerm)
Creates aPhraseWildcardQuery.TermStats
map for aPhraseWildcardQuery.MultiTerm
.
-
collectMultiTermDataForSegment
protected java.util.List<PhraseWildcardQuery.TermBytesTermState> collectMultiTermDataForSegment(PhraseWildcardQuery.MultiTerm multiTerm, LeafReaderContext leafReaderContext, int remainingExpansions, MutableValueBool shouldStopSegmentIteration, java.util.Map<BytesRef,PhraseWildcardQuery.TermStats> termStatsMap) throws java.io.IOException
Collects theTermState
list andTermStatistics
for a multi-term on a specific index segment.- Parameters:
remainingExpansions
- the number of remaining expansions allowed for the segment.shouldStopSegmentIteration
- to be set to true to stop the segment iteration calling this method repeatedly.termStatsMap
- receives the collectedPhraseWildcardQuery.TermStats
across all segments.- Throws:
java.io.IOException
-
createTermsEnum
protected TermsEnum createTermsEnum(PhraseWildcardQuery.MultiTerm multiTerm, LeafReaderContext leafReaderContext) throws java.io.IOException
Creates theTermsEnum
for the givenPhraseWildcardQuery.MultiTerm
and segment.- Returns:
- null if there is no term for this query field in the segment.
- Throws:
java.io.IOException
-
collectMultiTermStats
protected void collectMultiTermStats(IndexSearcher searcher, java.util.Map<BytesRef,PhraseWildcardQuery.TermStats> termStatsMap, PhraseWildcardQuery.TermsData termsData, PhraseWildcardQuery.TermData termData) throws java.io.IOException
Collect the term stats across all segments.- Parameters:
termStatsMap
- input map of already collectedPhraseWildcardQuery.TermStats
.termsData
- receives theTermStatistics
computed for allPhraseWildcardQuery.TermStats
.termData
- receives all the collectedTerm
.- Throws:
java.io.IOException
-
checkTermsHavePositions
protected void checkTermsHavePositions(Terms terms)
-
-