Module org.apache.lucene.core
Class Lucene99ScalarQuantizedVectorsWriter
- java.lang.Object
-
- org.apache.lucene.codecs.hnsw.FlatVectorsWriter
-
- org.apache.lucene.codecs.lucene99.Lucene99ScalarQuantizedVectorsWriter
-
- All Implemented Interfaces:
java.io.Closeable
,java.lang.AutoCloseable
,Accountable
public final class Lucene99ScalarQuantizedVectorsWriter extends FlatVectorsWriter
Writes quantized vector values and metadata to index segments.
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
Lucene99ScalarQuantizedVectorsWriter.FieldWriter
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.FloatVectorWrapper
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.MergedQuantizedVectorValues
Returns a merged view over all the segment'sQuantizedByteVectorValues
.(package private) static class
Lucene99ScalarQuantizedVectorsWriter.OffsetCorrectedQuantizedByteVectorValues
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.QuantizedByteVectorValueSub
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.QuantizedFloatVectorValues
(package private) static class
Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier
-
Field Summary
Fields Modifier and Type Field Description private byte
bits
private boolean
compress
private java.lang.Float
confidenceInterval
private java.util.List<Lucene99ScalarQuantizedVectorsWriter.FieldWriter>
fields
private boolean
finished
private IndexOutput
meta
private static float
QUANTILE_RECOMPUTE_LIMIT
private IndexOutput
quantizedVectorData
private FlatVectorsWriter
rawVectorDelegate
private static float
REQUANTIZATION_LIMIT
private SegmentWriteState
segmentWriteState
private static long
SHALLOW_RAM_BYTES_USED
private int
version
-
Fields inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
vectorsScorer
-
Fields inherited from interface org.apache.lucene.util.Accountable
NULL_ACCOUNTABLE
-
-
Constructor Summary
Constructors Modifier Constructor Description private
Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, int version, java.lang.Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer)
Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer)
Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description FlatFieldVectorsWriter<?>
addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter)
Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.(package private) static ScalarQuantizer
buildScalarQuantizer(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, java.lang.Float confidenceInterval, byte bits)
void
close()
void
finish()
Called once at the end before closevoid
flush(int maxDoc, Sorter.DocMap sortMap)
Flush all buffered data on disk *private static QuantizedVectorsReader
getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, java.lang.String fieldName)
private static ScalarQuantizer
getQuantizedState(KnnVectorsReader vectorsReader, java.lang.String fieldName)
static ScalarQuantizer
mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, java.lang.Float confidenceInterval, byte bits)
Merges the quantiles of the segments and recalculates the quantiles if necessary.void
mergeOneField(FieldInfo fieldInfo, MergeState mergeState)
Write field for mergingCloseableRandomVectorScorerSupplier
mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState)
Write the field for merging, providing a scorer over the newly merged flat vectors.private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier
mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState)
(package private) static ScalarQuantizer
mergeQuantiles(java.util.List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits)
long
ramBytesUsed()
Return the memory usage of this object in bytes.(package private) static boolean
shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, java.util.List<ScalarQuantizer> quantizationStates)
Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.(package private) static boolean
shouldRequantize(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles)
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state.private void
writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc)
private void
writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, java.lang.Float confidenceInterval, byte bits, boolean compress, java.lang.Float lowerQuantile, java.lang.Float upperQuantile, DocsWithFieldSet docsWithField)
static DocsWithFieldSet
writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress)
Writes the vector values to the output and returns a set of documents that contains vectors.private void
writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData)
private void
writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap)
private void
writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap)
-
Methods inherited from class org.apache.lucene.codecs.hnsw.FlatVectorsWriter
getFlatVectorScorer
-
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
-
Methods inherited from interface org.apache.lucene.util.Accountable
getChildResources
-
-
-
-
Field Detail
-
SHALLOW_RAM_BYTES_USED
private static final long SHALLOW_RAM_BYTES_USED
-
QUANTILE_RECOMPUTE_LIMIT
private static final float QUANTILE_RECOMPUTE_LIMIT
- See Also:
- Constant Field Values
-
REQUANTIZATION_LIMIT
private static final float REQUANTIZATION_LIMIT
- See Also:
- Constant Field Values
-
segmentWriteState
private final SegmentWriteState segmentWriteState
-
fields
private final java.util.List<Lucene99ScalarQuantizedVectorsWriter.FieldWriter> fields
-
meta
private final IndexOutput meta
-
quantizedVectorData
private final IndexOutput quantizedVectorData
-
confidenceInterval
private final java.lang.Float confidenceInterval
-
rawVectorDelegate
private final FlatVectorsWriter rawVectorDelegate
-
bits
private final byte bits
-
compress
private final boolean compress
-
version
private final int version
-
finished
private boolean finished
-
-
Constructor Detail
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws java.io.IOException
- Throws:
java.io.IOException
-
Lucene99ScalarQuantizedVectorsWriter
public Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, java.lang.Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws java.io.IOException
- Throws:
java.io.IOException
-
Lucene99ScalarQuantizedVectorsWriter
private Lucene99ScalarQuantizedVectorsWriter(SegmentWriteState state, int version, java.lang.Float confidenceInterval, byte bits, boolean compress, FlatVectorsWriter rawVectorDelegate, FlatVectorsScorer scorer) throws java.io.IOException
- Throws:
java.io.IOException
-
-
Method Detail
-
addField
public FlatFieldVectorsWriter<?> addField(FieldInfo fieldInfo, KnnFieldVectorsWriter<?> indexWriter) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Add a new field for indexing, allowing the user to provide a writer that the flat vectors writer can delegate to if additional indexing logic is required.- Specified by:
addField
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to addindexWriter
- the writer to delegate to, can be null- Returns:
- a writer for the field
- Throws:
java.io.IOException
- if an I/O error occurs when adding the field
-
mergeOneField
public void mergeOneField(FieldInfo fieldInfo, MergeState mergeState) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Write field for merging- Overrides:
mergeOneField
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
mergeOneFieldToIndex
public CloseableRandomVectorScorerSupplier mergeOneFieldToIndex(FieldInfo fieldInfo, MergeState mergeState) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Write the field for merging, providing a scorer over the newly merged flat vectors. This way any additional merging logic can be implemented by the user of this class.- Specified by:
mergeOneFieldToIndex
in classFlatVectorsWriter
- Parameters:
fieldInfo
- fieldInfo of the field to mergemergeState
- mergeState of the segments to merge- Returns:
- a scorer over the newly merged flat vectors, which should be closed as it holds a temporary file handle to read over the newly merged vectors
- Throws:
java.io.IOException
- if an I/O error occurs when merging
-
flush
public void flush(int maxDoc, Sorter.DocMap sortMap) throws java.io.IOException
Description copied from class:FlatVectorsWriter
Flush all buffered data on disk *- Specified by:
flush
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
finish
public void finish() throws java.io.IOException
Description copied from class:FlatVectorsWriter
Called once at the end before close- Specified by:
finish
in classFlatVectorsWriter
- Throws:
java.io.IOException
-
ramBytesUsed
public long ramBytesUsed()
Description copied from interface:Accountable
Return the memory usage of this object in bytes. Negative values are illegal.
-
writeField
private void writeField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc) throws java.io.IOException
- Throws:
java.io.IOException
-
writeMeta
private void writeMeta(FieldInfo field, int maxDoc, long vectorDataOffset, long vectorDataLength, java.lang.Float confidenceInterval, byte bits, boolean compress, java.lang.Float lowerQuantile, java.lang.Float upperQuantile, DocsWithFieldSet docsWithField) throws java.io.IOException
- Throws:
java.io.IOException
-
writeQuantizedVectors
private void writeQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData) throws java.io.IOException
- Throws:
java.io.IOException
-
writeSortingField
private void writeSortingField(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int maxDoc, Sorter.DocMap sortMap) throws java.io.IOException
- Throws:
java.io.IOException
-
writeSortedQuantizedVectors
private void writeSortedQuantizedVectors(Lucene99ScalarQuantizedVectorsWriter.FieldWriter fieldData, int[] ordMap) throws java.io.IOException
- Throws:
java.io.IOException
-
mergeOneFieldToIndex
private Lucene99ScalarQuantizedVectorsWriter.ScalarQuantizedCloseableRandomVectorScorerSupplier mergeOneFieldToIndex(SegmentWriteState segmentWriteState, FieldInfo fieldInfo, MergeState mergeState, ScalarQuantizer mergedQuantizationState) throws java.io.IOException
- Throws:
java.io.IOException
-
mergeQuantiles
static ScalarQuantizer mergeQuantiles(java.util.List<ScalarQuantizer> quantizationStates, IntArrayList segmentSizes, byte bits)
-
shouldRecomputeQuantiles
static boolean shouldRecomputeQuantiles(ScalarQuantizer mergedQuantizationState, java.util.List<ScalarQuantizer> quantizationStates)
Returns true if the quantiles of the merged state are too far from the quantiles of the individual states.- Parameters:
mergedQuantizationState
- The merged quantization statequantizationStates
- The quantization states of the individual segments- Returns:
- true if the quantiles should be recomputed
-
getQuantizedKnnVectorsReader
private static QuantizedVectorsReader getQuantizedKnnVectorsReader(KnnVectorsReader vectorsReader, java.lang.String fieldName)
-
getQuantizedState
private static ScalarQuantizer getQuantizedState(KnnVectorsReader vectorsReader, java.lang.String fieldName)
-
mergeAndRecalculateQuantiles
public static ScalarQuantizer mergeAndRecalculateQuantiles(MergeState mergeState, FieldInfo fieldInfo, java.lang.Float confidenceInterval, byte bits) throws java.io.IOException
Merges the quantiles of the segments and recalculates the quantiles if necessary.- Parameters:
mergeState
- The merge statefieldInfo
- The field infoconfidenceInterval
- The confidence intervalbits
- The number of bits- Returns:
- The merged quantiles
- Throws:
java.io.IOException
- If there is a low-level I/O error
-
buildScalarQuantizer
static ScalarQuantizer buildScalarQuantizer(FloatVectorValues floatVectorValues, int numVectors, VectorSimilarityFunction vectorSimilarityFunction, java.lang.Float confidenceInterval, byte bits) throws java.io.IOException
- Throws:
java.io.IOException
-
shouldRequantize
static boolean shouldRequantize(ScalarQuantizer existingQuantiles, ScalarQuantizer newQuantiles)
Returns true if the quantiles of the new quantization state are too far from the quantiles of the existing quantization state. This would imply that floating point values would slightly shift quantization buckets.- Parameters:
existingQuantiles
- The existing quantiles for a segmentnewQuantiles
- The new quantiles for a segment, could be merged, or fully re-calculated- Returns:
- true if the floating point values should be requantized
-
writeQuantizedVectorData
public static DocsWithFieldSet writeQuantizedVectorData(IndexOutput output, QuantizedByteVectorValues quantizedByteVectorValues, byte bits, boolean compress) throws java.io.IOException
Writes the vector values to the output and returns a set of documents that contains vectors.- Throws:
java.io.IOException
-
close
public void close() throws java.io.IOException
- Throws:
java.io.IOException
-
-