Class Lucene99ScalarQuantizedVectorsFormat


  • public class Lucene99ScalarQuantizedVectorsFormat
    extends FlatVectorsFormat
    Format supporting vector quantization, storage, and retrieval
    • Field Detail

      • QUANTIZED_VECTOR_COMPONENT

        public static final java.lang.String QUANTIZED_VECTOR_COMPONENT
        See Also:
        Constant Field Values
      • VECTOR_DATA_CODEC_NAME

        static final java.lang.String VECTOR_DATA_CODEC_NAME
        See Also:
        Constant Field Values
      • VECTOR_DATA_EXTENSION

        static final java.lang.String VECTOR_DATA_EXTENSION
        See Also:
        Constant Field Values
      • MINIMUM_CONFIDENCE_INTERVAL

        private static final float MINIMUM_CONFIDENCE_INTERVAL
        The minimum confidence interval
        See Also:
        Constant Field Values
      • MAXIMUM_CONFIDENCE_INTERVAL

        private static final float MAXIMUM_CONFIDENCE_INTERVAL
        The maximum confidence interval
        See Also:
        Constant Field Values
      • DYNAMIC_CONFIDENCE_INTERVAL

        public static final float DYNAMIC_CONFIDENCE_INTERVAL
        Dynamic confidence interval
        See Also:
        Constant Field Values
      • confidenceInterval

        final java.lang.Float confidenceInterval
        Controls the confidence interval used to scalar quantize the vectors the default value is calculated as `1-1/(vector_dimensions + 1)`
      • bits

        final byte bits
      • compress

        final boolean compress
    • Constructor Detail

      • Lucene99ScalarQuantizedVectorsFormat

        public Lucene99ScalarQuantizedVectorsFormat()
        Constructs a format using default graph construction parameters
      • Lucene99ScalarQuantizedVectorsFormat

        public Lucene99ScalarQuantizedVectorsFormat​(java.lang.Float confidenceInterval,
                                                    int bits,
                                                    boolean compress)
        Constructs a format using the given graph construction parameters.
        Parameters:
        confidenceInterval - the confidenceInterval for scalar quantizing the vectors, when `null` it is calculated based on the vector dimension. When `0`, the quantiles are dynamically determined by sampling many confidence intervals and determining the most accurate pair.
        bits - the number of bits to use for scalar quantization (must be between 1 and 8, inclusive)
        compress - whether to compress the vectors, if true, the vectors that are quantized with lte 4 bits will be compressed into a single byte. If false, the vectors will be stored as is. This provides a trade-off of memory usage and speed.