Class ScalarQuantizer


  • public class ScalarQuantizer
    extends java.lang.Object
    Will scalar quantize float vectors into `int8` byte values. This is a lossy transformation. Scalar quantization works by first calculating the quantiles of the float vector values. The quantiles are calculated using the configured confidence interval. The [minQuantile, maxQuantile] are then used to scale the values into the range [0, 127] and bucketed into the nearest byte values.

    How Scalar Quantization Works

    The basic mathematical equations behind this are fairly straight forward and based on min/max normalization. Given a float vector `v` and a confidenceInterval `q` we can calculate the quantiles of the vector values [minQuantile, maxQuantile].

       byte = (float - minQuantile) * 127/(maxQuantile - minQuantile)
       float = (maxQuantile - minQuantile)/127 * byte + minQuantile
     

    This then means to multiply two float values together (e.g. dot_product) we can do the following:

       float1 * float2 ~= (byte1 * (maxQuantile - minQuantile)/127 + minQuantile) * (byte2 * (maxQuantile - minQuantile)/127 + minQuantile)
       float1 * float2 ~= (byte1 * byte2 * (maxQuantile - minQuantile)^2)/(127^2) + (byte1 * minQuantile * (maxQuantile - minQuantile)/127) + (byte2 * minQuantile * (maxQuantile - minQuantile)/127) + minQuantile^2
       let alpha = (maxQuantile - minQuantile)/127
       float1 * float2 ~= (byte1 * byte2 * alpha^2) + (byte1 * minQuantile * alpha) + (byte2 * minQuantile * alpha) + minQuantile^2
     

    The expansion for square distance is much simpler:

      square_distance = (float1 - float2)^2
      (float1 - float2)^2 ~= (byte1 * alpha + minQuantile - byte2 * alpha - minQuantile)^2
      = (alpha*byte1 + minQuantile)^2 + (alpha*byte2 + minQuantile)^2 - 2*(alpha*byte1 + minQuantile)(alpha*byte2 + minQuantile)
      this can be simplified to:
      = alpha^2 (byte1 - byte2)^2
     
    • Field Detail

      • SCALAR_QUANTIZATION_SAMPLE_SIZE

        public static final int SCALAR_QUANTIZATION_SAMPLE_SIZE
        See Also:
        Constant Field Values
      • alpha

        private final float alpha
      • scale

        private final float scale
      • bits

        private final byte bits
      • minQuantile

        private final float minQuantile
      • maxQuantile

        private final float maxQuantile
      • random

        private static final java.util.Random random
    • Constructor Detail

      • ScalarQuantizer

        public ScalarQuantizer​(float minQuantile,
                               float maxQuantile,
                               byte bits)
        Parameters:
        minQuantile - the lower quantile of the distribution
        maxQuantile - the upper quantile of the distribution
        bits - the number of bits to use for quantization
    • Method Detail

      • quantize

        public float quantize​(float[] src,
                              byte[] dest,
                              VectorSimilarityFunction similarityFunction)
        Quantize a float vector into a byte vector
        Parameters:
        src - the source vector
        dest - the destination vector
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the corrective offset that needs to be applied to the score
      • quantizeFloat

        private float quantizeFloat​(float v,
                                    byte[] dest,
                                    int destIndex)
      • recalculateCorrectiveOffset

        public float recalculateCorrectiveOffset​(byte[] quantizedVector,
                                                 ScalarQuantizer oldQuantizer,
                                                 VectorSimilarityFunction similarityFunction)
        Recalculate the old score corrective value given new current quantiles
        Parameters:
        quantizedVector - the old vector
        oldQuantizer - the old quantizer
        similarityFunction - the similarity function used to calculate the quantile
        Returns:
        the new offset
      • deQuantize

        void deQuantize​(byte[] src,
                        float[] dest)
        Dequantize a byte vector into a float vector
        Parameters:
        src - the source vector
        dest - the destination vector
      • getLowerQuantile

        public float getLowerQuantile()
      • getUpperQuantile

        public float getUpperQuantile()
      • getConstantMultiplier

        public float getConstantMultiplier()
      • getBits

        public byte getBits()
      • toString

        public java.lang.String toString()
        Overrides:
        toString in class java.lang.Object
      • reservoirSampleIndices

        private static int[] reservoirSampleIndices​(int numFloatVecs,
                                                    int sampleSize)
      • fromVectors

        public static ScalarQuantizer fromVectors​(FloatVectorValues floatVectorValues,
                                                  float confidenceInterval,
                                                  int totalVectorCount,
                                                  byte bits)
                                           throws java.io.IOException
        This will read the float vector values and calculate the quantiles. If the number of float vectors is less than SCALAR_QUANTIZATION_SAMPLE_SIZE then all the values will be read and the quantiles calculated. If the number of float vectors is greater than SCALAR_QUANTIZATION_SAMPLE_SIZE then a random sample of SCALAR_QUANTIZATION_SAMPLE_SIZE will be read and the quantiles calculated.
        Parameters:
        floatVectorValues - the float vector values from which to calculate the quantiles
        confidenceInterval - the confidence interval used to calculate the quantiles
        totalVectorCount - the total number of live float vectors in the index. This is vital for accounting for deleted documents when calculating the quantiles.
        bits - the number of bits to use for quantization
        Returns:
        A new ScalarQuantizer instance
        Throws:
        java.io.IOException - if there is an error reading the float vector values
      • fromVectors

        static ScalarQuantizer fromVectors​(FloatVectorValues floatVectorValues,
                                           float confidenceInterval,
                                           int totalVectorCount,
                                           byte bits,
                                           int quantizationSampleSize)
                                    throws java.io.IOException
        Throws:
        java.io.IOException
      • extractQuantiles

        private static void extractQuantiles​(float[] confidenceIntervals,
                                             float[] quantileGatheringScratch,
                                             double[] upperSum,
                                             double[] lowerSum)
      • gatherSample

        private static void gatherSample​(FloatVectorValues floatVectorValues,
                                         float[] quantileGatheringScratch,
                                         java.util.List<float[]> sampledDocs,
                                         int i)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • findNearestNeighbors

        private static java.util.List<ScalarQuantizer.ScoreDocsAndScoreVariance> findNearestNeighbors​(java.util.List<float[]> vectors,
                                                                                                      VectorSimilarityFunction similarityFunction)
        Parameters:
        vectors - The vectors to find the nearest neighbors for each other
        similarityFunction - The similarity function to use
        Returns:
        The top 10 nearest neighbors for each vector from the vectors list
      • getUpperAndLowerQuantile

        static float[] getUpperAndLowerQuantile​(float[] arr,
                                                float confidenceInterval)
        Takes an array of floats, sorted or not, and returns a minimum and maximum value. These values are such that they reside on the `(1 - confidenceInterval)/2` and `confidenceInterval/2` percentiles. Example: providing floats `[0..100]` and asking for `90` quantiles will return `5` and `95`.
        Parameters:
        arr - array of floats
        confidenceInterval - the configured confidence interval
        Returns:
        lower and upper quantile values