Skip to content

Releases: apache/datasketches-java

0.13.3 May 8, 2019: Fix Theta Direct Union Bug

09 May 00:01

Choose a tag to compare

This release fixes a nasty bug that occurred when merging estimating sketches into a Direct Union.
Three lines of code were accidentally deleted between 0.13.0 and 0.13.1.
This bug existed only in releases 0.13.1 and 0.13.2.

0.13.2 Apr 25, 2019: Frequent Distinct Tuples Sketch

26 Apr 02:21

Choose a tag to compare

0.13.1 Apr 2, 2019: Fix Direct DoublesUnion Quantiles Bug

02 Apr 20:00

Choose a tag to compare

  • Bug fix for Quantiles Sketches

    • Environment: Using DoublesUnion in Direct (off-heap) mode.
    • Symptom 1: quantiles are out-of-order: q(0.99) < q(0.98)
    • Symptom 2: garbage values amongst otherwise normal quantile values: q(0.99) = 100, q(0.98) = 1E100, q(0.97) = 90.
  • Bug fix for Theta Sketches

    • Environment: using Union in Direct (off-heap) mode
    • Symptom: getEstimate() returns NaN.
      It requires an unusual set of circumstances to actually observe this.
  • Logic change for Theta Sketches

    • Empty sketches do not affect unions and can be ignored

0.13.0 Mar 14, 2019: Added new CPC Sketch

14 Mar 21:04

Choose a tag to compare

  • Added new CPC Sketch. This new sketch has superior accuracy per stored space than the HLL sketch.
  • Added a high-performance thread-safe version of Theta UpdateSketch for use in applications that require very high throughput
  • Added API calls for easier understanding of error in the Frequent Items sketches
  • Added more general ceiling and floor powers of X functions to sketches.Util.*
  • Optimized serialization of single item KLL sketches
  • Minor changes to HLL API: getIterator() becomes the Java convention iterator().
  • Added xxHash() and faster version of MurmurHash3 (v2).

0.12.0 Aug 7, 2018: Update POM to Memory 0.12.0, improves performance.

08 Aug 01:42

Choose a tag to compare

  • Updated to Memory 0.12.0, which will improve performance
  • Fixed handling of min and max values in KLL sketch merge
  • Minor API changes

0.11.1 Apr 20, 2018: Quantiles, KLL, Tuple, Fixes & Improvements

20 Apr 22:22

Choose a tag to compare

  • Quantiles sketch
    • fixed issue #195
    • added DoublesUnion.heapify() and DoublesUnion.wrap() methods
    • deprecated DoublesUnionBuilder.heapify() and DoublesUnionBuilder.wrap() methods
  • KLL sketch
    • methods to obtain rank error for both single-sided and double-sided queries
    • methods to compute parameter k given a target rank error
    • Javadoc improvements
  • Tuple sketch
    • added Filter

0.11.0 Mar 15, 2018: KLL quantiles sketch, tuple sketch API change and more

16 Mar 02:20

Choose a tag to compare

  • New KLL sketch: KllFloatsSketch:
    • This is a new quantiles sketch with better accuracy per stored bit than the original quantiles DoublesSketch. If you select a value of K for the KLL sketch so that it matches the same accuracy as the DoublesSketch, the K will be larger, but the space required will be much smaller. This sketch is specifically tuned for the smallest amount of space usage as possible (near theoretical optimum) and uses floats rather than doubles. On update this new KLL sketch is a little faster than the original DoublesSketch, but may be slower on merge. Also, this KLL sketch currently does not have a generic version (as does the DoublesSketch) nor does it provide off-heap capability like the DoublesSketch. Refer to the javadocs for a link to the KLL theoretical paper.
  • Tuple:
    • generic sketch API change
      • removed the convention to require static methods with a certain signature, these methods are now based on a more visible API
      • added SummaryDeserializer
      • The need to serialize factories has been removed
      • removed getSummaries() method - use iterator instead
  • Theta:
    • added new SingleItemSketch - fast way to create sketches with a single input item
  • Original quantiles sketch enhancements:
    • added getRank() - faster than getCDF() with one split point
    • empty sketch returns null from getQuantiles(), getPMF() and getCDF()
    • empty sketch returns NaN from getQuantile(), getMinValue() and getMaxValue()
    • Komologorov-Smirnov Statistic between two quantiles sketches
    • fixed sorting using comparator in generic ItemsSketch

0.10.3 Oct 26, 2017: Theta backward compatibility

27 Oct 00:21

Choose a tag to compare

Theta sketch: As a part of the resize factor serialization fix in version 0.10.2 a validation check was added, which led to inability to deserialize UpdateSketch or Union serialized using sketches-core-0.8.4 and above. This release is to address the issue.

0.10.2 Oct 20, 2017: Theta, HLL bug fixes

20 Oct 22:17

Choose a tag to compare

  • Theta:
    • Fixed bug in HeapUpdatesketch.toByteArray() that didn't set resize factor
    • Added getFamily() to all Set Operations. Any user-defined subclasses of SetOperations will need to implement this method.
  • HLL:
    • Fixed HLL Union conversion to HLL_4 bug
    • Made isSameResource() public

0.10.1 Sep 7, 2017: HLL Sketch Extended for Off-Heap Operation

08 Sep 01:41

Choose a tag to compare

  • This release extends the prior HLL release 0.10.0 to also allow the HLL sketch to operate off-heap leveraging the new Memory package (located in the DataSketches/Memory repository. This capability is critical for large systems that must manage millions of sketches as updatable fields located in off-heap (native) memory. The other sketches in the library that also enable this off-heap operation include the Theta sketch as well as the Quantiles sketch.