bullet-db
diff --git a/‎docs/about/releases.md
Lines changed: 2 additions & 0 deletions b/‎docs/about/releases.md
Lines changed: 2 additions & 0 deletions
diff --git a/‎docs/quick-start.md
Lines changed: 5 additions & 5 deletions b/‎docs/quick-start.md
Lines changed: 5 additions & 5 deletions
diff --git a/‎docs/ui/usage.md
Lines changed: 89 additions & 10 deletions b/‎docs/ui/usage.md
Lines changed: 89 additions & 10 deletions
diff --git a/‎docs/video/approx-count-distinct-2.mp4
-787 KB b/‎docs/video/approx-count-distinct-2.mp4
-787 KB
diff --git a/‎docs/video/approx-count-distinct-3.mp4
659 KB b/‎docs/video/approx-count-distinct-3.mp4
659 KB
diff --git a/‎docs/video/approx-quantile.mp4
438 KB b/‎docs/video/approx-quantile.mp4
438 KB
diff --git a/‎docs/video/approx-top-k.mp4
354 KB b/‎docs/video/approx-top-k.mp4
354 KB
diff --git a/‎docs/video/exact-count-distinct-2.mp4
-251 KB b/‎docs/video/exact-count-distinct-2.mp4
-251 KB
diff --git a/‎docs/video/exact-count-distinct-3.mp4
526 KB b/‎docs/video/exact-count-distinct-3.mp4
526 KB
diff --git a/‎docs/video/exact-top-k.mp4
292 KB b/‎docs/video/exact-top-k.mp4
292 KB
@@ -67,6 +67,8 @@ The Bullet UI that lets you build, run, save and visualize results from Bullet.
 
 |    Date      |                                      Release                                           | Highlights |
 | ------------ | -------------------------------------------------------------------------------------- | ---------- |
+| 2016-05-03   | [**0.2.2**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.2) | Fixes maxlength of the input for points |
+| 2016-05-02   | [**0.2.1**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.1) | Fixes a bug with a dependency that broke sorting the Filters |
 | 2016-05-01   | [**0.2.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0) | Release for Top K and Distribution. Supports Bullet Storm 0.4.2+ |
 | 2016-02-21   | [**0.1.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.1.0) | The first release with support for all features included in Bullet Storm 0.2.1+ |
 
 
@@ -6,7 +6,7 @@ By the following the steps in this section, you will:
 
   * Setup the Bullet topology using a custom spout on [bullet-storm-0.4.2](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2)
   * Setup the [Web Service](ws/setup.md) talking to the topology and serving a schema for your UI using [bullet-service-0.0.1](https://github.com/yahoo/bullet-service/releases/tag/bullet-service-0.0.1)
-  * Setup the [UI](ui/setup.md) talking to the Web Service using [bullet-ui-0.2.0](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0)
+  * Setup the [UI](ui/setup.md) talking to the Web Service using [bullet-ui-0.2.2](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.2)
 
 **Prerequisites**
 
@@ -128,7 +128,7 @@ cp $BULLET_EXAMPLES/storm/* $BULLET_HOME/backend/storm
 
     ```bullet.query.aggregation.top.k.sketch.entries: 1024``` 0.75 times this number is the number of unique items for which counts can be done exactly. Approximates after.
 
-    ```bullet.query.aggregation.distribution.max.points: 100``` The maximum number of points you can generate, use or provide for a Distribution aggregation.
+    ```bullet.query.aggregation.distribution.max.points: 200``` The maximum number of points you can generate, use or provide for a Distribution aggregation.
 
 !!! note "Want to tweak the example topology code?"
 
@@ -207,8 +207,8 @@ nvm use v6.9.4
 
 ```bash
 cd $BULLET_HOME/ui
-curl -LO https://github.com/yahoo/bullet-ui/releases/download/v0.1.0/bullet-ui-v0.1.0.tar.gz
-tar -xzf bullet-ui-v0.1.0.tar.gz
+curl -LO https://github.com/yahoo/bullet-ui/releases/download/v0.2.2/bullet-ui-v0.2.2.tar.gz
+tar -xzf bullet-ui-v0.2.2.tar.gz
 cp $BULLET_EXAMPLES/ui/env-settings.json config/
 ```
 
@@ -231,7 +231,7 @@ Visit [http://localhost:8800](http://localhost:8800) to query your topology with
 If you were following the [Quicker Start](#quicker-start) or if you don't want to manually bring down everything, you can run:
 
 ```bash
-curl -sLo- https://raw.githubusercontent.com/yahoo/bullet-docs/v0.2.0/examples/install-all.sh | bash -s cleanup
+curl -sLo- https://raw.githubusercontent.com/yahoo/bullet-docs/v0.3.0/examples/install-all.sh | bash -s cleanup
 ```
 
 If you were performing the steps yourself, you can also manually cleanup **all the components and all the downloads** using:
 
@@ -15,7 +15,7 @@ The schema you [plug into the UI](setup.md#configuration) is shown here so the u
 **Example: The landing and schema pages**
 
 <video controls autoplay loop>
-  <source src="../../video/schema.mp4" type="video/mp4">
+  <source src="../../video/schema-2.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
@@ -36,11 +36,11 @@ You can also download the results in JSON, CSV or flattened CSV (fields inside m
 **Example: Picking a random record from the stream**
 
 <video controls autoplay loop>
-  <source src="../../video/first-query.mp4" type="video/mp4">
+  <source src="../../video/first-query-2.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
-!!! note "__receive_timestamp"
+!!! note "receive_timestamp"
 
     This was enabled as part of the configuration for the example backend. This was the timestamp when Bullet first saw this record. If you have timestamps in your data (as this example does), you will be able to tell exactly when your data was received by Bullet. This coupled with the timestamps in the Result Metadata for when your query was submitted and terminated, you will be able to tell why or why not a particular record was or was not seen in Bullet.
 
@@ -53,7 +53,7 @@ The Output Data section lets you aggregate or choose to see raw data records. Yo
 **Example: Finding and picking out fields from events that have probability > 0.5**
 
 <video controls autoplay loop>
-  <source src="../../video/filter-project.mp4" type="video/mp4">
+  <source src="../../video/filter-project-2.mp4" type="video/mp4">
   Your browser does not support the video tag
 </video>
 
@@ -70,7 +70,7 @@ The querybuilder is also type aware. The operations you can perform change based
 **Example: Finding and picking out the first and second events in each period that also have probability > 0.5**
 
 <video controls autoplay loop>
-  <source src="../../video/query-building.mp4" type="video/mp4">
+  <source src="../../video/query-building-2.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
@@ -89,7 +89,7 @@ You can also optionally rename the result.
 **Example: Counting unique UUIDs for 20s**
 
 <video controls autoplay loop>
-  <source src="../../video/exact-count-distinct-2.mp4" type="video/mp4">
+  <source src="../../video/exact-count-distinct-3.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
@@ -112,13 +112,13 @@ When the result is approximate, it is shown as a decimal value. The Result Metad
 **Example: Counting unique UUIDs for 200s**
 
 <video controls autoplay loop>
-  <source src="../../video/approx-count-distinct-2.mp4" type="video/mp4">
+  <source src="../../video/approx-count-distinct-3.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
 !!! note "So why is the approximate count what it is?"
 
-    The backend should have produced ```20 * 200000/101``` or ```39603``` tuples with unique uuids. Due to the synthetic nature of the data generation and the building delays mentioned above, we estimated that we should subtract about 20 tuples for every 10 s the query runs. Since this query ran for ```200 s```, this makes the actual uuids generated to be at best ```39603 - (200/10) * 20``` or ```39203```. The result from Bullet was ```38886```, which is an error of ```~0.8 %```. The real error is probably about a *third* of that because we assumed the delay between periods to be 1 ms. It is more on the order of 2 or 3 ms, which makes the number of uuids actually generated even less.
+    The backend should have produced ```20 * 200000/101``` or ```39603``` tuples with unique uuids. Due to the synthetic nature of the data generation and the building delays mentioned above, we estimated that we should subtract about 20 tuples for every 10 s the query runs. Since this query ran for ```200 s```, this makes the actual uuids generated to be at best ```39603 - (200/10) * 20``` or ```39203```. The result from Bullet was ```39069```, which is an error of ```~0.3 %```. The real error is probably less than that because we assumed the delay between periods to be 1 ms to get the ```39203``` number. It's probably slightly larger making the actual uuids generated lower and closer to our estimate.
 
 ##  Group all
 
@@ -129,7 +129,7 @@ When choosing the Grouped Data option, you can choose to add fields to group by.
 The metrics you apply on fields are all numeric presently. If you apply a metric on a non-numeric field, Bullet will try to **type-cast** your field into number and if it's not possible, the result will be ```null```. The result will also be ```null``` if the field was not present or no data matched your filters.
 
 <video controls autoplay loop>
-  <source src="../../video/group-all-error-duplicating-2.mp4" type="video/mp4">
+  <source src="../../video/group-all-error-2.mp4" type="video/mp4">
   Your browser does not support the video tag.
 </video>
 
@@ -156,7 +156,7 @@ In this example, we group by ```tuple_number```. Recall that this is the number
 
 !!! note "What happens if I group by uuid?"
 
-    Try it out! Nothing bad should happen. If the number of unique group values exceeds the [maximum configured](../quick-start.md#setting-up-the-example-bullet-topology) (we used 1024 for this example), you will receive a *uniform sample* across your unique group values. The results for your metrics however, are **not sampled**. It is the groups that are sampled on. This means that is **no** guarantee of order if you were expecting the *most popular* groups or similar. We are working on adding a ```TOP K``` query that can support these kinds of use-cases.
+    Try it out! Nothing bad should happen. If the number of unique group values exceeds the [maximum configured](../quick-start.md#setting-up-the-example-bullet-topology) (we used 1024 for this example), you will receive a *uniform sample* across your unique group values. The results for your metrics however, are **not sampled**. It is the groups that are sampled on. This means that is **no** guarantee of order if you were expecting the *most popular* groups or similar. You should use the Top K query in that scenario.
 
 !!! note "Why no Count Distinct after Grouping"
 
@@ -167,3 +167,82 @@ In this example, we group by ```tuple_number```. Recall that this is the number
     Good job, eagle eyes! Unfortunately, whenever we group on fields, those fields become strings under the current implementation. Rather than convert them back at the end, we have currently decided to leave it as is. This means that in your results, if you try and sort by a grouped field, it will perform a lexicographical sort even if it was originally a number.
 
     However, this also means that you can actually group by any field - including non primitives such as maps and lists! The field will be converted to a string and that string will be used as the field's representation for uniqueness and grouping purposes.
+
+## Distributions
+
+In this example, we find distributions of the ```duration```  field. This field is generated randomly from 0 to 10,049, with a tendency to have values that are closer to 0 than 10,049. Let's see if this is true. Note that since this field has random values, the results you see per query are the values generated during that query's duration.
+
+The distribution type of output data requires you to pick a type of distribution: ```Quantiles```, ```Frequencies``` or ```Cumulative Frequencies```. ```Quantiles``` lets you get various percentiles (e.g. 25th, 99th) of your numeric field. ```Frequencies``` lets you break up the range of values of your field into intervals and get a count of how many values fell into each interval. ```Cumulative Frequencies``` does the same as ```Frequencies``` but each interval includes the counts of all the intervals prior to it. Both ```Frequencies``` and ```Cumulative Frequencies``` also give you a probability of how likely a value is to fall into the interval.
+
+All the distributions require you to specify some numeric points. For ```Quantiles```, these points are between 0 and 1 and the value denotes the percentile you are looking for. (0.25 for 25th percentile, 0.99 for 99th etc). For ```Frequencies``` and ```Cumulative Frequencies```, the points are between the minimum and maximum value of your field and every 2 contiguous points create an interval. However, the first interval always starts from *-&infin;* to the first point and the last interval always starts from your last point to *+&infin;*.
+
+You can read much more about this in the UI help by clicking the ```Need more help?``` link.
+
+### Exact
+
+**Example: Finding the various percentiles of duration**
+
+This example shows all 3 values of specifying points and shows *exact* distribution results for the ```duration``` field.
+
+<video controls autoplay loop>
+  <source src="../../video/quantiles-all-point-formats.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+
+---
+
+**Example: Finding some frequency counts of duration values in an interval**
+
+The last example showed that the 90th percentile of ```duration``` was around 4000. This example gets some frequencies in various intervals.
+
+<video controls autoplay loop>
+  <source src="../../video/frequency-distribution.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+
+Try out and see what ```Cumulative Frequencies``` does yourself!
+
+### Approximate
+
+This next example shows how an approximate distribution result looks.
+
+**Example: Approximate quantile distribution**
+
+<video controls autoplay loop>
+  <source src="../../video/approx-quantile.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+
+!!! note "Normalized Rank Error"
+
+    To understand what this means, refer to the [explanation here](../ws/examples.md#normalized-rank-error). You can also refer to the help in the Result Metadata section.
+
+!!! note "Wouldn't it be nice to graph these?"
+
+    This is in the works! We plan to add pivoting and graphing as a general option in the results pages. Feel free to follow [the issue here](https://github.com/yahoo/bullet-ui/issues/24).
+
+## Top K
+
+Top K lets you get the most *frequent items* or the *heavy hitters* for the values in a set of a fields.
+
+### Exact
+
+This example gets the Top 3 most popular ```type``` values (there are only 6 but this illustrates the idea).
+
+<video controls autoplay loop>
+  <source src="../../video/exact-top-k.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+
+### Approximate
+
+By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated. We ask for the top 300 results but we also say that they should have a count of at least 20. This restricts the overall number of results to 12.
+
+<video controls autoplay loop>
+  <source src="../../video/approx-top-k.mp4" type="video/mp4">
+  Your browser does not support the video tag.
+</video>
+
+!!! note "Maximum Count Error"
+
+    The ```maximum_count_error``` value for the query above was ```3```. This means that the difference between the upper bound and the lower bound of each count estimate is ```3```. Bullet returns the upper bound as the estimate so subtracting ```3``` from each count gives you the lower bound of the count. Note that some counts are closer to each other than the count error. For instance, ```(quux, 1)``` and ```(baz, 0)``` have counts ```67``` and ```66``` but their true counts could be from ```64 to 67``` and ```63 to 66``` respectively. This means that ```(baz, 0)``` could well be the most frequent item for this query.