Skip to content

Commit a9412dc

Browse files
committed
Adding UI usage videos
1 parent 0366420 commit a9412dc

23 files changed

+96
-15
lines changed

docs/about/releases.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -67,6 +67,8 @@ The Bullet UI that lets you build, run, save and visualize results from Bullet.
6767

6868
| Date | Release | Highlights |
6969
| ------------ | -------------------------------------------------------------------------------------- | ---------- |
70+
| 2016-05-03 | [**0.2.2**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.2) | Fixes maxlength of the input for points |
71+
| 2016-05-02 | [**0.2.1**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.1) | Fixes a bug with a dependency that broke sorting the Filters |
7072
| 2016-05-01 | [**0.2.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0) | Release for Top K and Distribution. Supports Bullet Storm 0.4.2+ |
7173
| 2016-02-21 | [**0.1.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.1.0) | The first release with support for all features included in Bullet Storm 0.2.1+ |
7274

docs/quick-start.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ By the following the steps in this section, you will:
66

77
* Setup the Bullet topology using a custom spout on [bullet-storm-0.4.2](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2)
88
* Setup the [Web Service](ws/setup.md) talking to the topology and serving a schema for your UI using [bullet-service-0.0.1](https://github.com/yahoo/bullet-service/releases/tag/bullet-service-0.0.1)
9-
* Setup the [UI](ui/setup.md) talking to the Web Service using [bullet-ui-0.2.0](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0)
9+
* Setup the [UI](ui/setup.md) talking to the Web Service using [bullet-ui-0.2.2](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.2)
1010

1111
**Prerequisites**
1212

@@ -128,7 +128,7 @@ cp $BULLET_EXAMPLES/storm/* $BULLET_HOME/backend/storm
128128

129129
```bullet.query.aggregation.top.k.sketch.entries: 1024``` 0.75 times this number is the number of unique items for which counts can be done exactly. Approximates after.
130130

131-
```bullet.query.aggregation.distribution.max.points: 100``` The maximum number of points you can generate, use or provide for a Distribution aggregation.
131+
```bullet.query.aggregation.distribution.max.points: 200``` The maximum number of points you can generate, use or provide for a Distribution aggregation.
132132

133133
!!! note "Want to tweak the example topology code?"
134134

@@ -207,8 +207,8 @@ nvm use v6.9.4
207207

208208
```bash
209209
cd $BULLET_HOME/ui
210-
curl -LO https://github.com/yahoo/bullet-ui/releases/download/v0.1.0/bullet-ui-v0.1.0.tar.gz
211-
tar -xzf bullet-ui-v0.1.0.tar.gz
210+
curl -LO https://github.com/yahoo/bullet-ui/releases/download/v0.2.2/bullet-ui-v0.2.2.tar.gz
211+
tar -xzf bullet-ui-v0.2.2.tar.gz
212212
cp $BULLET_EXAMPLES/ui/env-settings.json config/
213213
```
214214

@@ -231,7 +231,7 @@ Visit [http://localhost:8800](http://localhost:8800) to query your topology with
231231
If you were following the [Quicker Start](#quicker-start) or if you don't want to manually bring down everything, you can run:
232232

233233
```bash
234-
curl -sLo- https://raw.githubusercontent.com/yahoo/bullet-docs/v0.2.0/examples/install-all.sh | bash -s cleanup
234+
curl -sLo- https://raw.githubusercontent.com/yahoo/bullet-docs/v0.3.0/examples/install-all.sh | bash -s cleanup
235235
```
236236

237237
If you were performing the steps yourself, you can also manually cleanup **all the components and all the downloads** using:

docs/ui/usage.md

Lines changed: 89 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -15,7 +15,7 @@ The schema you [plug into the UI](setup.md#configuration) is shown here so the u
1515
**Example: The landing and schema pages**
1616

1717
<video controls autoplay loop>
18-
<source src="../../video/schema.mp4" type="video/mp4">
18+
<source src="../../video/schema-2.mp4" type="video/mp4">
1919
Your browser does not support the video tag.
2020
</video>
2121

@@ -36,11 +36,11 @@ You can also download the results in JSON, CSV or flattened CSV (fields inside m
3636
**Example: Picking a random record from the stream**
3737

3838
<video controls autoplay loop>
39-
<source src="../../video/first-query.mp4" type="video/mp4">
39+
<source src="../../video/first-query-2.mp4" type="video/mp4">
4040
Your browser does not support the video tag.
4141
</video>
4242

43-
!!! note "__receive_timestamp"
43+
!!! note "receive_timestamp"
4444

4545
This was enabled as part of the configuration for the example backend. This was the timestamp when Bullet first saw this record. If you have timestamps in your data (as this example does), you will be able to tell exactly when your data was received by Bullet. This coupled with the timestamps in the Result Metadata for when your query was submitted and terminated, you will be able to tell why or why not a particular record was or was not seen in Bullet.
4646

@@ -53,7 +53,7 @@ The Output Data section lets you aggregate or choose to see raw data records. Yo
5353
**Example: Finding and picking out fields from events that have probability > 0.5**
5454

5555
<video controls autoplay loop>
56-
<source src="../../video/filter-project.mp4" type="video/mp4">
56+
<source src="../../video/filter-project-2.mp4" type="video/mp4">
5757
Your browser does not support the video tag
5858
</video>
5959

@@ -70,7 +70,7 @@ The querybuilder is also type aware. The operations you can perform change based
7070
**Example: Finding and picking out the first and second events in each period that also have probability > 0.5**
7171

7272
<video controls autoplay loop>
73-
<source src="../../video/query-building.mp4" type="video/mp4">
73+
<source src="../../video/query-building-2.mp4" type="video/mp4">
7474
Your browser does not support the video tag.
7575
</video>
7676

@@ -89,7 +89,7 @@ You can also optionally rename the result.
8989
**Example: Counting unique UUIDs for 20s**
9090

9191
<video controls autoplay loop>
92-
<source src="../../video/exact-count-distinct-2.mp4" type="video/mp4">
92+
<source src="../../video/exact-count-distinct-3.mp4" type="video/mp4">
9393
Your browser does not support the video tag.
9494
</video>
9595

@@ -112,13 +112,13 @@ When the result is approximate, it is shown as a decimal value. The Result Metad
112112
**Example: Counting unique UUIDs for 200s**
113113

114114
<video controls autoplay loop>
115-
<source src="../../video/approx-count-distinct-2.mp4" type="video/mp4">
115+
<source src="../../video/approx-count-distinct-3.mp4" type="video/mp4">
116116
Your browser does not support the video tag.
117117
</video>
118118

119119
!!! note "So why is the approximate count what it is?"
120120

121-
The backend should have produced ```20 * 200000/101``` or ```39603``` tuples with unique uuids. Due to the synthetic nature of the data generation and the building delays mentioned above, we estimated that we should subtract about 20 tuples for every 10 s the query runs. Since this query ran for ```200 s```, this makes the actual uuids generated to be at best ```39603 - (200/10) * 20``` or ```39203```. The result from Bullet was ```38886```, which is an error of ```~0.8 %```. The real error is probably about a *third* of that because we assumed the delay between periods to be 1 ms. It is more on the order of 2 or 3 ms, which makes the number of uuids actually generated even less.
121+
The backend should have produced ```20 * 200000/101``` or ```39603``` tuples with unique uuids. Due to the synthetic nature of the data generation and the building delays mentioned above, we estimated that we should subtract about 20 tuples for every 10 s the query runs. Since this query ran for ```200 s```, this makes the actual uuids generated to be at best ```39603 - (200/10) * 20``` or ```39203```. The result from Bullet was ```39069```, which is an error of ```~0.3 %```. The real error is probably less than that because we assumed the delay between periods to be 1 ms to get the ```39203``` number. It's probably slightly larger making the actual uuids generated lower and closer to our estimate.
122122

123123
## Group all
124124

@@ -129,7 +129,7 @@ When choosing the Grouped Data option, you can choose to add fields to group by.
129129
The metrics you apply on fields are all numeric presently. If you apply a metric on a non-numeric field, Bullet will try to **type-cast** your field into number and if it's not possible, the result will be ```null```. The result will also be ```null``` if the field was not present or no data matched your filters.
130130

131131
<video controls autoplay loop>
132-
<source src="../../video/group-all-error-duplicating-2.mp4" type="video/mp4">
132+
<source src="../../video/group-all-error-2.mp4" type="video/mp4">
133133
Your browser does not support the video tag.
134134
</video>
135135

@@ -156,7 +156,7 @@ In this example, we group by ```tuple_number```. Recall that this is the number
156156

157157
!!! note "What happens if I group by uuid?"
158158

159-
Try it out! Nothing bad should happen. If the number of unique group values exceeds the [maximum configured](../quick-start.md#setting-up-the-example-bullet-topology) (we used 1024 for this example), you will receive a *uniform sample* across your unique group values. The results for your metrics however, are **not sampled**. It is the groups that are sampled on. This means that is **no** guarantee of order if you were expecting the *most popular* groups or similar. We are working on adding a ```TOP K``` query that can support these kinds of use-cases.
159+
Try it out! Nothing bad should happen. If the number of unique group values exceeds the [maximum configured](../quick-start.md#setting-up-the-example-bullet-topology) (we used 1024 for this example), you will receive a *uniform sample* across your unique group values. The results for your metrics however, are **not sampled**. It is the groups that are sampled on. This means that is **no** guarantee of order if you were expecting the *most popular* groups or similar. You should use the Top K query in that scenario.
160160

161161
!!! note "Why no Count Distinct after Grouping"
162162

@@ -167,3 +167,82 @@ In this example, we group by ```tuple_number```. Recall that this is the number
167167
Good job, eagle eyes! Unfortunately, whenever we group on fields, those fields become strings under the current implementation. Rather than convert them back at the end, we have currently decided to leave it as is. This means that in your results, if you try and sort by a grouped field, it will perform a lexicographical sort even if it was originally a number.
168168

169169
However, this also means that you can actually group by any field - including non primitives such as maps and lists! The field will be converted to a string and that string will be used as the field's representation for uniqueness and grouping purposes.
170+
171+
## Distributions
172+
173+
In this example, we find distributions of the ```duration``` field. This field is generated randomly from 0 to 10,049, with a tendency to have values that are closer to 0 than 10,049. Let's see if this is true. Note that since this field has random values, the results you see per query are the values generated during that query's duration.
174+
175+
The distribution type of output data requires you to pick a type of distribution: ```Quantiles```, ```Frequencies``` or ```Cumulative Frequencies```. ```Quantiles``` lets you get various percentiles (e.g. 25th, 99th) of your numeric field. ```Frequencies``` lets you break up the range of values of your field into intervals and get a count of how many values fell into each interval. ```Cumulative Frequencies``` does the same as ```Frequencies``` but each interval includes the counts of all the intervals prior to it. Both ```Frequencies``` and ```Cumulative Frequencies``` also give you a probability of how likely a value is to fall into the interval.
176+
177+
All the distributions require you to specify some numeric points. For ```Quantiles```, these points are between 0 and 1 and the value denotes the percentile you are looking for. (0.25 for 25th percentile, 0.99 for 99th etc). For ```Frequencies``` and ```Cumulative Frequencies```, the points are between the minimum and maximum value of your field and every 2 contiguous points create an interval. However, the first interval always starts from *-&infin;* to the first point and the last interval always starts from your last point to *+&infin;*.
178+
179+
You can read much more about this in the UI help by clicking the ```Need more help?``` link.
180+
181+
### Exact
182+
183+
**Example: Finding the various percentiles of duration**
184+
185+
This example shows all 3 values of specifying points and shows *exact* distribution results for the ```duration``` field.
186+
187+
<video controls autoplay loop>
188+
<source src="../../video/quantiles-all-point-formats.mp4" type="video/mp4">
189+
Your browser does not support the video tag.
190+
</video>
191+
192+
---
193+
194+
**Example: Finding some frequency counts of duration values in an interval**
195+
196+
The last example showed that the 90th percentile of ```duration``` was around 4000. This example gets some frequencies in various intervals.
197+
198+
<video controls autoplay loop>
199+
<source src="../../video/frequency-distribution.mp4" type="video/mp4">
200+
Your browser does not support the video tag.
201+
</video>
202+
203+
Try out and see what ```Cumulative Frequencies``` does yourself!
204+
205+
### Approximate
206+
207+
This next example shows how an approximate distribution result looks.
208+
209+
**Example: Approximate quantile distribution**
210+
211+
<video controls autoplay loop>
212+
<source src="../../video/approx-quantile.mp4" type="video/mp4">
213+
Your browser does not support the video tag.
214+
</video>
215+
216+
!!! note "Normalized Rank Error"
217+
218+
To understand what this means, refer to the [explanation here](../ws/examples.md#normalized-rank-error). You can also refer to the help in the Result Metadata section.
219+
220+
!!! note "Wouldn't it be nice to graph these?"
221+
222+
This is in the works! We plan to add pivoting and graphing as a general option in the results pages. Feel free to follow [the issue here](https://github.com/yahoo/bullet-ui/issues/24).
223+
224+
## Top K
225+
226+
Top K lets you get the most *frequent items* or the *heavy hitters* for the values in a set of a fields.
227+
228+
### Exact
229+
230+
This example gets the Top 3 most popular ```type``` values (there are only 6 but this illustrates the idea).
231+
232+
<video controls autoplay loop>
233+
<source src="../../video/exact-top-k.mp4" type="video/mp4">
234+
Your browser does not support the video tag.
235+
</video>
236+
237+
### Approximate
238+
239+
By adding ```duration``` into the fields, the number of unique values for ```(type, duration)``` is increased. However, because ```duration``` has a tendency to have low values, we will have some *frequent items*. The counts are now estimated. We ask for the top 300 results but we also say that they should have a count of at least 20. This restricts the overall number of results to 12.
240+
241+
<video controls autoplay loop>
242+
<source src="../../video/approx-top-k.mp4" type="video/mp4">
243+
Your browser does not support the video tag.
244+
</video>
245+
246+
!!! note "Maximum Count Error"
247+
248+
The ```maximum_count_error``` value for the query above was ```3```. This means that the difference between the upper bound and the lower bound of each count estimate is ```3```. Bullet returns the upper bound as the estimate so subtracting ```3``` from each count gives you the lower bound of the count. Note that some counts are closer to each other than the count error. For instance, ```(quux, 1)``` and ```(baz, 0)``` have counts ```67``` and ```66``` but their true counts could be from ```64 to 67``` and ```63 to 66``` respectively. This means that ```(baz, 0)``` could well be the most frequent item for this query.
-787 KB
Binary file not shown.
659 KB
Binary file not shown.

docs/video/approx-quantile.mp4

438 KB
Binary file not shown.

docs/video/approx-top-k.mp4

354 KB
Binary file not shown.

docs/video/exact-count-distinct-2.mp4

-251 KB
Binary file not shown.

docs/video/exact-count-distinct-3.mp4

526 KB
Binary file not shown.

docs/video/exact-top-k.mp4

292 KB
Binary file not shown.

0 commit comments

Comments
 (0)