Skip to content

Commit 4669dac

Browse files
committed
Updating to latest versions and updating quickstart
1 parent a9412dc commit 4669dac

File tree

7 files changed

+85
-61
lines changed

7 files changed

+85
-61
lines changed

docs/about/releases.md

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ certain CPU and memory related settings specific to RAS in its configuration. Th
1919
| **Storm-1.0+ Repository** | [https://github.com/yahoo/bullet-storm](https://github.com/yahoo/bullet-storm) |
2020
| **Storm-0.10- Repository** | [https://github.com/yahoo/bullet-storm/tree/storm-0.10](https://github.com/yahoo/bullet-storm/tree/storm-0.10) |
2121
| **Issues** | [https://github.com/yahoo/bullet-storm/issues](https://github.com/yahoo/bullet-storm/issues) |
22-
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-storm.svg)](https://github.com/yahoo/bullet-storm/releases/latest) |
22+
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-storm/all.svg)](https://github.com/yahoo/bullet-storm/tags) |
2323
| **Latest Artifact** | [![Download](https://api.bintray.com/packages/yahoo/maven/bullet-storm/images/download.svg)](https://bintray.com/yahoo/maven/bullet-storm/_latestVersion) |
2424

2525
### Releases
@@ -43,7 +43,7 @@ The Web Service implementation that can serve a static schema from a file and ta
4343
| ------------------- | --------------- |
4444
| **Repository** | [https://github.com/yahoo/bullet-service](https://github.com/yahoo/bullet-service) |
4545
| **Issues** | [https://github.com/yahoo/bullet-service/issues](https://github.com/yahoo/bullet-service/issues) |
46-
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-service.svg)](https://github.com/yahoo/bullet-service/releases/latest) |
46+
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-service/all.svg)](https://github.com/yahoo/bullet-service/tags) |
4747
| **Latest Artifact** | [![Download](https://api.bintray.com/packages/yahoo/maven/bullet-service/images/download.svg)](https://bintray.com/yahoo/maven/bullet-service/_latestVersion) |
4848

4949
### Releases
@@ -60,13 +60,15 @@ The Bullet UI that lets you build, run, save and visualize results from Bullet.
6060
| ------------------- | --------------- |
6161
| **Repository** | [https://github.com/yahoo/bullet-ui](https://github.com/yahoo/bullet-ui) |
6262
| **Issues** | [https://github.com/yahoo/bullet-ui/issues](https://github.com/yahoo/bullet-ui/issues) |
63-
| **Last Tag** | [![GitHub release](https://img.shields.io/github/release/yahoo/bullet-ui.svg)](https://github.com/yahoo/bullet-ui/releases/latest) |
63+
| **Last Tag** | [![GitHub release](https://img.shields.io/github/tag/yahoo/bullet-ui.svg)](https://github.com/yahoo/bullet-ui/tags) |
6464
| **Latest Artifact** | [![GitHub release](https://img.shields.io/github/release/yahoo/bullet-ui.svg)](https://github.com/yahoo/bullet-ui/releases/latest) |
6565

6666
### Releases
6767

6868
| Date | Release | Highlights |
6969
| ------------ | -------------------------------------------------------------------------------------- | ---------- |
70+
| 2016-05-12 | [**0.3.1**](https://github.com/yahoo/bullet-ui/releases/tag/v0.3.1) | Adds styles to the Pivot table. Fixes some minor UI interactions |
71+
| 2016-05-10 | [**0.3.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.3.0) | Adds Charting and Pivoting support. Migrations enhanced. Support for overriding nested default settings |
7072
| 2016-05-03 | [**0.2.2**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.2) | Fixes maxlength of the input for points |
7173
| 2016-05-02 | [**0.2.1**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.1) | Fixes a bug with a dependency that broke sorting the Filters |
7274
| 2016-05-01 | [**0.2.0**](https://github.com/yahoo/bullet-ui/releases/tag/v0.2.0) | Release for Top K and Distribution. Supports Bullet Storm 0.4.2+ |
@@ -80,7 +82,7 @@ The AVRO container that you need to convert your data into to be consumed by Bul
8082
| ------------------- | --------------- |
8183
| **Repository** | [https://github.com/yahoo/bullet-record](https://github.com/yahoo/bullet-record) |
8284
| **Issues** | [https://github.com/yahoo/bullet-record/issues](https://github.com/yahoo/bullet-record/issues) |
83-
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-record.svg)](https://github.com/yahoo/bullet-record/releases/latest) |
85+
| **Last Tag** | [![Latest tag](https://img.shields.io/github/release/yahoo/bullet-record/all.svg)](https://github.com/yahoo/bullet-record/tags) |
8486
| **Latest Artifact** | [![Download](https://api.bintray.com/packages/yahoo/maven/bullet-record/images/download.svg)](https://bintray.com/yahoo/maven/bullet-record/_latestVersion) |
8587

8688
### Releases

docs/backend/performance.md

Lines changed: 23 additions & 23 deletions
Original file line numberDiff line numberDiff line change
@@ -28,7 +28,7 @@ The rest of this document assumes that you are familiar with [Storm](http://stor
2828

2929
## How was this tested?
3030

31-
All tests run here were using [Bullet-Storm 0.3.1](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.1). The intent is to test just the Storm piece without going through the Web Service or the UI. The DRPC REST endpoint provided by Storm lets us do just that.
31+
All tests run here were using [Bullet-Storm 0.4.2](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.4.2). The intent is to test just the Storm piece without going through the Web Service or the UI. The DRPC REST endpoint provided by Storm lets us do just that.
3232

3333
Using the pluggable metrics interface in Bullet on Storm, various worker level metrics such as CPU time, Heap usage, GC times and types, were captured and sent to a Yahoo in-house monitoring service for time-slicing and graphing. The graphs shown in the tests below use this service. See [0.3.0](https://github.com/yahoo/bullet-storm/releases/tag/bullet-storm-0.3.0) for details on how to plug in your own metrics collection.
3434

@@ -60,7 +60,7 @@ Here is the default configuration we used to launch the basic instance of Bullet
6060
bullet.topology.metrics.enable: true
6161
bullet.topology.metrics.built.in.enable: true
6262
bullet.topology.metrics.built.in.emit.interval.mapping:
63-
bullet_active_rules: 5
63+
bullet_active_queries: 5
6464
default: 60
6565
bullet.topology.metrics.classes:
6666
- "package.containing.our.custom.class.pushing.metrics"
@@ -85,14 +85,15 @@ bullet.topology.join.bolt.cpu.load: 50.0
8585
bullet.topology.join.bolt.memory.on.heap.load: 384.0
8686
bullet.topology.join.bolt.memory.off.heap.load: 192.0
8787
bullet.topology.join.bolt.error.tick.timeout: 3
88-
bullet.topology.join.bolt.rule.tick.timeout: 3
88+
bullet.topology.join.bolt.query.tick.timeout: 3
8989
bullet.topology.tick.interval.secs: 1
90-
bullet.rule.default.duration: 30000
91-
bullet.rule.max.duration: 540000
92-
bullet.rule.aggregation.max.size: 512
93-
bullet.rule.aggregation.raw.max.size: 500
90+
bullet.query.default.duration: 30000
91+
bullet.query.max.duration: 540000
92+
bullet.query.aggregation.max.size: 512
93+
bullet.query.aggregation.raw.max.size: 500
94+
bullet.query.aggregation.distribution.max.points: 200
9495
```
95-
Any setting not listed here default to the defaults in [bullet_defaults.yaml](https://github.com/yahoo/bullet-storm/blob/bullet-storm-0.3.1/src/main/resources/bullet_defaults.yaml). In particular, **metadata collection** and **timestamp injection** is enabled. ```RAW``` type queries also micro-batch by size 1 (in other words, do not micro-batch).
96+
Any setting not listed here default to the defaults in [bullet_defaults.yaml](https://github.com/yahoo/bullet-storm/blob/bullet-storm-0.4.2/src/main/resources/bullet_defaults.yaml). In particular, **metadata collection** and **timestamp injection** is enabled. ```RAW``` type queries also micro-batch by size 1 (in other words, do not micro-batch).
9697

9798
The topology was also launched (command-line args to Storm) with the following Storm settings:
9899

@@ -163,48 +164,48 @@ The following table shows the timestamps averaged by running **100** of these qu
163164

164165
| Timestamp | Delay (ms) |
165166
| :-------------- | ---------: |
166-
| Kafka Received | -705.79 |
167-
| Bullet Received | -1.01 |
167+
| Kafka Received | -710.75 |
168+
| Bullet Received | -2.16 |
168169
| Query Received | 0 |
169-
| Query Finished | 1.74 |
170+
| Query Finished | 1.66 |
170171

171-
The Bullet Received timestamp above is negative because the Filter bolt received the query and emitted an arbitrary record ```1.01 ms``` before the Join bolt received the query. The data was submitted into Kafka about ```705.79 ms``` before the query was received by Bullet and that difference is the processing time of Kafka and the time for our spouts to read the data into Bullet.
172+
The Bullet Received timestamp above is negative because the Filter bolt received the query and emitted an arbitrary record ```2.16 ms``` before the Join bolt received the query. The data was submitted into Kafka about ```710.75 ms``` before the query was received by Bullet and that difference is the processing time of Kafka and the time for our spouts to read the data into Bullet.
172173

173174
### Conclusion
174175

175-
Bullet adds a delay of ```1.74 ms``` to just pull out a record. This result shows that this is the fastest Bullet can be. It cannot return data any faster than this for meaningful queries.
176+
Bullet adds a delay of a few ms - ```1.66 ms``` in the test above - to just pull out a record. This result shows that this is the fastest Bullet can be. It cannot return data any faster than this for meaningful queries.
176177

177178
## Test 2: Measuring the time to find a record
178179

179180
This test runs with the [standard configuration](#configuration) above.
180181

181182
The [last test](#test-1-measuring-the-inherent-latency-of-bullet) attempted to measure how long Bullet takes to pick out a record. Here we will measure how long it takes to find a record *that we generate*. This is the average of running **100** queries across a time interval of 30 minutes trying to filter for a record with a single unique value in a field [similar to this query](../ws/examples.md#simple-filtering).
182183

183-
Since this query actually requires us to be looking at the values in the data, we should also mention that the average data volume across this test was: ```Data: 164,000 MPS and 107 MB/s```
184+
Since this query actually requires us to be looking at the values in the data, we should also mention that the average data volume across this test was: ```Data: 163,000 MPS and 105 MB/s```
184185

185186
### Result
186187

187188
<div class="mostly-numeric-table"></div>
188189

189190
| Timestamp | Delay (ms) |
190191
| :-------------- | ---------: |
191-
| Kafka Received | 519.25 |
192-
| Bullet Received | 1161.43 |
192+
| Kafka Received | 465.81 |
193+
| Bullet Received | 1072.84 |
193194
| Query Received | 0 |
194-
| Query Finished | 1165.96 |
195+
| Query Finished | 1077.85 |
195196

196197

197-
The record was emitted into Kafka ```519.25 ms``` after the query was received. The delay is the time it takes for the generated record to flow through our network and into Kafka.
198+
The record was emitted into Kafka ```465.81 ms``` after the query was received. The delay is the time it takes for the generated record to flow through our network and into Kafka.
198199

199-
It is difficult to isolate how long of the ```1161.43 ms - 519.25 ms = 642.18 ms``` was spent in Kafka and how long was spent reading the record and sending it to the Filter bolt from our spout. However, we can look this time as a whole and include it into the time to get data into Bullet.
200+
It is difficult to isolate how long of the ```1072.84 ms - 465.81 ms = 607.03 ms``` was spent in Kafka and how long was spent reading the record and sending it to the Filter bolt from our spout. However, we can look this time as a whole and include it into the time to get data into Bullet.
200201

201202
### Conclusion
202203

203-
We see that Bullet took on average ```1165.96 ms - 1161.43 ms = 4.53 ms``` from the time it saw the record in the Filter bolt to finishing up the query and returning it.
204+
We see that Bullet took on average ```1077.85 ms - 1072.84 ms = 5.01 ms``` from the time it saw the record in the Filter bolt to finishing up the query and returning it.
204205

205206
!!! note "So, Bullet takes ~5 ms to find a record?"
206207

207-
No, not really. Remember that we are only including the time from which the record was matched in the Filter bolt to when it was sent out from the Join bolt. We can only conclude that the true delay is less than ```1165.96 ms - 519.25 ms = 646.71 ms``` because that is the difference in time from when the record was emitted into Kafka and when it was emitted out of Bullet. It is less than that because a part of that time is Kafka accepting the record and making it available for consumption. Nevertheless, finding a single record in data stream of ```164,000 mps``` in about half a second with about [5 machines](#resource-utilization) is not bad at all!
208+
No, not really. Remember that we are only including the time from which the record was matched in the Filter bolt to when it was sent out from the Join bolt. We can only conclude that the true delay is less than ```1077.85 ms - 465.81 ms = 612.04 ms``` because that is the difference in time from when the record was emitted into Kafka and when it was emitted out of Bullet. It is less than that because a part of that time is Kafka accepting the record and making it available for consumption. Nevertheless, finding a single record in data stream of ```163,000 mps``` in about half a second with about [5 machines](#resource-utilization) is not bad at all!
208209

209210
## Test 3: Measuring the maximum number of parallel ```RAW``` queries
210211

@@ -237,7 +238,7 @@ We will run a certain number of these queries then generate a record matching th
237238
This script takes in a single numeric argument to run a number of queries in parallel (you may have to use ```ulimit``` to change maximum user processes if you specify a large number). It runs till you kill it performing the following:
238239

239240
1. It generates a provided number of the [query above](#query) and runs them in parallel against a randomly chosen DRPC server
240-
2. It generates data for the query
241+
2. It generates data for the query
241242
3. It waits out the rest of the time and uses jq to validate that all the generated data was found
242243

243244
Here is a version of the script with the specifics to our data generation and Storm topology removed:
@@ -423,4 +424,3 @@ With this change in heap usage, we could get to ```735``` of these queries simul
423424
!!! note "735 is a hard limit then?"
424425

425426
We are currently discussing this with the Storm folks to perhaps switch DRPC to a non-blocking implementation. Also, depending on if and how Bullet is implemented on other Stream processors, an alternative to DRPC may be required anyway - such as using a Pub/Sub queue like Kafka to deliver queries and retrieve results from Bullet. Stay tuned for updates!
426-

0 commit comments

Comments
 (0)