Skip to content

Commit ba8021a

Browse files
authored
Merge pull request #2351 from Kobzol/unify-docs
Update job queue documentation
2 parents bc7d280 + 7c6d1e1 commit ba8021a

File tree

7 files changed

+208
-172
lines changed

7 files changed

+208
-172
lines changed

collector/README.md

Lines changed: 3 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,7 @@
11
# Rust Compiler Performance Benchmarking and Profiling
22

33
Hardware and software details of the machine that executes the CI details can be found
4-
[here](../docs/perf-runner.md). A glossary of relevant terms can be found
4+
[here](../docs/deployment.md). A glossary of relevant terms can be found
55
[here](../docs/glossary.md).
66

77
## The benchmarks
@@ -34,6 +34,8 @@ This crate is only compatible with OpenSSL 1.0.1, 1.0.2, and 1.1.0, or LibreSSL
3434
aborting due to this version mismatch.
3535
```
3636

37+
For benchmarking using `perf`, you will also need to set `/proc/sys/kernel/perf_event_paranoid` to `-1`.
38+
3739
## Benchmarking
3840

3941
This section is about benchmarking rustc, i.e. measuring its performance on the

database/schema.md

Lines changed: 7 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -238,17 +238,9 @@ Columns:
238238
* **job_id** (`INTEGER`): A nullable job_id which, if it exists it will inform
239239
us as to which job this error is part of.
240240

241-
## New benchmarking design
242-
We are currently implementing a new design for dispatching benchmarks to collector(s) and storing
243-
them in the database. It will support new use-cases, like backfilling of new benchmarks into a parent
244-
commit and primarily benchmarking with multiple collectors (and multiple hardware architectures) in
245-
parallel.
246-
247-
The tables below are a part of the new scheme.
248-
249241
### benchmark_request
250242

251-
Represents a single request for performing a benchmark collection. Each request can be one of three types:
243+
Represents a single request for performing a benchmark run. Each request can be one of three types:
252244

253245
* Master: benchmark a merged master commit
254246
* Release: benchmark a published stable or beta compiler toolchain
@@ -297,15 +289,13 @@ Columns:
297289

298290
### job_queue
299291

300-
This table stores ephemeral benchmark jobs, which specifically tell the
301-
collector which benchmarks it should execute. The jobs will be kept in the
302-
table for ~30 days after being completed, so that we can quickly figure out
303-
what master parent jobs we need to backfill when handling try builds.
292+
This table stores benchmark jobs, which specifically tell the
293+
collector which benchmarks it should execute.
304294

305295
Columns:
306296

307-
* **id** (`bigint` / `serial`): Primary*key identifier for the job row;
308-
auto*increments with each new job.
297+
* **id** (`bigint` / `serial`): Primary key identifier for the job row;
298+
autoincrements with each new job.
309299
* **request_tag** (`text`): References the parent benchmark request that
310300
spawned this job.
311301
* **target** (`text NOT NULL`): Hardware/ISA the benchmarks must run on
@@ -325,3 +315,5 @@ Columns:
325315
`success`, or `failure`.
326316
* **retry** (`int NOT NULL`): Number of times the job has been re*queued after
327317
a failure; 0 on the first attempt.
318+
* **kind** (`text NOT NULL`): What benchmark suite should be executed in the job (`compiletime`, `runtime` or `rustc`).
319+
* **is_optional** (`boolean NOT NULL`): Whether a request should wait for this job to finish before it will become completed.

docs/README.md

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
# rustc-perf documentation
2+
3+
- [Glossary of useful terms](./glossary.md)
4+
- [Database schema](../database/schema.md)
5+
- [How rustc-perf is deployed](./deployment.md)
6+
- [How the distributed job queue works](./job-queue.md)
7+
- [How we compare benchmarks results](./comparison-analysis.md)
Lines changed: 18 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,7 +1,23 @@
1-
# Benchmarking machine
2-
The machine that actually executes the benchmarks is the `AX-42` server running on [Hetzner](https://www.hetzner.com/dedicated-rootserver/). It has the following configuration.
1+
# Deployment
2+
3+
The machines that actually execute the benchmarks ("collectors") are dedicated machines running on [Hetzner](https://www.hetzner.com/dedicated-rootserver/). The [web server](http://perf.rust-lang.org/) runs on [ECS](https://github.com/rust-lang/infra-team/blob/HEAD/service-catalog/rustc-perf/README.md).
4+
5+
## Debugging
6+
This section documents what to do in case benchmarking doesn't work or something is stuck. The status of the collectors can be found on the [status page](https://perf.rust-lang.org/status.html). In particular, it shows the last heartbeat of each collector. If that date is very old (>1 hour), then something bad has probably happened with the collector.
7+
8+
You can SSH into the machines directly and examine what is going on there. The currently active machines have the following domain names:
9+
10+
- `rustc-perf-one.infra.rust-lang.org`
11+
- `rustc-perf-two.infra.rust-lang.org`
12+
13+
The benchmarking process runs as a systemd service called `collector`. You can start/stop/inspect it using the usual commands:
14+
- Start/restart/stop: `sudo systemctl start/restart/stop collector.service`
15+
- See logs: `sudo journalctl --utc -n 10000 -u collector -f`
16+
17+
The user account under which the benchmarks execute is called `collector`, you can switch to it using `su` and examine the `/home/collector/rustc-perf` checkout, from where are the benchmarks executed.
318

419
## Hardware
20+
- The collectors run on `AX-42` Hetzner server instances.
521
- 8-core AMD Ryzen 7 PRO 8700GE with HyperThreading (16 hardware threads total)
622
<details>
723
<summary>Output of `lscpu`</summary>

docs/glossary.md

Lines changed: 19 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -25,6 +25,9 @@ The following is a glossary of domain specific terminology. Although benchmarks
2525
- `incr-patched`: incremental compilation is used, with a full incremental cache and some code changes made.
2626
* **backend**: the codegen backend used for compiling Rust code.
2727
- `llvm`: the default codegen backend
28+
- `cranelift`: experimental backend designed for quicker non-optimized builds
29+
* **target**: compilation target for which the benchmark is compiled.
30+
- `x86_64-unknown-linux-gnu`: the default x64 Linux target
2831
* **category**: a high-level group of benchmarks. Currently, there are three categories, primary (mostly real-world crates), secondary (mostly stress tests), and stable (old real-world crates, only used for the dashboard).
2932
* **artifact type**: describes what kind of artifact does the benchmark build. Either `library` or `binary`.
3033

@@ -41,15 +44,15 @@ The following is a glossary of domain specific terminology. Although benchmarks
4144
## Testing
4245

4346
* **test case**: a combination of parameters that describe the measurement of a single (compile-time or runtime) benchmark - a single `test`
44-
- For compile-time benchmarks, it is a combination of a benchmark, a profile, and a scenario.
45-
- For runtime benchmarks, it is currently only the benchmark name.
47+
- For compile-time benchmarks, it is a combination of a benchmark, a profile, a scenario, a codegen backend and a target.
48+
- For runtime benchmarks, it a combination of a benchmark and a target.
4649
* **test**: the act of running an artifact under a test case. Each test is composed of many iterations.
4750
* **test iteration**: a single iteration that makes up a test. Note: we currently normally run 3 test iterations for each test.
48-
* **test result**: the result of the collection of all statistics from running a test. Currently, the minimum value of a statistic from all the test iterations is used for analysis calculations and the website.
49-
* **statistic**: a single measured value of a metric in a test result
51+
* **test result**: the set of all gathered statistics from running a test. Currently, the minimum value of a statistic from all the test iterations is used for analysis calculations and the website.
52+
* **statistic**: a single measured value of a metric in a test iteration
5053
* **statistic description**: the combination of a metric and a test case which describes a statistic.
5154
* **statistic series**: statistics for the same statistic description over time.
52-
* **run**: a set of tests for all currently available test cases measured on a given artifact.
55+
* **run**: a set of tests for all currently available test cases measured on a given artifact.
5356

5457
## Analysis
5558

@@ -60,7 +63,17 @@ The following is a glossary of domain specific terminology. Although benchmarks
6063
* **relevant test result comparison**: a test result comparison can be significant but still not be relevant (i.e., worth paying attention to). Relevance is a factor of the test result comparison's significance and magnitude. Comparisons are considered relevant if they are significant and have at least a small magnitude .
6164
* **test result comparison magnitude**: how "large" the delta is between the two test result's under comparison. This is determined by the average of two factors: the absolute size of the change (i.e., a change of 5% is larger than a change of 1%) and the amount above the significance threshold (i.e., a change that is 5x the significance threshold is larger than a change 1.5x the significance threshold).
6265

63-
## Other
66+
## Job queue
67+
68+
These terms are related to the [job queue system](./job-queue.md) that distributes benchmarking jobs across available collectors.
69+
70+
- **benchmark request**: a request for a benchmarking a *run* on a given *artifact*. Can be either created from a try build on a PR, or it is automatically created from merged master/release *artifacts*.
71+
- **collector**: a machine that performs benchmarks.
72+
- **benchmark set**: a subset of a compile/runtime/bootstrap benchmark suite that is executed by a collector in a single job.
73+
- **job**: a high-level "work item" that defines a set of *test cases* that should be benchmarked on a specific collector.
74+
- **job queue**: a queue of *jobs*.
75+
76+
## Other
6477

6578
* **bootstrap**: the process of building the compiler from a previous version of the compiler
6679
* **compiler query**: a query used inside the [compiler query system](https://rustc-dev-guide.rust-lang.org/overview.html#queries).

0 commit comments

Comments
 (0)