Skip to content

Commit 1c27656

Browse files
author
Marcus Müller
committed
cpu_hog: don't use non-reentrant rand() in threads, do smth with result
Previously, stress -c did a terrible job at actually loading the CPU; it was idle most of the times: ``` $> perf stat stress -c 16 -t 5 stress: info: [526148] dispatching hogs: 16 cpu, 0 io, 0 vm, 0 hdd stress: info: [526148] successful run completed in 5s Performance counter stats for 'stress -c 16 -t 5': 79,580.45 msec task-clock:u # 15.910 CPUs utilized 0 context-switches:u # 0.000 /sec 0 cpu-migrations:u # 0.000 /sec 309 page-faults:u # 3.883 /sec 418,716,815,425 cycles:u # 5.262 GHz 262,176,845,042 stalled-cycles-frontend:u # 62.61% frontend cycles idle 617,055,840,870 instructions:u # 1.47 insn per cycle # 0.42 stalled cycles per insn 175,186,890,751 branches:u # 2.201 G/sec 269,450,686 branch-misses:u # 0.15% of all branches 5.001799550 seconds time elapsed 79.463002000 seconds user 0.007854000 seconds sys ``` This means that in more than half of the cycles, the CPU frontend couldn't do something. Why? A `perf record -g` trace of the same invocation tells us that the CPU is spending > 99% of its time in __random, waiting for an integer comparison that involves a data load. No surprise there: rand() relies on global state that needs to get synchronized. With this percentage in mind, it's not so bad that the result of sqrt never got used. This commit changes both: - stores the result of sqrt in a volatile double - to stay portable, and use a very small-state algorithm for pseudo-random number generation, we just inline xoroshiro128+ [1], which is under a MIT-0 style "dedication to public domain" license. We still don't "spin on sqrt()", because floating point sqrt is very very fast on modern desktop/server CPUs; but at least we actually make the CPU do its rounds. [1] https://prng.di.unimi.it/ Signed-off-by: Marcus Müller <mueller_foss_stress@baseband.digital>
1 parent f725109 commit 1c27656

File tree

1 file changed

+43
-3
lines changed

1 file changed

+43
-3
lines changed

src/stress.c

Lines changed: 43 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,6 +3,7 @@
33
* Copyright 2001-2010 Amos Waterland <apw@rossby.metr.ou.edu>
44
* Copyright 2021-2023 Joao Eriberto Mota Filho <eriberto@eriberto.pro.br>
55
* Copyright 2023 Vratislav Bendel <vbendel@redhat.com>
6+
* Copyright 2025 Marcus Müller <mueller_foss_stress@baseband.digital>
67
*
78
* This program is free software; you can redistribute it and/or
89
* modify it under the terms of the GNU General Public License
@@ -23,6 +24,7 @@
2324
#include <errno.h>
2425
#include <libgen.h>
2526
#include <math.h>
27+
#include <stdint.h>
2628
#include <stdio.h>
2729
#include <stdlib.h>
2830
#include <string.h>
@@ -479,9 +481,47 @@ void worker_init(void)
479481
int
480482
hogcpu (void)
481483
{
482-
while (1)
483-
sqrt (rand ());
484-
484+
/* initial state for PRNG */
485+
static uint32_t s[] = {
486+
4 /* chosen by fair dice roll */,
487+
42 /* the answer */,
488+
3 /* pi */,
489+
0xCAFECA4E /* because who doesn't like cake? */
490+
};
491+
while (1) {
492+
/* START xoroshirp128plus:
493+
Written in 2018 by David Blackman and Sebastiano Vigna (vigna@acm.org)
494+
495+
To the extent possible under law, the author has dedicated all copyright
496+
and related and neighboring rights to this software to the public domain
497+
worldwide.
498+
499+
Permission to use, copy, modify, and/or distribute this software for any
500+
purpose with or without fee is hereby granted.
501+
502+
THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
503+
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
504+
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
505+
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
506+
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
507+
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF OR
508+
IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.
509+
*/
510+
511+
const uint32_t result = s[0] + s[3];
512+
const uint32_t t = s[1] << 9;
513+
514+
s[2] ^= s[0];
515+
s[3] ^= s[1];
516+
s[1] ^= s[2];
517+
s[0] ^= s[3];
518+
519+
s[2] ^= t;
520+
521+
s[3] = (s[3] << 11) | (s[3]>> (32-11));
522+
/* END xoroshirp128plus */
523+
volatile double value = sqrt ((double)(result));
524+
}
485525
return 0;
486526
}
487527

0 commit comments

Comments
 (0)