Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion buildSrc/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -23,7 +23,7 @@ repositories {
dependencies {
//TODO Allow pulling from Versions.kt
implementation("com.gradleup.shadow", "shadow-gradle-plugin", "8.3.6")

implementation("me.champeau.jmh", "me.champeau.jmh.gradle.plugin", "0.7.3")
implementation("io.papermc.paperweight.userdev", "io.papermc.paperweight.userdev.gradle.plugin", "2.0.0-beta.17")
implementation("org.ow2.asm", "asm", "9.8")
implementation("org.ow2.asm", "asm-tree", "9.8")
Expand Down
6 changes: 6 additions & 0 deletions buildSrc/src/main/kotlin/BenchmarkingConfig.kt
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
import org.gradle.api.Project
import org.gradle.kotlin.dsl.apply

fun Project.configureBenchmarking() {
apply(plugin = "me.champeau.jmh")
}
Comment on lines +1 to +6
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I mentioned it a while ago that it might be good to add some jmh benchmarks, I made a gradle config for that, I forget if it was just identical to this or if I had smth else as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

2 changes: 2 additions & 0 deletions common/addons/chunk-generator-noise-3d/build.gradle.kts
Original file line number Diff line number Diff line change
Expand Up @@ -3,3 +3,5 @@ version = version("1.2.1")
dependencies {
compileOnlyApi(project(":common:addons:manifest-addon-loader"))
}

configureBenchmarking()
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
package com.dfsek.terra.addons.chunkgenerator.generation.math.interpolation;

import org.openjdk.jmh.annotations.Benchmark;
import org.openjdk.jmh.annotations.Fork;
import org.openjdk.jmh.annotations.Measurement;
import org.openjdk.jmh.annotations.Scope;
import org.openjdk.jmh.annotations.State;
import org.openjdk.jmh.annotations.Warmup;
import org.openjdk.jmh.infra.Blackhole;

@State(Scope.Benchmark)
@Fork(1)
@Warmup(iterations = 2, time = 1)
@Measurement(iterations = 2, time = 5)
Comment on lines +13 to +14
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably good to do more iterations for a bit longer than that.

I usually see 1 warmup iteration for 5 seconds & 5 measurement iterations for 5 seconds, for a total of 60 seconds.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

public class Interpolator3Benchmark {
private final Interpolator3 interpolator = new Interpolator3(0, 1, 0, 1, 0, 1, 0, 1);

@Benchmark
public void benchmarkInterpolator3(Blackhole blackhole) {
blackhole.consume(interpolator.trilerp(0.5, 0.75, 0.5));
}
}
Comment on lines +15 to +22
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this benchmark should probably use random values to stop the jvm from

perhaps pre-populate an array with random values and step through that for each invocation of the benchmark to avoid the overhead of the random number generator, since this is rather low level.

which architectures & jvms has this been tested on? low level optimizations like this are extremely finicky.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1

This file was deleted.

Original file line number Diff line number Diff line change
Expand Up @@ -7,16 +7,21 @@

package com.dfsek.terra.addons.chunkgenerator.generation.math.interpolation;


import com.dfsek.seismic.math.numericanalysis.interpolation.InterpolationFunctions;
import com.dfsek.seismic.math.arithmetic.ArithmeticFunctions;


/**
* Class for bilinear interpolation of values arranged on a unit square.
*/
public class Interpolator3 {
private final Interpolator bottom;
private final Interpolator top;
private final double _000;
private final double _001;
private final double _010;
private final double _011;
private final double _100;
private final double _101;
private final double _110;
private final double _111;

/**
* Constructs an interpolator with given values as vertices of a unit cube.
Expand All @@ -33,12 +38,25 @@ public Interpolator3(double _000, double _100,
double _010, double _110,
double _001, double _101,
double _011, double _111) {
this.top = new Interpolator(_000, _010, _001, _011);
this.bottom = new Interpolator(_100, _110, _101, _111);
this._000 = _000;
this._001 = _001;
this._010 = _010;
this._011 = _011;
this._100 = _100;
this._101 = _101;
this._110 = _110;
this._111 = _111;
}

//TODO this system is not very good, replace it wholesale
public double trilerp(double x, double y, double z) {
return InterpolationFunctions.lerp(top.bilerp(y, z), bottom.bilerp(y, z), x);
double a = ArithmeticFunctions.fma(2, x, -1);
double b = ArithmeticFunctions.fma(2, y, -1);
double g = ArithmeticFunctions.fma(2, z, -1);

// using explicit fma here somehow makes this slower
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you share the benchmarks for this?

in fact, it would be good if you could share all the benchmarks you did for different changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, that honestly should not be the case as the jvm should just optimize fma into a*b + c on non supported platforms

return ((1 - g) * (1 - b) * (1 - a) * _000 + (1 - g) * (1 - b) * (1 + a) * _100 +
(1 - g) * (1 + b) * (1 - a) * _010 + (1 - g) * (1 + b) * (1 + a) * _110 +
(1 + g) * (1 - b) * (1 - a) * _001 + (1 + g) * (1 - b) * (1 + a) * _101 +
(1 + g) * (1 + b) * (1 - a) * _011 + (1 + g) * (1 + b) * (1 + a) * _111) / 8;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -59,7 +59,7 @@ private void sendProgressBar(int current, int max) {
}

public void preloadWorldAndMeasure() {
int radius = 12;
int radius = 32;
int chunksLoading = (radius * 2 + 1) * (radius * 2 + 1);
AtomicInteger chunksLeft = new AtomicInteger(chunksLoading);

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

public class GeneratedChunkCache {
private static final Logger log = LoggerFactory.getLogger(GeneratedChunkCache.class);
private final LoadingCache<Pair<Integer, Integer>, CachedChunk> cache;
private final LoadingCache<Long, CachedChunk> cache;
private final DimensionType dimensionType;
private final ChunkGenerator generator;
private final ServerWorld world;
Expand All @@ -29,7 +29,7 @@ public GeneratedChunkCache(DimensionType dimensionType, ChunkGenerator generator
this.cache = Caffeine.newBuilder()
.maximumSize(128)
.recordStats()
.build((Pair<Integer, Integer> key) -> generateChunk(key.getLeft(), key.getRight()));
.build((Long key) -> generateChunk(unpackX(key), unpackZ(key)));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if it would be worthwhile to have our own kind of cache which uses is backed by a fastutil Long2ObjectMaps and avoids boxing the primitives.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have something in the works for this for layered, out of scope for this PR but good idea

}

private CachedChunk generateChunk(int x, int z) {
Expand All @@ -50,6 +50,18 @@ public void displayStats() {
}

public CachedChunk at(int x, int z) {
return cache.get(Pair.of(x, z));
return cache.get(pack(x, z));
}

private long pack(final int x, final int z) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we already have a function somewhere which packs two integers into a long

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is from like the 3.0 days, I wrote it.

return ((long) x & 0xFFFFFFFFL) << 32 | (z & 0xFFFFFFFFL);
}

private int unpackX(long packed) {
return (int) (packed >> 32);
}

private int unpackZ(long packed) {
return (int) packed;
}
}