RWMC #936

Przemog1 · 2025-09-25T15:24:11Z

No description provided.

devshgraphicsprogramming · 2025-10-17T07:37:57Z

@Przemog1 the RWMC stuff should be neatly packaged up and go into nbl/builtin/hlsl/rwmc

devshgraphicsprogramming · 2025-10-29T12:43:36Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    }
+
+    // most of this code is stolen from https://cg.ivd.kit.edu/publications/2018/rwmc/tool/split.cpp
+    void addSample(uint32_t sampleIndex, float32_t3 sample)


sample is a reserved keyword, use _sample (see the highlight

Also why float32t_3 here and not CascadeLayerType

devshgraphicsprogramming · 2025-10-29T12:44:17Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    uint32_t size;
+    uint32_t start;
+    uint32_t base;
+};
+
+template<typename CascadeLayerType, uint32_t CascadeSize>
+struct CascadeEntry
+{
+    CascadeLayerType data[CascadeSize];
+};
+
+template<typename CascadeLayerType, uint32_t CascadeSize>
+struct CascadeAccumulator
+{
+    using output_storage_type = CascadeEntry<CascadeLayerType, CascadeSize>;
+    using initialization_data = CascadeSettings;
+    output_storage_type accumulation;
+    uint32_t cascadeSampleCounter[CascadeSize];
+    CascadeSettings cascadeSettings;


CascadeSize (which should be called CascadeCount) is already a template parameter, why is there a CascadeSettings::size ?

Also separate the aliases from member declarations (even better put member declarations at the bottom of struct) so its easier to read

devshgraphicsprogramming · 2025-10-29T15:10:45Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+#include "nbl/builtin/hlsl/cpp_compat.hlsl"
+#include <nbl/builtin/hlsl/vector_utils/vector_traits.hlsl>
+#include <nbl/builtin/hlsl/colorspace/encodeCIEXYZ.hlsl>


use <> for HLSL includes in the Nabla headers (prioritizes looking in builtin DLL / most important search paths before doing anything local)

devshgraphicsprogramming · 2025-10-29T15:11:45Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    uint32_t cascadeSampleCounter[CascadeSize];
+    CascadeSettings cascadeSettings;
+
+    void initialize(in CascadeSettings settings)


static create maybe better as a factory?

Also use NBL_CONST_REF_ARG get out of habit of using in/out/inout in HLSL

idk i like initialize better as it indicates that CascadeAccumulator has to be initialized. static create just looks like plain constructor and we have no option to disable default constructor forcing user to use this one.

devshgraphicsprogramming · 2025-10-29T15:12:59Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    {
+        for (int i = 0; i < CascadeSize; ++i)
+        {
+            accumulation.data[i] = (CascadeLayerType)0.0f;


devshgraphicsprogramming · 2025-10-29T15:13:29Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    uint32_t base;
+};
+
+template<typename CascadeLayerType, uint32_t CascadeSize>


require the LayerType just be a vector

btw if you want we can nest this struct inside the CascadeAccumulator because nobody will ever use it in an SSBO or Texture as is

devshgraphicsprogramming · 2025-10-29T15:14:14Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        cascadeSettings.size = settings.size;
+        cascadeSettings.start = settings.start;
+        cascadeSettings.base = settings.base;


you don't need to assign member by member, struct to struct assignment works just fine

devshgraphicsprogramming · 2025-10-29T15:16:04Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+
+        uint32_t higherCascadeIndex = lowerCascadeIndex + 1u;
+
+        const uint32_t sampleCount = sampleIndex + 1u;


take sampleCount from the outside already, instead of sampleIndex

i don't get it

devshgraphicsprogramming · 2025-10-29T15:16:43Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        cascadeSettings.base = settings.base;
+    }
+
+    typename vector_traits<CascadeLayerType>::scalar_type getLuma(NBL_CONST_REF_ARG(CascadeLayerType) col)


make a using alias for the typename vector_traits<CascadeLayerType>::scalar_type

devshgraphicsprogramming · 2025-10-29T15:18:55Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    uint32_t size;
+    uint32_t start;
+    uint32_t base;


you need completely different members, see the old GLSL code

Also put the structs in separate headers (so easier to share with C++ without sharing the accumulator and resolve):

splatting parameters

resolve parameters
etc.

devshgraphicsprogramming · 2025-10-29T15:22:03Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        float lowerScale = cascadeSettings.start;
+        float upperScale = lowerScale * cascadeSettings.base;
+
+        const float luma = getLuma(sample);


instead of float use the alias https://github.com/Devsh-Graphics-Programming/Nabla/pull/936/files#r2473742745 for vector_traits<CascadeLayerType>::scalar_type

devshgraphicsprogramming · 2025-10-29T15:23:37Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        uint32_t lowerCascadeIndex = 0u;
+        while (!(luma < upperScale) && lowerCascadeIndex < cascadeSettings.size - 2)
+        {
+            lowerScale = upperScale;
+            upperScale *= cascadeSettings.base;
+            ++lowerCascadeIndex;
+        }


you can get the floating point cascade index by dealing with logarithms

clamp((log2(luma) - log2(Start))*1.f/log2(Base), 0 , CascadeSize-1)

this is why I say that the Splatting Parameters need to be precomputing different things, like log2_start, and rcp_log2_base

devshgraphicsprogramming · 2025-10-29T15:43:57Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        uint32_t lowerCascadeIndex = 0u;
+        while (!(luma < upperScale) && lowerCascadeIndex < cascadeSettings.size - 2)
+        {
+            lowerScale = upperScale;
+            upperScale *= cascadeSettings.base;
+            ++lowerCascadeIndex;
+        }
+
+        float lowerCascadeLevelWeight;
+        float higherCascadeLevelWeight;
+
+        if (luma <= lowerScale)
+            lowerCascadeLevelWeight = 1.0f;
+        else if (luma < upperScale)
+            lowerCascadeLevelWeight = max(0.0f, (lowerScale / luma - lowerScale / upperScale) / (1.0f - lowerScale / upperScale));
+        else // Inf, NaN ...
+            lowerCascadeLevelWeight = 0.0f;


you can get the floating point cascade index by dealing with logarithms

const scalar_t log2Luma= log2<scalar_t>(luma); const scalar_t cascade = max(log2Luma*1.f/log2(Base) - log2(Start)/log2(Base), 0, CascadeCount-1); const scalar_t clampedCascade = clamp(cascade, 0, CascadeCount-1); // c<=0 -> 0, c>=Count-1 -> Count-1 uint16_t lowerCascadeIndex = floor<scalar_t>(clampedCascade); // 0 whenever clamped or `cascade` is integer (when `clampedCascade` is integer) scalar_t higherCascadeWeight = cascade-floor<scalar_t>(clampedCascade); // never 0 thanks to magic of `1-fract(x)` scalar_t lowerCascadeWeight = scalar_type(1)-higherCascadeWeight; // handle super bright sample case if (cascade>CascadeCount-1) lowerCascadeWeight = exp2(log2(Start)+log2(Base)*(CascadeCount-1)-log2Luma); // only deal with splatting the higher cascade if you're not out of bound // when out of bounds negatively, we use lower cascade as the only cascade (follow the math above) if (clampedCascade==cascade) // equivalent condition is `higherCascadeWeight!=0` { const uint16_t higherCascadeIndex = lowerCascadeIndex+1; // do stuff for higher cascade }

this is why I say that the Splatting Parameters need to be precomputing different things, like logBaseStart, rcpLog2Base, and log2CascadeEnd

devshgraphicsprogramming · 2025-10-29T15:51:57Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+    using output_storage_type = CascadeEntry<CascadeLayerType, CascadeSize>;
+    using initialization_data = CascadeSettings;
+    output_storage_type accumulation;
+    uint32_t cascadeSampleCounter[CascadeSize];


stick sample count into the CascadeEntry

devshgraphicsprogramming · 2025-10-29T16:01:42Z

include/nbl/builtin/hlsl/rwmc/CascadeAccumulator.hlsl

+        accumulation.data[lowerCascadeIndex] += (sample * lowerCascadeLevelWeight - (sampleCount - (cascadeSampleCounter[lowerCascadeIndex])) * accumulation.data[lowerCascadeIndex]) * reciprocalSampleCount;
+        accumulation.data[higherCascadeIndex] += (sample * higherCascadeLevelWeight - (sampleCount - (cascadeSampleCounter[higherCascadeIndex])) * accumulation.data[higherCascadeIndex]) * reciprocalSampleCount;
+        cascadeSampleCounter[lowerCascadeIndex] = sampleCount;
+        cascadeSampleCounter[higherCascadeIndex] = sampleCount;


this is really long in a line and reads badly, could be turned into a method you call per cascade.

Also remove the extra () around cascadeSampleCounter[...]

devshgraphicsprogramming · 2025-10-29T16:02:41Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+namespace impl
+{
+
+struct CascadeSample
+{
+	float32_t3 centerValue;
+	float normalizedCenterLuma;
+	float normalizedNeighbourhoodAverageLuma;
+};
+
+// TODO: figure out what values should pixels outside have, 0.0f is incorrect
+float32_t3 sampleCascadeTexel(int32_t2 currentCoord, int32_t2 offset, in RWTexture2DArray<float32_t4> cascade, uint32_t cascadeIndex)
+{
+	const int32_t2 texelCoord = currentCoord + offset;
+	if (any(texelCoord < int32_t2(0, 0)))
+		return float32_t3(0.0f, 0.0f, 0.0f);
+
+	float32_t4 output = cascade.Load(int32_t3(texelCoord, int32_t(cascadeIndex)));
+	return float32_t3(output.r, output.g, output.b);
+}
+
+float32_t calcLuma(in float32_t3 col)
+{
+	return hlsl::dot<float32_t3>(hlsl::transpose(colorspace::scRGBtoXYZ)[1], col);
+}
+
+CascadeSample SampleCascade(in int32_t2 coord, in RWTexture2DArray<float32_t4> cascade, in uint cascadeIndex, in float reciprocalBaseI)
+{
+	float32_t3 neighbourhood[9];
+	neighbourhood[0] = sampleCascadeTexel(coord, int32_t2(-1, -1), cascade, cascadeIndex);
+	neighbourhood[1] = sampleCascadeTexel(coord, int32_t2(0, -1), cascade, cascadeIndex);
+	neighbourhood[2] = sampleCascadeTexel(coord, int32_t2(1, -1), cascade, cascadeIndex);
+	neighbourhood[3] = sampleCascadeTexel(coord, int32_t2(-1, 0), cascade, cascadeIndex);
+	neighbourhood[4] = sampleCascadeTexel(coord, int32_t2(0, 0), cascade, cascadeIndex);
+	neighbourhood[5] = sampleCascadeTexel(coord, int32_t2(1, 0), cascade, cascadeIndex);
+	neighbourhood[6] = sampleCascadeTexel(coord, int32_t2(-1, 1), cascade, cascadeIndex);
+	neighbourhood[7] = sampleCascadeTexel(coord, int32_t2(0, 1), cascade, cascadeIndex);
+	neighbourhood[8] = sampleCascadeTexel(coord, int32_t2(1, 1), cascade, cascadeIndex);
+
+	// numerical robustness
+	float32_t3 excl_hood_sum = ((neighbourhood[0] + neighbourhood[1]) + (neighbourhood[2] + neighbourhood[3])) +
+		((neighbourhood[5] + neighbourhood[6]) + (neighbourhood[7] + neighbourhood[8]));
+
+	CascadeSample retval;
+	retval.centerValue = neighbourhood[4];
+	retval.normalizedNeighbourhoodAverageLuma = retval.normalizedCenterLuma = calcLuma(neighbourhood[4]) * reciprocalBaseI;
+	retval.normalizedNeighbourhoodAverageLuma = (calcLuma(excl_hood_sum) * reciprocalBaseI + retval.normalizedNeighbourhoodAverageLuma) / 9.f;
+	return retval;
+}
+
+} // namespace impl


instead of using an impl namespace, roll this up into a struct

could roll it up into a ResolveAccessorAdaptor (and also make a ResolveAccessor concept which is LoadableImage + the extra methods like calcLuma and the get with an offset for a particular cascade which does bounds checking)

general note, you should allow using any vector<scalar,N> for the color values (the default adaptor can be hardcoded for vector<scalar,3>)

devshgraphicsprogramming · 2025-10-29T16:03:07Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+#ifndef _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_
+#define _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_


this header very clearly deals with Resolving, so call it resolve.hlsl not rwmc.hlsl

devshgraphicsprogramming · 2025-10-29T16:07:38Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+// TODO: figure out what values should pixels outside have, 0.0f is incorrect
+float32_t3 sampleCascadeTexel(int32_t2 currentCoord, int32_t2 offset, in RWTexture2DArray<float32_t4> cascade, uint32_t cascadeIndex)
+{
+	const int32_t2 texelCoord = currentCoord + offset;
+	if (any(texelCoord < int32_t2(0, 0)))
+		return float32_t3(0.0f, 0.0f, 0.0f);
+
+	float32_t4 output = cascade.Load(int32_t3(texelCoord, int32_t(cascadeIndex)));
+	return float32_t3(output.r, output.g, output.b);
+}


no weird input textures, make this an Accessor Adaptor (does out of bound checks) which requires concepts::accessors::LoadableImage

btw nothe that the names of the concepts need fixing https://github.com/Devsh-Graphics-Programming/Nabla/blob/aa5d6cb90b383e68b52c6abe69c6efe96281c7dd/include/nbl/builtin/hlsl/concepts/accessors/loadable_image.hlsl

I copy pasted and they have Storable in the name for this reason, should be Loadable

The rwmc::ResolveAccessorAdaptor should fullfill LoadableImage itself

devshgraphicsprogramming · 2025-10-29T16:11:54Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+CascadeSample SampleCascade(in int32_t2 coord, in RWTexture2DArray<float32_t4> cascade, in uint cascadeIndex, in float reciprocalBaseI)
+{
+	float32_t3 neighbourhood[9];
+	neighbourhood[0] = sampleCascadeTexel(coord, int32_t2(-1, -1), cascade, cascadeIndex);
+	neighbourhood[1] = sampleCascadeTexel(coord, int32_t2(0, -1), cascade, cascadeIndex);
+	neighbourhood[2] = sampleCascadeTexel(coord, int32_t2(1, -1), cascade, cascadeIndex);
+	neighbourhood[3] = sampleCascadeTexel(coord, int32_t2(-1, 0), cascade, cascadeIndex);
+	neighbourhood[4] = sampleCascadeTexel(coord, int32_t2(0, 0), cascade, cascadeIndex);
+	neighbourhood[5] = sampleCascadeTexel(coord, int32_t2(1, 0), cascade, cascadeIndex);
+	neighbourhood[6] = sampleCascadeTexel(coord, int32_t2(-1, 1), cascade, cascadeIndex);
+	neighbourhood[7] = sampleCascadeTexel(coord, int32_t2(0, 1), cascade, cascadeIndex);
+	neighbourhood[8] = sampleCascadeTexel(coord, int32_t2(1, 1), cascade, cascadeIndex);


~~SampleCascade should be a method of your ResolveAccessor (default impl and concept)~~
Actually should be a pseudo-private method of a Resolve Struct

sampleCascadeTexel can be a pseudo private __sampleCascadeTexel in the ResolveAccessorAdaptor (the default impl)

devshgraphicsprogramming · 2025-10-29T16:14:29Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+	if (any(texelCoord < int32_t2(0, 0)))
+		return float32_t3(0.0f, 0.0f, 0.0f);


this doesn't check for being out of bounds with texelCoord+1, you're only protecting against texelCoord-1

devshgraphicsprogramming · 2025-10-29T16:16:12Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+	float normalizedNeighbourhoodAverageLuma;
+};
+
+// TODO: figure out what values should pixels outside have, 0.0f is incorrect


if you use accessors, you make it user-configurable, all that you can write is that MIRROR and CLAMP will falsely make similar neighbour pixels reducing RWMC effect on the border, while CLAMP_TO_BORDER (returning 0) will boost the RWMC effect because samples think they have more 0 neighbours

You usually want vignetting, so what you're doing now (CLAMP_TO_BORDER with black border) is probably desirable

wirte it all in a comment

devshgraphicsprogramming · 2025-10-29T16:18:16Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+
+} // namespace impl
+
+struct ReweightingParameters


rename to ResolveParameters

also divide into two structs

ResolveConstantParams - things that can't change during rendering, like:

last Cascade Index

reciprocalBase

and ResolveRuntimeParams which are things you can either change with imgui slider or changes with accumulation

initialEmin

rcplN

rcpKappa

so ResolvePAramters is two of those structs plus any free floating dependent variables like colorReliabilityFactor and NoverKappa

devshgraphicsprogramming · 2025-10-29T16:23:28Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+float32_t3 reweight(in ReweightingParameters params, in RWTexture2DArray<float32_t4> cascade, in int32_t2 coord)
+{


make this into a Resolve struct templated on the Accessor you'll use instead of RWTexture2DArray (which really should only be read only)

you Resolve::create from the ResolveParameters and you have an operator()(NBL_REF_ARG(CascadeAccessor) acc, const int32_t2 coord) which does the reweighting

devshgraphicsprogramming · 2025-10-29T16:24:15Z

include/nbl/builtin/hlsl/rwmc/rwmc.hlsl

+	float reciprocalBaseI = 1.f;
+	impl::CascadeSample curr = impl::SampleCascade(coord, cascade, 0u, reciprocalBaseI);
+
+	float32_t3 accumulation = float32_t3(0.0f, 0.0f, 0.0f);


would be good to template everything on the vector type and its scalar

devshgraphicsprogramming · 2025-10-29T16:25:09Z

src/nbl/builtin/CMakeLists.txt

+LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/rwmc.hlsl")
+LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/CascadeAccumulator.hlsl")


need to separate parameter structs passed and initialized between c++ and HLSL from the HLSL algorithms for splatting and resolve, so more headers

Chop it up!

Updated examples

19ba15d

Przemog1 changed the title ~~Updated examples~~ RWMC Oct 18, 2025

Przemog1 added 2 commits October 18, 2025 19:12

Created RWMC files

7ec3f39

Updated examples

4dcf56a

devshgraphicsprogramming reviewed Oct 29, 2025

View reviewed changes


		uint32_t higherCascadeIndex = lowerCascadeIndex + 1u;

		const uint32_t sampleCount = sampleIndex + 1u;

		#ifndef _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_
		#define _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_

		if (any(texelCoord < int32_t2(0, 0)))
		return float32_t3(0.0f, 0.0f, 0.0f);

		float32_t3 reweight(in ReweightingParameters params, in RWTexture2DArray<float32_t4> cascade, in int32_t2 coord)
		{

		LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/rwmc.hlsl")
		LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/CascadeAccumulator.hlsl")

RWMC #936

Are you sure you want to change the base?

RWMC #936

Conversation

Przemog1 commented Sep 25, 2025

Uh oh!

devshgraphicsprogramming commented Oct 17, 2025

Uh oh!

devshgraphicsprogramming Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

devshgraphicsprogramming Oct 29, 2025 •

edited

Loading

devshgraphicsprogramming Oct 29, 2025 •

edited

Loading

devshgraphicsprogramming Oct 29, 2025 •

edited

Loading

devshgraphicsprogramming Oct 29, 2025 •

edited

Loading

devshgraphicsprogramming Oct 29, 2025 •

edited

Loading