-
Notifications
You must be signed in to change notification settings - Fork 67
RWMC #936
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: hlsl_path_tracer_example
Are you sure you want to change the base?
RWMC #936
Conversation
|
@Przemog1 the RWMC stuff should be neatly packaged up and go into |
| } | ||
|
|
||
| // most of this code is stolen from https://cg.ivd.kit.edu/publications/2018/rwmc/tool/split.cpp | ||
| void addSample(uint32_t sampleIndex, float32_t3 sample) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sample is a reserved keyword, use _sample (see the highlight
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also why float32t_3 here and not CascadeLayerType
| uint32_t size; | ||
| uint32_t start; | ||
| uint32_t base; | ||
| }; | ||
|
|
||
| template<typename CascadeLayerType, uint32_t CascadeSize> | ||
| struct CascadeEntry | ||
| { | ||
| CascadeLayerType data[CascadeSize]; | ||
| }; | ||
|
|
||
| template<typename CascadeLayerType, uint32_t CascadeSize> | ||
| struct CascadeAccumulator | ||
| { | ||
| using output_storage_type = CascadeEntry<CascadeLayerType, CascadeSize>; | ||
| using initialization_data = CascadeSettings; | ||
| output_storage_type accumulation; | ||
| uint32_t cascadeSampleCounter[CascadeSize]; | ||
| CascadeSettings cascadeSettings; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CascadeSize (which should be called CascadeCount) is already a template parameter, why is there a CascadeSettings::size ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also separate the aliases from member declarations (even better put member declarations at the bottom of struct) so its easier to read
| #include "nbl/builtin/hlsl/cpp_compat.hlsl" | ||
| #include <nbl/builtin/hlsl/vector_utils/vector_traits.hlsl> | ||
| #include <nbl/builtin/hlsl/colorspace/encodeCIEXYZ.hlsl> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use <> for HLSL includes in the Nabla headers (prioritizes looking in builtin DLL / most important search paths before doing anything local)
| uint32_t cascadeSampleCounter[CascadeSize]; | ||
| CascadeSettings cascadeSettings; | ||
|
|
||
| void initialize(in CascadeSettings settings) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
static create maybe better as a factory?
Also use NBL_CONST_REF_ARG get out of habit of using in/out/inout in HLSL
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
idk i like initialize better as it indicates that CascadeAccumulator has to be initialized. static create just looks like plain constructor and we have no option to disable default constructor forcing user to use this one.
| { | ||
| for (int i = 0; i < CascadeSize; ++i) | ||
| { | ||
| accumulation.data[i] = (CascadeLayerType)0.0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
promote
| uint32_t base; | ||
| }; | ||
|
|
||
| template<typename CascadeLayerType, uint32_t CascadeSize> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require the LayerType just be a vector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
btw if you want we can nest this struct inside the CascadeAccumulator because nobody will ever use it in an SSBO or Texture as is
| cascadeSettings.size = settings.size; | ||
| cascadeSettings.start = settings.start; | ||
| cascadeSettings.base = settings.base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you don't need to assign member by member, struct to struct assignment works just fine
|
|
||
| uint32_t higherCascadeIndex = lowerCascadeIndex + 1u; | ||
|
|
||
| const uint32_t sampleCount = sampleIndex + 1u; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
take sampleCount from the outside already, instead of sampleIndex
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i don't get it
| cascadeSettings.base = settings.base; | ||
| } | ||
|
|
||
| typename vector_traits<CascadeLayerType>::scalar_type getLuma(NBL_CONST_REF_ARG(CascadeLayerType) col) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make a using alias for the typename vector_traits<CascadeLayerType>::scalar_type
| uint32_t size; | ||
| uint32_t start; | ||
| uint32_t base; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you need completely different members, see the old GLSL code
Also put the structs in separate headers (so easier to share with C++ without sharing the accumulator and resolve):
- splatting parameters
- resolve parameters
etc.
| float lowerScale = cascadeSettings.start; | ||
| float upperScale = lowerScale * cascadeSettings.base; | ||
|
|
||
| const float luma = getLuma(sample); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of float use the alias https://github.com/Devsh-Graphics-Programming/Nabla/pull/936/files#r2473742745 for vector_traits<CascadeLayerType>::scalar_type
| uint32_t lowerCascadeIndex = 0u; | ||
| while (!(luma < upperScale) && lowerCascadeIndex < cascadeSettings.size - 2) | ||
| { | ||
| lowerScale = upperScale; | ||
| upperScale *= cascadeSettings.base; | ||
| ++lowerCascadeIndex; | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can get the floating point cascade index by dealing with logarithms
clamp((log2(luma) - log2(Start))*1.f/log2(Base), 0 , CascadeSize-1)this is why I say that the Splatting Parameters need to be precomputing different things, like log2_start, and rcp_log2_base
| uint32_t lowerCascadeIndex = 0u; | ||
| while (!(luma < upperScale) && lowerCascadeIndex < cascadeSettings.size - 2) | ||
| { | ||
| lowerScale = upperScale; | ||
| upperScale *= cascadeSettings.base; | ||
| ++lowerCascadeIndex; | ||
| } | ||
|
|
||
| float lowerCascadeLevelWeight; | ||
| float higherCascadeLevelWeight; | ||
|
|
||
| if (luma <= lowerScale) | ||
| lowerCascadeLevelWeight = 1.0f; | ||
| else if (luma < upperScale) | ||
| lowerCascadeLevelWeight = max(0.0f, (lowerScale / luma - lowerScale / upperScale) / (1.0f - lowerScale / upperScale)); | ||
| else // Inf, NaN ... | ||
| lowerCascadeLevelWeight = 0.0f; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you can get the floating point cascade index by dealing with logarithms
const scalar_t log2Luma= log2<scalar_t>(luma);
const scalar_t cascade = max(log2Luma*1.f/log2(Base) - log2(Start)/log2(Base), 0, CascadeCount-1);
const scalar_t clampedCascade = clamp(cascade, 0, CascadeCount-1);
// c<=0 -> 0, c>=Count-1 -> Count-1
uint16_t lowerCascadeIndex = floor<scalar_t>(clampedCascade);
// 0 whenever clamped or `cascade` is integer (when `clampedCascade` is integer)
scalar_t higherCascadeWeight = cascade-floor<scalar_t>(clampedCascade);
// never 0 thanks to magic of `1-fract(x)`
scalar_t lowerCascadeWeight = scalar_type(1)-higherCascadeWeight;
// handle super bright sample case
if (cascade>CascadeCount-1)
lowerCascadeWeight = exp2(log2(Start)+log2(Base)*(CascadeCount-1)-log2Luma);
// only deal with splatting the higher cascade if you're not out of bound
// when out of bounds negatively, we use lower cascade as the only cascade (follow the math above)
if (clampedCascade==cascade) // equivalent condition is `higherCascadeWeight!=0`
{
const uint16_t higherCascadeIndex = lowerCascadeIndex+1;
// do stuff for higher cascade
}this is why I say that the Splatting Parameters need to be precomputing different things, like logBaseStart, rcpLog2Base, and log2CascadeEnd
| using output_storage_type = CascadeEntry<CascadeLayerType, CascadeSize>; | ||
| using initialization_data = CascadeSettings; | ||
| output_storage_type accumulation; | ||
| uint32_t cascadeSampleCounter[CascadeSize]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stick sample count into the CascadeEntry
| accumulation.data[lowerCascadeIndex] += (sample * lowerCascadeLevelWeight - (sampleCount - (cascadeSampleCounter[lowerCascadeIndex])) * accumulation.data[lowerCascadeIndex]) * reciprocalSampleCount; | ||
| accumulation.data[higherCascadeIndex] += (sample * higherCascadeLevelWeight - (sampleCount - (cascadeSampleCounter[higherCascadeIndex])) * accumulation.data[higherCascadeIndex]) * reciprocalSampleCount; | ||
| cascadeSampleCounter[lowerCascadeIndex] = sampleCount; | ||
| cascadeSampleCounter[higherCascadeIndex] = sampleCount; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is really long in a line and reads badly, could be turned into a method you call per cascade.
Also remove the extra () around cascadeSampleCounter[...]
| namespace impl | ||
| { | ||
|
|
||
| struct CascadeSample | ||
| { | ||
| float32_t3 centerValue; | ||
| float normalizedCenterLuma; | ||
| float normalizedNeighbourhoodAverageLuma; | ||
| }; | ||
|
|
||
| // TODO: figure out what values should pixels outside have, 0.0f is incorrect | ||
| float32_t3 sampleCascadeTexel(int32_t2 currentCoord, int32_t2 offset, in RWTexture2DArray<float32_t4> cascade, uint32_t cascadeIndex) | ||
| { | ||
| const int32_t2 texelCoord = currentCoord + offset; | ||
| if (any(texelCoord < int32_t2(0, 0))) | ||
| return float32_t3(0.0f, 0.0f, 0.0f); | ||
|
|
||
| float32_t4 output = cascade.Load(int32_t3(texelCoord, int32_t(cascadeIndex))); | ||
| return float32_t3(output.r, output.g, output.b); | ||
| } | ||
|
|
||
| float32_t calcLuma(in float32_t3 col) | ||
| { | ||
| return hlsl::dot<float32_t3>(hlsl::transpose(colorspace::scRGBtoXYZ)[1], col); | ||
| } | ||
|
|
||
| CascadeSample SampleCascade(in int32_t2 coord, in RWTexture2DArray<float32_t4> cascade, in uint cascadeIndex, in float reciprocalBaseI) | ||
| { | ||
| float32_t3 neighbourhood[9]; | ||
| neighbourhood[0] = sampleCascadeTexel(coord, int32_t2(-1, -1), cascade, cascadeIndex); | ||
| neighbourhood[1] = sampleCascadeTexel(coord, int32_t2(0, -1), cascade, cascadeIndex); | ||
| neighbourhood[2] = sampleCascadeTexel(coord, int32_t2(1, -1), cascade, cascadeIndex); | ||
| neighbourhood[3] = sampleCascadeTexel(coord, int32_t2(-1, 0), cascade, cascadeIndex); | ||
| neighbourhood[4] = sampleCascadeTexel(coord, int32_t2(0, 0), cascade, cascadeIndex); | ||
| neighbourhood[5] = sampleCascadeTexel(coord, int32_t2(1, 0), cascade, cascadeIndex); | ||
| neighbourhood[6] = sampleCascadeTexel(coord, int32_t2(-1, 1), cascade, cascadeIndex); | ||
| neighbourhood[7] = sampleCascadeTexel(coord, int32_t2(0, 1), cascade, cascadeIndex); | ||
| neighbourhood[8] = sampleCascadeTexel(coord, int32_t2(1, 1), cascade, cascadeIndex); | ||
|
|
||
| // numerical robustness | ||
| float32_t3 excl_hood_sum = ((neighbourhood[0] + neighbourhood[1]) + (neighbourhood[2] + neighbourhood[3])) + | ||
| ((neighbourhood[5] + neighbourhood[6]) + (neighbourhood[7] + neighbourhood[8])); | ||
|
|
||
| CascadeSample retval; | ||
| retval.centerValue = neighbourhood[4]; | ||
| retval.normalizedNeighbourhoodAverageLuma = retval.normalizedCenterLuma = calcLuma(neighbourhood[4]) * reciprocalBaseI; | ||
| retval.normalizedNeighbourhoodAverageLuma = (calcLuma(excl_hood_sum) * reciprocalBaseI + retval.normalizedNeighbourhoodAverageLuma) / 9.f; | ||
| return retval; | ||
| } | ||
|
|
||
| } // namespace impl |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
instead of using an impl namespace, roll this up into a struct
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could roll it up into a ResolveAccessorAdaptor (and also make a ResolveAccessor concept which is LoadableImage + the extra methods like calcLuma and the get with an offset for a particular cascade which does bounds checking)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
general note, you should allow using any vector<scalar,N> for the color values (the default adaptor can be hardcoded for vector<scalar,3>)
| #ifndef _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_ | ||
| #define _NBL_BUILTIN_HLSL_RWMC_RWMC_HLSL_INCLUDED_ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this header very clearly deals with Resolving, so call it resolve.hlsl not rwmc.hlsl
| // TODO: figure out what values should pixels outside have, 0.0f is incorrect | ||
| float32_t3 sampleCascadeTexel(int32_t2 currentCoord, int32_t2 offset, in RWTexture2DArray<float32_t4> cascade, uint32_t cascadeIndex) | ||
| { | ||
| const int32_t2 texelCoord = currentCoord + offset; | ||
| if (any(texelCoord < int32_t2(0, 0))) | ||
| return float32_t3(0.0f, 0.0f, 0.0f); | ||
|
|
||
| float32_t4 output = cascade.Load(int32_t3(texelCoord, int32_t(cascadeIndex))); | ||
| return float32_t3(output.r, output.g, output.b); | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no weird input textures, make this an Accessor Adaptor (does out of bound checks) which requires concepts::accessors::LoadableImage
btw nothe that the names of the concepts need fixing https://github.com/Devsh-Graphics-Programming/Nabla/blob/aa5d6cb90b383e68b52c6abe69c6efe96281c7dd/include/nbl/builtin/hlsl/concepts/accessors/loadable_image.hlsl
I copy pasted and they have Storable in the name for this reason, should be Loadable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rwmc::ResolveAccessorAdaptor should fullfill LoadableImage itself
| CascadeSample SampleCascade(in int32_t2 coord, in RWTexture2DArray<float32_t4> cascade, in uint cascadeIndex, in float reciprocalBaseI) | ||
| { | ||
| float32_t3 neighbourhood[9]; | ||
| neighbourhood[0] = sampleCascadeTexel(coord, int32_t2(-1, -1), cascade, cascadeIndex); | ||
| neighbourhood[1] = sampleCascadeTexel(coord, int32_t2(0, -1), cascade, cascadeIndex); | ||
| neighbourhood[2] = sampleCascadeTexel(coord, int32_t2(1, -1), cascade, cascadeIndex); | ||
| neighbourhood[3] = sampleCascadeTexel(coord, int32_t2(-1, 0), cascade, cascadeIndex); | ||
| neighbourhood[4] = sampleCascadeTexel(coord, int32_t2(0, 0), cascade, cascadeIndex); | ||
| neighbourhood[5] = sampleCascadeTexel(coord, int32_t2(1, 0), cascade, cascadeIndex); | ||
| neighbourhood[6] = sampleCascadeTexel(coord, int32_t2(-1, 1), cascade, cascadeIndex); | ||
| neighbourhood[7] = sampleCascadeTexel(coord, int32_t2(0, 1), cascade, cascadeIndex); | ||
| neighbourhood[8] = sampleCascadeTexel(coord, int32_t2(1, 1), cascade, cascadeIndex); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SampleCascade should be a method of your ResolveAccessor (default impl and concept)
Actually should be a pseudo-private method of a Resolve Struct
sampleCascadeTexel can be a pseudo private __sampleCascadeTexel in the ResolveAccessorAdaptor (the default impl)
| if (any(texelCoord < int32_t2(0, 0))) | ||
| return float32_t3(0.0f, 0.0f, 0.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this doesn't check for being out of bounds with texelCoord+1, you're only protecting against texelCoord-1
| float normalizedNeighbourhoodAverageLuma; | ||
| }; | ||
|
|
||
| // TODO: figure out what values should pixels outside have, 0.0f is incorrect |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you use accessors, you make it user-configurable, all that you can write is that MIRROR and CLAMP will falsely make similar neighbour pixels reducing RWMC effect on the border, while CLAMP_TO_BORDER (returning 0) will boost the RWMC effect because samples think they have more 0 neighbours
You usually want vignetting, so what you're doing now (CLAMP_TO_BORDER with black border) is probably desirable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
wirte it all in a comment
|
|
||
| } // namespace impl | ||
|
|
||
| struct ReweightingParameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rename to ResolveParameters
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
also divide into two structs
ResolveConstantParams - things that can't change during rendering, like:
- last Cascade Index
- reciprocalBase
and ResolveRuntimeParams which are things you can either change with imgui slider or changes with accumulation
- initialEmin
- rcplN
- rcpKappa
so ResolvePAramters is two of those structs plus any free floating dependent variables like colorReliabilityFactor and NoverKappa
| float32_t3 reweight(in ReweightingParameters params, in RWTexture2DArray<float32_t4> cascade, in int32_t2 coord) | ||
| { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make this into a Resolve struct templated on the Accessor you'll use instead of RWTexture2DArray (which really should only be read only)
you Resolve::create from the ResolveParameters and you have an operator()(NBL_REF_ARG(CascadeAccessor) acc, const int32_t2 coord) which does the reweighting
| float reciprocalBaseI = 1.f; | ||
| impl::CascadeSample curr = impl::SampleCascade(coord, cascade, 0u, reciprocalBaseI); | ||
|
|
||
| float32_t3 accumulation = float32_t3(0.0f, 0.0f, 0.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would be good to template everything on the vector type and its scalar
| LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/rwmc.hlsl") | ||
| LIST_BUILTIN_RESOURCE(NBL_RESOURCES_TO_EMBED "hlsl/rwmc/CascadeAccumulator.hlsl") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
need to separate parameter structs passed and initialized between c++ and HLSL from the HLSL algorithms for splatting and resolve, so more headers
Chop it up!
No description provided.