Skip to content

Conversation

itay-space
Copy link

@itay-space itay-space commented Aug 27, 2025

Description

This PR is the first step in breaking down the point detector tally project into smaller, reviewable pieces, following the large PR #3109
and the paper Development and benchmarking of a point detector in OpenMC.

As suggested during the earlier discussion (thanks @GuySten, @gridley, @shimwell !), the natural starting point is to add the ability to correctly calculate PDFs on the C++ side.

Accordingly, this PR introduces the get_pdf implementations for the distribution classes. This is essentially a clean cut from the original large PR, with the goal of keeping the scope focused and review manageable.

We plan to follow up with tests and additional pieces of the point detector work in subsequent PRs.

Checklist

  • I have performed a self-review of my own code
  • I have run clang-format (version 15) on any C++ source files (if applicable)
  • I have followed the style guidelines for Python source files (if applicable)
  • I have made corresponding changes to the documentation (if applicable)
  • I have added tests that prove my fix is effective or that my feature works (if applicable)

@itay-space itay-space changed the title Implement get_pdf functionality [Point Detector] Add distribution get_pdf functionality Aug 27, 2025
Copy link
Contributor

@GuySten GuySten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work @itay-space!

My review focus more on the code structure side of things.
I did not check the maths.

itay-space and others added 6 commits August 28, 2025 08:32
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
Co-authored-by: GuySten <62616591+GuySten@users.noreply.github.com>
@itay-space
Copy link
Author

@GuySten I saw the changes you made, and I have a few conclusions. You’re right that the PDF functions themselves don’t need a seed. But there’s an important caveat: we do need the outgoing energy that was sampled in the original transport step, because the PDF of mu depends on it (see, for example, the Kalbach–Mann distribution).

The distributions where you implemented angular_pdf and conditional_sample_energy are only the special cases where they are independent.

@GuySten
Copy link
Contributor

GuySten commented Sep 4, 2025

@itay-space, I am still working on the implementation for more distributions. Some of them are harder than others.
Regarding the kalbach-mann distribution I think it is theoretically possible to calculate the energy independent pdf but is sure mathematically complicated.
I want more time to think about it.

@GuySten
Copy link
Contributor

GuySten commented Sep 4, 2025

@itay-space, after looking at your comment and further thinking I have reached the following conclusions:

Theoretically the angular_pdf for every distribution is defined and computable but it may be hard to calculate it.
A trick that we can do and should do especially in the case of the kalbach mann distribution is to sample the pdf and the outgoing energy simultaneously so the get_pdf method in those cases should be named sample_energy_and_pdf and not split into two functions.

Theoretically I could try to derive the exact mathematical expressions for the angular_pdf but it is a headache.

@itay-space
Copy link
Author

@GuySten Right, that’s exactly why the seed was used - to sample the outgoing energy. If we can somehow share the outgoing energy that was sampled in the sample function with the get_pdf function, that would be great. But if not, then I suggest we go back to the original implementation with the seed.

@GuySten
Copy link
Contributor

GuySten commented Sep 4, 2025

@GuySten Right, that’s exactly why the seed was used - to sample the outgoing energy. If we can somehow share the outgoing energy that was sampled in the sample function with the get_pdf function, that would be great. But if not, then I suggest we go back to the original implementation with the seed.

The fact that we need the seed to sample the outgoing energy is not that surprising.
The surprising fact (to me) is that we use the seed to return not the real angular pdf but a stochastic unbiased estimate of the angular pdf.

Because in theory the exact angular pdf does not depend on the outgoing energy at all!

@GuySten
Copy link
Contributor

GuySten commented Sep 4, 2025

If all the tests will pass and nobody will object I am planning to merge this PR at the start of next week.

Now the names we use should not clash with the source biasing PR.

@eepeterson, @itay-space, @j-fletcher, if you have further comments you have until start of next week.

@GuySten GuySten added the Merging Soon PR will be merged in < 24 hrs if no further comments are made. label Sep 4, 2025
@eepeterson
Copy link
Contributor

I do have additional comments, thanks for your patience.

  1. I fear the name sample_energy_and_pdf is now less clear than it was before.
  2. Fundamentally we are just dealing with probability distributions here - there are univariate and multivariate distributions and they can be correlated or independent. Regardless, we need to be able to do two things with them for the features under discussion: we need to sample from them and evaluate them. That's why I believe this is the appropriate public interface to expose for both clarity and extensibility. To go back to the surface analogy, a Surface is an object that has an equation that we evaluate to find the value of the equation describing the surface. The surface and the equation that describes it are effectively synonymous. I'd argue that the PDF or PMF are likewise synonymous with the Distribution to warrant this terminology.
  3. For the point detector tally we shouldn't have to sample an energy in the direction of the detector as long as we have the distribution of energies given the virtual particle scatters towards the detector. You can then integrate this spectrum with the attenuation factor that is also energy dependent and the detector response function to get the tally contribution and further reduce the variance. This way we could avoid some of the ambiguity in these functions as well as the duplication of code. My proposal here would be to have a polymorphic sample function for joint probability distributions with an enum class that functions as a case switch to determine whether to sample both E and mu, or mu given E, or E given mu, etc.
  4. It would be good to add some unit tests to make sure functionality is working. As it stands, tests are passing because nothing new or different is being tested.

@GuySten
Copy link
Contributor

GuySten commented Sep 5, 2025

I do have additional comments, thanks for your patience.

  1. I fear the name sample_energy_and_pdf is now less clear than it was before.
  2. Fundamentally we are just dealing with probability distributions here - there are univariate and multivariate distributions and they can be correlated or independent. Regardless, we need to be able to do two things with them for the features under discussion: we need to sample from them and evaluate them. That's why I believe this is the appropriate public interface to expose for both clarity and extensibility. To go back to the surface analogy, a Surface is an object that has an equation that we evaluate to find the value of the equation describing the surface. The surface and the equation that describes it are effectively synonymous. I'd argue that the PDF or PMF are likewise synonymous with the Distribution to warrant this terminology.
  3. For the point detector tally we shouldn't have to sample an energy in the direction of the detector as long as we have the distribution of energies given the virtual particle scatters towards the detector. You can then integrate this spectrum with the attenuation factor that is also energy dependent and the detector response function to get the tally contribution and further reduce the variance. This way we could avoid some of the ambiguity in these functions as well as the duplication of code. My proposal here would be to have a polymorphic sample function for joint probability distributions with an enum class that functions as a case switch to determine whether to sample both E and mu, or mu given E, or E given mu, etc.
  4. It would be good to add some unit tests to make sure functionality is working. As it stands, tests are passing because nothing new or different is being tested.
  1. I am open to suggestions. Please bear in mind that the name should reflect what the current implementation do.
  2. If you are insistent on evaluate for what I called pdf I do not care enough to argue.
  3. If we want to integrate numerically the spectrum with the attenuation factor then we actually need to sample an energy from the spectrum and multiply it with the attenuation factor which is energy dependent (unless you want to do quadrature based approach on an energy grid which might be possible but IMO out of scope for this PR). I also thought that having sample function that sample E given mu etc is a good idea. But then I did not manage to code it for the more complicated distributions (like kalbach mann). If you can do it I agree that the sample_energy_and_pdf might not be needed anymore. Can we agree that we can merge it as is and If someone will manage to do it we will change it later?
  4. Currently we only test that the program can be built and is formatted correctly.
    More tests will have to be added in subsequent PRs before this functionality is exposed to users.

@itay-space
Copy link
Author

itay-space commented Sep 5, 2025

@GuySten @eepeterson
Thanks for the feedback! I’m fine with evaluate instead of pdf. In my opinion, for joint distributions the best approach is to get E_out from the original sample and somehow share that with the evaluate/pdf function. If we don’t share it, then yes, we’d need to resample E_out inside evaluate. Just to illustrate: in a one-collision case, if the originally sampled E_out kinematically prevents the particle from reaching the detector, then using the original sample keeps the virtual particle blocked as it should be. Resampling inside evaluate might yield an E_out that allows it to reach, which I think is less consistent (though both converge in the limit of large histories). I suggest we merge this as a baseline and refine the interface + add proper unit tests in follow-up PRs.

Copy link
Contributor

@paulromano paulromano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Putting a placeholder review as I'd like to take a look at this before we merge.

@j-fletcher
Copy link
Contributor

Hello all, apologies for the delay here.

  1. I am inclined to agree with @eepeterson that evaluate fits better with the functionality we're trying to add here. Looking at the univariate Tabular distribution for example, there are already methods called integral and cdf, with the difference being that cdf returns the "running total," so to speak, while integral simply returns the total probability under the curve of the PDF. In keeping with this point of view, pdf would in my mind return a vector containing each probability p(x) used to construct the distribution, which we're already able to access via the .p attribute. What we're after instead is PDF evaluated at a specific point, thus evaluate.
  2. I also agree with @GuySten that I think returning fatal error if evaluate is not implemented would be fine. I'm open to persuasion, however.

@GuySten
Copy link
Contributor

GuySten commented Sep 24, 2025

@itay-space, I noticed that the calculation of the uncollided contribution to the point detector tally is using the probability density function of emitting a particle in a specific direction.

I think that we should extend this PR (or maybe add another PR) that will add evaluate function for the classes that inherit from UnitSphereDistribution.

I propose the signature: double evaluate(Direction u) to match the signature of the rest of the 1d distribution functions.

@GuySten GuySten removed the Merging Soon PR will be merged in < 24 hrs if no further comments are made. label Sep 24, 2025
@itay-space
Copy link
Author

@GuySten , Thanks for the suggestion! These classes already rely on two independent 1D distributions (polar and azimuthal), and the evaluate methods are implemented at that level. The sphere distributions just combine those underlying 1D PDFs, so the functionality is already covered.

@GuySten
Copy link
Contributor

GuySten commented Sep 25, 2025

@itay-space, the conversion is not trivial because of the reference direction, so I think that adding this functionality is worth while.
If you do not object I can add the relevant functionality myself.

@itay-space
Copy link
Author

@GuySten I’m not sure why this would be different from the other cases (I can’t see it right now), but sure, go ahead and add the functionality. I have no objections. Thanks!

@GuySten
Copy link
Contributor

GuySten commented Sep 28, 2025

I am waiting for #3582 to be merged to add the functionality I talked about.
Without that PR it is unclear what is the correct mapping between a point on the unit sphere and the mu, phi angles to calculate the pdf for.

@GuySten
Copy link
Contributor

GuySten commented Oct 14, 2025

@paulromano, I am reminding you to look at this PR when you have some time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants