Skip to content

_FillValue to missing conversion fails due to precision loss from external tools #281

@Atreyu-94

Description

@Atreyu-94

Description
In many scientific workflows, it is common to preprocess NetCDF files using external command-line tools like the NetCDF Operators (NCO), specifically ncks for subsetting.

A subtle but critical issue arises when these tools process files containing floating-point data. Operations like subsetting can cause a promotion/demotion cycle (e.g., float -> double -> float), which can introduce tiny precision errors. As a result, the numerical value of the data points corresponding to the fill value may no longer be bit-for-bit identical to the _FillValue attribute stored in the metadata.

When NCDatasets.jl loads such a file, the automatic conversion from _FillValue to missing fails because it relies on an exact equality (I guess) check (==). This leads to a confusing situation where the loaded array has the type Array{Union{Missing, T}} but contains no missing values, even though it is full of what should be considered fill values.

Proposed Solution
Instead of using strict equality (==), the library could use an approximate comparison (isapprox).

This would correctly identify and mask fill values even when minor precision discrepancies exist. To maintain transparency, a Logging warning could be emitted the first time this approximate match is triggered for a variable, informing the user that a precision mismatch was detected and handled automatically.

This change would significantly improve the user experience, as the library would "just work" as expected without requiring users to debug subtle floating-point issues and implement manual workarounds.

Thank you for considering this suggestion and for your work on this great package! :)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions