Skip to content

Handling of misaligned arrays #523

@jakelishman

Description

@jakelishman

There are a few methods that produce references to items in Numpy arrays, such as PyArrayMethods::get and PyArrayMethods::as_slice. Rust references are required to be to aligned data, while Numpy arrays aren't (but usually are in practice, unless you try hard). When passing a misaligned array and using a reference-returning method, rust-numpy current invokes UB in safe code by attempting to create the reference.

I'm not sure how practical it is to handle misaligned arrays in all cases. One place where it would be convenient, however, is for zero-length arrays in as_slice. With Pickle 5 and its PickleBuffer objects now the default pickling method from Python 3.14+, Numpy arrays roundtrip through pickle using them, which in the case of an empty array causes the imported array to be produced by (effectively) numpy.frombuffer(bytearray()). Since bytearray() is concerned with bytes, its alignment is only guaranteed as on 1s. I observed Python 3.14 on Linux x86-64 to reliably put the bytearray() singleton buffer on an odd memory address (see python/cpython#140557), which was causing array-extraction failures in my library when trying to implement custom pickling handling when the object happened to be empty.

For example:

use numpy::{PyArray1, PyArrayMethods};
use pyo3::prelude::*;
use pyo3::types::IntoPyDict;

fn main() -> PyResult<()> {
    Python::initialize();
    Python::attach(|py| {
        let array = PyArray1::<u16>::zeros(py, 4, false);
        let view = py
            .eval(
                c"array.view('u1')[1:-1].view('u2')",
                None,
                Some(&[("array", array)].into_py_dict(py)?),
            )?
            .cast_into::<PyArray1<u16>>()?;
        assert_eq!(view.readonly().as_slice()?, &[0, 0, 0]);
        Ok(())
    })
}

will panic in debug mode in the as_slice call without returning an error variant:

thread 'main' (30956354) panicked at /Users/jake/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/numpy-0.27.1/src/array.rs:748:16:
unsafe precondition(s) violated: slice::from_raw_parts requires the pointer to be aligned and non-null, and the total size of the slice not to exceed `isize::MAX`

This indicates a bug in the program. This Undefined Behavior check is optional, and cannot be relied on for safety.
stack backtrace:
   0: __rustc::rust_begin_unwind
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/std/src/panicking.rs:698:5
   1: core::panicking::panic_nounwind_fmt::runtime
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/panicking.rs:122:22
   2: core::panicking::panic_nounwind_fmt
             at /rustc/ded5c06cf21d2b93bffd5d884aa6e96934ee4234/library/core/src/intrinsics/mod.rs:2435:9
   3: core::slice::raw::from_raw_parts::precondition_check
             at /Users/jake/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ub_checks.rs:73:21
   4: core::slice::raw::from_raw_parts
             at /Users/jake/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ub_checks.rs:78:17
   5: numpy::array::PyArrayMethods::as_slice
             at /Users/jake/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/numpy-0.27.1/src/array.rs:748:16
   6: numpy::borrow::PyReadonlyArray<T,D>::as_slice
             at /Users/jake/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/numpy-0.27.1/src/borrow/mod.rs:273:29
   7: rust_numpy_test::main::{{closure}}
             at ./src/main.rs:16:36
   8: pyo3::marker::Python::attach
             at /Users/jake/.cargo/registry/src/index.crates.io-1949cf8c6b5b557f/pyo3-0.27.2/src/marker.rs:426:9
   9: rust_numpy_test::main
             at ./src/main.rs:7:5
  10: core::ops::function::FnOnce::call_once
             at /Users/jake/.rustup/toolchains/stable-aarch64-apple-darwin/lib/rustlib/src/rust/library/core/src/ops/function.rs:250:5

I would be interested in:

  • PyArrayMethods::as_slice returning a safe Err variant if the array is misaligned, not just if it's non-contiguous (though this would potentially be an API break to change the return type)
  • PyArrayMethods::as_slice safely returning &[] if the length is zero, regardless of the alignment of the internal pointer of the array
  • any discussion on how to handle data alignment concerns elsewhere in the library

I can easily prepare a patch that addresses the first two points.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions