Skip to content

Conversation

@kalenedrael
Copy link
Contributor

@kalenedrael kalenedrael commented Nov 13, 2025

  • Add batch_traits that unifies traits for scalar, batch, and batch_bool.
  • Treat batch_bool as its own mask type.
  • Add casts for scalar types.
  • Add select() for batch_bool.
  • Simplify abs implementation.

- Add `batch_traits` that unifies traits for scalar, `batch`, and
  `batch_bool`.
- Treat `batch_bool` as its own mask type.
- `batch_cast` can now be used on `batch_bool` (because why not?)
- Add casts for scalar types
- Simplify `abs` implementation.
@kalenedrael
Copy link
Contributor Author

FWIW, the goal is to make it easier to write generic code that works on scalar and batch types. For instance, this enables something like:

template <typename VecF, VecI, VecB>
VecF DoStuff(VecF fv, VecI iv, VecB cond) {
  using F = typename batch_traits<VecF>::scalar_type;
  auto cond_i = iv > 42;
  return select(batch_cast<F>(cond_i) & cond, fv + batch_cast<F>(iv), 0.0));
}

which can be used as DoStuff(float, int, bool) or DoStuff(batch<float>, batch<int>, batch_bool<float>).

Unifying batch_bool with batch enables code like this to work for either type (at least it would, if select supported batch_bool, which I will add):

template <typename Vec>
auto LoadSelect(bool *b, Vec x, Vec y) {
  using VecB = typename batch_traits<Vec>::mask_type;
  return select(VecB::load_unaligned(b), x, y);
}

Tangentially related: I wonder if, in general, batch_bool should be parameterized on vector size instead of the underlying type. It's a bit strange to say that a boolean can only be used on a particular scalar type. Unfortunately, at least on x86, there is a distinction between integer and floating point vector types, and on many older processors there is a performance penalty for crossing domains. This can basically be ignored on any newer processor, but it's something to keep in mind.

@serge-sans-paille
Copy link
Contributor

  • Add batch_traits that unifies traits for scalar, batch, and batch_bool.

I like that!

* `batch_cast` can now be used on `batch_bool` (because why not?)

I 'm not quite sure about this one. We currently have a conversion operator from batch_bool to batch and within that setup I find it confusing to have a batch_cast on a a batch_bool returning a batch_bool.

@kalenedrael
Copy link
Contributor Author

True. Do you think it's better to allow implicit conversion between batch_bool of the same size?

@serge-sans-paille
Copy link
Contributor

True. Do you think it's better to allow implicit conversion between batch_bool of the same size?

I'm not a big fan of this either :-/
Just because explicit is better than implicit.

@kalenedrael
Copy link
Contributor Author

I'll remove it from this PR; I think the other features are enough for what I want at the moment.

The problem I'm trying to solve is that batch_bool_cast is fundamentally nonsensical. There's no conceptual reason the result of an integer comparison shouldn't be directly usable for conditioning float expressions, for instance. Unfortunately I don't have a good solution yet.

@kalenedrael kalenedrael changed the title Add batch_traits and some missing functions Unify scalar, batch, and batch_bool in some places Nov 19, 2025
@kalenedrael
Copy link
Contributor Author

Ok, select for batch_bool has been added. This is ready now; please let me know if you have any concerns.

template <class A, class T, bool... Values>
XSIMD_INLINE batch_bool<T, A> select(batch_bool_constant<T, A, Values...> const&, batch_bool<T, A> const& true_br, batch_bool<T, A> const& false_br, requires_arch<common>)
{
return select<A>(batch_bool<T, A> { Values... }, true_br, false_br, A {});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems sub-optimal in the case where we don't have access to a fast select, see https://godbolt.org/z/3GqaxTsbd

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants