Skip to content

Conversation

@bettio
Copy link
Collaborator

@bettio bettio commented Oct 16, 2025

This PR introduces support for big integers in AtomVM, allowing arithmetic and bitwise operations on integers up to 256-bit (sign + 255-bit magnitude). This significantly extends AtomVM's numeric capabilities beyond the previous 64-bit limitation.

Key Changes

Core Big Integer Support

  • Implemented a new big integer representation using boxed terms with sign bit
  • Added comprehensive big integer arithmetic through the new intn module
  • term_is_any_integer() now returns true for big integers
  • Boxed integers now utilize the sign bit for efficient sign representation

Arithmetic Operations

  • All arithmetic operations (+, -, *, div, rem, abs, neg) now support integers up to 256-bit
  • All bitwise operations (band, bor, bxor, bnot, bsl, bsr) now support integers up to 256-bit
  • Float conversion functions now handle big integer conversions in both directions

Serialization Support

  • Added big integer support in binary_to_term/1 and term_to_binary/1,2
  • External term format now encodes/decodes big integers as SMALL_BIG_EXT

JIT Enhancements

  • Added JIT support for big integer encoding
  • Implemented big integer constant support in opcodes (JIT and Emu)

Breaking Changes

Overflow Checking

  • bsl (bitshift left) now properly checks for overflow. While this shouldn't affect existing code (integers were previously limited to 64 bits), ensure values are masked before left bitshifts: e.g., (16#FFFF band 0xF) bsl 252

Error Handling

  • binary_to_integer/1 no longer accepts binaries with whitespace or prefixes like <<"0xFF">> or <<" 123">>
  • binary_to_integer and list_to_integer now raise badarg instead of overflow when parsing integers exceeding 256 bits. Update error handling code accordingly

Bug Fixes

  • Fixed list_to_integer bug with integers close to INT64_MAX

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

bettio added 30 commits April 10, 2025 23:22
Clang and GCC allow to provide hints to the compiler, they have similar
constructs but they have a different syntax (builtin VS attribute).

Add portable macro for it.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions for converting integers to a string, that are better
suited for our usage.

These new functions they likely perform better than lltoa, since they
don't rely on helpers for 64 bit division, and also compiler
optimization friendly functions for base 10 and 16 are provided:
Compiler is able to optimize n / k, when k is a known constant, by
replacing it with a multiplication.

Note that these new functions will write characters without C string
terminator.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor it in order to use new `int*_write_to_ascii_buf` functions,
to make it easier supporting big integers and to share code across
to_binary and to_list functions.

Also remove `lltoa` function that is super slow: it relies on 64 bit
division that in most embedded architectures requires a helper function.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor it in order to use new `int64_parse_ascii_buf` function.
Unlike strtoll the newly introduced function rejects binaries such as
"0xFF", so it behaves like OTP.

Also it doesn't require copying binaries to \0 terminated bufs.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Helpers (`mul/div/add/sub_boxed_helper`) have been refactored in order to
prepare to bigint implementation.

Signed-off-by: Davide Bettio <davide@uninstall.it>
`intn.c` contains functions for manipulating bigints (array of n
digits):
- `intn_mulmns`, `intn_divmnu`, and `nlz` are from Hacker's Delight
- Other functions such as `intn_addmns` are original work

This version is an attempt with numbers in 2nd complement, so division
required a wrapper for calling `divmnu` using an absolute value (this
specific function has not been broadly tested yet).

Given functions are limited to a maximum size for inputs and outputs
that is defined in `INTN_MAX_IN_LEN` and `INTN_MAX_RES_LEN`.
That's the reason it is called intn and not bigint.

Signed-off-by: Davide Bettio <davide@uninstall.it>
warning: suggest parentheses around ‘-’ inside ‘>>’ [-Wparentheses]

eg:
```
vn[i] = (v[i] << s) | (v[i - 1] >> 16 - s);
```

Signed-off-by: Davide Bettio <davide@uninstall.it>
nlz function is used from `divmnu` function.
Use compiler builtin when available instead of C implementation.

Signed-off-by: Davide Bettio <davide@uninstall.it>
This function is a first round of integration with intn bigint
implementation.

- Allow printing big integers with `erlang:display/1` and in general with
`term_display` functions.
- Allow converting big integers to binaries and lists using
`erlang:integer_to_binary/1` and `erlang:integer_to_list/1`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Implement a first arithmetic operation that uses `intn_mulmns` in order to
validate the whole approach.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add first iteration on bigint tests, starting with tests for `erlang:*/2`
and `integer_to_binary/2`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Just use `intn_parse` function.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Negative boxed integers have 3rd bit set (b1s00).
Also introduce new defines:
- TERM_BOXED_NEGATIVE_INTEGER
- TERM_BOXED_INTEGER_SIGN_BIT
- TERM_BOXED_INTEGER_SIGN_BIT_POS

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions for checking if a term is a positive integer, and etc...
Function names are inspired to Erlang typespecs (such as
non_neg_integer).

Signed-off-by: Davide Bettio <davide@uninstall.it>
Start moving existing code to predicates such as
`term_is_any_non_neg_integer(t)`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Some operations in 2-complement turns to be quite complex, they
require more code, and also more stack space for storing abs value.
Hence using a dedicated sign bit (as Erlang does) turns to be an easier and
pragmatic approach.

This approach makes possible having sign bit outside the numeric
payload, so the supported range is -(2^256 - 1)..+(2^256 - 1).
They might be called int257, but it would be quite confusing.

Sign bit is stored in boxed header, outside of the numeric payload.

Also add a valgrind supression file, in order to ignore a bogus warning
about overlapping memory in memcpy when executing memmove (that allows
overlapping memory).

Signed-off-by: Davide Bettio <davide@uninstall.it>
On 32-bit systems, use `make_maybe_boxed_int64` in `neg_boxed_helper`
since `-(INT32_MAX + 1)` is `INT32_MIN` that fits into a 32-bit boxed
integer.
Before of this change `make_boxed_int64` was used, making a 64-bit boxed
integer for an int32 value.

New `term_compare` implementation will check size and sign metadata before
performing any actual comparison, so all value must be in their "minimal
canonical form".

Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_compare to use metadata such as size and sign before
performing any integer comparison (that might be expensive for big
integers).

Perform digit by digit comparison for big integers only when size and
sign are equal.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions that do not rely on undefined behavior for converting
unsigned to signed negative integers (and viceversa), for checking if conversion
overflows and for conditionally negate.

Start using newly introduced utilities in both intn and externalterm
(an old macro is removed).

Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_conv_to_float in order to use intn_to_double function.
Also make sure in opcodesswitch that term_conv_to_float() returns a
finite value.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add helper function float_to_integer_helper, that checks the float result
of functions such as floor, round, trunc, etc... instead of the arguments in
advance.

Furthermore a better upper and limit for safe double to int64 conversion
has been found.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow functions such as trunc, round, etc... to return a big integer,
when a number above 2^63 or below -2^63 is given.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Test for overflows and for not-yet-overflowed values, when converting
from float to big int.
Also test comparison between floats and big ints.

Signed-off-by: Davide Bettio <davide@uninstall.it>
This function converts n-bytes in either big or little endian format,
signed / unsigned, and converts them into a intn integer.

Signed-off-by: Davide Bettio <davide@uninstall.it>
This function is required in any place a new bigint term needs to be
created.
Also rename it to `term_intn_to_term_size`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add support in SMALL_BIG_EXT parsing to big integers (that means integers that
are >= 8 bytes and <= 32 bytes).

Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow literals bigger than 64 bit, such as:
16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF

As a side note, bigger literals than (2^256 - 1) are encoded as external
terms.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions useful for writing a big integer back to a buffer, as a
little/big-endian integer.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow calling term_to_binary for serializing big integers.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Add bigint (256 bit + sign) initial scaffolding

This PR introduces support to signed 256 bit integers.

Briefly:
- Add code for generic big integer handling (`intn.c`), limited to 256 bit for
the sake of simplicity, but it might be expanded to generic big integers with
some additional work
- Refactored arithmetic BIF helpers
- Refactored `integer_to_binary`/`_to_list` in order to reduce both code
duplication and make it simpler adding big integers support
- Reimplemented `binary_to_integer` in to make it compliant with OTP (binaries
such as `<<"0xCAFE">>` or `<<"  42">>` must be rejected).
- Replaced `lltoa` with more performant functions that do not rely on slow
helpers, specially for base 10 and 16
- Added bigint support to '*' operator
- Boxed integers can be either positive and negative; also added predicates for
checking signed type
- Added support for big literals
- Added support for big integers to `binary_to_term` and `term_to_binary`

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
bettio added 19 commits October 26, 2025 17:15
Remove valgrind-suppressions.sup.license.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Support unnormalized intn

Fix documentation about functions accepting big integers in not-normalized form,
and extend intn_to_double to accept them.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Remove unused file

Remove valgrind-suppressions.sup.license.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
This function name is going to be used from term.h. Also since it is a
static helper, put verb first.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Replace duplicated code with new functions in term.h.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Use `term_initialize_bigint` instead of `term_intn_data` + `intn_copy`

Signed-off-by: Davide Bettio <davide@uninstall.it>
`term_create_uninitialized_intn` -> `term_create_uninitialized_bigint`
`term_intn_to_term_size` -> `term_bigint_size_requirements`

Also add doxygen documentation.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Rename it to BOXED_BIGINT_HEAP_SIZE, and clarify that it must be always
used, in order to have the suitable size for allocating space for the bigint
term with its boxed header.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Use better names than bn1, bn2 (such as big1, big2), etc...

Signed-off-by: Davide Bettio <davide@uninstall.it>
Move before all arithmetic and bitwise functions bitwise helpers.

Signed-off-by: Davide Bettio <davide@uninstall.it>
When sign parameter is non-null, always set it to a known value.
This avoids annoying bugs.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Implement everything needed to run JIT compiler (that uses bigint
pattern matching) on AtomVM.

Some bigint handling parts are not yet implemented, such as =:= binary
pattern matching operation.

Also unsigned 64-bits pattern matching is not yet fixed.

Signed-off-by: Davide Bettio <davide@uninstall.it>
_Static_assert is not compatible with C++, that uses static_assert
instead.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Handle big integers also in `skip_compact_term`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Doc and improve bigint term funcs

Add more bigint related functions to `term.h`, in order to avoid code duplication.
Also document existing ones, and use consistently the `bigint` word.

The codebase uses two related but distinct terms:

- _intn_ refers to the multi-precision integer implementation (the low-level
arithmetic library that operates on arrays of digits)
- _bigint_ refers to the term type in AtomVM's type system (boxed integers
larger than int64)

This separation allows the bigint term interface to remain stable even if the
underlying multi-precision implementation changes. Functions in term.h use
"bigint" because they work with terms, while intn.h contains the actual
arithmetic implementation.

Continuation of #1930

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
bif.c bigint cleanup

Move functions & rename variables (to understandable names).

Continuation of #1933

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
jit.erl: add missing skip_compact_term for big integers

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Fix static_assert in header

Continuation of #1933

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Minimal bigint pattern matching

Add everything needed to allow JIT compiler (with big integers support) to run on AtomVM.

Continuation of #1933

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
@bettio bettio marked this pull request as ready for review November 1, 2025 09:27
*/
static inline bool int32_is_negative(int32_t i32)
{
return ((uint32_t) i32) >> 31;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure of the usefulness of this, and this is equivalen to:

return i32 < 0;

which arguably is more readable.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found this producing better assembly while testing this stuff with compiler explorer, I cannot remember on which target it was working better

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And without optimization options, the plain comparison with 0 is better because the compilers don't inline.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

*/
static inline bool int64_is_negative(int64_t i64)
{
return ((uint64_t) i64) >> 63;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

return i64 < 0;

and like int32_is_negative I fail to see why we would want this function.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

It was an optimization that I cannot reproduce anymore, discard it.

Signed-off-by: Davide Bettio <davide@uninstall.it>
int64_safe_unsigned_abs_set_flag can be replaced with
`int64_safe_unsigned_abs` and `(n < 0)`.

Signed-off-by: Davide Bettio <davide@uninstall.it>
Remove useless optimization

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
@bettio bettio requested a review from pguyot November 2, 2025 09:02
Copy link
Collaborator

@pguyot pguyot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

All unused functions should be used or removed in a continuation PR.

@bettio bettio merged commit 58768ab into main Nov 2, 2025
209 of 211 checks passed
@bettio bettio deleted the feature/bigint branch November 2, 2025 12:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants