-
Notifications
You must be signed in to change notification settings - Fork 133
256-bit big integer support #1906
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Clang and GCC allow to provide hints to the compiler, they have similar constructs but they have a different syntax (builtin VS attribute). Add portable macro for it. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions for converting integers to a string, that are better suited for our usage. These new functions they likely perform better than lltoa, since they don't rely on helpers for 64 bit division, and also compiler optimization friendly functions for base 10 and 16 are provided: Compiler is able to optimize n / k, when k is a known constant, by replacing it with a multiplication. Note that these new functions will write characters without C string terminator. Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor it in order to use new `int*_write_to_ascii_buf` functions, to make it easier supporting big integers and to share code across to_binary and to_list functions. Also remove `lltoa` function that is super slow: it relies on 64 bit division that in most embedded architectures requires a helper function. Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor it in order to use new `int64_parse_ascii_buf` function. Unlike strtoll the newly introduced function rejects binaries such as "0xFF", so it behaves like OTP. Also it doesn't require copying binaries to \0 terminated bufs. Signed-off-by: Davide Bettio <davide@uninstall.it>
Helpers (`mul/div/add/sub_boxed_helper`) have been refactored in order to prepare to bigint implementation. Signed-off-by: Davide Bettio <davide@uninstall.it>
`intn.c` contains functions for manipulating bigints (array of n digits): - `intn_mulmns`, `intn_divmnu`, and `nlz` are from Hacker's Delight - Other functions such as `intn_addmns` are original work This version is an attempt with numbers in 2nd complement, so division required a wrapper for calling `divmnu` using an absolute value (this specific function has not been broadly tested yet). Given functions are limited to a maximum size for inputs and outputs that is defined in `INTN_MAX_IN_LEN` and `INTN_MAX_RES_LEN`. That's the reason it is called intn and not bigint. Signed-off-by: Davide Bettio <davide@uninstall.it>
warning: suggest parentheses around ‘-’ inside ‘>>’ [-Wparentheses] eg: ``` vn[i] = (v[i] << s) | (v[i - 1] >> 16 - s); ``` Signed-off-by: Davide Bettio <davide@uninstall.it>
nlz function is used from `divmnu` function. Use compiler builtin when available instead of C implementation. Signed-off-by: Davide Bettio <davide@uninstall.it>
This function is a first round of integration with intn bigint implementation. - Allow printing big integers with `erlang:display/1` and in general with `term_display` functions. - Allow converting big integers to binaries and lists using `erlang:integer_to_binary/1` and `erlang:integer_to_list/1`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Implement a first arithmetic operation that uses `intn_mulmns` in order to validate the whole approach. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add first iteration on bigint tests, starting with tests for `erlang:*/2` and `integer_to_binary/2`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Just use `intn_parse` function. Signed-off-by: Davide Bettio <davide@uninstall.it>
Negative boxed integers have 3rd bit set (b1s00). Also introduce new defines: - TERM_BOXED_NEGATIVE_INTEGER - TERM_BOXED_INTEGER_SIGN_BIT - TERM_BOXED_INTEGER_SIGN_BIT_POS Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions for checking if a term is a positive integer, and etc... Function names are inspired to Erlang typespecs (such as non_neg_integer). Signed-off-by: Davide Bettio <davide@uninstall.it>
Start moving existing code to predicates such as `term_is_any_non_neg_integer(t)`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Some operations in 2-complement turns to be quite complex, they require more code, and also more stack space for storing abs value. Hence using a dedicated sign bit (as Erlang does) turns to be an easier and pragmatic approach. This approach makes possible having sign bit outside the numeric payload, so the supported range is -(2^256 - 1)..+(2^256 - 1). They might be called int257, but it would be quite confusing. Sign bit is stored in boxed header, outside of the numeric payload. Also add a valgrind supression file, in order to ignore a bogus warning about overlapping memory in memcpy when executing memmove (that allows overlapping memory). Signed-off-by: Davide Bettio <davide@uninstall.it>
On 32-bit systems, use `make_maybe_boxed_int64` in `neg_boxed_helper` since `-(INT32_MAX + 1)` is `INT32_MIN` that fits into a 32-bit boxed integer. Before of this change `make_boxed_int64` was used, making a 64-bit boxed integer for an int32 value. New `term_compare` implementation will check size and sign metadata before performing any actual comparison, so all value must be in their "minimal canonical form". Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_compare to use metadata such as size and sign before performing any integer comparison (that might be expensive for big integers). Perform digit by digit comparison for big integers only when size and sign are equal. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions that do not rely on undefined behavior for converting unsigned to signed negative integers (and viceversa), for checking if conversion overflows and for conditionally negate. Start using newly introduced utilities in both intn and externalterm (an old macro is removed). Signed-off-by: Davide Bettio <davide@uninstall.it>
Refactor term_conv_to_float in order to use intn_to_double function. Also make sure in opcodesswitch that term_conv_to_float() returns a finite value. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add helper function float_to_integer_helper, that checks the float result of functions such as floor, round, trunc, etc... instead of the arguments in advance. Furthermore a better upper and limit for safe double to int64 conversion has been found. Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow functions such as trunc, round, etc... to return a big integer, when a number above 2^63 or below -2^63 is given. Signed-off-by: Davide Bettio <davide@uninstall.it>
Test for overflows and for not-yet-overflowed values, when converting from float to big int. Also test comparison between floats and big ints. Signed-off-by: Davide Bettio <davide@uninstall.it>
This function converts n-bytes in either big or little endian format, signed / unsigned, and converts them into a intn integer. Signed-off-by: Davide Bettio <davide@uninstall.it>
This function is required in any place a new bigint term needs to be created. Also rename it to `term_intn_to_term_size`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add support in SMALL_BIG_EXT parsing to big integers (that means integers that are >= 8 bytes and <= 32 bytes). Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow literals bigger than 64 bit, such as: 16#FFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFFF As a side note, bigger literals than (2^256 - 1) are encoded as external terms. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add functions useful for writing a big integer back to a buffer, as a little/big-endian integer. Signed-off-by: Davide Bettio <davide@uninstall.it>
Allow calling term_to_binary for serializing big integers. Signed-off-by: Davide Bettio <davide@uninstall.it>
Add bigint (256 bit + sign) initial scaffolding This PR introduces support to signed 256 bit integers. Briefly: - Add code for generic big integer handling (`intn.c`), limited to 256 bit for the sake of simplicity, but it might be expanded to generic big integers with some additional work - Refactored arithmetic BIF helpers - Refactored `integer_to_binary`/`_to_list` in order to reduce both code duplication and make it simpler adding big integers support - Reimplemented `binary_to_integer` in to make it compliant with OTP (binaries such as `<<"0xCAFE">>` or `<<" 42">>` must be rejected). - Replaced `lltoa` with more performant functions that do not rely on slow helpers, specially for base 10 and 16 - Added bigint support to '*' operator - Boxed integers can be either positive and negative; also added predicates for checking signed type - Added support for big literals - Added support for big integers to `binary_to_term` and `term_to_binary` These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Remove valgrind-suppressions.sup.license. Signed-off-by: Davide Bettio <davide@uninstall.it>
Support unnormalized intn Fix documentation about functions accepting big integers in not-normalized form, and extend intn_to_double to accept them. These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Remove unused file Remove valgrind-suppressions.sup.license. These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
This function name is going to be used from term.h. Also since it is a static helper, put verb first. Signed-off-by: Davide Bettio <davide@uninstall.it>
Replace duplicated code with new functions in term.h. Signed-off-by: Davide Bettio <davide@uninstall.it>
Use `term_initialize_bigint` instead of `term_intn_data` + `intn_copy` Signed-off-by: Davide Bettio <davide@uninstall.it>
`term_create_uninitialized_intn` -> `term_create_uninitialized_bigint` `term_intn_to_term_size` -> `term_bigint_size_requirements` Also add doxygen documentation. Signed-off-by: Davide Bettio <davide@uninstall.it>
Rename it to BOXED_BIGINT_HEAP_SIZE, and clarify that it must be always used, in order to have the suitable size for allocating space for the bigint term with its boxed header. Signed-off-by: Davide Bettio <davide@uninstall.it>
Use better names than bn1, bn2 (such as big1, big2), etc... Signed-off-by: Davide Bettio <davide@uninstall.it>
Move before all arithmetic and bitwise functions bitwise helpers. Signed-off-by: Davide Bettio <davide@uninstall.it>
When sign parameter is non-null, always set it to a known value. This avoids annoying bugs. Signed-off-by: Davide Bettio <davide@uninstall.it>
Implement everything needed to run JIT compiler (that uses bigint pattern matching) on AtomVM. Some bigint handling parts are not yet implemented, such as =:= binary pattern matching operation. Also unsigned 64-bits pattern matching is not yet fixed. Signed-off-by: Davide Bettio <davide@uninstall.it>
_Static_assert is not compatible with C++, that uses static_assert instead. Signed-off-by: Davide Bettio <davide@uninstall.it>
Handle big integers also in `skip_compact_term`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Doc and improve bigint term funcs Add more bigint related functions to `term.h`, in order to avoid code duplication. Also document existing ones, and use consistently the `bigint` word. The codebase uses two related but distinct terms: - _intn_ refers to the multi-precision integer implementation (the low-level arithmetic library that operates on arrays of digits) - _bigint_ refers to the term type in AtomVM's type system (boxed integers larger than int64) This separation allows the bigint term interface to remain stable even if the underlying multi-precision implementation changes. Functions in term.h use "bigint" because they work with terms, while intn.h contains the actual arithmetic implementation. Continuation of #1930 These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
bif.c bigint cleanup Move functions & rename variables (to understandable names). Continuation of #1933 These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
jit.erl: add missing skip_compact_term for big integers These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Fix static_assert in header Continuation of #1933 These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
Minimal bigint pattern matching Add everything needed to allow JIT compiler (with big integers support) to run on AtomVM. Continuation of #1933 These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
src/libAtomVM/utils.h
Outdated
| */ | ||
| static inline bool int32_is_negative(int32_t i32) | ||
| { | ||
| return ((uint32_t) i32) >> 31; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure of the usefulness of this, and this is equivalen to:
return i32 < 0;
which arguably is more readable.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found this producing better assembly while testing this stuff with compiler explorer, I cannot remember on which target it was working better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
src/libAtomVM/utils.h
Outdated
| */ | ||
| static inline bool int64_is_negative(int64_t i64) | ||
| { | ||
| return ((uint64_t) i64) >> 63; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return i64 < 0;
and like int32_is_negative I fail to see why we would want this function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
It was an optimization that I cannot reproduce anymore, discard it. Signed-off-by: Davide Bettio <davide@uninstall.it>
int64_safe_unsigned_abs_set_flag can be replaced with `int64_safe_unsigned_abs` and `(n < 0)`. Signed-off-by: Davide Bettio <davide@uninstall.it>
Remove useless optimization These changes are made under both the "Apache 2.0" and the "GNU Lesser General Public License 2.1 or later" license terms (dual license). SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later
pguyot
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All unused functions should be used or removed in a continuation PR.
This PR introduces support for big integers in AtomVM, allowing arithmetic and bitwise operations on integers up to 256-bit (sign + 255-bit magnitude). This significantly extends AtomVM's numeric capabilities beyond the previous 64-bit limitation.
Key Changes
Core Big Integer Support
intnmoduleterm_is_any_integer()now returns true for big integersArithmetic Operations
+,-,*,div,rem,abs,neg) now support integers up to 256-bitband,bor,bxor,bnot,bsl,bsr) now support integers up to 256-bitSerialization Support
binary_to_term/1andterm_to_binary/1,2SMALL_BIG_EXTJIT Enhancements
Breaking Changes
Overflow Checking
bsl(bitshift left) now properly checks for overflow. While this shouldn't affect existing code (integers were previously limited to 64 bits), ensure values are masked before left bitshifts: e.g.,(16#FFFF band 0xF) bsl 252Error Handling
binary_to_integer/1no longer accepts binaries with whitespace or prefixes like<<"0xFF">>or<<" 123">>binary_to_integerandlist_to_integernow raisebadarginstead ofoverflowwhen parsing integers exceeding 256 bits. Update error handling code accordinglyBug Fixes
list_to_integerbug with integers close toINT64_MAXThese changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).
SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later