Create Lookup

Continuing [this conversation](https://github.com/stumpy-dev/sliding_dot_product/pull/16#issuecomment-2706713984), it would be good to come up with a rough strategy for which algorithm to use depending on the `len(T)`. 

To get started, I explored the amount of storage that we might need to keep track of everything in memory:

```
#!/usr/bin/env python

from scipy.fft import next_fast_len
import numpy as np
from array import array
import sys

lengths = set()
for i in range(2**28):
    lengths.add(next_fast_len(i, real=True))

fast_lengths = np.empty(len(lengths), dtype=np.uint32)
fast_methods = np.empty(len(lengths), dtype=np.uint8)
for i, l in enumerate(sorted(list(lengths))):
    fast_lengths[i] = l
    fast_methods[i] = 1

print(f"{fast_lengths.nbytes / 1024} kb, {sys.getsizeof(fast_lengths.tolist()) / 1024} kb")
print(f"{fast_methods.nbytes / 1024} kb, {sys.getsizeof(fast_methods.tolist()) / 1024} kb")
```

This returns:

```
4.9921875 kb, 10.0390625 kb
1.248046875 kb, 10.0390625 kb
```

So, `fast_lengths` stores all of the possible `len(T)` generated by `next_fast_len` from `0` to `2**28` and `fast_methods` stores the algorithm index number (for now, I've only set the all of the algorithms to `1`). The printed numbers at the bottom corresponds to the amount of space needed for storing things in a `numpy` array versus a `python` `list` (both in-memory).

In conclusion, storing all of this information is very cheap (as long as it is 1D and not 2D). I should probably simplify and store everything in a simple `Dict` and see how much space that will take up.




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Create Lookup #18

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Create Lookup #18

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions