-
Notifications
You must be signed in to change notification settings - Fork 168
Open
Description
Related
(TODO - move write-up here)
- fix(DONT MERGE):
duckdb>=1.4.1typing & warnings #3189 (comment) - fix(DONT MERGE):
duckdb>=1.4.1typing & warnings #3189 (comment)
Description
We document that Expr.is_in supports Iterable, and check it at runtime.
Lines 971 to 975 in ebb2a40
| def is_in(self, other: Any) -> Self: | |
| """Check if elements of this expression are present in the other iterable. | |
| Arguments: | |
| other: iterable |
But in #3189 I found that our tests only cover list.
Kinda suprised by which backends do/don't work.
Fixing it for all of them is any easy task - just something like this really:
narwhals/narwhals/_duckdb/expr.py
Lines 260 to 262 in 64f6b4f
| def is_in(self, other: Sequence[Any]) -> Self: | |
| other_ = tuple(other) if not isinstance(other, (tuple, list)) else other | |
| return self._with_elementwise(lambda expr: F("contains", lit(other_), expr)) |
Repro
import narwhals as nw
data = {"a": [1, 4, 2, 5]}
df = nw.from_dict(data, backend="polars")
sequence = 4, 2
other = sequence
>>> df.select(nw.col("a").is_in(other))┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| shape: (4, 1) |
| ┌───────┐ |
| │ a │ |
| │ --- │ |
| │ bool │ |
| ╞═══════╡ |
| │ false │ |
| │ true │ |
| │ true │ |
| │ false │ |
| └───────┘ |
└──────────────────┘Eager
other = iter(sequence)
>>> df.select(nw.col("a").is_in(other))TypeError: cannot create expression literal for value of type tuple_iterator.
Hint: Pass `allow_object=True` to accept any value and create a literal of type Object.expr = nw.col("a").is_in(iter(sequence))
>>> nw.from_dict(data, backend="pandas").select(expr)┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| a |
| 0 False |
| 1 True |
| 2 True |
| 3 False |
└──────────────────┘expr = nw.col("a").is_in(iter(sequence))
>>> nw.from_dict(data, backend="pyarrow").select(expr)┌────────────────────────────┐
| Narwhals DataFrame |
|----------------------------|
|pyarrow.Table |
|a: bool |
|---- |
|a: [[false,true,true,false]]|
└────────────────────────────┘Lazy
df = nw.from_dict(data, backend="polars")
expr = nw.col("a").is_in(iter(sequence))
>>>df.lazy("ibis").select(expr).collect("polars")┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| shape: (4, 1) |
| ┌───────┐ |
| │ a │ |
| │ --- │ |
| │ bool │ |
| ╞═══════╡ |
| │ false │ |
| │ true │ |
| │ true │ |
| │ false │ |
| └───────┘ |
└──────────────────┘df = nw.from_dict(data, backend="polars")
expr = nw.col("a").is_in(iter(sequence))
>>>df.lazy("duckdb").select(expr).collect("polars")NotImplementedException: Not implemented Error: Unable to transform python value of type '<class 'tuple_iterator'>' to DuckDB LogicalTypefrom sqlframe.duckdb import DuckDBSession
df = nw.from_dict(data, backend="polars")
expr = nw.col("a").is_in(iter(sequence))
>>> df.lazy("sqlframe", session=DuckDBSession()).select(expr).collect("polars")ValueError: Cannot convert <tuple_iterator object at 0x000001DEC8260970>df = nw.from_dict(data, backend="polars")
expr = nw.col("a").is_in(iter(sequence))
>>> df.lazy("dask").select(expr).collect("polars")┌──────────────────┐
|Narwhals DataFrame|
|------------------|
| shape: (4, 1) |
| ┌───────┐ |
| │ a │ |
| │ --- │ |
| │ bool │ |
| ╞═══════╡ |
| │ false │ |
| │ true │ |
| │ true │ |
| │ false │ |
| └───────┘ |
└──────────────────┘