-
Notifications
You must be signed in to change notification settings - Fork 373
feat: make LiteralPredicate serializable via internal IcebergBaseModel #2561
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
In the issue #2523 it is said to derive the class from |
I have now marked it as a Draft since I am not sure now that is the kind of implementation you want. Tests are still passing now using LiteralPredicate as subclass of IcebergBaseModel, but had to make |
Something fishy that I had to pull in order for tests to pass was putting the attribute The problem is that the earlier implementation was calling def _to_unbound_term(term: Union[str, UnboundTerm[Any]]) -> UnboundTerm[Any]:
return Reference(term) if isinstance(term, str) else term If term is not _______________________________________________ test_not_equal_to_invert _______________________________________________
def test_not_equal_to_invert() -> None:
> bound = NotEqualTo(
term=BoundReference( # type: ignore
field=NestedField(field_id=1, name="foo", field_type=StringType(), required=False),
accessor=Accessor(position=0, inner=None),
),
literal="hello",
) Should we address this in this PR or delegate it to another issue? |
That's a bit of an edge case, since you deliberately ignore the type annotation. We could add a check in the function itself: def _to_unbound_term(term: Union[str, UnboundTerm[Any]]) -> UnboundTerm[Any]:
if isinstance(term, str),
return Reference(term)
elif isinstance(term, UnboundTerm):
return term
else:
raise ValueError(f"Expected UnboundTerm or str, but got: {term}")
return Reference(term) if isinstance(term, str) else term |
I do not ignore the type annotation, it is the current implementation and a current test that is ignoring the annotation. I am trying to just implement the issue requirement and since now the types are checked in runtime by pydantic the (old) test is not passing. |
00bc5db
to
5118748
Compare
def __init__(self, *args: Any, **kwargs: Any) -> None: | ||
if args: | ||
if len(args) != 2: | ||
raise TypeError("Expected (term, literal)") | ||
kwargs = {"term": args[0], "literal": args[1], **kwargs} | ||
super().__init__(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After having many issues with an init such as:
def __init__(self, term: Union[str, UnboundTerm[Any]], literals: Union[Iterable[L], Iterable[Literal[L]]]):
super().__init__(term=_to_unbound_term(term), items=_to_literal_set(literals))
Because there are some typing errors with _transform_literal
in pyiceberg/transforms.py
for example:
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[str | None], str | None]"; expected "Callable[[str], str]" [arg-type]
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[bool | None], bool | None]"; expected "Callable[[str], str]" [arg-type]
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[int | None], int | None]"; expected "Callable[[str], str]" [arg-type]
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[float | None], float | None]"; expected "Callable[[str], str]" [arg-type]
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[bytes | None], bytes | None]"; expected "Callable[[str], str]" [arg-type]
pyiceberg/transforms.py:1113: error: Argument 1 to "_transform_literal" has incompatible type "Callable[[UUID | None], UUID | None]"; expected "Callable[[str], str]" [arg-type]
I decided to just go for this implementation of init. The problem now is that:
assert_type(EqualTo("a", "b"), EqualTo[str]) # <-- Fails
------
tests/expressions/test_expressions.py:1238: error: Expression is of type "LiteralPredicate[L]", not "EqualTo[str]" [assert-type]
So I am really stuck, would you mind lending a hand here? @Fokko
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always! So, I think the linter isn't really sure what to do. It is pretty clear:
In the signature, we see that transform
accepts an optional, which I think is correct. However, _transform_literal
requires non-null, which is incorrect.
def _truncate_number(
name: str, pred: BoundLiteralPredicate[L], transform: Callable[[Optional[L]], Optional[L]]
) -> Optional[UnboundPredicate[Any]]:
boundary = pred.literal
if not isinstance(boundary, (LongLiteral, DecimalLiteral, DateLiteral, TimestampLiteral)):
raise ValueError(f"Expected a numeric literal, got: {type(boundary)}")
if isinstance(pred, BoundLessThan):
return LessThanOrEqual(Reference(name), _transform_literal(transform, boundary.decrement()))
elif isinstance(pred, BoundLessThanOrEqual):
The following change suppresses most of the warnings for me:
-def _transform_literal(func: Callable[[L], L], lit: Literal[L]) -> Literal[L]:
+def _transform_literal(func: Callable[[Any], Any], lit: Literal[L]) -> Literal[L]:
"""Small helper to upwrap the value from the literal, and wrap it again."""
return literal(func(lit.value))
class LiteralPredicate(IcebergBaseModel, UnboundPredicate[L], ABC): | ||
type: TypingLiteral["lt", "lt-eq", "gt", "gt-eq", "eq", "not-eq", "starts-with", "not-starts-with"] = Field(alias="type") | ||
term: UnboundTerm[L] | ||
literal: Literal[L] = Field(serialization_alias="value") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
literal: Literal[L] = Field(serialization_alias="value") | |
value: Literal[L] = Field() |
@field_validator("term", mode="before") | ||
@classmethod | ||
def _coerce_term(cls, v: Any) -> UnboundTerm[Any]: | ||
return _to_unbound_term(v) | ||
|
||
def __init__(self, term: Union[str, UnboundTerm[Any]], literal: Union[L, Literal[L]]): # pylint: disable=W0621 | ||
super().__init__(term) | ||
self.literal = _to_literal(literal) # pylint: disable=W0621 | ||
@field_validator("literal", mode="before") | ||
@classmethod | ||
def _coerce_literal(cls, v: Union[L, Literal[L]]) -> Literal[L]: | ||
return _to_literal(v) | ||
|
||
@field_serializer("literal") | ||
def ser_literal(self, literal: Literal[L]) -> str: | ||
return "Any" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we just call the field value
, and we add a literal
property:
@property
def literal(self) -> Literal[L]:
return self.value
def __init__(self, *args: Any, **kwargs: Any) -> None: | ||
if args: | ||
if len(args) != 2: | ||
raise TypeError("Expected (term, literal)") | ||
kwargs = {"term": args[0], "literal": args[1], **kwargs} | ||
super().__init__(**kwargs) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Always! So, I think the linter isn't really sure what to do. It is pretty clear:
In the signature, we see that transform
accepts an optional, which I think is correct. However, _transform_literal
requires non-null, which is incorrect.
def _truncate_number(
name: str, pred: BoundLiteralPredicate[L], transform: Callable[[Optional[L]], Optional[L]]
) -> Optional[UnboundPredicate[Any]]:
boundary = pred.literal
if not isinstance(boundary, (LongLiteral, DecimalLiteral, DateLiteral, TimestampLiteral)):
raise ValueError(f"Expected a numeric literal, got: {type(boundary)}")
if isinstance(pred, BoundLessThan):
return LessThanOrEqual(Reference(name), _transform_literal(transform, boundary.decrement()))
elif isinstance(pred, BoundLessThanOrEqual):
The following change suppresses most of the warnings for me:
-def _transform_literal(func: Callable[[L], L], lit: Literal[L]) -> Literal[L]:
+def _transform_literal(func: Callable[[Any], Any], lit: Literal[L]) -> Literal[L]:
"""Small helper to upwrap the value from the literal, and wrap it again."""
return literal(func(lit.value))
Closes #2523
Rationale for this change
Are these changes tested?
yes
Are there any user-facing changes?