[ty] Synthesize precise `getitem` overloads for tuple subclasses #19493

AlexWaygood · 2025-07-22T18:49:15Z

Summary

This PR adds synthesized __getitem__ overloads for tuple subclasses, such that for f in the following example, we can infer that f[0] evaluates to int and f[1] evaluates to str:

class Foo(tuple[int, str]): ...

f = Foo((42, "foo"))
reveal_type(f[0])
reveal_type(f[1])

It was initially my hope when embarking on this PR that we would be able to use these synthesized overloads to fully get rid of the special casing we have for tuples in infer_subscript_expression_types. Over the course of writing the PR, I realised that this would not be possible, for the following reasons:

The special casing for slice literals is too complicated to be reasonably implemented via synthesized overloads.

The synthesized overloads being added for int literals in this PR are already (I believe) the most complicated synthesized functions we have anywhere in our codebase so far. The synthesized overloads required to support slice literals would be far more complicated. It might be theoretically possible to generate them, but I think the inherent complexity makes this realistically untenable. It could also cause performance problems to synthesize that many overloads.
The index-out-of-bounds error can't be implemented via synthesized overloads.

I'd like to explore generalising this diagnostic so that we don't just emit it for specific Type variants that we know to have fixed lengths, but for any type where Type::len() returns Some(). That's for another PR, however.
For a tuple type like tuple[int, *tuple[str, ...], bytes], it's impossible to express "if it's subscripted with any literal integer higher than 1, you should infer str | bytes rather than int | str | bytes using synthesized overloads.

This is currently implemented in our tuple special-casing, and it would be a shame to introduce a regression on this.

Because of these issues, I've come to the conclusion that we will not be able to get rid of the hardcoded special casing for tuples in TypeInferenceBuilder::infer_suscript_expression_types, and that we will probably have to extend it so that it also applies to tuple subclasses. Nonetheless, I'm opening this PR anyway, because inferring precise signatures for __getitem__ attributes on specialised tuples has advantages even if we don't end up using these signatures directly when inferring the types of subscript expressions against tuples. A good example of why is protocol assignability: it would be ideal if Bar is understood by ty as a subtype of Proto in the following example:

from typing import Protocol, Literal

class Proto(Protocol):
    def __getitem__(self, index: Literal[0], /) -> int: ...

class Bar(tuple[int, str]): ...

We currently say that Bar is indeed a subtype of Proto, but (if we do not land something similar to this PR), that will no longer be the case after fixing astral-sh/ty#889. After fixing that issue, we will strictly validate the signature of a class's method against the signature of a protocol it claims to be an instance of. Without this PR, we would look up the __getitem__ signature on tuple[int] and fallback to the generic __getitem__ signature in typeshed, which would lead us to incorrectly infer that Bar is not a subtype of Proto. With this PR, however, we should have the necessary pieces in place that we continue to consider Bar a subtype of Proto even after #889 has been fixed, because the lookup of __getitem__ on the Bar class object would return the precise synthesized overloads being added here.

Test Plan

Mdtests. The ecosystem hits also LGTM. They use os.stat() and pwd.getpwuid(), both of which return instances of tuple subclasses.

github-actions · 2025-07-22T18:52:53Z

`mypy_primer` results

Changes were detected when running on open source projects

paasta (https://github.com/yelp/paasta)
- paasta_tools/utils.py:3111:12: error[invalid-return-type] Return type does not match returned value: expected `str`, found `str | int`
- Found 885 diagnostics
+ Found 884 diagnostics

cloud-init (https://github.com/canonical/cloud-init)
- tests/unittests/sources/test_smartos.py:560:32: error[invalid-argument-type] Argument to function `oct` is incorrect: Expected `SupportsIndex`, found `int | float`
- tests/unittests/sources/test_smartos.py:576:32: error[invalid-argument-type] Argument to function `oct` is incorrect: Expected `SupportsIndex`, found `int | float`
- tests/unittests/sources/test_smartos.py:632:35: error[invalid-argument-type] Argument to function `oct` is incorrect: Expected `SupportsIndex`, found `int | float`
- Found 599 diagnostics
+ Found 596 diagnostics

cwltool (https://github.com/common-workflow-language/cwltool)
- cwltool/cwlprov/__init__.py:16:20: warning[possibly-unbound-attribute] Attribute `split` on type `str | int` is possibly unbound
- Found 127 diagnostics
+ Found 126 diagnostics

scipy (https://github.com/scipy/scipy)
- scipy/_lib/_util.py:310:23: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown, ...]`
+ scipy/_lib/_util.py:310:23: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown, ...]`
- scipy/optimize/tests/test_chandrupatla.py:687:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:687:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:691:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:691:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:699:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:699:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:702:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:702:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:710:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:710:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:713:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:713:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:719:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:719:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
- scipy/optimize/tests/test_chandrupatla.py:724:9: error[call-non-callable] Method `__getitem__` of type `Overload[(key: SupportsIndex, /) -> Unknown, (key: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`
+ scipy/optimize/tests/test_chandrupatla.py:724:9: error[call-non-callable] Method `__getitem__` of type `Overload[(index: SupportsIndex, /) -> Unknown, (index: slice[Any, Any, Any], /) -> tuple[Unknown, ...]]` is not callable on object of type `tuple[Unknown]`

No memory usage changes detected ✅

github-actions · 2025-07-29T13:41:33Z

Diagnostic diff on typing conformance tests

Changes were detected when running ty on typing conformance tests

--- old-output.txt	2025-07-30 11:24:12.533913338 +0000
+++ new-output.txt	2025-07-30 11:24:12.595913773 +0000
@@ -87,6 +87,7 @@
 aliases_variance.py:18:24: error[non-subscriptable] Cannot subscript object of type `<class 'ClassA[T_co]'>` with no `__class_getitem__` method
 aliases_variance.py:28:16: error[non-subscriptable] Cannot subscript object of type `<class 'ClassA[T_co]'>` with no `__class_getitem__` method
 aliases_variance.py:44:16: error[non-subscriptable] Cannot subscript object of type `<class 'ClassB[T_co, T_contra]'>` with no `__class_getitem__` method
+annotations_coroutines.py:27:5: error[type-assertion-failure] Argument does not have asserted type `str`
 annotations_forward_refs.py:22:7: error[unresolved-reference] Name `ClassA` used when not defined
 annotations_forward_refs.py:23:12: error[unresolved-reference] Name `ClassA` used when not defined
 annotations_forward_refs.py:49:10: error[invalid-type-form] Variable of type `Literal[1]` is not allowed in a type expression
@@ -101,6 +102,8 @@
 annotations_forward_refs.py:96:1: error[type-assertion-failure] Argument does not have asserted type `int`
 annotations_generators.py:86:21: error[invalid-return-type] Return type does not match returned value: expected `int`, found `types.GeneratorType`
 annotations_generators.py:91:27: error[invalid-return-type] Return type does not match returned value: expected `int`, found `types.AsyncGeneratorType`
+annotations_generators.py:167:5: error[type-assertion-failure] Argument does not have asserted type `AsyncGenerator[str, None]`
+annotations_generators.py:174:5: error[type-assertion-failure] Argument does not have asserted type `AsyncGenerator[str, None]`
 annotations_generators.py:193:1: error[type-assertion-failure] Argument does not have asserted type `() -> AsyncIterator[int]`
 annotations_methods.py:31:1: error[type-assertion-failure] Argument does not have asserted type `A`
 annotations_methods.py:36:1: error[type-assertion-failure] Argument does not have asserted type `B`
@@ -889,4 +892,4 @@
 tuples_type_form.py:36:1: error[invalid-assignment] Object of type `tuple[Literal[1], Literal[2], Literal[3], Literal[""]]` is not assignable to `tuple[int, ...]`
 typeddicts_operations.py:60:1: error[type-assertion-failure] Argument does not have asserted type `str | None`
 typeddicts_type_consistency.py:101:1: error[invalid-assignment] Object of type `Unknown | None` is not assignable to `str`
-Found 890 diagnostics
+Found 893 diagnostics

sharkdp

Thank you very much.

This is extremely cool. It makes me a little sad, because the protocol matching seems to be the only effect of this once we add the special-casing logic in infer_suscript_expression_types for tuple subclasses. And I'm not really convinced that this is a real world concern?

If it is a real world concern, then why do we do this for tuple sublcasses only? Shouldn't the same logic apply to normal tuples as well?