Commit 2d549a9
authored
Fix partition column projection with schema evolution (apache#2685)
Closes apache#2672
# Rationale for this change
When performing column projection on partitioned tables with schema
evolution, PyIceberg incorrectly uses the projected schema (containing
only selected columns) instead
of the full table schema when building partition types in
`_get_column_projection_values()`. This causes `ValueError: Could not
find field with id: X` when:
1. Reading from partitioned Iceberg tables
2. Using column projection (selecting specific columns, not `SELECT *`)
3. Selected columns do NOT include the partition field(s)
4. The table has undergone schema evolution (fields added/removed after
initial creation)
5. Reading files that are missing some of the selected columns (written
before schema evolution)
The root cause is where
`partition_spec.partition_type(projected_schema)` fails because the
projected schema may be missing fields that
exist in the partition specification.
The fix passes the full table schema from
`ArrowScan._table_metadata.schema()` through `_task_to_record_batches()`
to `_get_column_projection_values()`, ensuring all fields are available
when building partition accessors.
## Are these changes tested?
Yes. Added a test
`test_partition_column_projection_with_schema_evolution` that:
- Creates a partitioned table with initial schema
- Writes data with the initial schema
- Evolves the schema by adding a new column
- Writes data with the evolved schema
- Performs column projection that excludes the partition field
## Are there any user-facing changes?
No. Only internal helpers are changed1 parent 5773b7f commit 2d549a9
2 files changed
+79
-3
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1492 | 1492 | | |
1493 | 1493 | | |
1494 | 1494 | | |
1495 | | - | |
| 1495 | + | |
| 1496 | + | |
| 1497 | + | |
| 1498 | + | |
| 1499 | + | |
1496 | 1500 | | |
1497 | 1501 | | |
1498 | 1502 | | |
1499 | 1503 | | |
1500 | 1504 | | |
1501 | 1505 | | |
1502 | | - | |
| 1506 | + | |
1503 | 1507 | | |
1504 | 1508 | | |
1505 | 1509 | | |
| |||
1517 | 1521 | | |
1518 | 1522 | | |
1519 | 1523 | | |
| 1524 | + | |
1520 | 1525 | | |
1521 | 1526 | | |
1522 | 1527 | | |
| |||
1541 | 1546 | | |
1542 | 1547 | | |
1543 | 1548 | | |
1544 | | - | |
| 1549 | + | |
1545 | 1550 | | |
1546 | 1551 | | |
1547 | 1552 | | |
| |||
1763 | 1768 | | |
1764 | 1769 | | |
1765 | 1770 | | |
| 1771 | + | |
1766 | 1772 | | |
1767 | 1773 | | |
1768 | 1774 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
2846 | 2846 | | |
2847 | 2847 | | |
2848 | 2848 | | |
| 2849 | + | |
2849 | 2850 | | |
2850 | 2851 | | |
2851 | 2852 | | |
| |||
4590 | 4591 | | |
4591 | 4592 | | |
4592 | 4593 | | |
| 4594 | + | |
| 4595 | + | |
| 4596 | + | |
| 4597 | + | |
| 4598 | + | |
| 4599 | + | |
| 4600 | + | |
| 4601 | + | |
| 4602 | + | |
| 4603 | + | |
| 4604 | + | |
| 4605 | + | |
| 4606 | + | |
| 4607 | + | |
| 4608 | + | |
| 4609 | + | |
| 4610 | + | |
| 4611 | + | |
| 4612 | + | |
| 4613 | + | |
| 4614 | + | |
| 4615 | + | |
| 4616 | + | |
| 4617 | + | |
| 4618 | + | |
| 4619 | + | |
| 4620 | + | |
| 4621 | + | |
| 4622 | + | |
| 4623 | + | |
| 4624 | + | |
| 4625 | + | |
| 4626 | + | |
| 4627 | + | |
| 4628 | + | |
| 4629 | + | |
| 4630 | + | |
| 4631 | + | |
| 4632 | + | |
| 4633 | + | |
| 4634 | + | |
| 4635 | + | |
| 4636 | + | |
| 4637 | + | |
| 4638 | + | |
| 4639 | + | |
| 4640 | + | |
| 4641 | + | |
| 4642 | + | |
| 4643 | + | |
| 4644 | + | |
| 4645 | + | |
| 4646 | + | |
| 4647 | + | |
| 4648 | + | |
| 4649 | + | |
| 4650 | + | |
| 4651 | + | |
| 4652 | + | |
| 4653 | + | |
| 4654 | + | |
| 4655 | + | |
| 4656 | + | |
| 4657 | + | |
| 4658 | + | |
| 4659 | + | |
| 4660 | + | |
| 4661 | + | |
| 4662 | + | |
0 commit comments