Commit a3a020f
authored
Fix Schema Duplication Errors in Self‑Referential INTERSECT/EXCEPT by Requalifying Input Sides (#18814)
## Which issue does this PR close?
* Closes #16295.
## Rationale for this change
Self-referential INTERSECT and EXCEPT queries (where both sides
originate from the same table) failed during Substrait round‑trip
consumption with the error:
> "Schema contains duplicate qualified field name"
This happened because the join-based implementation of set operations
attempted to merge two identical schemas without requalification,
resulting in duplicate or ambiguous field names. By ensuring both sides
are requalified when needed, DataFusion can correctly construct valid
logical plans for these operations.
### Before
```
❯ cargo test --test sqllogictests -- --substrait-round-trip intersection.slt:33
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.24s
Running bin/sqllogictests.rs (target/debug/deps/sqllogictests-917e139464eeea33)
Completed 1 test files in 0 seconds External error: 1 errors in file /Users/kosiew/GitHub/datafusion/datafusion/sqllogictest/test_files/intersection.slt
1. query failed: DataFusion error: Schema error: Schema contains duplicate qualified field name alltypes_plain.int_col
...
```
### After
```
❯ cargo test --test sqllogictests -- --substrait-round-trip intersection.slt:33
Finished `test` profile [unoptimized + debuginfo] target(s) in 0.64s
Running bin/sqllogictests.rs (target/debug/deps/sqllogictests-917e139464eeea33)
Completed 1 test files in 0 seconds
```
## What changes are included in this PR?
* Added a requalification step (`requalify_sides_if_needed`) inside
`intersect_or_except` to avoid duplicate or ambiguous field names.
* Improved conflict detection logic in `requalify_sides_if_needed` to
handle:
1. Duplicate qualified fields
2. Duplicate unqualified fields
3. Ambiguous references (qualified vs. unqualified collisions)
* Updated optimizer tests to reflect correct aliasing (`left`, `right`).
* Added new Substrait round‑trip tests for:
* INTERSECT and EXCEPT (both DISTINCT and ALL variants)
* Self-referential queries that previously failed
* Minor formatting and consistency improvements in Substrait consumer
code.
## Are these changes tested?
Yes. The PR includes comprehensive tests that:
* Reproduce the original failure modes.
* Validate that requalification produces stable and correct logical
plans.
* Confirm correct behavior across INTERSECT, EXCEPT, ALL, and DISTINCT
cases.
## Are there any user-facing changes?
No user-facing behavior changes.
This is a correctness improvement ensuring that valid SQL
queries—previously failing only in Substrait round‑trip mode—now work
without error.
## LLM-generated code disclosure
This PR includes LLM-generated code and comments. All LLM-generated
content has been manually reviewed and validated.1 parent 107cb5e commit a3a020f
File tree
4 files changed
+162
-22
lines changed- datafusion
- expr/src/logical_plan
- optimizer/tests
- substrait
- src/logical_plan/consumer/rel
- tests/cases
4 files changed
+162
-22
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1352 | 1352 | | |
1353 | 1353 | | |
1354 | 1354 | | |
| 1355 | + | |
| 1356 | + | |
| 1357 | + | |
| 1358 | + | |
| 1359 | + | |
| 1360 | + | |
| 1361 | + | |
| 1362 | + | |
| 1363 | + | |
1355 | 1364 | | |
1356 | 1365 | | |
1357 | 1366 | | |
| |||
1731 | 1740 | | |
1732 | 1741 | | |
1733 | 1742 | | |
1734 | | - | |
1735 | | - | |
1736 | | - | |
1737 | | - | |
1738 | | - | |
1739 | | - | |
1740 | | - | |
1741 | | - | |
1742 | | - | |
1743 | | - | |
1744 | | - | |
1745 | | - | |
1746 | | - | |
1747 | | - | |
| 1743 | + | |
| 1744 | + | |
| 1745 | + | |
| 1746 | + | |
| 1747 | + | |
| 1748 | + | |
| 1749 | + | |
| 1750 | + | |
| 1751 | + | |
| 1752 | + | |
| 1753 | + | |
| 1754 | + | |
| 1755 | + | |
| 1756 | + | |
| 1757 | + | |
| 1758 | + | |
| 1759 | + | |
| 1760 | + | |
| 1761 | + | |
| 1762 | + | |
| 1763 | + | |
| 1764 | + | |
| 1765 | + | |
| 1766 | + | |
| 1767 | + | |
| 1768 | + | |
| 1769 | + | |
| 1770 | + | |
| 1771 | + | |
| 1772 | + | |
| 1773 | + | |
| 1774 | + | |
| 1775 | + | |
| 1776 | + | |
| 1777 | + | |
| 1778 | + | |
| 1779 | + | |
| 1780 | + | |
| 1781 | + | |
| 1782 | + | |
| 1783 | + | |
| 1784 | + | |
| 1785 | + | |
| 1786 | + | |
| 1787 | + | |
| 1788 | + | |
| 1789 | + | |
| 1790 | + | |
| 1791 | + | |
| 1792 | + | |
1748 | 1793 | | |
1749 | | - | |
1750 | 1794 | | |
| 1795 | + | |
| 1796 | + | |
| 1797 | + | |
1751 | 1798 | | |
1752 | 1799 | | |
1753 | 1800 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
270 | 270 | | |
271 | 271 | | |
272 | 272 | | |
273 | | - | |
274 | | - | |
275 | | - | |
276 | | - | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
277 | 280 | | |
278 | | - | |
279 | 281 | | |
280 | 282 | | |
281 | 283 | | |
| |||
Lines changed: 3 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
81 | 81 | | |
82 | 82 | | |
83 | 83 | | |
84 | | - | |
| 84 | + | |
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
| |||
95 | 95 | | |
96 | 96 | | |
97 | 97 | | |
98 | | - | |
| 98 | + | |
| 99 | + | |
99 | 100 | | |
100 | 101 | | |
101 | 102 | | |
| |||
Lines changed: 90 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1162 | 1162 | | |
1163 | 1163 | | |
1164 | 1164 | | |
| 1165 | + | |
| 1166 | + | |
| 1167 | + | |
| 1168 | + | |
| 1169 | + | |
| 1170 | + | |
| 1171 | + | |
| 1172 | + | |
| 1173 | + | |
| 1174 | + | |
| 1175 | + | |
| 1176 | + | |
| 1177 | + | |
| 1178 | + | |
| 1179 | + | |
| 1180 | + | |
| 1181 | + | |
| 1182 | + | |
| 1183 | + | |
| 1184 | + | |
| 1185 | + | |
| 1186 | + | |
| 1187 | + | |
| 1188 | + | |
| 1189 | + | |
| 1190 | + | |
| 1191 | + | |
| 1192 | + | |
| 1193 | + | |
| 1194 | + | |
| 1195 | + | |
| 1196 | + | |
| 1197 | + | |
| 1198 | + | |
| 1199 | + | |
| 1200 | + | |
| 1201 | + | |
| 1202 | + | |
| 1203 | + | |
| 1204 | + | |
| 1205 | + | |
| 1206 | + | |
| 1207 | + | |
| 1208 | + | |
| 1209 | + | |
| 1210 | + | |
| 1211 | + | |
| 1212 | + | |
| 1213 | + | |
| 1214 | + | |
| 1215 | + | |
| 1216 | + | |
| 1217 | + | |
| 1218 | + | |
| 1219 | + | |
| 1220 | + | |
| 1221 | + | |
| 1222 | + | |
| 1223 | + | |
| 1224 | + | |
| 1225 | + | |
| 1226 | + | |
| 1227 | + | |
| 1228 | + | |
| 1229 | + | |
| 1230 | + | |
| 1231 | + | |
| 1232 | + | |
| 1233 | + | |
| 1234 | + | |
| 1235 | + | |
| 1236 | + | |
| 1237 | + | |
| 1238 | + | |
| 1239 | + | |
| 1240 | + | |
| 1241 | + | |
| 1242 | + | |
| 1243 | + | |
| 1244 | + | |
| 1245 | + | |
| 1246 | + | |
| 1247 | + | |
| 1248 | + | |
| 1249 | + | |
| 1250 | + | |
| 1251 | + | |
| 1252 | + | |
| 1253 | + | |
| 1254 | + | |
1165 | 1255 | | |
1166 | 1256 | | |
1167 | 1257 | | |
| |||
0 commit comments