Fix bin nested fields issue #4606

ahkcs · 2025-10-20T20:59:44Z

Description

Fixed bin command failing on nested fields (e.g., resource.attributes.telemetry.sdk.version).
Updated CalciteRelNodeVisitor.projectPlusOverriding() to use prefix matching instead of exact matching for nested field names.

Related Issues

Resolve [BUG] PPL bin command fails to produce output for nested/struct fields #4482

ykmr1224 · 2025-10-21T00:45:32Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

        originalFieldNames.stream()
-            .filter(newNames::contains)
+            .filter(
+                originalName ->


Let's extract for readability / maintainability.

ykmr1224 · 2025-10-21T00:57:50Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+                                // Match exact field names (e.g., "age" == "age")
+                                // OR nested paths (e.g., "resource.attributes..." starts with
+                                // "resource")
+                                newName.equals(originalName)
+                                    || newName.startsWith(originalName + ".")))


Does this mean it could override all the original fields start with originalName + "."? (like when original fields include resource.a, resource.b, etc. then all of them will be removed?

Thanks for the question! No - originalFieldNames only contains top-level field names like "resource", not nested paths like "resource.a" or "resource.b".

So when binning resource.c.nested, the filter only matches the top-level "resource" field (via startsWith("resource.")). This causes the entire resource struct to be removed and flattened, with the binned field replacing the original nested value.

Only one top-level struct gets matched and processed, regardless of how many nested fields it contains.

You can refer to this IT test I added: testBinWithNestedFieldWithoutExplicitProjection

Field name could contain . and newName.startsWith("resource.") could match with multiple fields in my understanding.
I am unsure how much we currently support field name containing ., but QualifiedNameResolver implements longest match logic to decide the referred field in case qualified name contains multiple dots.

I think it would be an unusual case for field name to contain mutliple .?
Looking at the end-to-end flow, originalFieldNames comes from:
context.relBuilder.peek().getRowType().getFieldNames()

This returns only the direct fields of the current RelNode's row type. For a schema with nested structures, the row type would have:

Top-level field: "resource" (type: STRUCT)

Top-level field: "severityNumber" (type: INT)

It would not contain "resource.a", "resource.b", etc. as separate top-level fields - those only exist as nested fields within the resource struct definition.

The scenario where multiple top-level fields share a prefix (like resource, resource.a, resource.b all being top-level) would require an unusual schema design(I think we can rarely meet this?)

OpenSearch mapping key could include ., and also we can easily introduce field which contains . with eval command.
Why can't we directly match with the actual field we used?
projectPlusOverriding is used for multiple commands, and could have side effect to other commands.

Why can't we directly match with the actual field we used?

We DO directly match. The logic uses both:

Exact match: newName.equals(originalName)

Struct match: newName.startsWith(originalName + ".")

This exact logic is already used by eval. Example:
source=telemetry | eval resource.temp = 1 | bin resource.attributes.telemetry.sdk.version span=2

Step 1 - Eval:

"resource.temp".startsWith("resource.") → true → flattens struct

Schema becomes: ["resource.attributes.telemetry.sdk.enabled", "resource.attributes.telemetry.sdk.language", ..., "resource.temp"]

Step 2 - Bin:

Schema already flattened, exact match works: "resource.attributes.telemetry.sdk.version".equals("resource.attributes.telemetry.sdk.version") → true

Both commands use the same logic safely. Added test (testBinWithEvalCreatedDottedFieldName) to confirm no edge case issues

I think @ykmr1224 means if we can directly know the sub field of a struct and project it, instead of relying on the crispy name matching. E.g. if a struct is info {a: string, b: int}, we can directly project info, info.a, info.b instead of relying on pattern info.* to match the names, because there may be another schema info: string, info.a integer, where info.a is not a sub-field of info.

@yuancu Thanks for the clarification! However, I think the current logic handles this situation, you can refer to this newly added integration test using eval command to test: testBinWithEvalCreatedDottedFieldName

For the edge case you mentioned, I think it would be problematic if we have two info.a with different type because that would cause confusion for the commands. For example, when we are using stats count(info.a), which info.a is it referring to? For other cases where there's no naming overlap, I think you can refer to the eval test case

RyanL1997 · 2025-10-21T16:21:42Z

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

+                newName.equals(originalName)
+                    // OR match nested paths (e.g., "resource.attributes..." starts with
+                    // "resource.")
+                    || newName.startsWith(originalName + "."));


is there any other place using this pattern of detection? Or if there is a better way to detect if it is a nested field or not?

In CalciteRelNodeVisitor.java, the visitFlatten() method uses the same startsWith() pattern:

public RelNode visitFlatten(Flatten node, CalcitePlanContext context) { visitChildren(node, context); RelBuilder relBuilder = context.relBuilder; String fieldName = node.getField().getField().toString(); // Match the sub-field names with "field.*" List<RelDataTypeField> fieldsToExpand = relBuilder.peek().getRowType().getFieldList().stream() .filter(f -> f.getName().startsWith(fieldName + ".")) // ← Same pattern! .toList(); // ... rest of method }

Signed-off-by: Kai Huang <ahkcs@amazon.com>

(cherry picked from commit 85dc8d9) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

(cherry picked from commit 85dc8d9) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com> Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... # Conflicts: # docs/user/ppl/functions/conversion.rst

* default-main: (34 commits) Enhance dynamic source clause to support only metadata filters (opensearch-project#4554) Make nested alias type support referring to outer context (opensearch-project#4673) Update big5 ppl queries and check plans (opensearch-project#4668) Support push down sort after limit (opensearch-project#4657) Use table scan rowType in filter pushdown could fix rename issue (opensearch-project#4670) Fix: Support Alias Fields in MIN, MAX, FIRST, LAST, and TAKE Aggregations (opensearch-project#4621) Fix bin nested fields issue (opensearch-project#4606) Add `per_minute`, `per_hour`, `per_day` function support (opensearch-project#4531) Pushdown sort aggregate metrics (opensearch-project#4603) Followup: Change ComparableLinkedHashMap to compare Key than Value (opensearch-project#4648) Mitigate the CI failure caused by 500 Internal Server Error (opensearch-project#4646) Allow renaming group-by fields to existing field names (opensearch-project#4586) Publish internal modules separately for downstream reuse (opensearch-project#4484) Revert "Update grammar files and developer guide (opensearch-project#4301)" (opensearch-project#4643) Support Automatic Type Conversion for REX/SPATH/PARSE Command Extractions (opensearch-project#4599) Replace all dots in fields of table scan's PhysType (opensearch-project#4633) Return comparable LinkedHashMap in `valueForCalcite()` of ExprTupleValue (opensearch-project#4629) Refactor JsonExtractAllFunctionIT and MapConcatFunctionIT (opensearch-project#4623) Pushdown case function in aggregations as range queries (opensearch-project#4400) Update GEOIP function to support IP types as input (opensearch-project#4613) ... Signed-off-by: Asif Bashar <asif.bashar@gmail.com>

ahkcs force-pushed the fix/bin_nested branch from 942d497 to 9d329c4 Compare October 20, 2025 23:43

ahkcs marked this pull request as ready for review October 21, 2025 00:05

ykmr1224 reviewed Oct 21, 2025

View reviewed changes

ahkcs force-pushed the fix/bin_nested branch from dec8679 to 97492f4 Compare October 21, 2025 04:13

RyanL1997 reviewed Oct 21, 2025

View reviewed changes

ahkcs requested review from RyanL1997 and ykmr1224 October 21, 2025 23:07

ahkcs added 3 commits October 21, 2025 17:38

Fix bin nested fields issue

5e0cecc

Signed-off-by: Kai Huang <ahkcs@amazon.com>

refactor

58f20d8

Signed-off-by: Kai Huang <ahkcs@amazon.com>

add test

c64bea5

Signed-off-by: Kai Huang <ahkcs@amazon.com>

ahkcs force-pushed the fix/bin_nested branch from 97492f4 to c64bea5 Compare October 22, 2025 02:38

yuancu added the bug Something isn't working label Oct 22, 2025

ykmr1224 approved these changes Oct 23, 2025

View reviewed changes

RyanL1997 added bugFix PPL Piped processing language backport 2.19-dev labels Oct 24, 2025

RyanL1997 approved these changes Oct 24, 2025

View reviewed changes

RyanL1997 merged commit 85dc8d9 into opensearch-project:main Oct 24, 2025
40 of 41 checks passed

opensearch-trigger-bot bot pushed a commit that referenced this pull request Oct 24, 2025

Fix bin nested fields issue (#4606)

4825a2f

(cherry picked from commit 85dc8d9) Signed-off-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

opensearch-trigger-bot bot mentioned this pull request Oct 24, 2025

[Backport 2.19-dev] Fix bin nested fields issue #4663

Merged

Fix bin nested fields issue #4606

Fix bin nested fields issue #4606

Conversation

ahkcs commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Related Issues

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ahkcs Oct 21, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yuancu Oct 22, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

ahkcs commented Oct 20, 2025 •

edited

Loading

ahkcs Oct 21, 2025 •

edited

Loading

yuancu Oct 22, 2025 •

edited

Loading