Skip to content

Conversation

@eskabetxe
Copy link
Member

This pull request refactors the JDBC source split enumeration logic in the Flink connector, replacing the previous SqlTemplateSplitEnumerator and related classes with a new abstraction called SplitterEnumerator. The builder and enumerator classes are updated to use this new abstraction, simplifying split management and improving extensibility. The changes also update lineage handling and testing methods to use the new splitter abstraction.

Split Enumeration Refactoring

  • Replaced SqlTemplateSplitEnumerator and JdbcSqlSplitEnumeratorBase with the new SplitterEnumerator abstraction across JdbcSource, JdbcSourceBuilder, and JdbcSourceEnumerator, updating constructors, fields, and method calls accordingly. [1] [2] [3] [4]
  • Updated the builder to support setting a custom SplitterEnumerator, and added logic to select the appropriate splitter implementation based on provided parameters. [1] [2] [3]
  • Refactored enumerator startup and split assignment logic to use the new splitter, improving asynchronous split enumeration and handling both bounded and unbounded modes.

Lineage and Testing Updates

  • Changed lineage vertex extraction to use the new splitter abstraction for query information, removing direct calls to SqlTemplateSplitEnumerator. [1] [2]
  • Updated visible-for-testing methods to expose the new splitter abstraction instead of the old provider.

Miscellaneous

  • Updated equality checks and state snapshot logic to use the new splitter abstraction and its serializable state. [1] [2]
  • Removed obsolete annotations for the old split enumerator methods in the architecture violation files.

@eskabetxe eskabetxe force-pushed the source_dinamically branch 5 times, most recently from 2d3c1ac to 35399ed Compare November 25, 2025 16:25
@eskabetxe
Copy link
Member Author

@ferenc-csaky can you have a look when have free time..

Copy link
Contributor

@ferenc-csaky ferenc-csaky left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did not check the implementation details yet, added some interface-related comments.

import java.util.function.Supplier;

/** Interface for jdbc sql split enumerator. */
public interface SplitterEnumerator extends AutoCloseable, Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this IF be considered as a public API? If yes, lets add @PublicEvolving, otherwise @Internal. Based on JdbcSqlSplitEnumeratorBase, I guess @PublicEvolving. And that also means we should add javadoc to the functions. As far as I see most of them can be applied from JdbcSqlSplitEnumeratorBase.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

correct this should be PublicEvolving, the implementation can also be PublicEvolving if we decide to deprecate the JdbcParameterValuesProvider

Comment on lines -37 to -38
@PublicEvolving
public abstract class JdbcSqlSplitEnumeratorBase<SplitT> implements AutoCloseable, Serializable {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this is a public API, we cannot throw it away like this, we need to deprecate it first and keep the related config methods (deprecated) as well.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think about this, and I don't deprecated it because it is only used internally and cannot be passed in any way to source.. could have a wrong annotation?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants