-
Notifications
You must be signed in to change notification settings - Fork 14
feat: Add logic for greedy backfill #1887
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9a6fa80 to
ef0859a
Compare
| publisherConfiguration = serverContext.configuration().getConfigData(PublisherConfig.class); | ||
|
|
||
| this.scheduledExecutorService = context.threadPoolManager() | ||
| .createSingleThreadScheduledExecutor( | ||
| "StreamPublisherRunner", | ||
| (t, e) -> LOGGER.log(ERROR, "Uncaught exception in thread: " + t.getName(), e)); | ||
| // schedule a notification to prompt backfill to proactively look for newer blocks from other block nodes | ||
| this.scheduledExecutorService.scheduleWithFixedDelay( | ||
| () -> notifyTooFarBehind(UNKNOWN_BLOCK_NUMBER), | ||
| publisherConfiguration.noActivityNotificationIntervalSeconds(), | ||
| publisherConfiguration.noActivityNotificationIntervalSeconds(), | ||
| TimeUnit.SECONDS); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think we need this at all.
Backfill already has scheduled activity and could check for changes in latest available block (rather than publisher blind sending an event assuming it's too far behind)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finished listening the recording and catching up with followup conversation offline.
I agree that if we are just sending this message from the publisher every X amount of time, and not using the no publishers condition we discussed initially. then it does not make sense to have a new scheduled executor on the publisher.
however I think we can re-use the work you did for the scheduled executor within the backfill plugin to schedule the autonomous task that checks for gaps.
and as part of that check, if greedy mode is enabled we can check for newer blocks as well...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed here. Using the work to make scheduled executors work correctly in the Backfill plugin to schedule a check (has the last acknowledged increased since the last backfill, for instance) when "greedy backfill" is enabled seems to be a good approach.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good note.
This is left over design from the first part where it was staged on no publisher.
I should have pulled this out after the design approach update.
Will do this and it should likely simplify all the test changes.
Will fix that up
AlfredoG87
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First pass is looking good.
I agree with Joseph recomendation to avoid triggering the process from the publisher Manager since we don't care if we have or not active publishers in order to backfill in greedy mode.
There is a method getBackfillTypeForBlock that determines if the PersistenceNotification is coming from a autonomous or on-demand flow based on the block number, after this changes we cannot longer trust this method for determining the source. Here I would recommend switching for a map for each flow and store there the blocs that were submitted for verification+persistence and backfill is awaiting to complete before moving on to the next batch.
or even fire and forget and don't care what happens after blocks are submitted, this approach might greatly simplify logic, but might also potentially create some queue saturation if the BN is too slow.
| .withConfigDataType(org.hiero.block.node.app.config.ServerConfig.class) | ||
| .withConfigDataType(org.hiero.block.node.app.config.node.NodeConfig.class); | ||
| .withConfigDataType(org.hiero.block.node.app.config.node.NodeConfig.class) | ||
| .withValue("prometheus.endpointEnabled", "false"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice addition, this will make testing cleaner 💯
| // add block range available from other BN sources | ||
| LongRange detectedRecentGapRange = backfillGrpcClientAutonomous.getNewAvailableRange( | ||
| blockRanges.getLast().end()); | ||
| if (detectedRecentGapRange.size() > 0 && detectedRecentGapRange.start() >= 0) { | ||
| detectedGaps = new ArrayList<>(); | ||
| detectedGaps.add(detectedRecentGapRange); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if we want to wrap this in a feature flag to determine if we want autonomous backfill to be greedy.
| public record PublisherConfig( | ||
| @Loggable @ConfigProperty(defaultValue = "9_223_372_036_854_775_807") @Min(100_000L) long batchForwardLimit) { | ||
| @Loggable @ConfigProperty(defaultValue = "9_223_372_036_854_775_807") @Min(100_000L) long batchForwardLimit, | ||
| @Loggable @ConfigProperty(defaultValue = "10") @Min(5) @Max(60) int noActivityNotificationIntervalSeconds) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we use the same scheduled task of autonomous for greedy we don't even need this property.
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
…eable Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
3d92e48 to
63bf223
Compare
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Codecov Report❌ Patch coverage is ❌ Your patch check has failed because the patch coverage (77.19%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage. @@ Coverage Diff @@
## main #1887 +/- ##
============================================
- Coverage 79.48% 79.18% -0.30%
- Complexity 1193 1215 +22
============================================
Files 128 130 +2
Lines 5684 5857 +173
Branches 610 624 +14
============================================
+ Hits 4518 4638 +120
- Misses 887 937 +50
- Partials 279 282 +3
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Reviewer Notes
Current backfill logic doesn't account for blocks that are available from other blocks when BN has no publishers pushing blocks to it or has publishers not sending new blocks (dealing with duplicated etc) but the network is progressing and other BNs have the blocks.
In this case we want BNs where possible to proactively retrieve new blocks that exist on other block nodes.
We expect due to network latency it may happen that CNs switch around BNs to push blocks, in this case it's likely for a BN to fall behind and we want it to be close enough to live should a CN ever switch back to it.
serverStatusResponse.lastAvailableBlockavailable from other BNsserverStatusResponse.lastAvailableBlockavailable from other BNsNewestBlockKnownToNetworkNotificationevery 10 secs to prompt on demand backfill for new blocksRelated Issue(s)
Fixes #1886
Fixes #1238