Skip to content

Conversation

@Nana-EC
Copy link
Contributor

@Nana-EC Nana-EC commented Nov 21, 2025

Reviewer Notes

Current backfill logic doesn't account for blocks that are available from other blocks when BN has no publishers pushing blocks to it or has publishers not sending new blocks (dealing with duplicated etc) but the network is progressing and other BNs have the blocks.
In this case we want BNs where possible to proactively retrieve new blocks that exist on other block nodes.

We expect due to network latency it may happen that CNs switch around BNs to push blocks, in this case it's likely for a BN to fall behind and we want it to be close enough to live should a CN ever switch back to it.

  • Update autonomous mode to search for blocks up to the max of serverStatusResponse.lastAvailableBlock available from other BNs
  • Update on-demand mode to search for blocks up to the max of serverStatusResponse.lastAvailableBlock available from other BNs
  • Add logic to PublisherPlugin that periodically issues a NewestBlockKnownToNetworkNotification every 10 secs to prompt on demand backfill for new blocks

Related Issue(s)

Fixes #1886
Fixes #1238

@Nana-EC Nana-EC added this to the 0.24.0 milestone Nov 21, 2025
@Nana-EC Nana-EC self-assigned this Nov 21, 2025
@Nana-EC Nana-EC added the Block Node Issues/PR related to the Block Node. label Nov 21, 2025
@Nana-EC Nana-EC force-pushed the 1886-greedy-backfill branch from 9a6fa80 to ef0859a Compare November 26, 2025 05:11
Comment on lines 106 to 117
publisherConfiguration = serverContext.configuration().getConfigData(PublisherConfig.class);

this.scheduledExecutorService = context.threadPoolManager()
.createSingleThreadScheduledExecutor(
"StreamPublisherRunner",
(t, e) -> LOGGER.log(ERROR, "Uncaught exception in thread: " + t.getName(), e));
// schedule a notification to prompt backfill to proactively look for newer blocks from other block nodes
this.scheduledExecutorService.scheduleWithFixedDelay(
() -> notifyTooFarBehind(UNKNOWN_BLOCK_NUMBER),
publisherConfiguration.noActivityNotificationIntervalSeconds(),
publisherConfiguration.noActivityNotificationIntervalSeconds(),
TimeUnit.SECONDS);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this at all.
Backfill already has scheduled activity and could check for changes in latest available block (rather than publisher blind sending an event assuming it's too far behind)

Copy link
Contributor

@AlfredoG87 AlfredoG87 Nov 26, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I finished listening the recording and catching up with followup conversation offline.

I agree that if we are just sending this message from the publisher every X amount of time, and not using the no publishers condition we discussed initially. then it does not make sense to have a new scheduled executor on the publisher.

however I think we can re-use the work you did for the scheduled executor within the backfill plugin to schedule the autonomous task that checks for gaps.

and as part of that check, if greedy mode is enabled we can check for newer blocks as well...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed here. Using the work to make scheduled executors work correctly in the Backfill plugin to schedule a check (has the last acknowledged increased since the last backfill, for instance) when "greedy backfill" is enabled seems to be a good approach.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good note.
This is left over design from the first part where it was staged on no publisher.
I should have pulled this out after the design approach update.
Will do this and it should likely simplify all the test changes.
Will fix that up

Copy link
Contributor

@AlfredoG87 AlfredoG87 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

First pass is looking good.

I agree with Joseph recomendation to avoid triggering the process from the publisher Manager since we don't care if we have or not active publishers in order to backfill in greedy mode.

There is a method getBackfillTypeForBlock that determines if the PersistenceNotification is coming from a autonomous or on-demand flow based on the block number, after this changes we cannot longer trust this method for determining the source. Here I would recommend switching for a map for each flow and store there the blocs that were submitted for verification+persistence and backfill is awaiting to complete before moving on to the next batch.

or even fire and forget and don't care what happens after blocks are submitted, this approach might greatly simplify logic, but might also potentially create some queue saturation if the BN is too slow.

.withConfigDataType(org.hiero.block.node.app.config.ServerConfig.class)
.withConfigDataType(org.hiero.block.node.app.config.node.NodeConfig.class);
.withConfigDataType(org.hiero.block.node.app.config.node.NodeConfig.class)
.withValue("prometheus.endpointEnabled", "false");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice addition, this will make testing cleaner 💯

Comment on lines 284 to 289
// add block range available from other BN sources
LongRange detectedRecentGapRange = backfillGrpcClientAutonomous.getNewAvailableRange(
blockRanges.getLast().end());
if (detectedRecentGapRange.size() > 0 && detectedRecentGapRange.start() >= 0) {
detectedGaps = new ArrayList<>();
detectedGaps.add(detectedRecentGapRange);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if we want to wrap this in a feature flag to determine if we want autonomous backfill to be greedy.

public record PublisherConfig(
@Loggable @ConfigProperty(defaultValue = "9_223_372_036_854_775_807") @Min(100_000L) long batchForwardLimit) {
@Loggable @ConfigProperty(defaultValue = "9_223_372_036_854_775_807") @Min(100_000L) long batchForwardLimit,
@Loggable @ConfigProperty(defaultValue = "10") @Min(5) @Max(60) int noActivityNotificationIntervalSeconds) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we use the same scheduled task of autonomous for greedy we don't even need this property.

Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
…eable

Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
@Nana-EC Nana-EC force-pushed the 1886-greedy-backfill branch from 3d92e48 to 63bf223 Compare November 28, 2025 16:22
Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
@codecov
Copy link

codecov bot commented Nov 28, 2025

Codecov Report

❌ Patch coverage is 77.19298% with 26 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...hiero/block/node/app/DefaultThreadPoolManager.java 0.00% 12 Missing ⚠️
.../org/hiero/block/node/backfill/BackfillPlugin.java 81.81% 7 Missing and 5 partials ⚠️
.../hiero/block/node/backfill/BackfillGrpcClient.java 91.66% 0 Missing and 2 partials ⚠️

❌ Your patch check has failed because the patch coverage (77.19%) is below the target coverage (80.00%). You can increase the patch coverage or adjust the target coverage.

@@             Coverage Diff              @@
##               main    #1887      +/-   ##
============================================
- Coverage     79.48%   79.18%   -0.30%     
- Complexity     1193     1215      +22     
============================================
  Files           128      130       +2     
  Lines          5684     5857     +173     
  Branches        610      624      +14     
============================================
+ Hits           4518     4638     +120     
- Misses          887      937      +50     
- Partials        279      282       +3     
Files with missing lines Coverage Δ Complexity Δ
...ero/block/node/backfill/BackfillConfiguration.java 100.00% <ø> (ø) 1.00 <0.00> (ø)
...ro/block/node/backfill/client/BlockNodeClient.java 100.00% <100.00%> (ø) 6.00 <2.00> (+2.00)
...ro/block/node/spi/threading/ThreadPoolManager.java 50.00% <ø> (ø) 2.00 <0.00> (ø)
.../hiero/block/node/backfill/BackfillGrpcClient.java 94.17% <91.66%> (-0.77%) 21.00 <5.00> (+5.00) ⬇️
...hiero/block/node/app/DefaultThreadPoolManager.java 27.50% <0.00%> (-11.79%) 5.00 <0.00> (ø)
.../org/hiero/block/node/backfill/BackfillPlugin.java 81.27% <81.81%> (+5.08%) 46.00 <9.00> (+12.00)

... and 1 file with indirect coverage changes

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Signed-off-by: Nana Essilfie-Conduah <nana@swirldslabs.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Block Node Issues/PR related to the Block Node.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

feat(backfill): Greedy Backfill proactively fills in new blocks Handle "no publisher" scenarios in block node

4 participants