Skip to content

Conversation

@oskarszoon
Copy link
Contributor

This is a work in progress branch where developers can quickly iterate on possibly improvements to the scaling environment.

Once we've found good improvements, all or some of these changes will be pushed back to main codebase in seperate PRs.

❗ Do not merge this PR
❗ Do not pull main into this branch

oskarszoon and others added 23 commits November 28, 2025 12:40
…n loading

Instrument the loadUnminedTransactions function with detailed Prometheus metrics to track performance and identify bottlenecks during startup:

- UTXO index readiness tracking (gauge + wait duration histogram)
- Iterator creation timing with success/error status
- Iterator processing with detailed transaction statistics (skipped, already mined, locked, added)
- Mark transactions on longest chain timing
- Sort transactions timing bucketed by volume (<1k, 1k-10k, 10k-100k, 100k-1M, >1M)
- Parent chain validation timing and filtered count
- AddDirectly calls with individual and batch timing

Also fixed unused import in pkg/k8sresolver/k8s.go
Restructured UTracer.Start() to process logging and metrics
independently of tracing state. Previously, early return when
tracing was disabled prevented log messages, metrics (histograms/
counters), and stats from being recorded.

Changes:
- Move option processing before tracing check
- Execute logging at span start/end regardless of tracing state
- Record metrics (histogram/counter) in endFn unconditionally
- Gate OpenTelemetry span operations behind tracingEnabled check
- Add test coverage for disabled tracing with logging/metrics

This allows observability through logs and metrics even when
distributed tracing is disabled for performance reasons.
When getminingcandidate is called while processNewBlockAnnouncement is processing a new block, immediately return an empty block template for the new height instead of timing out or blocking. This prevents miners from wasting hashrate on stale work and eliminates timeout errors during block processing.
Simplified the mining candidate caching from ~200 lines to ~100 lines by removing:
- Infinite loop with multiple continue statements
- Duplicated "stale cache" logic (appeared twice)
- Complex lock upgrade pattern with double-checking
- generationChan coordination mechanism
- Unnecessary retry loops

New approach:
1. Check cache with read lock (fast path)
2. Acquire write lock for generation (prevents concurrent generation)
3. Double-check cache after acquiring write lock (race prevention)
4. Generate and update cache

This maintains all functionality while being much easier to understand and maintain.
Added two new Prometheus metrics to track BlockAssembler state changes:

1. teranode_blockassembly_state_transitions_total{from, to}
   - Counter tracking every state transition
   - Labeled with from/to states
   - Shows transition patterns and frequency

2. teranode_blockassembly_state_duration_seconds{state}
   - Histogram of time spent in each state
   - Buckets: 1ms to 60s
   - Enables P50/P95/P99 duration analysis

These metrics capture state changes between Prometheus scrapes, providing complete visibility into state transitions and durations for Grafana dashboards.
…king

Added comprehensive Grafana dashboard to visualize BlockAssembler state transitions and durations:

Panels:
- State Timeline: Visual timeline showing state changes over time
- State Durations: P50/P95/P99 percentiles for each state
- State Transitions: Rate of state changes per second
- Current State: Real-time state display with color coding
- Time in Current State: Duration gauge with thresholds
- State Distribution: Pie chart showing time percentage per state
- State Entries: Count of state entries in last 5 minutes
- State Transition Matrix: Table view of all transitions

Enables monitoring of:
- Block processing performance (BlockchainSubscription duration)
- Mining candidate generation speed (GetMiningCandidate duration)
- Reorg frequency and duration
- State transition patterns

Includes documentation with example queries, alerting rules, and use cases.
Added missing 'name' field in YAML frontmatter to fix agent parse error.
oskarszoon and others added 30 commits December 3, 2025 13:37
… using filter expressions and separate modules for SetMined
Critical performance optimizations addressing 20ms UDF latency spikes:

1. **setMined() - Use list.remove() instead of array rebuilding**
   - Eliminated O(n) array rebuilding in block invalidation path
   - Replaced serial loop with native list.remove() operations
   - Significantly reduces latency in setMined hot path

2. **Cached #blocks calculations**
   - Prevent repeated length calculations within same function
   - Applied to setMined() and setDeleteAtHeight()
   - Every millisecond counts in single-threaded Lua execution

3. **spendMulti() optimizations**
   - Direct array indexing instead of list_iterator()
   - Cached loop variables (#spends, spentUtxos)
   - Reduced record field lookups with defensive nil checks

4. **HEX_LOOKUP table for byte-to-hex conversion**
   - Pre-computed 256-entry hex lookup table (module-level)
   - Eliminates 36 string_format() calls per transaction
   - Applied to spendingDataBytesToHex() and spendingDataBytesToTxHex()
   - ~10-20x faster hex conversion

5. **Removed dead code**
   - Deleted unused bytesToHex() function (only in commented debug code)

Performance impact:
- setMined(): Eliminates array rebuilding bottleneck
- spendMulti(): ~10-15% faster with cached variables
- Hex conversion: ~10-20x faster with lookup table
- Overall: Addresses observed 20ms latency spikes at scale

Version: teranode_v51 → teranode_v54
Added configurable hash prefix support across all blob store clients, allowing flexible directory organization in storage backends. Implemented automatic UTXO headers export alongside UTXO sets for improved bootstrapping and chain verification. Enhanced settings marshaling with sorted JSON output for consistent configuration display. Added printSettings flag to CLI commands and force option to seeder for reprocessing control.
…d output

Fixed marshalSortedJSON to properly detect and preserve types implementing json.Marshaler, ensuring hash types output as hex strings instead of byte arrays. Changed behavior to preserve original array/slice order instead of sorting them. Added comprehensive tests for custom marshalers, byte slices, byte arrays, and chainhash.Hash types.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants