Skip to content

Conversation

@jotak
Copy link
Member

@jotak jotak commented Dec 10, 2025

Description

  • Introduce a configurable reconnection timer + entropy on GRPC client
  • Add tests

Observing the reconnections:

Capture d’écran du 2025-12-10 18-01-21

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
    • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
    • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
    • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
    • Standard QE validation, with pre-merge tests unless stated otherwise.
    • Regression tests only (e.g. refactoring with no user-facing change).
    • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

- Introduce a configurable reconnection timer + entropy on GRPC client
- Add tests
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Dec 10, 2025

@jotak: This pull request references NETOBSERV-2489 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Description

  • Introduce a configurable reconnection timer + entropy on GRPC client
  • Add tests

Dependencies

  • Operator:

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci
Copy link

openshift-ci bot commented Dec 10, 2025

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by:
Once this PR has been reviewed and has the lgtm label, please assign oliviercazade for approval. For more information see the Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Dec 10, 2025

@jotak: This pull request references NETOBSERV-2489 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Description

  • Introduce a configurable reconnection timer + entropy on GRPC client
  • Add tests

Observing the reconnections:

Capture d’écran du 2025-12-10 18-01-21

Dependencies

  • Operator:

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Dec 10, 2025

@jotak: This pull request references NETOBSERV-2489 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.21.0" version, but no target version was set.

Details

In response to this:

Description

  • Introduce a configurable reconnection timer + entropy on GRPC client
  • Add tests

Observing the reconnections:

Capture d’écran du 2025-12-10 18-01-21

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak jotak added the needs-review Tells that the PR needs a review label Dec 12, 2025
Copy link
Contributor

@jpinsonneau jpinsonneau left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as initial solution.

If we want to improve this in future, it seems that gRPC is recommending a client aware approach:
https://grpc.io/blog/grpc-load-balancing/#recommendations-and-best-practices

And kubernetes focuses on service mesh:
https://kubernetes.io/blog/2018/11/07/grpc-load-balancing-on-kubernetes-without-tears/

Those could be interesting if we face balancing or performances issues with the current implementation.

@openshift-ci openshift-ci bot added the lgtm label Dec 17, 2025
@jotak jotak removed the needs-review Tells that the PR needs a review label Dec 17, 2025
@Amoghrd
Copy link
Member

Amoghrd commented Dec 17, 2025

/ok-to-test

@openshift-ci openshift-ci bot added the ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. label Dec 17, 2025
@github-actions
Copy link

New images:
quay.io/netobserv/ebpf-bytecode:3fb3a9a
quay.io/netobserv/netobserv-ebpf-agent:3fb3a9a

These will expire after two weeks.

To deploy this build, run from the operator repo, assuming the operator is running:

USER=netobserv VERSION=3fb3a9a make set-agent-image

@openshift-ci
Copy link

openshift-ci bot commented Dec 17, 2025

@jotak: The following tests failed, say /retest to rerun all failed tests or /retest-required to rerun all mandatory failed tests:

Test name Commit Details Required Rerun command
ci/prow/netobserv-cli-tests d71f05d link false /test netobserv-cli-tests
ci/prow/qe-e2e-tests d71f05d link false /test qe-e2e-tests

Full PR test history. Your PR dashboard.

Details

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here.

@Amoghrd
Copy link
Member

Amoghrd commented Dec 18, 2025

/label qe-approved

@openshift-ci openshift-ci bot added the qe-approved QE has approved this pull request label Dec 18, 2025
@openshift-ci-robot
Copy link
Collaborator

openshift-ci-robot commented Dec 18, 2025

@jotak: This pull request references NETOBSERV-2489 which is a valid jira issue.

Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the bug to target the "4.22.0" version, but no target version was set.

Details

In response to this:

Description

  • Introduce a configurable reconnection timer + entropy on GRPC client
  • Add tests

Observing the reconnections:

Capture d’écran du 2025-12-10 18-01-21

Dependencies

Checklist

If you are not familiar with our processes or don't know what to answer in the list below, let us know in a comment: the maintainers will take care of that.

  • Will this change affect NetObserv / Network Observability operator? If not, you can ignore the rest of this checklist.
  • Is this PR backed with a JIRA ticket? If so, make sure it is written as a title prefix (in general, PRs affecting the NetObserv/Network Observability product should be backed with a JIRA ticket - especially if they bring user facing changes).
  • Does this PR require product documentation?
  • If so, make sure the JIRA epic is labelled with "documentation" and provides a description relevant for doc writers, such as use cases or scenarios. Any required step to activate or configure the feature should be documented there, such as new CRD knobs.
  • Does this PR require a product release notes entry?
  • If so, fill in "Release Note Text" in the JIRA.
  • Is there anything else the QE team should know before testing? E.g: configuration changes, environment setup, etc.
  • If so, make sure it is described in the JIRA ticket.
  • QE requirements (check 1 from the list):
  • Standard QE validation, with pre-merge tests unless stated otherwise.
  • Regression tests only (e.g. refactoring with no user-facing change).
  • No QE (e.g. trivial change with high reviewer's confidence, or per agreement with the QE team).

To run a perfscale test, comment with: /test ebpf-node-density-heavy-25nodes

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository.

@jotak jotak merged commit 0ca32f1 into netobserv:main Dec 19, 2025
11 of 14 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

jira/valid-reference lgtm ok-to-test To set manually when a PR is safe to test. Triggers image build on PR. qe-approved QE has approved this pull request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants