Skip to content

Conversation

@Nukitt
Copy link
Contributor

@Nukitt Nukitt commented Nov 24, 2025

Currently, we use global overrides to configure testserver's tenant and DRPC options. These were also using defer statements, but it was concluded that there is not much value in actually performing the cleanup function to reset the global state option in a defer, since the test process exits immediately after finishing the tests.

This change introduces the framework changes required to consolidate the configuration into InitTestServerFactory. For now, we also have the global overrides present to maintain backward compatibility until we can have all the usages of the TestingSetDefaultTenantSelectionOverride and TestingGlobalDRPCOption removed.
We also change the priority order of changing these configs, highest priority is given to the config set in the test itself, then the global override, then factory default and then the enviorment variable change through the cli.

After this changes, we can simply add the configuration we want for the package into InitTestServerFactory. Example usage of it would look like:

serverutils.InitTestServerFactory(server.TestServerFactory,
    serverutils.WithDRPCOption(base.TestDRPCEnabled),
    serverutils.WithTenantOption(base.TestTenantProbabilisticOnly,
)

Epic: None
Fixes: #158004
Release note: None

@blathers-crl
Copy link

blathers-crl bot commented Nov 24, 2025

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@Nukitt Nukitt marked this pull request as ready for review November 24, 2025 16:15
@github-actions
Copy link

Potential Bug(s) Detected

The three-stage Claude Code analysis has identified potential bug(s) in this PR that may warrant investigation.

Next Steps:
Please review the detailed findings in the workflow run.

Note: When viewing the workflow output, scroll to the bottom to find the Final Analysis Summary.

After you review the findings, please tag the issue as follows:

  • If the detected issue is real or was helpful in any way, please tag the issue with O-AI-Review-Real-Issue-Found
  • If the detected issue was not helpful in any way, please tag the issue with O-AI-Review-Not-Helpful

@github-actions github-actions bot added the o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. label Nov 24, 2025
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! Just a couple of question from me.

@yuzefovich reviewed 1 of 1 files at r1, all commit messages.
Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cthumuluru-crdb)


pkg/testutils/serverutils/test_server_shim.go line 106 at r1 (raw file):

	// we are done.
	if !baseArg.TestTenantNoDecisionMade() {
		if t != nil {

nit: can t ever be nil actually? It looks like we only have non-nil check in 2 places but we log in more, and I wonder whether we could remove these nil checks altogether.


pkg/testutils/serverutils/test_server_shim.go line 129 at r1 (raw file):

	if factoryDefaultTenant != nil {
		defaultArg := *factoryDefaultTenant
		// If factory default made a decision, return it

nit: missing period.


pkg/testutils/serverutils/test_server_shim.go line 326 at r1 (raw file):

	// Update the flags with the actual decisions for test configuration.
	// Priority of these Should* functions:
	// Test explicit value > global override > factory defaults > env vars > metamorphic.

Hm, I feel like the env var should have the highest priority except when the test itself has explicit value (in my mind, if an engineer is specifying the env var, they truly want to enable it pretty much everywhere applicable for the invocation, regardless of different overrides), WDYT?

Copy link
Contributor Author

@Nukitt Nukitt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TFTR!

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @cthumuluru-crdb and @yuzefovich)


pkg/testutils/serverutils/test_server_shim.go line 106 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: can t ever be nil actually? It looks like we only have non-nil check in 2 places but we log in more, and I wonder whether we could remove these nil checks altogether.

the nil check is required imo, and as you pointed out is missing and instead needs to be added in those places. This t comes from StartServerOnlyE which has it set as optional hence it can be nil.

// The first argument is optional. If non-nil; it is used for logging
// server configuration messages.
func StartServerOnlyE(t TestLogger, params base.TestServerArgs) (TestServerInterface, error) {

We have existing tests where t is not set, but since the troubling codepath is never being exercised (since cli tests have tenants disabled and no IssueRef)

func ShouldStartDefaultTestTenant(
t TestLogger, baseArg base.DefaultTestTenantOptions,
) (retval base.DefaultTestTenantOptions) {
// Explicit case for disabling the default test tenant.
if baseArg.TestTenantAlwaysDisabled() {
if issueNum, label := baseArg.IssueRef(); issueNum != 0 {
t.Logf("cluster virtualization disabled due to issue: #%d (expected label: %s)", issueNum, label)
}
return baseArg
When we actually do exercise the log here(commenting out the conditional) we get panic: runtime error: invalid memory address or nil pointer dereference. We never encountered the panic it generates before due to this. More evidence:
// TestCLI wraps a test server and is used by tests to make assertions about the output of CLI commands.
type TestCLI struct {
// Insecure is a copy of the insecure mode parameter.
Insecure bool
Server serverutils.TestServerInterface
tenant serverutils.ApplicationLayerInterface
certsDir string
cleanupFunc func() error
prevStderr *os.File
// t is the testing.T instance used for this test.
// Example_xxx tests may have this set to nil.
t testing.TB

Added the nil checks now.


pkg/testutils/serverutils/test_server_shim.go line 129 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

nit: missing period.

done.


pkg/testutils/serverutils/test_server_shim.go line 326 at r1 (raw file):

Previously, yuzefovich (Yahor Yuzefovich) wrote…

Hm, I feel like the env var should have the highest priority except when the test itself has explicit value (in my mind, if an engineer is specifying the env var, they truly want to enable it pretty much everywhere applicable for the invocation, regardless of different overrides), WDYT?

I find that a lot of times apart from setting an explicit value in the test, people do the same for package level too. For example, I've seen us turning off drpc for a whole package due failing tests, or secondary tenants are disabled for a whole package. So if a test/package has a explicit value for the config, we should not be overriding it since its already known to fail. (I personally faced this a lot when testing for different tenant modes with drpc. I change the env var to set tenant and drpc -> it fails -> turns out it was failing without drpc too.)
Thinking about the use-case of the env var, imo most of the engineers would want to just use it to make sure the changes they did still work on different configs, if its already failing before then it shouldn't worry them. This is why setting the env var lower made sense to me.
Maybe we can have another god-like env var that can do this to enable for everything but that just adds more complexity and confusion on what to use down the line.
Let me know your thoughts on this.

@Nukitt Nukitt force-pushed the initserverfactory-config branch from 375304f to cd14adb Compare November 25, 2025 11:30
Copy link
Member

@yuzefovich yuzefovich left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

@yuzefovich reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @cthumuluru-crdb and @Nukitt)


pkg/testutils/serverutils/test_server_shim.go line 326 at r1 (raw file):

Previously, Nukitt (Nukit) wrote…

I find that a lot of times apart from setting an explicit value in the test, people do the same for package level too. For example, I've seen us turning off drpc for a whole package due failing tests, or secondary tenants are disabled for a whole package. So if a test/package has a explicit value for the config, we should not be overriding it since its already known to fail. (I personally faced this a lot when testing for different tenant modes with drpc. I change the env var to set tenant and drpc -> it fails -> turns out it was failing without drpc too.)
Thinking about the use-case of the env var, imo most of the engineers would want to just use it to make sure the changes they did still work on different configs, if its already failing before then it shouldn't worry them. This is why setting the env var lower made sense to me.
Maybe we can have another god-like env var that can do this to enable for everything but that just adds more complexity and confusion on what to use down the line.
Let me know your thoughts on this.

Ack, sounds reasonable to me.

Copy link
Contributor

@cthumuluru-crdb cthumuluru-crdb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a minor comment. Otherwise looks good.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this comment needs revision?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I was referring to line 246

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done.

@cthumuluru-crdb
Copy link
Contributor

@Nukitt, can you reword the commit title so it can be fit in a single line before you commit?

Currently, we use global overrides to configure testserver's
tenant and DRPC options. These were also using defer statements,
but it was concluded that there is not much value in actually
performing the cleanup function to reset the global state option
in a defer, since the test process exits immediately after
finishing the tests.

This change introduces the framework changes required to
consolidate the configuration into InitTestServerFactory.
For now, we also have the global overrides present to
maintain backward compatibility until we can have all
the usages of the `TestingSetDefaultTenantSelectionOverride`
and `TestingGlobalDRPCOption` removed.
We also change the priority order of changing these configs,
highest priority is given to the config set in the test itself,
then the global override, then factory default and then the
enviorment variable change through the cli.

After this changes, we can simply add the configuration we
want for the package into `InitTestServerFactory`. Example
usage of it would look like:

```go
serverutils.InitTestServerFactory(server.TestServerFactory,
    serverutils.WithDRPCOption(base.TestDRPCEnabled),
    serverutils.WithTenantOption(base.TestTenantProbabilisticOnly,
)
```

Epic: None
Fixes: cockroachdb#158004
Release note: None
@Nukitt Nukitt force-pushed the initserverfactory-config branch from cd14adb to dd1d13d Compare November 26, 2025 07:48
@Nukitt Nukitt changed the title testserver: consolidate testserver configuration into InitTestServerF… testserver: consolidate testserver config to InitTestServerFactory Nov 26, 2025
@Nukitt
Copy link
Contributor Author

Nukitt commented Nov 26, 2025

TFTRs! 😄

bors r=yuzefovich, cthumuluru-crdb

@craig
Copy link
Contributor

craig bot commented Nov 26, 2025

@craig craig bot merged commit 65848d0 into cockroachdb:master Nov 26, 2025
25 of 26 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

o-AI-Review-Potential-Issue-Detected AI reviewer found potential issue. Never assign manually—auto-applied by GH action only. target-release-26.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Consolidate test server configuration into InitTestServerFactory

4 participants