Skip to content

Conversation

@changlei-li
Copy link
Contributor

No description provided.

Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the API surface could be minimized further to avoid footguns. We need to be careful as well with the expectations of setting time pool-wide, this can't be simply implemented naively as it can be very problematic.

I would also ask you to consider dropping the mode that NTP is not used at all: this means that server time in a pool can drift considerably. I would recommend instead a mode where the server run NTP servers and synchronize between themselves. This means that they might not be accurate, but at least they will be synchronized.

The call to set time and get time need changes as well, depending on the previous decision, it might make more sense to use pool-wide calls for these.

Comment on lines 42 to 43
`host.ntp_mode`, enum host_ntp_mode(ntp_mode_dhcp, ntp_mode_custom, ntp_mode_default)
`host.ntp_custom_servers`, string set
`host.ntp_enabled`, bool
Copy link
Member

@psafont psafont Nov 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think merging ntp_enabled and ntp_mode would simplify the control of the feature:

let ntp_mode = DHCP | Manual | Factory | Disabled

To enable ntp mode, users can call set_ntp_mode DHCP; set_ntp_mode Factory; or set_ntp_servers [ntp1.server.com; ntp2.server.com]. And to disable call set_ntp_mode Disabled

And to check whether NTP is enabled, and which mode it is, just call get_ntp_mode

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be Manual of string list, but we can't express that in the IDL.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These mode and enabled fields were also implemented and reviewed as part of the earlier PR #6689 already. This design doc has them for completeness, but it a bit late to change now...

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see why can't they be changed, they haven't merged to master yet, nor released :)

Copy link
Contributor Author

@changlei-li changlei-li Nov 14, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need some more rework both for dev and QA. But it is good that this topic is not related with the feature scope. I changed the doc to meet it.

@psafont In your example, set_ntp_servers is misleading that user may think it represents the real NTP servers no matter in any NTP mode. In fact, it just applies to ntp mode Manual. As Edwin said, the best choice is Manual of string list, but we can't use like that now. So I change host.ntp_custom_servers to host.ntp_manual_servers(not host.ntp_servers in your example) to show they are in direct relations. Is this your last against point in the PR? Hope to have your response soon.
And @robhoes please let me know if you have opinions on this change. Thank you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@changlei-li I'm not sure I follow that... If the mode is called custom then that would match a field called custom_servers?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...and I think the term "custom" suits better than "manual", so IMO we don't need any changes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The idea was that calling set_ntp_servers changed the mode to manual to enforce the servers being passed by this call, we can change the name if you think it's more clear

@changlei-li changlei-li force-pushed the private/changleli/add-ntp-doc branch from 6c68616 to 2bf0903 Compare November 7, 2025 02:29
@changlei-li
Copy link
Contributor Author

changlei-li commented Nov 7, 2025

I think the API surface could be minimized further to avoid footguns. We need to be careful as well with the expectations of setting time pool-wide, this can't be simply implemented naively as it can be very problematic.

pool-level set APIs don't contain time setting. Sorry I didn't show it explicitly in chapter Pool level set APIs but mentioned in use cases.

I would also ask you to consider dropping the mode that NTP is not used at all: this means that server time in a pool can drift considerably. I would recommend instead a mode where the server run NTP servers and synchronize between themselves. This means that they might not be accurate, but at least they will be synchronized.

"Dropping the mode that NTP is not used at all" You mean not to provide a disable-ntp command but a internal ntp server in the pool. That's a big change.

The call to set time and get time need changes as well, depending on the previous decision, it might make more sense to use pool-wide calls for these.

Your suggestion makes sense. By the internal NTP sever, we can set the server time, then the pool time will be synced. It's good to add mode like ntp_mode_internal in the future.
But not providing the ntp disable is a little arbitrary? If we keep the disable-ntp option, we need the local-time settings.

@psafont
Copy link
Member

psafont commented Nov 7, 2025

"Dropping the mode that NTP is not used at all" You mean not to provide a disable-ntp command but a internal ntp server in the pool. That's a big change.

yes, it's a big change, but it's important to keep it in mind if customers start seeing strange issues when using the manual mode, we might have to provide it as a more robust solution, and deprecate manual mode in the future.

Your suggestion makes sense. By the internal NTP sever, we can set the server time, then the pool time will be synced. It's good to add mode like ntp_mode_internal in the future.
But not providing the ntp disable is a little arbitrary? If we keep the disable-ntp option, we need the local-time settings.

Well, if there's an internal NTP server, the API pool.set_time can be provided, and the disable ntp can be disregarded. I do agree that's for the future, I doubt having such a feature can be done in time for 9.0

@changlei-li
Copy link
Contributor Author

changlei-li commented Nov 10, 2025

it's important to keep it in mind if customers start seeing strange issues when using the manual mode, we might have to provide it as a more robust solution, and deprecate manual mode in the future

Do you mean the disable NTP? I don't see issues on the manual/custom mode.

I doubt having such a feature can be done in time for 9.0

Maybe not. They are not on the plan.

@psafont Let me summary the key points

  • NTP. You think we shouldn't provide the disable option (Then the host.set-time is needless too) as it may cause strange issues. I think it is the option for user to select.
  • time setting. You think we shouldn't provide the time without timezone even there are separate set/get_timezone APIs. I have answered in the inline comments above. Please have a look.

Please confirm the two points. They are mostly the requirement level which we have reviewed internally. I don't think we need to have changes if no defects in the current scope. If there are defects, of course I'm glad to accept and fix.

@psafont
Copy link
Member

psafont commented Nov 12, 2025

Please confirm the two points. They are mostly the requirement level which we have reviewed internally. I don't think we need to have changes if no defects in the current scope. If there are defects, of course I'm glad to accept and fix.

  • I think allowing users to manually set time will lead to strange issues because of time differences, especially in large pools. This is why I prefer to not expose the mode at all, if time permits. We can expose this for the time being, document the fact to recommend API clients to not use this mode.

  • Both host.get_server_localtime and host.set_server_localtime must use only timestamps with time difference from UTC, as explained in the newest comments about edge cases when using them.

@robhoes
Copy link
Member

robhoes commented Nov 12, 2025

I think we need to see these new calls in the right context, which is to provide a way to use the API (and external clients such as XC and XO) to configure NTP, where users currently need to log in to the host and either use xsconsole or use shell commands.

Looking at the NTP options in xsconsole is useful to guide us what we'd need in the API. And I think that is what the current API design was inspired by: xsconsole has options to configure NTP servers through DHCP, set them manually, or accept default servers. It also has an option to disable NTP and set time manually. All of this is at the host level.

I do agree that turning off NTP is generally not desirable. But people may have hosts currently set up like this already, which we'd want to support. I think it makes sense to have feature parity with xsconsole in the API, so that all current use cases are met. Also xsconsole can then use the API itself. Other interesting improvements have been discussed to avoid hosts in a pool getting out of sync, but those will need to be designed and planned separately.

@changlei-li changlei-li force-pushed the private/changleli/add-ntp-doc branch from 2bf0903 to d98a362 Compare November 13, 2025 09:59
@changlei-li
Copy link
Contributor Author

Hi @psafont I have updated the doc. Please review again.

  • Remove all the pool level APIs
  • host.set_server_localtime to host.set_servertime, accept the iso8601 format with UTC offset.

@changlei-li changlei-li force-pushed the private/changleli/add-ntp-doc branch from d98a362 to a15b50e Compare November 14, 2025 06:25
Copy link
Member

@psafont psafont left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for accepting the changes on set time and the minimization of the API. I think there's one last change that needs to be done before I'm happy. It might seem trivial, but the format accepted is quite important, it reduces the complexity of implementation

Signed-off-by: Changlei Li <changlei.li@cloud.com>
@changlei-li changlei-li force-pushed the private/changleli/add-ntp-doc branch from a15b50e to 4d9bf6c Compare November 14, 2025 09:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants