Skip to content

Conversation

hanouticelina
Copy link
Contributor

This PR implements a CLI to manage Inference Endpoints, this provides "one liners" to deploy/delete/update/etc. endpoints, which could be handy in many cases. The DX intentionally mirrors a bit the UI instead of the API, to quote @ErikKaum :

we're renaming things in the UI quite fast to adapt and make things make more sense. And in many cases in the UI things are configured with slightly different names/groupings that in the API. Just because it's faster than in the API.

I explored a few layouts (e.g. a single deploy command with --catalog), but the cleanest UX ended up being two explicit paths:

  • hf inference-endpoints deploy hub ... – minimal set of hardware/task configs for Hub models.
  • hf inference-endpoints deploy catalog ... – one liner using optimized configs from the model catalog.
  • delete and update endpoints currently live under the "Settings" group in the UI, but feels more natural to keep them top-level in the CLI 🤷‍♀️
> hf inference-endpoints --help
Usage: hf inference-endpoints [OPTIONS] COMMAND [ARGS]...

  Manage Hugging Face Inference Endpoints.

Options:
  --help  Show this message and exit.

Commands:
  delete         Delete an Inference Endpoint permanently.
  deploy         Deploy Inference Endpoints from the Hub or the Catalog.
  describe       Get information about an Inference Endpoint.
  list           Lists all inference endpoints for the given namespace.
  list-catalog   List available Catalog models.
  pause          Pause an Inference Endpoint.
  resume         Resume an Inference Endpoint.
  scale-to-zero  Scale an Inference Endpoint to zero.
  update         Update an existing endpoint.

happy to more iterate if there are more suggestions to make the DX better (and simpler?)

from ._cli_utils import TokenOpt, get_hf_api, typer_factory


logger = logging.get_logger(__name__)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not related to this PR, but the logger isn't used here (same in the other command files)



@deploy_app.command(name="hub", help="Deploy an Inference Endpoint from a Hub repository.")
def deploy_from_hub(
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitted the deploy into two subcommands instead of using a flag to deploy from the Model Catalog because Typer doesn't easily allow conditional requirements (i.e., "these parameters are required unless that flag (e.g. --from-catalog) is set"), which makes validation and type hints messy so using subcommands lets Typer enforce required options cleanly for each case

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants