Skip to content

Experimental files API client hangs in Databricks Apps + docs/code mismatch on default behavior #1153

@arcaputo3

Description

@arcaputo3

Summary

Two related issues with the experimental files API client:

  1. Hangs in Databricks Apps: File downloads hang indefinitely when running inside Databricks Apps due to presigned URL requests being blocked by network restrictions
  2. Docs/code mismatch: Documentation says experimental features are OFF by default, but the SDK code has them ON by default

Environment

  • SDK Version: 0.74.0 (also reproduced with 0.57.0)
  • Runtime: Databricks Apps (Azure)
  • Python: 3.11+, 3.14
  • Package Manager: uv (also reproducible with pip)

Issue 1: Hangs in Databricks Apps

Steps to Reproduce

  1. Create a Databricks App that downloads files from Unity Catalog volumes
  2. Use workspace.files.download(path) to download a file
  3. App hangs indefinitely (until any configured HTTP timeout kicks in)

Expected Behavior

File download should complete (either via presigned URL or fallback to Files API).

Actual Behavior

The download hangs silently. Based on SDK source analysis:

  1. SDK tries presigned URL path (/api/2.0/fs/create-download-url → direct Azure Blob download)
  2. In Databricks Apps, the direct Azure Blob request appears to be blocked (firewall/network policy?)
  3. The request never returns (no 403, no timeout) - just hangs
  4. The Files API fallback code exists but never triggers because the presigned URL attempt doesn't fail

Workaround

Setting DATABRICKS_DISABLE_EXPERIMENTAL_FILES_API_CLIENT=true forces use of the standard Files API, which works correctly.

Issue 2: Documentation/Code Mismatch

Documentation says (from Files API docs):

Some Files API client features are currently experimental. To enable them, set enable_experimental_files_api_client = True in your configuration profile or use the environment variable DATABRICKS_ENABLE_EXPERIMENTAL_FILES_API_CLIENT=True.

Actual SDK behavior:

# config.py - only has DISABLE flag, no ENABLE flag
disable_experimental_files_api_client: bool = ConfigAttribute(
    env="DATABRICKS_DISABLE_EXPERIMENTAL_FILES_API_CLIENT"
)
# Default: None (falsy) → experimental client is ENABLED by default
# files.py - uses the disable flag
if self._config.disable_experimental_files_api_client:
    _LOG.info("Disable experimental files API client, will use the original download method.")
    return super().download(file_path)
# else: uses experimental client (presigned URLs)

Verified the default:

>>> from databricks.sdk.config import Config
>>> c = Config()
>>> c.disable_experimental_files_api_client
None  # None is falsy, so experimental is ENABLED

Summary:

  • Docs say experimental is OFF by default, use ENABLE=True to opt-in
  • Code has experimental ON by default, use DISABLE=True to opt-out
  • The ENABLE env var mentioned in docs doesn't exist

Observations

  • Works locally: Same code downloads files successfully on local machine (presigned URL succeeds or fallback triggers properly)
  • Works with env var: Disabling experimental client makes downloads work in Apps
  • Silent failure: No SDK logging indicates the presigned URL attempt is hanging
  • Fallback never triggers: The presigned URL request hangs rather than failing, so fallback code is never reached

Suggestions

For Issue 1 (Hangs):

  1. Add timeout to presigned URL attempts - Don't hang indefinitely
  2. Add logging - Log when attempting presigned URL path and if it's taking too long
  3. Document Apps limitation - Note that presigned URLs may not work in Apps environments

For Issue 2 (Docs mismatch):

  1. Update docs to match actual behavior (experimental is ON by default)
  2. Or change code to match docs (make experimental opt-in)
  3. Consider adding the DATABRICKS_ENABLE_EXPERIMENTAL_FILES_API_CLIENT env var that docs mention

Additional Context

This took significant debugging time (~4 hours) because:

  • The failure mode (silent hang) gave no indication of what was failing
  • The code works locally, making it hard to reproduce
  • Initial debugging focused on permissions, scopes, effective_user_api_scopes, and auth (all red herrings)
  • The documentation led us to believe experimental features were disabled

We documented our debugging journey here: https://github.com/TJC-LP/chat-tjc-deep-research-mcp/blob/main/docs/KNOWN_ISSUES.md

Happy to provide more details or test any fixes!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions