diff --git a/README.md b/README.md index afcf3a2..923fbfa 100644 --- a/README.md +++ b/README.md @@ -174,13 +174,64 @@ Display the current version of the package. #### `datacustomcode configure` Configure credentials for connecting to Data Cloud. +**Prerequisites:** +- A [connected app](#creating-a-connected-app) with OAuth settings configured +- For OAuth Tokens authentication: [refresh token and core token](#obtaining-refresh-token-and-core-token) + Options: - `--profile TEXT`: Credential profile name (default: "default") +- `--auth-type TEXT`: Authentication method (`oauth_tokens` or `username_password`, default: `oauth_tokens`) +- `--login-url TEXT`: Salesforce login URL + +For Username/Password authentication: - `--username TEXT`: Salesforce username - `--password TEXT`: Salesforce password - `--client-id TEXT`: Connected App Client ID - `--client-secret TEXT`: Connected App Client Secret -- `--login-url TEXT`: Salesforce login URL + +For OAuth Tokens authentication: +- `--client-id TEXT`: Connected App Client ID +- `--client-secret TEXT`: Connected App Client Secret +- `--refresh-token TEXT`: OAuth refresh token (see [Obtaining Refresh Token](#obtaining-refresh-token-and-core-token)) +- `--core-token TEXT`: (Optional) OAuth core/access token - if not provided, it will be obtained using the refresh token + +##### Using Environment Variables (Alternative) + +Instead of using `datacustomcode configure`, you can also set credentials via environment variables. + +> [!NOTE] +> Environment variables take precedence over the credentials INI file when both are present. + +**Common (required for all auth types):** +| Variable | Description | +|----------|-------------| +| `SFDC_LOGIN_URL` | Salesforce login URL (e.g., `https://login.salesforce.com`) | +| `SFDC_CLIENT_ID` | External Client App Client ID | +| `SFDC_AUTH_TYPE` | Authentication type: `oauth_tokens` (default) or `username_password` | + +**For OAuth Tokens authentication (`SFDC_AUTH_TYPE=oauth_tokens`):** +| Variable | Description | +|----------|-------------| +| `SFDC_CLIENT_SECRET` | External Client App Client Secret | +| `SFDC_REFRESH_TOKEN` | OAuth refresh token | +| `SFDC_CORE_TOKEN` | (Optional) OAuth core/access token | + +**For Username/Password authentication (`SFDC_AUTH_TYPE=username_password`):** +| Variable | Description | +|----------|-------------| +| `SFDC_USERNAME` | Salesforce username | +| `SFDC_PASSWORD` | Salesforce password | +| `SFDC_CLIENT_SECRET` | External Client App Client Secret | + +Example usage: +```bash +export SFDC_LOGIN_URL="https://login.salesforce.com" +export SFDC_CLIENT_ID="your_client_id" +export SFDC_CLIENT_SECRET="your_client_secret" +export SFDC_REFRESH_TOKEN="your_refresh_token" + +datacustomcode run ./payload/entrypoint.py +``` #### `datacustomcode init` @@ -295,35 +346,97 @@ You can read more about Jupyter Notebooks here: https://jupyter.org/ ## Prerequisite details -### Creating a connected app - -1. Log in to salesforce as an admin. In the top right corner, click on the gear icon and go to `Setup` -2. In the left hand column search for `oauth` -3. Select `OAuth and OpenID Connect Settings` -4. Toggle on `Allow OAuth Username-Password Flows` and accept the dialog box that pops up -5. Clear the search bar -6. Expand `Apps`, expand `External Client Apps`, click `Settings` -7. Toggle on `Allow access to External Client App consumer secrets via REST API` -8. Toggle on `Allow creation of connected apps` -9. Click `Enable` in the warning box -10. Click `New Connected App` button -11. Fill in the required fields within the `Basic Information` section -12. Under the `API (Enable OAuth Settings)` section: - a. Click on the checkbox to Enable OAuth Settings. - b. Provide a callback URL like http://localhost:55555/callback - c. In the Selected OAuth Scopes, make sure that `refresh_token`, `api`, `cdp_query_api`, `cdp_profile_api` is selected. - d. Click on Save to save the connected app -13. From the detail page that opens up afterwards, click the `Manage Consumer Details` button to find your client id and client secret -14. Click `Cancel` button once complete -15. Click `Manage` button -16. Click `Edit Policies` -17. Under `IP Relaxation` select `Relax IP restrictions` -18. Click `Save` -19. Logout -20. Use the URL of the login page as the `login_url` value when setting up the SDK +### Creating an External Client app + +1. Log in to Salesforce as an admin. In the top right corner, click on the gear icon and go to `Setup` +2. On the left sidebar, expand `Apps`, expand `External Client Apps`, click `Settings` +3. Expand `Apps`, expand `External Client Apps`, click `External Client App Manager` +4. Click `New External Client App` button +5. Fill in the required fields within the `Basic Information` section +6. Under the `API (Enable OAuth Settings)` section: + 1. Click on the checkbox to Enable OAuth Settings + 2. Provide a callback URL like `http://localhost:5555/callback` + 3. In the Selected OAuth Scopes, make sure that `refresh_token`, `api`, `cdp_query_api`, `cdp_profile_api` are selected + 4. Check the following: + - Enable Authorization Code and Credentials Flow + - Require user credentials in the POST body for Authorization Code and Credentials Flow + 5. Uncheck `Require Proof Key for Code Exchange (PKCE) extension for Supported Authorization Flows` + 6. Click on `Create` button +7. On your newly created External Client App page, on the `Policies` tab: + 1. In the `App Authorization` section, choose an appropriate Refresh Token Policy as per your expected usage and preference. + 2. Under `App Authorization`, set IP Relaxation to `Relax IP restrictions` unless otherwise needed +8. Click `Save` +9. Go to the `Settings` tab, under `OAuth Settings`. There, you can click on the `Consumer Key and Secret` button which will open a new tab. There you can copy the `client_id` and `client_secret` values which are to be used during configuring credentials using this SDK. +10. Logout +11. Use the URL of the login page as the `login_url` value when setting up the SDK You now have all fields necessary for the `datacustomcode configure` command. +### Obtaining Refresh Token and Core Token + +If you're using OAuth Tokens authentication (instead of Username/Password), follow these steps to obtain your refresh token and core token (access token). + +#### Step 1: Note Connected App Details + +From your connected app, note down the following: +- **Client ID** +- **Client Secret** +- **Callback URL** (e.g., `http://localhost:55555/callback`) + +#### Step 2: Obtain Authorization Code + +1. Open a browser and navigate to the following URL (replace placeholders with your values): + + ``` + /services/oauth2/authorize?response_type=code&client_id=&redirect_uri= + ``` + +2. After authenticating, you'll be redirected to your callback URL. The redirected URL will be in the form: + ``` + ?code= + ``` + +3. Extract the `` from the address bar. If the address bar doesn't show it, check the **Network tab** in your browser's developer tools. + +#### Step 3: Exchange Code for Tokens + +Make a POST request to exchange the authorization code for tokens. You can use `curl` or Postman: + +```bash +curl --location --request POST '/services/oauth2/token' \ + --header 'Content-Type: application/x-www-form-urlencoded' \ + --data-urlencode 'grant_type=authorization_code' \ + --data-urlencode 'code=' \ + --data-urlencode 'client_id=' \ + --data-urlencode 'client_secret=' \ + --data-urlencode 'redirect_uri=' +``` + +The response will be a JSON object containing: + +```json +{ + "access_token": "", + "refresh_token": "", + "signature": "", + "scope": "refresh_token cdp_query_api api cdp_profile_api cdp_api full", + "id_token": "", + "instance_url": "https://your-instance.my.salesforce.com", + "id": "https://login.salesforce.com/id/00DSB.../005SB...", + "token_type": "Bearer", + "issued_at": "1767743916187" +} +``` + +The key fields you need are: +| Field | Description | +|-------|-------------| +| `access_token` | The **core token** (also called access token) | +| `refresh_token` | The **refresh token** for obtaining new access tokens | +| `instance_url` | Your Salesforce instance URL | + +Use the `refresh_token` value when running `datacustomcode configure` with OAuth Tokens authentication. + ## Other docs - [Troubleshooting](./docs/troubleshooting.md) diff --git a/src/datacustomcode/__init__.py b/src/datacustomcode/__init__.py index 644dfc1..00cfae3 100644 --- a/src/datacustomcode/__init__.py +++ b/src/datacustomcode/__init__.py @@ -14,7 +14,14 @@ # limitations under the License. from datacustomcode.client import Client +from datacustomcode.credentials import AuthType, Credentials from datacustomcode.io.reader.query_api import QueryAPIDataCloudReader from datacustomcode.io.writer.print import PrintDataCloudWriter -__all__ = ["Client", "QueryAPIDataCloudReader", "PrintDataCloudWriter"] +__all__ = [ + "AuthType", + "Client", + "Credentials", + "PrintDataCloudWriter", + "QueryAPIDataCloudReader", +] diff --git a/src/datacustomcode/cli.py b/src/datacustomcode/cli.py index a7ac5d6..ea5607a 100644 --- a/src/datacustomcode/cli.py +++ b/src/datacustomcode/cli.py @@ -43,30 +43,91 @@ def version(): click.echo("Version information not available") -@cli.command() -@click.option("--profile", default="default") -@click.option("--username", prompt=True) -@click.option("--password", prompt=True, hide_input=True) -@click.option("--client-id", prompt=True) -@click.option("--client-secret", prompt=True) -@click.option("--login-url", prompt=True) -def configure( - username: str, - password: str, - client_id: str, - client_secret: str, +def _configure_username_password( login_url: str, + client_id: str, profile: str, ) -> None: - from datacustomcode.credentials import Credentials + """Configure credentials for Username/Password authentication.""" + from datacustomcode.credentials import AuthType, Credentials + + username = click.prompt("Username") + password = click.prompt("Password", hide_input=True) + client_secret = click.prompt("Client Secret") - Credentials( + credentials = Credentials( + login_url=login_url, + client_id=client_id, + auth_type=AuthType.USERNAME_PASSWORD, username=username, password=password, - client_id=client_id, client_secret=client_secret, + ) + credentials.update_ini(profile=profile) + click.secho( + f"Username/Password credentials saved to profile '{profile}' successfully", + fg="green", + ) + + +def _configure_oauth_tokens( + login_url: str, + client_id: str, + profile: str, +) -> None: + """Configure credentials for OAuth Tokens authentication.""" + from datacustomcode.credentials import AuthType, Credentials + + client_secret = click.prompt("Client Secret") + refresh_token = click.prompt("Refresh Token") + core_token = click.prompt( + "Core Token (optional, press Enter to skip)", + default="", + show_default=False, + ) + + credentials = Credentials( login_url=login_url, - ).update_ini(profile=profile) + client_id=client_id, + auth_type=AuthType.OAUTH_TOKENS, + client_secret=client_secret, + refresh_token=refresh_token, + core_token=core_token if core_token else None, + ) + credentials.update_ini(profile=profile) + click.secho( + f"OAuth Tokens credentials saved to profile '{profile}' successfully", + fg="green", + ) + + +@cli.command() +@click.option("--profile", default="default", help="Credential profile name") +@click.option( + "--auth-type", + type=click.Choice(["oauth_tokens", "username_password"]), + default="oauth_tokens", + help="""Authentication method to use. + + \b + oauth_tokens - OAuth tokens (refresh_token/core_token) authentication [DEFAULT] + username_password - Traditional username/password OAuth flow + """, +) +def configure(profile: str, auth_type: str) -> None: + """Configure credentials for connecting to Data Cloud.""" + from datacustomcode.credentials import AuthType + + # Common fields for all auth types + click.echo(f"\nConfiguring {auth_type} authentication for profile '{profile}':\n") + login_url = click.prompt("Login URL") + client_id = click.prompt("Client ID") + + # Route to appropriate handler based on auth type + if auth_type == AuthType.USERNAME_PASSWORD.value: + _configure_username_password(login_url, client_id, profile) + elif auth_type == AuthType.OAUTH_TOKENS.value: + _configure_oauth_tokens(login_url, client_id, profile) @cli.command() diff --git a/src/datacustomcode/credentials.py b/src/datacustomcode/credentials.py index d7db5e2..833a36f 100644 --- a/src/datacustomcode/credentials.py +++ b/src/datacustomcode/credentials.py @@ -15,28 +15,94 @@ from __future__ import annotations import configparser -from dataclasses import dataclass +from dataclasses import dataclass, field +from enum import Enum import os +from typing import Optional from loguru import logger -ENV_CREDENTIALS = { +INI_FILE = os.path.expanduser("~/.datacustomcode/credentials.ini") + + +class AuthType(str, Enum): + """Supported authentication methods for Salesforce Data Cloud.""" + + USERNAME_PASSWORD = "username_password" + OAUTH_TOKENS = "oauth_tokens" + + +# Environment variable mappings for each auth type +ENV_CREDENTIALS_COMMON = { + "login_url": "SFDC_LOGIN_URL", + "client_id": "SFDC_CLIENT_ID", +} + +ENV_CREDENTIALS_USERNAME_PASSWORD = { "username": "SFDC_USERNAME", "password": "SFDC_PASSWORD", - "client_id": "SFDC_CLIENT_ID", "client_secret": "SFDC_CLIENT_SECRET", - "login_url": "SFDC_LOGIN_URL", } -INI_FILE = os.path.expanduser("~/.datacustomcode/credentials.ini") + +ENV_CREDENTIALS_OAUTH_TOKENS = { + "client_secret": "SFDC_CLIENT_SECRET", + "refresh_token": "SFDC_REFRESH_TOKEN", + "core_token": "SFDC_CORE_TOKEN", +} @dataclass class Credentials: - username: str - password: str - client_id: str - client_secret: str + """Flexible credentials supporting multiple authentication methods. + + Supports two authentication methods: + - OAUTH_TOKENS: OAuth tokens (core_token and refresh_token) authentication + - USERNAME_PASSWORD: Traditional username/password OAuth flow + """ + + # Required for all auth types login_url: str + client_id: str + auth_type: AuthType = field(default=AuthType.OAUTH_TOKENS) + + # Username/Password flow fields + username: Optional[str] = None + password: Optional[str] = None + + # Common field + client_secret: Optional[str] = None + + # OAuth Tokens flow fields + core_token: Optional[str] = None + refresh_token: Optional[str] = None + + def __post_init__(self): + """Validate credentials based on auth_type.""" + self._validate() + + def _validate(self) -> None: + """Validate that required fields are present for the auth type.""" + if self.auth_type == AuthType.USERNAME_PASSWORD: + missing = [] + if not self.username: + missing.append("username") + if not self.password: + missing.append("password") + if not self.client_secret: + missing.append("client_secret") + if missing: + raise ValueError( + f"Username/Password auth requires: {', '.join(missing)}" + ) + + elif self.auth_type == AuthType.OAUTH_TOKENS: + missing = [] + if not self.refresh_token: + missing.append("refresh_token") + if not self.client_secret: + missing.append("client_secret") + if missing: + raise ValueError(f"OAuth Tokens auth requires: {', '.join(missing)}") @classmethod def from_ini( @@ -44,38 +110,152 @@ def from_ini( profile: str = "default", ini_file: str = INI_FILE, ) -> Credentials: + """Load credentials from INI file. + + Args: + profile: Profile section name in the INI file (default: "default") + ini_file: Path to the credentials INI file + + Returns: + Credentials instance loaded from the INI file + + Raises: + KeyError: If the profile or required fields are missing + """ config = configparser.ConfigParser() - logger.debug(f"Reading {ini_file} for profile {profile}") - config.read(ini_file) + expanded_ini_file = os.path.expanduser(ini_file) + logger.debug(f"Reading {expanded_ini_file} for profile {profile}") + + if not os.path.exists(expanded_ini_file): + raise FileNotFoundError(f"Credentials file not found: {expanded_ini_file}") + + config.read(expanded_ini_file) + + if profile not in config: + raise KeyError(f"Profile '{profile}' not found in {expanded_ini_file}") + + section = config[profile] + + # Determine auth type (default to oauth_tokens) + auth_type_str = section.get("auth_type", AuthType.OAUTH_TOKENS.value) + try: + auth_type = AuthType(auth_type_str) + except ValueError as exc: + raise ValueError( + f"Invalid auth_type '{auth_type_str}' in profile '{profile}'. " + f"Valid options: {[t.value for t in AuthType]}" + ) from exc + return cls( - username=config[profile]["username"], - password=config[profile]["password"], - client_id=config[profile]["client_id"], - client_secret=config[profile]["client_secret"], - login_url=config[profile]["login_url"], + login_url=section["login_url"], + client_id=section["client_id"], + auth_type=auth_type, + # Username/Password fields + username=section.get("username"), + password=section.get("password"), + client_secret=section.get("client_secret"), + # OAuth Tokens fields + core_token=section.get("core_token"), + refresh_token=section.get("refresh_token"), ) @classmethod def from_env(cls) -> Credentials: + """Load credentials from environment variables. + + Environment variables: + Common (required): + SFDC_LOGIN_URL: Salesforce login URL + SFDC_CLIENT_ID: Connected App client ID + SFDC_AUTH_TYPE: Authentication type (optional, defaults to oauth_tokens) + + For oauth_tokens (default): + SFDC_CLIENT_SECRET: Connected App client secret + SFDC_REFRESH_TOKEN: OAuth refresh token + SFDC_CORE_TOKEN: OAuth core/access token (optional) + + For username_password: + SFDC_USERNAME: Salesforce username + SFDC_PASSWORD: Salesforce password + SFDC_CLIENT_SECRET: Connected App client secret + + Returns: + Credentials instance loaded from environment variables + + Raises: + ValueError: If required environment variables are missing + """ + # Check for common required variables + login_url = os.environ.get("SFDC_LOGIN_URL") + client_id = os.environ.get("SFDC_CLIENT_ID") + + if not login_url or not client_id: + raise ValueError( + "Environment variables SFDC_LOGIN_URL and SFDC_CLIENT_ID are required." + ) + + # Determine auth type + auth_type_str = os.environ.get("SFDC_AUTH_TYPE", AuthType.OAUTH_TOKENS.value) try: - return cls(**{k: os.environ[v] for k, v in ENV_CREDENTIALS.items()}) - except KeyError as exc: + auth_type = AuthType(auth_type_str) + except ValueError as exc: raise ValueError( - f"All of {ENV_CREDENTIALS.values()} must be set in environment." + f"Invalid SFDC_AUTH_TYPE '{auth_type_str}'. " + f"Valid options: {[t.value for t in AuthType]}" ) from exc + return cls( + login_url=login_url, + client_id=client_id, + auth_type=auth_type, + # Username/Password fields + username=os.environ.get("SFDC_USERNAME"), + password=os.environ.get("SFDC_PASSWORD"), + client_secret=os.environ.get("SFDC_CLIENT_SECRET"), + # OAuth Tokens fields + core_token=os.environ.get("SFDC_CORE_TOKEN"), + refresh_token=os.environ.get("SFDC_REFRESH_TOKEN"), + ) + @classmethod def from_available(cls, profile: str = "default") -> Credentials: - if os.environ.get("SFDC_USERNAME"): + """Load credentials from the first available source. + + Checks sources in order: + 1. Environment variables (if SFDC_LOGIN_URL is set) + 2. INI file (~/.datacustomcode/credentials.ini) + + Args: + profile: Profile name to use when loading from INI file + + Returns: + Credentials instance from the first available source + + Raises: + ValueError: If no credentials are found in any source + """ + # Check environment variables first + if os.environ.get("SFDC_LOGIN_URL"): + logger.debug("Loading credentials from environment variables") return cls.from_env() - if os.path.exists(INI_FILE): + + # Check INI file + if os.path.exists(os.path.expanduser(INI_FILE)): + logger.debug(f"Loading credentials from INI file: {INI_FILE}") return cls.from_ini(profile=profile) + raise ValueError( - "Credentials not found in env or ini file. " + "Credentials not found in environment or INI file. " "Run `datacustomcode configure` to create a credentials file." ) - def update_ini(self, profile: str = "default", ini_file: str = INI_FILE): + def update_ini(self, profile: str = "default", ini_file: str = INI_FILE) -> None: + """Save credentials to INI file. + + Args: + profile: Profile section name in the INI file + ini_file: Path to the credentials INI file + """ config = configparser.ConfigParser() expanded_ini_file = os.path.expanduser(ini_file) @@ -87,11 +267,30 @@ def update_ini(self, profile: str = "default", ini_file: str = INI_FILE): if profile not in config: config[profile] = {} - config[profile]["username"] = self.username - config[profile]["password"] = self.password - config[profile]["client_id"] = self.client_id - config[profile]["client_secret"] = self.client_secret + # Always save common fields + config[profile]["auth_type"] = self.auth_type.value config[profile]["login_url"] = self.login_url + config[profile]["client_id"] = self.client_id + + # Save fields based on auth type + if self.auth_type == AuthType.USERNAME_PASSWORD: + config[profile]["username"] = self.username or "" + config[profile]["password"] = self.password or "" + config[profile]["client_secret"] = self.client_secret or "" + # Remove fields from other auth types + for key in ["refresh_token", "core_token"]: + config[profile].pop(key, None) + + elif self.auth_type == AuthType.OAUTH_TOKENS: + config[profile]["client_secret"] = self.client_secret or "" + config[profile]["refresh_token"] = self.refresh_token or "" + if self.core_token: + config[profile]["core_token"] = self.core_token + # Remove fields from other auth types + for key in ["username", "password"]: + config[profile].pop(key, None) with open(expanded_ini_file, "w") as f: config.write(f) + + logger.debug(f"Saved credentials to {expanded_ini_file} [{profile}]") diff --git a/src/datacustomcode/io/reader/query_api.py b/src/datacustomcode/io/reader/query_api.py index bf300a3..44aa6d0 100644 --- a/src/datacustomcode/io/reader/query_api.py +++ b/src/datacustomcode/io/reader/query_api.py @@ -34,7 +34,7 @@ ) from salesforcecdpconnector.connection import SalesforceCDPConnection -from datacustomcode.credentials import Credentials +from datacustomcode.credentials import AuthType, Credentials from datacustomcode.io.reader.base import BaseDataCloudReader if TYPE_CHECKING: @@ -68,10 +68,79 @@ def _pandas_to_spark_schema( return StructType(fields) +def create_cdp_connection( + credentials: Credentials, + dataspace: Optional[str] = None, +) -> SalesforceCDPConnection: + """Create a SalesforceCDPConnection based on the credentials auth type. + + This factory function creates the appropriate connection based on the + authentication method configured in the credentials. + + Args: + credentials: Credentials instance with authentication details. + dataspace: Optional dataspace identifier for multi-tenant queries. + If None or "default", the dataspace argument is not passed to + the connection constructor. + + Returns: + SalesforceCDPConnection configured for the specified auth method. + + Raises: + ValueError: If the auth type is not supported. + """ + effective_dataspace = dataspace if dataspace and dataspace != "default" else None + + if credentials.auth_type == AuthType.USERNAME_PASSWORD: + logger.debug("Creating CDP connection with Username/Password authentication") + if effective_dataspace is not None: + return SalesforceCDPConnection( + credentials.login_url, + username=credentials.username, + password=credentials.password, + client_id=credentials.client_id, + client_secret=credentials.client_secret, + dataspace=effective_dataspace, + ) + else: + return SalesforceCDPConnection( + credentials.login_url, + username=credentials.username, + password=credentials.password, + client_id=credentials.client_id, + client_secret=credentials.client_secret, + ) + + elif credentials.auth_type == AuthType.OAUTH_TOKENS: + logger.debug("Creating CDP connection with OAuth Tokens authentication") + if effective_dataspace is not None: + return SalesforceCDPConnection( + credentials.login_url, + client_id=credentials.client_id, + client_secret=credentials.client_secret, + refresh_token=credentials.refresh_token, + dataspace=effective_dataspace, + ) + else: + return SalesforceCDPConnection( + credentials.login_url, + client_id=credentials.client_id, + client_secret=credentials.client_secret, + refresh_token=credentials.refresh_token, + ) + + else: + raise ValueError(f"Unsupported authentication type: {credentials.auth_type}") + + class QueryAPIDataCloudReader(BaseDataCloudReader): """DataCloud reader using Query API. This reader emulates data access within Data Cloud by calling the Query API. + Supports multiple authentication methods: + - OAuth Tokens (default, needs client_id/secret, with refresh_token) authentication + - Username/Password OAuth flow + Supports dataspace configuration for querying data within specific dataspaces. When a dataspace is provided (and not "default"), queries are executed within that dataspace context. @@ -90,30 +159,19 @@ def __init__( Args: spark: SparkSession instance for creating DataFrames. credentials_profile: Credentials profile name (default: "default"). + The profile determines which credentials to load from the + ~/.datacustomcode/credentials.ini file or environment variables. dataspace: Optional dataspace identifier. If provided and not "default", the connection will be configured for the specified dataspace. When None or "default", uses the default dataspace. """ self.spark = spark credentials = Credentials.from_available(profile=credentials_profile) - - if dataspace is not None and dataspace != "default": - self._conn = SalesforceCDPConnection( - credentials.login_url, - credentials.username, - credentials.password, - credentials.client_id, - credentials.client_secret, - dataspace=dataspace, - ) - else: - self._conn = SalesforceCDPConnection( - credentials.login_url, - credentials.username, - credentials.password, - credentials.client_id, - credentials.client_secret, - ) + logger.debug( + "Initializing QueryAPIDataCloudReader with " + f"auth_type={credentials.auth_type.value}" + ) + self._conn = create_cdp_connection(credentials, dataspace) def read_dlo( self, diff --git a/src/datacustomcode/io/writer/print.py b/src/datacustomcode/io/writer/print.py index 3f66b31..4eaa1ee 100644 --- a/src/datacustomcode/io/writer/print.py +++ b/src/datacustomcode/io/writer/print.py @@ -23,6 +23,20 @@ class PrintDataCloudWriter(BaseDataCloudWriter): + """Data Cloud writer that prints DataFrames for local testing. + + This writer is used during local development to validate data transformations + without actually writing to Data Cloud. It validates DataFrame columns against + the target DLO schema and prints the DataFrame contents. + + Supports multiple authentication methods through the credentials_profile: + - OAuth Tokens (core_token and refresh_token) authentication + - Username/Password OAuth flow + + The authentication method is determined by the credentials stored in the + profile (configured via `datacustomcode configure`). + """ + CONFIG_NAME = "PrintDataCloudWriter" def __init__( @@ -32,14 +46,31 @@ def __init__( credentials_profile: str = "default", dataspace: Optional[str] = None, ) -> None: + """Initialize PrintDataCloudWriter. + + Args: + spark: SparkSession instance for DataFrame operations. + reader: Optional QueryAPIDataCloudReader instance for schema validation. + If not provided, a new reader will be created using the + credentials_profile and dataspace. + credentials_profile: Credentials profile name (default: "default"). + The profile determines which credentials to load and which + authentication method to use. + dataspace: Optional dataspace identifier for multi-tenant queries. + """ super().__init__(spark) if reader is None: if dataspace is not None: self.reader = QueryAPIDataCloudReader( - self.spark, credentials_profile, dataspace=dataspace + self.spark, + credentials_profile=credentials_profile, + dataspace=dataspace, ) else: - self.reader = QueryAPIDataCloudReader(self.spark, credentials_profile) + self.reader = QueryAPIDataCloudReader( + self.spark, + credentials_profile=credentials_profile, + ) else: self.reader = reader diff --git a/tests/test_credentials.py b/tests/test_credentials.py index e2f1bcc..1bec933 100644 --- a/tests/test_credentials.py +++ b/tests/test_credentials.py @@ -6,68 +6,183 @@ import pytest -from datacustomcode.credentials import ENV_CREDENTIALS, Credentials +from datacustomcode.credentials import AuthType, Credentials class TestCredentials: - def test_from_env(self): - """Test loading credentials from environment variables.""" - test_creds = { - "username": "test_user", - "password": "test_pass", - "client_id": "test_client_id", - "client_secret": "test_secret", - "login_url": "https://test.login.url", + """Test suite for Credentials class supporting multiple auth types.""" + + # ============== OAuth Tokens Tests (Default) ============== + + def test_from_env_oauth_tokens_default(self): + """Test loading OAuth Tokens credentials from env vars (default).""" + env_vars = { + "SFDC_LOGIN_URL": "https://test.login.url", + "SFDC_CLIENT_ID": "test_client_id", + "SFDC_CLIENT_SECRET": "test_secret", + "SFDC_REFRESH_TOKEN": "test_refresh_token", + "SFDC_CORE_TOKEN": "test_core_token", } - with patch.dict( - os.environ, {v: test_creds[k] for k, v in ENV_CREDENTIALS.items()} + with patch.dict(os.environ, env_vars, clear=True): + creds = Credentials.from_env() + + assert creds.auth_type == AuthType.OAUTH_TOKENS + assert creds.client_secret == "test_secret" + assert creds.refresh_token == "test_refresh_token" + assert creds.core_token == "test_core_token" + assert creds.client_id == "test_client_id" + assert creds.login_url == "https://test.login.url" + + def test_from_env_oauth_tokens_explicit(self): + """Test loading OAuth Tokens credentials with explicit auth type.""" + env_vars = { + "SFDC_LOGIN_URL": "https://test.login.url", + "SFDC_CLIENT_ID": "test_client_id", + "SFDC_AUTH_TYPE": "oauth_tokens", + "SFDC_CLIENT_SECRET": "test_secret", + "SFDC_REFRESH_TOKEN": "test_refresh_token", + } + + with patch.dict(os.environ, env_vars, clear=True): + creds = Credentials.from_env() + + assert creds.auth_type == AuthType.OAUTH_TOKENS + assert creds.client_secret == "test_secret" + assert creds.refresh_token == "test_refresh_token" + + def test_from_ini_oauth_tokens(self): + """Test loading OAuth Tokens credentials from an INI file.""" + ini_content = """ + [oauth_profile] + auth_type = oauth_tokens + login_url = https://oauth.login.url + client_id = oauth_client_id + client_secret = oauth_secret + refresh_token = oauth_refresh_token + core_token = oauth_core_token + """ + + with ( + patch("os.path.exists", return_value=True), + patch("builtins.open", mock_open(read_data=ini_content)), + ): + mock_config = configparser.ConfigParser() + mock_config.read_string(ini_content) + + with patch.object(configparser, "ConfigParser", return_value=mock_config): + creds = Credentials.from_ini( + profile="oauth_profile", ini_file="fake_path" + ) + assert creds.auth_type == AuthType.OAUTH_TOKENS + assert creds.client_secret == "oauth_secret" + assert creds.refresh_token == "oauth_refresh_token" + assert creds.core_token == "oauth_core_token" + assert creds.client_id == "oauth_client_id" + + def test_from_ini_default_auth_type(self): + """Test that INI files without auth_type default to oauth_tokens.""" + ini_content = """ + [default] + login_url = https://ini.login.url + client_id = ini_client_id + client_secret = ini_secret + refresh_token = ini_refresh_token + """ + + with ( + patch("os.path.exists", return_value=True), + patch("builtins.open", mock_open(read_data=ini_content)), ): + mock_config = configparser.ConfigParser() + mock_config.read_string(ini_content) + + with patch.object(configparser, "ConfigParser", return_value=mock_config): + creds = Credentials.from_ini(profile="default", ini_file="fake_path") + assert creds.auth_type == AuthType.OAUTH_TOKENS + assert creds.client_secret == "ini_secret" + assert creds.refresh_token == "ini_refresh_token" + + def test_oauth_tokens_missing_refresh_token(self): + """Test that OAuth Tokens auth requires refresh token.""" + with pytest.raises(ValueError, match="refresh_token"): + Credentials( + login_url="https://test.login.url", + client_id="test_client_id", + auth_type=AuthType.OAUTH_TOKENS, + client_secret="test_secret", + ) + + def test_oauth_tokens_missing_client_secret(self): + """Test that OAuth Tokens auth requires client secret.""" + with pytest.raises(ValueError, match="client_secret"): + Credentials( + login_url="https://test.login.url", + client_id="test_client_id", + auth_type=AuthType.OAUTH_TOKENS, + refresh_token="test_refresh_token", + ) + + # ============== Username/Password Tests ============== + + def test_from_env_username_password(self): + """Test loading username/password credentials from environment variables.""" + env_vars = { + "SFDC_LOGIN_URL": "https://test.login.url", + "SFDC_CLIENT_ID": "test_client_id", + "SFDC_AUTH_TYPE": "username_password", + "SFDC_USERNAME": "test_user", + "SFDC_PASSWORD": "test_pass", + "SFDC_CLIENT_SECRET": "test_secret", + } + + with patch.dict(os.environ, env_vars, clear=True): creds = Credentials.from_env() - assert creds.username == test_creds["username"] - assert creds.password == test_creds["password"] - assert creds.client_id == test_creds["client_id"] - assert creds.client_secret == test_creds["client_secret"] - assert creds.login_url == test_creds["login_url"] + assert creds.auth_type == AuthType.USERNAME_PASSWORD + assert creds.username == "test_user" + assert creds.password == "test_pass" + assert creds.client_id == "test_client_id" + assert creds.client_secret == "test_secret" + assert creds.login_url == "https://test.login.url" def test_from_env_missing_vars(self): """Test that missing environment variables raise appropriate error.""" - # Ensure environment variables are not set with patch.dict(os.environ, {}, clear=True): - with pytest.raises(ValueError, match="must be set in environment"): + with pytest.raises(ValueError, match="SFDC_LOGIN_URL and SFDC_CLIENT_ID"): Credentials.from_env() - def test_from_ini(self): - """Test loading credentials from an INI file.""" + def test_from_ini_username_password(self): + """Test loading username/password credentials from an INI file.""" ini_content = """ [default] + auth_type = username_password + login_url = https://ini.login.url + client_id = ini_client_id username = ini_user password = ini_pass - client_id = ini_client_id client_secret = ini_secret - login_url = https://ini.login.url [other_profile] + auth_type = username_password + login_url = https://other.login.url + client_id = other_client_id username = other_user password = other_pass - client_id = other_client_id client_secret = other_secret - login_url = https://other.login.url """ with ( - patch("configparser.ConfigParser.read"), + patch("os.path.exists", return_value=True), patch("builtins.open", mock_open(read_data=ini_content)), ): - - # Mock the configparser behavior for reading the file mock_config = configparser.ConfigParser() mock_config.read_string(ini_content) with patch.object(configparser, "ConfigParser", return_value=mock_config): # Test default profile creds = Credentials.from_ini(profile="default", ini_file="fake_path") + assert creds.auth_type == AuthType.USERNAME_PASSWORD assert creds.username == "ini_user" assert creds.password == "ini_pass" assert creds.client_id == "ini_client_id" @@ -84,39 +199,49 @@ def test_from_ini(self): assert creds.client_secret == "other_secret" assert creds.login_url == "https://other.login.url" + def test_username_password_missing_username(self): + """Test that Username/Password auth requires username.""" + with pytest.raises(ValueError, match="username"): + Credentials( + login_url="https://test.login.url", + client_id="test_client_id", + auth_type=AuthType.USERNAME_PASSWORD, + password="test_pass", + client_secret="test_secret", + ) + + # ============== from_available Tests ============== + def test_from_available_env(self): """Test that from_available uses environment variables when available.""" - test_creds = { - "username": "test_user", - "password": "test_pass", - "client_id": "test_client_id", - "client_secret": "test_secret", - "login_url": "https://test.login.url", + env_vars = { + "SFDC_LOGIN_URL": "https://test.login.url", + "SFDC_CLIENT_ID": "test_client_id", + "SFDC_CLIENT_SECRET": "test_secret", + "SFDC_REFRESH_TOKEN": "test_refresh_token", } with ( - patch.dict( - os.environ, {v: test_creds[k] for k, v in ENV_CREDENTIALS.items()} - ), + patch.dict(os.environ, env_vars, clear=True), patch("os.path.exists", return_value=False), ): creds = Credentials.from_available() - assert creds.username == test_creds["username"] - assert creds.password == test_creds["password"] - assert creds.client_id == test_creds["client_id"] - assert creds.client_secret == test_creds["client_secret"] - assert creds.login_url == test_creds["login_url"] + assert creds.auth_type == AuthType.OAUTH_TOKENS + assert creds.client_id == "test_client_id" + assert creds.client_secret == "test_secret" + assert creds.refresh_token == "test_refresh_token" + assert creds.login_url == "https://test.login.url" def test_from_available_ini(self): """Test that from_available uses INI file when env vars not available.""" ini_content = """ [default] - username = ini_user - password = ini_pass + auth_type = oauth_tokens + login_url = https://ini.login.url client_id = ini_client_id client_secret = ini_secret - login_url = https://ini.login.url + refresh_token = ini_refresh_token """ with ( @@ -124,18 +249,16 @@ def test_from_available_ini(self): patch("os.path.exists", return_value=True), patch("builtins.open", mock_open(read_data=ini_content)), ): - - # Mock the configparser behavior mock_config = configparser.ConfigParser() mock_config.read_string(ini_content) with patch.object(configparser, "ConfigParser", return_value=mock_config): creds = Credentials.from_available() - assert creds.username == "ini_user" - assert creds.password == "ini_pass" + assert creds.auth_type == AuthType.OAUTH_TOKENS assert creds.client_id == "ini_client_id" assert creds.client_secret == "ini_secret" + assert creds.refresh_token == "ini_refresh_token" assert creds.login_url == "https://ini.login.url" def test_from_available_no_creds(self): @@ -147,23 +270,69 @@ def test_from_available_no_creds(self): with pytest.raises(ValueError, match="Credentials not found"): Credentials.from_available() - def test_update_ini(self): - """Test updating credentials in an INI file.""" + # ============== update_ini Tests ============== + + def test_update_ini_oauth_tokens(self): + """Test updating OAuth Tokens credentials in an INI file.""" ini_content = """ [default] - username = old_user - password = old_pass + auth_type = oauth_tokens + login_url = https://old.login.url client_id = old_client_id client_secret = old_secret + refresh_token = old_refresh_token + """ + + creds = Credentials( + login_url="https://new.login.url", + client_id="new_client_id", + auth_type=AuthType.OAUTH_TOKENS, + client_secret="new_secret", + refresh_token="new_refresh_token", + core_token="new_core_token", + ) + + mock_file = mock_open(read_data=ini_content) + + with ( + patch("os.path.expanduser", return_value="/fake/expanded/path"), + patch("os.path.exists", return_value=True), + patch("os.makedirs"), + patch("builtins.open", mock_file), + ): + mock_config = configparser.ConfigParser() + mock_config.read_string(ini_content) + + with patch.object(configparser, "ConfigParser", return_value=mock_config): + creds.update_ini(profile="default", ini_file="~/fake_path") + + mock_file.assert_called_with("/fake/expanded/path", "w") + assert mock_config["default"]["auth_type"] == "oauth_tokens" + assert mock_config["default"]["client_id"] == "new_client_id" + assert mock_config["default"]["client_secret"] == "new_secret" + assert mock_config["default"]["refresh_token"] == "new_refresh_token" + assert mock_config["default"]["core_token"] == "new_core_token" + assert mock_config["default"]["login_url"] == "https://new.login.url" + + def test_update_ini_username_password(self): + """Test updating username/password credentials in an INI file.""" + ini_content = """ + [default] + auth_type = username_password login_url = https://old.login.url + client_id = old_client_id + username = old_user + password = old_pass + client_secret = old_secret """ creds = Credentials( + login_url="https://new.login.url", + client_id="new_client_id", + auth_type=AuthType.USERNAME_PASSWORD, username="new_user", password="new_pass", - client_id="new_client_id", client_secret="new_secret", - login_url="https://new.login.url", ) mock_file = mock_open(read_data=ini_content) @@ -174,18 +343,14 @@ def test_update_ini(self): patch("os.makedirs"), patch("builtins.open", mock_file), ): - - # Mock the configparser behavior mock_config = configparser.ConfigParser() mock_config.read_string(ini_content) with patch.object(configparser, "ConfigParser", return_value=mock_config): creds.update_ini(profile="default", ini_file="~/fake_path") - # Check if the file was opened for writing mock_file.assert_called_with("/fake/expanded/path", "w") - - # Check if the config has the updated values + assert mock_config["default"]["auth_type"] == "username_password" assert mock_config["default"]["username"] == "new_user" assert mock_config["default"]["password"] == "new_pass" assert mock_config["default"]["client_id"] == "new_client_id" @@ -196,19 +361,19 @@ def test_update_ini_new_profile(self): """Test updating credentials with a new profile.""" ini_content = """ [existing] - username = existing_user - password = existing_pass + auth_type = oauth_tokens + login_url = https://existing.login.url client_id = existing_client_id client_secret = existing_secret - login_url = https://existing.login.url + refresh_token = existing_refresh_token """ creds = Credentials( - username="new_profile_user", - password="new_profile_pass", + login_url="https://new.profile.login.url", client_id="new_profile_client_id", + auth_type=AuthType.OAUTH_TOKENS, client_secret="new_profile_secret", - login_url="https://new.profile.login.url", + refresh_token="new_profile_refresh_token", ) mock_file = mock_open(read_data=ini_content) @@ -219,48 +384,47 @@ def test_update_ini_new_profile(self): patch("os.makedirs"), patch("builtins.open", mock_file), ): - - # Mock the configparser behavior mock_config = configparser.ConfigParser() mock_config.read_string(ini_content) with patch.object(configparser, "ConfigParser", return_value=mock_config): creds.update_ini(profile="new_profile", ini_file="~/fake_path") - # Check if the new profile was created assert "new_profile" in mock_config - assert mock_config["new_profile"]["username"] == "new_profile_user" - assert mock_config["new_profile"]["password"] == "new_profile_pass" assert ( mock_config["new_profile"]["client_id"] == "new_profile_client_id" ) assert ( mock_config["new_profile"]["client_secret"] == "new_profile_secret" ) + assert ( + mock_config["new_profile"]["refresh_token"] + == "new_profile_refresh_token" + ) assert ( mock_config["new_profile"]["login_url"] == "https://new.profile.login.url" ) - - # Check that existing profile was not modified - assert mock_config["existing"]["username"] == "existing_user" + assert ( + mock_config["existing"]["refresh_token"] == "existing_refresh_token" + ) def test_from_available_with_custom_profile(self): """Test that from_available uses custom profile when specified.""" ini_content = """ [default] - username = default_user - password = default_pass + auth_type = oauth_tokens + login_url = https://default.login.url client_id = default_client_id client_secret = default_secret - login_url = https://default.login.url + refresh_token = default_refresh_token [custom_profile] - username = custom_user - password = custom_pass + auth_type = oauth_tokens + login_url = https://custom.login.url client_id = custom_client_id client_secret = custom_secret - login_url = https://custom.login.url + refresh_token = custom_refresh_token """ with ( @@ -268,46 +432,36 @@ def test_from_available_with_custom_profile(self): patch("os.path.exists", return_value=True), patch("builtins.open", mock_open(read_data=ini_content)), ): - # Mock the configparser behavior for reading the file mock_config = configparser.ConfigParser() mock_config.read_string(ini_content) with patch.object(configparser, "ConfigParser", return_value=mock_config): - # Test default profile creds_default = Credentials.from_available() - assert creds_default.username == "default_user" + assert creds_default.client_secret == "default_secret" + assert creds_default.refresh_token == "default_refresh_token" assert creds_default.login_url == "https://default.login.url" - # Test custom profile creds_custom = Credentials.from_available(profile="custom_profile") - assert creds_custom.username == "custom_user" - assert creds_custom.password == "custom_pass" assert creds_custom.client_id == "custom_client_id" assert creds_custom.client_secret == "custom_secret" + assert creds_custom.refresh_token == "custom_refresh_token" assert creds_custom.login_url == "https://custom.login.url" - def test_from_available_fallback_to_default(self): - """Test that from_available falls back to default when no profile specified.""" - ini_content = """ - [default] - username = default_user - password = default_pass - client_id = default_client_id - client_secret = default_secret - login_url = https://default.login.url - """ + # ============== AuthType Enum Tests ============== - with ( - patch("datacustomcode.credentials.INI_FILE", "fake_path"), - patch("os.path.exists", return_value=True), - patch("builtins.open", mock_open(read_data=ini_content)), - ): - # Mock the configparser behavior for reading the file - mock_config = configparser.ConfigParser() - mock_config.read_string(ini_content) + def test_auth_type_values(self): + """Test AuthType enum values.""" + assert AuthType.USERNAME_PASSWORD.value == "username_password" + assert AuthType.OAUTH_TOKENS.value == "oauth_tokens" - with patch.object(configparser, "ConfigParser", return_value=mock_config): - # Test that no profile parameter defaults to "default" - creds = Credentials.from_available() - assert creds.username == "default_user" - assert creds.login_url == "https://default.login.url" + def test_invalid_auth_type_from_env(self): + """Test that invalid auth type from env raises error.""" + env_vars = { + "SFDC_LOGIN_URL": "https://test.login.url", + "SFDC_CLIENT_ID": "test_client_id", + "SFDC_AUTH_TYPE": "invalid_auth_type", + } + + with patch.dict(os.environ, env_vars, clear=True): + with pytest.raises(ValueError, match="Invalid SFDC_AUTH_TYPE"): + Credentials.from_env() diff --git a/tests/test_credentials_profile_integration.py b/tests/test_credentials_profile_integration.py index 92a1538..3b5d941 100644 --- a/tests/test_credentials_profile_integration.py +++ b/tests/test_credentials_profile_integration.py @@ -11,6 +11,7 @@ from unittest.mock import MagicMock, patch from datacustomcode.config import config +from datacustomcode.credentials import AuthType from datacustomcode.io.reader.query_api import QueryAPIDataCloudReader from datacustomcode.io.writer.print import PrintDataCloudWriter @@ -27,6 +28,7 @@ def test_query_api_reader_with_custom_profile(self): ) as mock_from_available: # Mock credentials for custom profile mock_credentials = MagicMock() + mock_credentials.auth_type = AuthType.USERNAME_PASSWORD mock_credentials.login_url = "https://custom.salesforce.com" mock_credentials.username = "custom@example.com" mock_credentials.password = "custom_password" @@ -52,10 +54,10 @@ def test_query_api_reader_with_custom_profile(self): # Verify the connection was created with the custom credentials mock_conn_class.assert_called_once_with( "https://custom.salesforce.com", - "custom@example.com", - "custom_password", - "custom_client_id", - "custom_secret", + username="custom@example.com", + password="custom_password", + client_id="custom_client_id", + client_secret="custom_secret", ) def test_print_writer_with_custom_profile(self): @@ -67,6 +69,7 @@ def test_print_writer_with_custom_profile(self): ) as mock_from_available: # Mock credentials for custom profile mock_credentials = MagicMock() + mock_credentials.auth_type = AuthType.USERNAME_PASSWORD mock_credentials.login_url = "https://custom.salesforce.com" mock_credentials.username = "custom@example.com" mock_credentials.password = "custom_password" @@ -160,6 +163,7 @@ def test_credentials_profile_consistency(self): ) as mock_from_available: # Mock credentials mock_credentials = MagicMock() + mock_credentials.auth_type = AuthType.USERNAME_PASSWORD mock_credentials.login_url = "https://consistent.salesforce.com" mock_credentials.username = "consistent@example.com" mock_credentials.password = "consistent_password" @@ -201,6 +205,7 @@ def test_multiple_profiles_isolation(self): # Mock different credentials for different profiles def mock_credentials_side_effect(profile="default"): mock_creds = MagicMock() + mock_creds.auth_type = AuthType.USERNAME_PASSWORD if profile == "profile1": mock_creds.login_url = "https://profile1.salesforce.com" mock_creds.username = "profile1@example.com" diff --git a/tests/test_deploy.py b/tests/test_deploy.py index 8fe3219..e65487d 100644 --- a/tests/test_deploy.py +++ b/tests/test_deploy.py @@ -10,7 +10,7 @@ import pytest import requests -from datacustomcode.credentials import Credentials +from datacustomcode.credentials import AuthType, Credentials from datacustomcode.deploy import DloPermission, Permissions # Patch get_version before importing deploy module @@ -461,11 +461,12 @@ class TestRetrieveAccessToken: def test_retrieve_access_token(self, mock_make_api_call): """Test retrieving access token.""" credentials = Credentials( + login_url="https://example.com", + client_id="id", + auth_type=AuthType.USERNAME_PASSWORD, username="user", password="pass", - client_id="id", client_secret="secret", - login_url="https://example.com", ) mock_make_api_call.return_value = { @@ -828,11 +829,12 @@ def test_deploy_full( ): """Test full deployment process.""" credentials = Credentials( + login_url="https://example.com", + client_id="id", + auth_type=AuthType.USERNAME_PASSWORD, username="user", password="pass", - client_id="id", client_secret="secret", - login_url="https://example.com", ) metadata = TransformationJobMetadata( name="test_job", @@ -911,11 +913,12 @@ def test_deploy_full_happy_path( ): """Test full deployment process with Docker dependency building.""" credentials = Credentials( + login_url="https://example.com", + client_id="id", + auth_type=AuthType.USERNAME_PASSWORD, username="user", password="pass", - client_id="id", client_secret="secret", - login_url="https://example.com", ) metadata = TransformationJobMetadata( name="test_job",