You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* [@dp.create_streaming_table](#create_streaming_table) or [CREATE STREAMING TABLE](../sql/SparkSqlAstBuilder.md/#visitCreatePipelineDataset) (with no flows that can be defined later with [@dp.append_flow](#append_flow) or [CREATE FLOW AS INSERT INTO BY NAME](../sql/SparkSqlAstBuilder.md/#visitCreatePipelineInsertIntoFlow))
@@ -317,7 +336,38 @@ Activate (_source_) the virtual environment (that `uv` helped us create).
317
336
source .venv/bin/activate
318
337
```
319
338
320
-
This activation brings all the necessary PySpark modules that have not been released yet and are only available in the source format only (incl. Spark Declarative Pipelines).
339
+
This activation brings all the necessary Spark Declarative Pipelines' Python dependencies (that are only available in the source format only) for non-`uv` tools and CLI, incl. [Spark Pipelines CLI](#spark-pipelines) itself.
340
+
341
+
```shell
342
+
$SPARK_HOME/bin/spark-pipelines --help
343
+
```
344
+
345
+
!!! note ""
346
+
347
+
```text
348
+
usage: cli.py [-h] {run,dry-run,init} ...
349
+
350
+
Pipelines CLI
351
+
352
+
positional arguments:
353
+
{run,dry-run,init}
354
+
run Run a pipeline. If no refresh options specified, a
355
+
default incremental update is performed.
356
+
dry-run Launch a run that just validates the graph and checks
357
+
for errors.
358
+
init Generate a sample pipeline project, with a spec file and
359
+
example transformations.
360
+
361
+
options:
362
+
-h, --help show this help message and exit
363
+
```
364
+
365
+
??? note "macOS and PYSPARK_PYTHON"
366
+
On macOS, you may want to define `PYSPARK_PYTHON` environment variable to point at Python >= 3.10.
Follow [Demo: Create Virtual Environment for Python Client](#demo-create-virtual-environment-for-python-client) before getting started with this demo.
389
439
390
-
Run `spark-pipelines --help` to learn the options.
391
-
392
-
=== "Command Line"
393
-
394
-
```shell
395
-
$SPARK_HOME/bin/spark-pipelines --help
396
-
```
440
+
### 1️⃣ Display Pipelines Help
397
441
398
-
!!! note ""
399
-
400
-
```text
401
-
usage: cli.py [-h] {run,dry-run,init} ...
402
-
403
-
Pipelines CLI
442
+
Run `spark-pipelines --help` to learn the options.
404
443
405
-
positional arguments:
406
-
{run,dry-run,init}
407
-
run Run a pipeline. If no refresh options specified, a
408
-
default incremental update is performed.
409
-
dry-run Launch a run that just validates the graph and checks
410
-
for errors.
411
-
init Generate a sample pipeline project, including a spec
412
-
file and example definitions.
444
+
```shell
445
+
$SPARK_HOME/bin/spark-pipelines --help
446
+
```
413
447
414
-
options:
415
-
-h, --help show this help message and exit
416
-
```
448
+
!!! note ""
417
449
418
-
Execute `spark-pipelines dry-run` to validate a graph and checks for errors.
450
+
```text
451
+
usage: cli.py [-h] {run,dry-run,init} ...
419
452
420
-
You haven't created a pipeline graph yet (and any exceptions are expected).
453
+
Pipelines CLI
421
454
422
-
=== "Command Line"
455
+
positional arguments:
456
+
{run,dry-run,init}
457
+
run Run a pipeline. If no refresh options specified, a
458
+
default incremental update is performed.
459
+
dry-run Launch a run that just validates the graph and checks
460
+
for errors.
461
+
init Generate a sample pipeline project, including a spec
462
+
file and example definitions.
423
463
424
-
```shell
425
-
$SPARK_HOME/bin/spark-pipelines dry-run
464
+
options:
465
+
-h, --help show this help message and exit
426
466
```
427
467
428
-
!!! note ""
429
-
```console
430
-
Traceback (most recent call last):
431
-
File "/Users/jacek/oss/spark/python/pyspark/pipelines/cli.py", line 382, in <module>
432
-
main()
433
-
File "/Users/jacek/oss/spark/python/pyspark/pipelines/cli.py", line 358, in main
434
-
spec_path = find_pipeline_spec(Path.cwd())
435
-
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
436
-
File "/Users/jacek/oss/spark/python/pyspark/pipelines/cli.py", line 101, in find_pipeline_spec
437
-
raise PySparkException(
438
-
pyspark.errors.exceptions.base.PySparkException: [PIPELINE_SPEC_FILE_NOT_FOUND] No pipeline.yaml or pipeline.yml file provided in arguments or found in directory `/` or readable ancestor directories.
439
-
```
468
+
### 2️⃣ Create Pipelines Demo Project
469
+
470
+
You've only created an empty Python project so far (using `uv`).
440
471
441
472
Create a demo double `hello-spark-pipelines` pipelines project with a sample `pipeline.yml` and sample transformations (in Python and in SQL).
0 commit comments