Skip to content

Commit 0e7be56

Browse files
committed
Typed processes
Signed-off-by: Ben Sherman <bentshermann@gmail.com>
1 parent 3fb8a58 commit 0e7be56

File tree

88 files changed

+7311
-3269
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

88 files changed

+7311
-3269
lines changed

docs/migrations/25-10.md

Lines changed: 60 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -16,16 +16,16 @@ The `params` block is a new way to declare pipeline parameters in a Nextflow scr
1616

1717
```nextflow
1818
params {
19-
// Path to input data.
20-
input: Path
19+
// Path to input data.
20+
input: Path
2121
22-
// Whether to save intermediate files.
23-
save_intermeds: Boolean = false
22+
// Whether to save intermediate files.
23+
save_intermeds: Boolean = false
2424
}
2525
2626
workflow {
27-
println "params.input = ${params.input}"
28-
println "params.save_intermeds = ${params.save_intermeds}"
27+
println "params.input = ${params.input}"
28+
println "params.save_intermeds = ${params.save_intermeds}"
2929
}
3030
```
3131

@@ -39,33 +39,56 @@ Type annotations are a way to denote the *type* of a variable. They help documen
3939

4040
```nextflow
4141
workflow RNASEQ {
42-
take:
43-
reads: Channel<Path>
44-
index: Value<Path>
42+
take:
43+
reads: Channel<Path>
44+
index: Value<Path>
4545
46-
main:
47-
samples_ch = QUANT( reads, index )
46+
main:
47+
samples_ch = QUANT( reads, index )
4848
49-
emit:
50-
samples: Channel<Path> = samples_ch
49+
emit:
50+
samples: Channel<Path> = samples_ch
5151
}
5252
5353
def isSraId(id: String) -> Boolean {
54-
return id.startsWith('SRA')
54+
return id.startsWith('SRA')
5555
}
5656
```
5757

5858
The following declarations can be annotated with types:
5959

6060
- Pipeline parameters (the `params` block)
6161
- Workflow takes and emits
62+
- Process inputs and outputs
6263
- Function parameters and returns
6364
- Local variables
6465
- Closure parameters
6566
- Workflow outputs (the `output` block)
6667

6768
Type annotations can refer to any of the {ref}`standard types <stdlib-types>`.
6869

70+
Some types have *generic type parameters*, which allow them to be reused in a type-safe manner with different types of data. For example:
71+
72+
- The generic type `E` in `List<E>` and `Channel<E>` refers to the type of the elements in the list or channel
73+
74+
- The generic types `K` and `V` in `Map<K,V>` refer to the types of keys and values in the map
75+
76+
Here are some concrete examples of types that use type parameters:
77+
78+
```nextflow
79+
// List<E> where E is String
80+
def sequences: List<String> = ['ATCG', 'GCTA', 'TTAG']
81+
82+
// List<E> where E is Path
83+
def fastqFiles: List<Path> = [file('sample1.fastq'), file('sample2.fastq')]
84+
85+
// Map<K,V> where K is String and V is Integer
86+
def readCounts: Map<String,Integer> = [sample1: 1000, sample2: 1500]
87+
88+
// Channel<E> where E is Path
89+
def inputFiles: Channel<Path> = channel.fromPath('*.bam')
90+
```
91+
6992
Type annotations can be appended with `?` to denote that the value can be `null`:
7093

7194
```nextflow
@@ -78,6 +101,26 @@ In the type system, queue channels are represented as `Channel`, while value cha
78101
Nextflow supports Groovy-style type annotations using the `<type> <name>` syntax, but this approach is deprecated in {ref}`strict syntax <strict-syntax-page>`. While Groovy-style annotations remain valid for functions and local variables, the language server and `nextflow lint` automatically convert them to Nextflow-style annotations during code formatting.
79102
:::
80103

104+
Type annotations can also be used for process inputs/outputs:
105+
106+
```nextflow
107+
process fastqc {
108+
input:
109+
(id, fastq_1, fastq_2): Tuple<String,Path,Path>
110+
111+
output:
112+
logs = tuple(id, file('fastqc_logs'))
113+
114+
script:
115+
"""
116+
mkdir fastqc_logs
117+
fastqc -o fastqc_logs -f fastq -q ${fastq_1} ${fastq_2}
118+
"""
119+
}
120+
```
121+
122+
See {ref}`migrating-static-types` for details.
123+
81124
## Enhancements
82125

83126
<h3>Nextflow plugin registry</h3>
@@ -113,7 +156,7 @@ workflow {
113156

114157
This syntax is simpler and easier to use with the {ref}`strict syntax <strict-syntax-page>`. See {ref}`workflow-handlers` for details.
115158

116-
<h3>Improved handling of dynamic directives</h3>
159+
<h3>Simpler syntax for dynamic directives</h3>
117160

118161
The {ref}`strict syntax <strict-syntax-page>` allows dynamic process directives to be specified without a closure:
119162

@@ -131,6 +174,8 @@ process hello {
131174
}
132175
```
133176

177+
Dynamic process settings in configuration files must still be specified with closures.
178+
134179
See {ref}`dynamic-directives` for details.
135180

136181
<h3>Configurable date formatting</h3>

docs/process-typed.md

Lines changed: 248 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,248 @@
1+
(process-typed-page)=
2+
3+
# Processes (typed)
4+
5+
:::{versionadded} 25.10.0
6+
:::
7+
8+
:::{note}
9+
This feature requires the {ref}`strict syntax <strict-syntax-page>` to be enabled (`NXF_SYNTAX_PARSER=v2`).
10+
:::
11+
12+
Process inputs and outputs can be defined using static types:
13+
14+
```nextflow
15+
process hello {
16+
input:
17+
message: String
18+
19+
output:
20+
file('hello.txt')
21+
22+
script:
23+
"""
24+
echo '${message}' > hello.txt
25+
"""
26+
}
27+
```
28+
29+
See {ref}`syntax-process-typed` for a full description of the process syntax. See {ref}`migrating-static-types` for more information on migrating existing code to static types.
30+
31+
## Inputs
32+
33+
The `input:` section is used to declare the inputs of a process. An input declaration in a typed process consists of a name and a type:
34+
35+
```nextflow
36+
process fastqc {
37+
input:
38+
(meta, fastq): Tuple<Map,Path>
39+
extra_args: String
40+
41+
script:
42+
"""
43+
echo 'meta: ${meta}`
44+
echo 'fastq: ${fastq}'
45+
echo 'extra_args: ${extra_args}'
46+
"""
47+
}
48+
```
49+
50+
Any of the {ref}`standard types <stdlib-types>` can be used as type annotations (except for `Channel` and `Value`, which can only be used in workflows).
51+
52+
### File inputs
53+
54+
Inputs of type `Path` or a collection of `Path` (e.g. `Set<Path>`) are automatically staged into the task directory.
55+
56+
By default, the task will fail if any input receives a `null` value. You can mark an input as nullable by appending `?` to the type annotation:
57+
58+
```nextflow
59+
process cat_opt {
60+
input:
61+
input: Path?
62+
63+
stage:
64+
stageAs 'input.txt', input
65+
66+
output:
67+
stdout()
68+
69+
script:
70+
'''
71+
[[ -f input.txt ]] && cat input.txt || echo 'empty input'
72+
'''
73+
}
74+
```
75+
76+
### Stage directives
77+
78+
The `stage:` section can be specified after the `input:` section. You can use it to specify custom staging behavior using *stage directives*. These directives serve the same purpose as input qualifiers such as `env` and `stdin` in the legacy syntax.
79+
80+
The `env` directive declares an environment variable in terms of task inputs:
81+
82+
```nextflow
83+
process echo_env {
84+
input:
85+
hello: String
86+
87+
stage:
88+
env 'HELLO', hello
89+
90+
script:
91+
'''
92+
echo "$HELLO world!"
93+
'''
94+
}
95+
```
96+
97+
The `stdin` directive defines the standard input of the task script:
98+
99+
```nextflow
100+
process cat {
101+
input:
102+
message: String
103+
104+
stage:
105+
stdin message
106+
107+
script:
108+
"""
109+
cat -
110+
"""
111+
}
112+
```
113+
114+
The `stageAs` directive stages an input file (or files) under a custom file pattern:
115+
116+
```nextflow
117+
process blast {
118+
input:
119+
fasta: Path
120+
121+
stage:
122+
stageAs 'query.fa', fasta
123+
124+
script:
125+
"""
126+
blastp -query query.fa -db nr
127+
"""
128+
}
129+
```
130+
131+
The file pattern can also reference task inputs:
132+
133+
```nextflow
134+
process grep {
135+
input:
136+
id: String
137+
fasta: Path
138+
139+
stage:
140+
stageAs "${id}.fa", fasta
141+
142+
script:
143+
"""
144+
cat ${id}.fa | grep '>'
145+
"""
146+
}
147+
```
148+
149+
See {ref}`process-reference-typed` for the set of available stage directives.
150+
151+
## Outputs
152+
153+
The `output:` section is used to declare the outputs of a typed process. An output declaration in a typed process consists of a name, an optional type, and an output value:
154+
155+
```nextflow
156+
process echo {
157+
input:
158+
message: String
159+
160+
output:
161+
out_env: String = env('MESSAGE')
162+
out_file: Path = file('message.txt')
163+
out_std: String = stdout()
164+
165+
script:
166+
"""
167+
export MESSAGE='${message}'
168+
169+
echo \$MESSAGE > message.txt
170+
171+
cat message.txt
172+
"""
173+
}
174+
```
175+
176+
When there is only one output, the name can be omitted:
177+
178+
```nextflow
179+
process echo {
180+
input:
181+
message: String
182+
183+
output:
184+
stdout()
185+
186+
script:
187+
"""
188+
echo '${message}'
189+
"""
190+
}
191+
```
192+
193+
See {ref}`process-reference-typed` for the set of available output functions.
194+
195+
### File outputs
196+
197+
You can use the `file()` and `files()` functions in the `output:` section to get a single file or collection of output files from the task directory.
198+
199+
By default, the `file()` function will fail if the specified file is not present in the task directory. You can specify `optional: true` to allow the file to be missing, in which case the `file()` function will return `null`. For example:
200+
201+
```nextflow
202+
process foo {
203+
output:
204+
file('output.txt', optional: true)
205+
206+
script:
207+
"""
208+
exit 0
209+
"""
210+
}
211+
```
212+
213+
## Topics
214+
215+
The `topic:` section is used to emit values to a {ref}`topic channel <channel-topic>`. A topic emission consists of an output value and a topic name:
216+
217+
```nextflow
218+
process cat {
219+
input:
220+
message: Path
221+
222+
output:
223+
stdout()
224+
225+
topic:
226+
tuple('bash', eval('bash --version')) >> 'versions'
227+
tuple('cat', eval('cat --version')) >> 'versions'
228+
229+
script:
230+
"""
231+
cat ${message}
232+
"""
233+
}
234+
```
235+
236+
Topic emissions can use the same {ref}`output functions <process-reference-typed>` that are available in the `output:` section.
237+
238+
## Script
239+
240+
The `script:` and `exec:` sections behave the same way as {ref}`legacy processes <process-script>`.
241+
242+
## Stub
243+
244+
The `stub:` section behaves the same way as {ref}`legacy processes <process-stub>`.
245+
246+
## Directives
247+
248+
Directives behave the same way as {ref}`legacy processes <process-directives>`.

0 commit comments

Comments
 (0)