Skip to content

Commit 122ca9d

Browse files
Fix documentation for skip_data flag (#70)
1 parent 59a1d9c commit 122ca9d

File tree

4 files changed

+49
-22
lines changed

4 files changed

+49
-22
lines changed

api-reference/python/tilebox.datasets/Collection.find.mdx

Lines changed: 12 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -19,7 +19,7 @@ Find a specific datapoint in a collection by its id.
1919
</ParamField>
2020

2121
<ParamField path="skip_data" type="bool">
22-
Whether to skip loading the data for the datapoint. If `True`, only the metadata for the datapoint is loaded.
22+
If `True`, the response contains only the ID and the timestamp for the datapoint. Defaults to `False`.
2323
</ParamField>
2424

2525
## Returns
@@ -38,7 +38,17 @@ Since it returns only a single data point, the output xarray dataset does not in
3838
```python Python
3939
data = collection.find(
4040
"0186d6b6-66cc-fcfd-91df-bbbff72499c3",
41-
skip_data = False,
4241
)
42+
43+
44+
# check if a datapoint exists
45+
try:
46+
collection.find(
47+
"0186d6b6-66cc-fcfd-91df-bbbff72499c3",
48+
skip_data=True,
49+
)
50+
exists = True
51+
except NotFoundError:
52+
exists = False
4353
```
4454
</RequestExample>

api-reference/python/tilebox.datasets/Collection.query.mdx

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ If no data exists for the requested time or interval, an empty `xarray.Dataset`
3232
</ParamField>
3333

3434
<ParamField path="skip_data" type="bool">
35-
If `True`, the response contains only the [required fields for the dataset type](/datasets/types/timeseries) without the actual dataset-specific fields. Defaults to `False`.
35+
If `True`, the response contains only the ID and the timestamp for each datapoint. Defaults to `False`.
3636
</ParamField>
3737

3838
<ParamField path="show_progress" type="bool">
@@ -54,7 +54,13 @@ data = collection.query(temporal_extent=time)
5454

5555
# querying a time interval
5656
interval = ("2023-05-01", "2023-08-01")
57-
data = collection.query(temporal_extent=interval, show_progress=True)
57+
data = collection.query(temporal_extent=interval)
58+
59+
# displaying a progress bar while querying
60+
data = collection.query(
61+
temporal_extent=interval,
62+
show_progress=True,
63+
)
5864

5965
# querying a time interval with TimeInterval
6066
interval = TimeInterval(
@@ -63,11 +69,13 @@ interval = TimeInterval(
6369
start_exclusive=False,
6470
end_inclusive=False,
6571
)
66-
data = collection.query(temporal_extent=interval, show_progress=True)
72+
data = collection.query(temporal_extent=interval)
6773

6874
# querying with an iterable
69-
meta_data = collection.query(temporal_extent=..., skip_data=True)
70-
first_50 = collection.query(temporal_extent=meta_data.time[:50], skip_data=False)
75+
datapoints = collection.query(
76+
temporal_extent=interval,
77+
skip_data=True, # only fetch datapoint IDs and time
78+
)
79+
first_50 = collection.query(temporal_extent=datapoints.time[:50])
7180
```
7281
</RequestExample>
73-

datasets/delete.mdx

Lines changed: 4 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -89,15 +89,15 @@ Deleted 2 data points.
8989

9090
## Deleting a time interval
9191

92-
One common way to delete data is to first load it from a collection and then forward it to the `delete` method. For
93-
this use case it often is a good idea to query the datapoints with `skip_data=True` to avoid loading the data fields,
94-
since you only need the datapoint IDs. See [fetching only metadata](/datasets/query#fetching-only-metadata) for more details.
92+
One common way to delete all datapoints in a time interval is to first query it from a collection and then deleting those
93+
found datapoints. For this use case it often is a good idea to query the datapoints with `skip_data=True` to avoid actually
94+
loading the data fields, since only the datapoint IDs are required. See [skipping data fields](/datasets/query#skipping-data-fields) for more details.
9595

9696
<CodeGroup>
9797
```python Python
9898
to_delete = collection.query(temporal_extent=("2023-05-01", "2023-06-01"), skip_data=True)
9999

100-
n_deleted = collection.delete(datapoints)
100+
n_deleted = collection.delete(to_delete)
101101
print(f"Deleted {n_deleted} data points.")
102102
```
103103
```go Go

datasets/query.mdx

Lines changed: 19 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -274,9 +274,9 @@ You can specify a time interval by using an iterable of `TimeScalar`s as the `te
274274
<CodeGroup>
275275
```python Python
276276
interval = ("2017-01-01", "2023-01-01")
277-
meta_data = collection.query(temporal_extent=interval, skip_data=True)
277+
found_datapoints = collection.query(temporal_extent=interval, skip_data=True)
278278

279-
first_50_data_points = collection.query(temporal_extent=meta_data.time[:50], skip_data=False)
279+
first_50_data_points = collection.query(temporal_extent=found_datapoints.time[:50])
280280
print(first_50_data_points)
281281
```
282282
</CodeGroup>
@@ -423,19 +423,23 @@ if err != nil {
423423
```
424424
</CodeGroup>
425425

426-
## Fetching only metadata
426+
## Skipping data fields
427427

428-
Sometimes, it may be useful to load only dataset metadata fields without the actual data fields. This can be done by setting the `skip_data` parameter to `True`.
429-
For example, when only checking if a datapoint exists, you may want to use `skip_data=True` to avoid loading the data fields.
430-
If this flag is set, the response will only include the required fields for the given dataset type, but no custom data fields.
428+
Sometimes, only the ID or timestamp associated with a datapoint is required. In this case, loading the full data fields for each datapoint is not necessary and can be avoided by
429+
setting the `skip_data` parameter to `True`.
430+
431+
For example, when only checking how many datapoints exist in a given time interval, you can use `skip_data=True` to avoid loading the data fields.
431432

432433
<CodeGroup>
433434
```python Python
434-
data = collection.query(temporal_extent="2024-08-01 00:00:01.362", skip_data=True)
435-
print(data)
435+
interval = ("2023-01-01", "2023-02-01")
436+
data = collection.query(temporal_extent=interval, skip_data=True)
437+
print(f"Found {data.sizes['time']} data points.")
436438
```
437439
```go Go
438-
temporalExtent := query.NewPointInTime(time.Date(2024, time.August, 1, 0, 0, 1, 362000000, time.UTC))
440+
startDate := time.Date(2023, time.January, 1, 0, 0, 0, 0, time.UTC)
441+
endDate := time.Date(2023, time.February, 1, 0, 0, 0, 0, time.UTC)
442+
interval := query.NewTimeInterval(startDate, endDate)
439443

440444
var datapoints []*v1.Sentinel1Sar
441445
err = client.Datapoints.QueryInto(ctx,
@@ -592,10 +596,15 @@ Data variables: (12/30)
592596
</CodeGroup>
593597

594598
<Tip>
595-
You can also set the `skip_data` parameter when calling `find` to query only the required fields of the data point, same as for `load`.
599+
You can also set the `skip_data` parameter when calling `find` to query only the required fields of the data point, same as for `query`.
596600
</Tip>
597601

598602
## Automatic pagination
599603

600604
Querying large time intervals can return a large number of data points.
601605
Tilebox automatically handles pagination for you by sending paginated requests to the server.
606+
607+
<Tip>
608+
When using the python SDK in an interactive notebook environment, you can additionally also display a
609+
progress bar to keep track of the progress of the query by setting the `show_progress` parameter to `True`.
610+
</Tip>

0 commit comments

Comments
 (0)