Skip to content

Commit 060175d

Browse files
authored
Save local key-value store records with the correct extension (#91)
There was a bug that files in local key-value store were saved with the wrong extension, e.g. `Actor.set_value('image.png', data)` would save the record as `image.png.bin`. This fixes it by making the filename generation more sensible: - the `.bin` extension won't be added anymore if the content type is `application/octet-stream`. - the record metadata is no longer saved as `{record_key}.__metadata__.json`, but as `{record_filename}.__metadata__.json` - this makes it a lot easier to find the right metadata for a given file when opening the store - the metadata no longer contains the `extension` attribute, just the key - the internal record representation now contains the `filename` attribute pointing to the filename under which the record is saved. I think the logic is a bit more readable this way, and it works better. I also wrote a lot of tests to check if it works correctly, they're a bit hacky, but good enough. Fixes #85.
1 parent 7826a38 commit 060175d

File tree

4 files changed

+305
-162
lines changed

4 files changed

+305
-162
lines changed

CHANGELOG.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -11,6 +11,7 @@ Changelog
1111
### Fixed
1212

1313
- started enforcing local storage to always use the UTF-8 encoding
14+
- fixed saving key-value store values to local storage with the right extension for a given content type
1415

1516
[1.0.0](../../releases/tag/v1.0.0) - 2022-03-13
1617
-----------------------------------------------

src/apify/_memory_storage/file_storage_utils.py

Lines changed: 0 additions & 39 deletions
Original file line numberDiff line numberDiff line change
@@ -41,45 +41,6 @@ async def _update_dataset_items(
4141
await f.write(_json_dumps(item).encode('utf-8'))
4242

4343

44-
async def _set_or_delete_key_value_store_record(
45-
*,
46-
entity_directory: str,
47-
persist_storage: bool,
48-
record: Dict,
49-
should_set: bool,
50-
write_metadata: bool,
51-
) -> None:
52-
# Skip writing files to the disk if the client has the option set to false
53-
if not persist_storage:
54-
return
55-
56-
# Ensure the directory for the entity exists
57-
await makedirs(entity_directory, exist_ok=True)
58-
59-
# Create files for the record
60-
record_path = os.path.join(entity_directory, f"""{record['key']}.{record['extension']}""")
61-
record_metadata_path = os.path.join(entity_directory, f"""{record['key']}.__metadata__.json""")
62-
63-
await _force_remove(record_path)
64-
await _force_remove(record_metadata_path)
65-
66-
if should_set:
67-
if write_metadata:
68-
async with aiofiles.open(record_metadata_path, mode='wb') as f:
69-
await f.write(_json_dumps({
70-
'key': record['key'],
71-
'contentType': record.get('content_type') or 'unknown/no content type',
72-
'extension': record['extension'],
73-
}).encode('utf-8'))
74-
75-
# Convert to bytes if string
76-
if isinstance(record['value'], str):
77-
record['value'] = record['value'].encode('utf-8')
78-
79-
async with aiofiles.open(record_path, mode='wb') as f:
80-
await f.write(record['value'])
81-
82-
8344
async def _update_request_queue_item(
8445
*,
8546
request_id: str,

0 commit comments

Comments
 (0)