-
Notifications
You must be signed in to change notification settings - Fork 137
synchronized_request._Request is_media sniff bug #187
Description
(see Conclusion for TL;DR)
Package version (pip): azure_cosmos-3.1.2-py3-none-any.whl
Observed error
When trying to upsert a document to a collection:
document = { 'id': str(datetime.datetime.utcnow()), 'key': 'example' }
collection_link = 'dbs/campaigns/colls/digitalmedia' # spoiler: note this collection link
options = { 'partitionKey': document['key'] }
client.UpsertItem(collection_link, document, options)I get an error thrown from inside azure.cosmos.cosmos_client at line 2880:
File "...\env-3.5.4\lib\site-packages\azure\cosmos\cosmos_client.py", line 2880, in _AddPartitionKey
partitionKeyDefinition = collection.get('partitionKey')
AttributeError: 'str' object has no attribute 'get'
This error had not been encountered in previous uses of the library.
Investigation
Upon printing the value of collection just above that line, I saw that it was a JSON string, instead of a dict, as the code must have expected.
So I followed the calls backwards: each time, the raw return value of the callee was returned:
cosmos_client.ReadContainer(self, collection_link, options=None)cosmos_client.Read(self, path, type, id, initial_headers, options=None)cosmos_client.__Get(self, path, request, headers)synchronized_request.SynchronizedRequest(client, request, global_endpoint_manager, connection_policy, requests_session, method, path, request_data, query_params, headers)synchronized_request._Request(global_endpoint_manager, request, connection_policy, requests_session, path, request_options, request_body)
At last, in the final synchronized_request._Request function, I see at the end:
if is_media:
result = data
else:
if len(data) > 0:
try:
result = json.loads(data)
except:
raise errors.JSONParseFailure(data)
return (result, headers)So this tells me that if is_media is set, the response is not parsed from JSON, but instead returned as-is. Injecting a little print(is_media) in there told me that the flag was indeed set, so I went up to see how it was set:
def _Request(global_endpoint_manager, request, connection_policy, requests_session, path, request_options, request_body):
"""Makes one http request using the requests module.
...
:return:
tuple of (result, headers)
:rtype:
tuple of (dict, dict)
"""
is_media = request_options['path'].find('media') > -1(note that the return type is said to be (dict, dict), but we now know that to not always be the case)
So is_media is sniffed from request_options['path']. Intrigued, I print(request_options['path']) only to be greeted by a familiar sight:
/dbs/campaigns/colls/digitalmedia/
Conclusion
If the database id, collection id or document id (I assume) contains the substring media, then the result of the API GET request in synchronized_request._Request will not be parsed as JSON, and the azure-cosmos library will fail further down the line when expecting the returned value to be a dict rather than a str.
The obvious workaround at this point would be to just use a container with a different name. I can do this during the testing phase, but we already expect it to be called what it is, so just changing the name is not really viable. Is there a safer way of determining whether a request is_media?
The second workaround is for us to fork this library, remove the is_media check and use our fork instead (whether by putting it locally in the project or hosting it somewhere for pip to pick up). This is more effort than is desirable, but may be necessary to not be affected by this bug.