-
Notifications
You must be signed in to change notification settings - Fork 88
[SVCS-528] [Blocked] Improve gdoc rendering - WB Part #304
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: develop
Are you sure you want to change the base?
[SVCS-528] [Blocked] Improve gdoc rendering - WB Part #304
Conversation
| else: | ||
|
|
||
| # TODO figure out metadata.raw.get('downloadUrl') | ||
| download_url = metadata.raw.get('downloadUrl') or drive_utils.get_export_link(metadata.raw) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This assumes that metadata.raw.get('downloadUrl') is always None. Why?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this line is that downloadUrl is always present unless it's a file that that isn't exportable in it's native type, so a .txt will have a downloadUrl, but a .gdoc won't. Google files have frustratingly nonstandard metadata.
| metadata = await self.metadata(path, revision=revision) | ||
|
|
||
| if kwargs.get('mfr', None) and kwargs['mfr'].lower() == 'true': | ||
| download_url = drive_utils.get_alt_export_link(metadata.raw) # type: ignore |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
download_url = metadata.raw.get('downloadUrl') or drive_utils.get_alt_export_link(metadata.raw) 4768410 to
38b4e1a
Compare
We can export .gdoc as a pdf if we know its coming from mfr. Added `alt_export` functions and properties to Gdrive utils and metadata. Gdrive download now looks for 'mfr' in its kwargs and will request file as pdf if used properly.
raises key error and update gdrive utils to use `dict.get()` instead of `dict[]`.
38b4e1a to
a7c70ef
Compare
| metadata = await self.metadata(path, revision=revision) | ||
|
|
||
| if kwargs.get('mfr', None) and kwargs['mfr'].lower() == 'true': | ||
| download_url = metadata.raw.get('downloadUrl') or drive_utils.get_alt_export_link(metadata.raw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FYI, fixed. Need to check metadata.raw.get('downloadUrl') first. There are file types on gdrive (e.g. .txt) that do not have export_links.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about
if kwargs.get('mfr', '').lower() == 'true':
Also I don't get the or here.
| metadata = await self.metadata(path, revision=revision) | ||
|
|
||
| if kwargs.get('mfr', None) and kwargs['mfr'].lower() == 'true': | ||
| download_url = metadata.raw.get('downloadUrl') or drive_utils.get_alt_export_link(metadata.raw) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what about
if kwargs.get('mfr', '').lower() == 'true':
Also I don't get the or here.
| if format['mime_type'] == metadata['mimeType']: | ||
| return format | ||
| for format_type in DOCS_FORMATS: | ||
| if format_type.get('mime_type') == metadata.get('mimeType'): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
metadata.get('mimeType') should be outside the for loop so it doesn't need to be accessed each iteration. Given that this data structure is tiny so I doubt there will be any perf. difference.
| format = get_format(metadata) | ||
| return metadata['exportLinks'][format['type']] | ||
| format_type = get_format(metadata) | ||
| return metadata.get('exportLinks').get(format_type.get('type')) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if metadata doesn't have an exportLinks?
|
I think some of the stuff here may be better passed as a query param that waterbutler passes through to gdrive transparently, this pr currently puts code that deals with exporting into mfr, I'd like to look at if theres a way to put that logic in mfr |
Ticket
https://openscience.atlassian.net/browse/SVCS-528
MFR counterpart: CenterForOpenScience/modular-file-renderer#303
Purpose
This PR replaces #290. Credit goes to @AddisonSchiller 🎆 🎆 .
Allow MFR to request
.gdocfiles in.pdfformat while keep.docxformat for downloading. This will allow MFR to cut out the expensive unoconv conversion for .gdoc filesWe can export .gdoc as a pdf if we know its coming from mfr. Added
alt_exportfunctions and properties to Gdrive utils and metadata. Gdrive download now looks for 'mfr' in its kwargs and will request file as pdf if used properly.Changes
WB V1 provider now looks for a
mfr=query param. Ifmfr=true, the google drive provider will usealt_export_nameandalt_export_link()instead of its normal counterparts. This will cause the file being sent to be a.pdfinstead of.docx.Side effects
None
QA Notes
Test that
.gdocrendering works as usual.Deployment Notes
This PR must be merged/deployed at the same time as its MFR counterpart.
This PR should be blocked by @TomBaxter 's v3 update: #276.