-
Notifications
You must be signed in to change notification settings - Fork 86
Added non-git source puller functionality #194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
sean-morris
wants to merge
45
commits into
jupyterhub:main
Choose a base branch
from
sean-morris:non-git
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 1 commit
Commits
Show all changes
45 commits
Select commit
Hold shift + click to select a range
ea87f2b
Command-line argument repo_dir is changed
sean-morris 10385bb
Added non-git source puller functionality
sean-morris ab80daf
Added async functionality to non-git archives
sean-morris 71ca2f4
Update nbgitpuller/plugin_helper.py
sean-morris ae66e53
Update nbgitpuller/hookspecs.py
sean-morris 8934f5f
renamed and simplified the test_files
sean-morris ac2072c
added README to plugins
sean-morris a84096d
added docstring to progress_loop function
sean-morris 86fd7bf
Update tests/test_download_puller.py
sean-morris c686651
Update tests/test_download_puller.py
sean-morris f8e04f1
Removed Downloader Plugins from Repo
sean-morris 958b0b1
Added Custom Exception for Bad Provider
sean-morris 2048e8d
Merge branch 'main' of https://github.com/jupyterhub/nbgitpuller
sean-morris 398a03f
merged from master and fixed conflicts
sean-morris 9a8fcab
Removed unused import from test file
sean-morris 78e31c3
Added packages to dev-requirements.txt
sean-morris a131b93
Moved the two constants and REPO_PARENT_DIR out of __init__.py
sean-morris 55da5e1
Revert some trivial formatting changes
consideRatio 0ca6cf9
Apply suggestions from code review
sean-morris 9e808e5
Changes from code review
sean-morris 8d63ee4
Apply suggestions from code review
sean-morris deecc7b
Removed setTerminalVisibility from automatically opening in UI
sean-morris a9e08c4
Reverted a mistaken change to command-line args
sean-morris 09c9249
Hookspecs renamed and documented
sean-morris 0085fab
Hookspecs name and seperate helper_args
sean-morris 88ec806
Renamed for clarity
sean-morris 8592d1f
Seperated actual query_line_args from helper_args
sean-morris 21d8f0f
fixed conflicts
sean-morris ab5dd10
Fixed tests
sean-morris e8ae5ca
Removed changes not meant to merged
sean-morris 56ad1ee
Apply suggestions from code review
sean-morris af567ca
Refactored docstrings
sean-morris 782a35b
Refactored docstrings
sean-morris d034d37
Merge branch 'non-git' of https://github.com/sean-morris/nbgitpuller …
sean-morris 9729464
Fix temp download dir to use the package tempfile
sean-morris 602ef01
provider is now contentProvider in the html/js/query parameters
sean-morris 3ebdc7e
The download_func and download_func_params brought in separately
sean-morris e22d076
Moved the handle_files_helper in Class
sean-morris 3b14405
Moved downloader-plugin util to own repo
sean-morris 613f863
Moved downloader-plugin util to own repo
sean-morris 5f39c68
Merge branch 'non-git' of https://github.com/sean-morris/nbgitpuller …
sean-morris f618560
Removed nested_asyncio from init.py
sean-morris 367f3c7
Moved downloader-plugin handling to puller thread
sean-morris 8893970
Moved downloader plugins handling to pull.py
sean-morris 7590c38
Access downloader-plugin results from plugin instance variable
sean-morris File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -1,5 +1,6 @@ | ||
| include *.md | ||
| include LICENSE | ||
| include setup.cfg | ||
| recursive-include nbgitpuller/plugins * | ||
| recursive-include nbgitpuller/static * | ||
| recursive-include nbgitpuller/templates * | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,21 @@ | ||
| import pluggy | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| hookspec = pluggy.HookspecMarker("nbgitpuller") | ||
|
|
||
|
|
||
| @hookspec | ||
| def handle_files(self, repo, repo_parent_dir): | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| """ | ||
| :param str repo: download url to source | ||
| :param str repo_parent_dir: where we will store the downloaded repo | ||
| :return two parameter json unzip_dir and origin_repo_path | ||
| :rtype json object | ||
| This handles the downloading of non-git source | ||
| files into the user directory. Once downloaded, | ||
| the files are merged into a local git repository. | ||
| Once the local git repository is updated(or created | ||
| the first time), git puller can then handle this | ||
| directory as it would sources coming from a | ||
| git repository. | ||
| """ | ||
Empty file.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,116 @@ | ||
| import subprocess | ||
| import os | ||
| import logging | ||
| import requests | ||
| from requests_file import FileAdapter | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| import shutil | ||
| import re | ||
|
|
||
|
|
||
| # for large files from Google Drive | ||
| def get_confirm_token(response): | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| for key, value in response.cookies.items(): | ||
| if key.startswith('download_warning'): | ||
| return value | ||
| return None | ||
|
|
||
|
|
||
| # sets up the a local repo that acts like a remote | ||
| def initialize_local_repo(local_repo_path): | ||
| logging.info(f"Creating local_repo_path: {local_repo_path}") | ||
| os.makedirs(local_repo_path, exist_ok=True) | ||
|
|
||
| subprocess.check_output(["git", "init", "--bare"], cwd=local_repo_path) | ||
|
|
||
|
|
||
| # local repo cloned from the "remote" which is in user drive | ||
| def clone_local_origin_repo(origin_repo_path, temp_download_repo): | ||
| logging.info(f"Creating temp_download_repo: {temp_download_repo}") | ||
| os.makedirs(temp_download_repo, exist_ok=True) | ||
|
|
||
| cmd = ["git", "clone", f"file://{origin_repo_path}", temp_download_repo] | ||
| subprocess.check_output(cmd, cwd=temp_download_repo) | ||
|
|
||
|
|
||
| # this is needed to unarchive various formats(eg. zip, tgz, etc) | ||
| def determine_file_extension(url, response): | ||
| file_type = response.headers.get('content-type') | ||
| content_disposition = response.headers.get('content-disposition') | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| ext = None | ||
| if content_disposition: | ||
| fname = re.findall("filename\\*?=([^;]+)", content_disposition) | ||
| fname = fname[0].strip().strip('"') | ||
| ext = fname.split(".")[1] | ||
| elif file_type and "/zip" in file_type: | ||
| ext = "zip" | ||
| else: | ||
| url = url.split("/")[-1] | ||
| if "?" in url: | ||
| url = url[0:url.find('?')] | ||
| if "." in url: | ||
| ext = url.split(".")[1] | ||
|
|
||
| if not ext: | ||
| m = f"Could not determine the file extension for unarchiving: {url}" | ||
| raise Exception(m) | ||
| return ext | ||
|
|
||
|
|
||
| # the downloaded content is in the response -- unarchive and save to the disk | ||
| def save_response_content(url, response, temp_download_repo): | ||
| try: | ||
| ext = determine_file_extension(url, response) | ||
| CHUNK_SIZE = 32768 | ||
| temp_download_file = f"{temp_download_repo}/download.{ext}" | ||
| with open(temp_download_file, "wb") as f: | ||
| for chunk in response.iter_content(CHUNK_SIZE): | ||
| # filter out keep-alive new chunks | ||
| if chunk: | ||
| f.write(chunk) | ||
|
|
||
| shutil.unpack_archive(temp_download_file, temp_download_repo) | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
|
|
||
| os.remove(temp_download_file) | ||
| except Exception as e: | ||
| m = f"Problem handling file download: {str(e)}" | ||
| raise Exception(m) | ||
|
|
||
|
|
||
| # grab archive file from url | ||
| def fetch_files(url, id=-1): | ||
| session = requests.Session() | ||
| session.mount('file://', FileAdapter()) # add adapter for pytests | ||
| response = session.get(url, params={'id': id}, stream=True) | ||
| token = get_confirm_token(response) | ||
| if token: | ||
| params = {'id': id, 'confirm': token} | ||
| response = session.get(url, params=params, stream=True) | ||
|
|
||
| return response | ||
|
|
||
|
|
||
| # this drive the file handling -- called from zip_puller by all the | ||
| # handle_files implementations for GoogleDrive, Dropbox, and standard | ||
| # Web url | ||
| def handle_files_helper(args): | ||
| try: | ||
| origin_repo = args["repo_parent_dir"] + args["origin_dir"] | ||
| temp_download_repo = args["repo_parent_dir"] + args["download_dir"] | ||
| if os.path.exists(temp_download_repo): | ||
| shutil.rmtree(temp_download_repo) | ||
|
|
||
| if not os.path.exists(origin_repo): | ||
| initialize_local_repo(origin_repo) | ||
|
|
||
| clone_local_origin_repo(origin_repo, temp_download_repo) | ||
| save_response_content(args["repo"], args["response"], temp_download_repo) | ||
| subprocess.check_output(["git", "add", "."], cwd=temp_download_repo) | ||
| subprocess.check_output(["git", "-c", "user.email=nbgitpuller@nbgitpuller.link", "-c", "user.name=nbgitpuller", "commit", "-m", "test", "--allow-empty"], cwd=temp_download_repo) | ||
| subprocess.check_output(["git", "push", "origin", "master"], cwd=temp_download_repo) | ||
| unzipped_dirs = os.listdir(temp_download_repo) | ||
|
|
||
| dir_names = list(filter(lambda dir: ".git" not in dir, unzipped_dirs)) | ||
| return {"unzip_dir": dir_names[0], "origin_repo_path": origin_repo} | ||
| except Exception as e: | ||
| logging.exception(e) | ||
| raise ValueError(e) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,79 @@ | ||
| from .plugin_helper import fetch_files | ||
| from .plugin_helper import handle_files_helper | ||
| import pluggy | ||
|
|
||
| hookimpl = pluggy.HookimplMarker("nbgitpuller") | ||
| TEMP_DOWNLOAD_REPO_DIR = ".temp_download_repo" | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| CACHED_ORIGIN_NON_GIT_REPO = ".origin_non_git_sources" | ||
|
|
||
|
|
||
| # handles standard web addresses(not google drive or dropbox) | ||
| class ZipSourceWebDownloader(object): | ||
sean-morris marked this conversation as resolved.
Outdated
Show resolved
Hide resolved
|
||
| @hookimpl | ||
| def handle_files(self, repo, repo_parent_dir): | ||
| """ | ||
| :param str repo: publicly accessible url to compressed source files | ||
| :param str repo_parent_dir: where we will store the downloaded repo | ||
| :return two parameter json unzip_dir and origin_repo_path | ||
| :rtype json object | ||
| """ | ||
| response = fetch_files(repo) | ||
| args = { | ||
| "repo": repo, | ||
| "repo_parent_dir": repo_parent_dir, | ||
| "response": response, | ||
| "origin_dir": CACHED_ORIGIN_NON_GIT_REPO, | ||
| "download_dir": TEMP_DOWNLOAD_REPO_DIR | ||
| } | ||
| return handle_files_helper(args) | ||
|
|
||
|
|
||
| # handles downloads from google drive | ||
| class ZipSourceGoogleDriveDownloader(object): | ||
| def __init__(self): | ||
| self.DOWNLOAD_URL = "https://docs.google.com/uc?export=download" | ||
|
|
||
| def get_id(self, repo): | ||
| start_id_index = repo.index("d/") + 2 | ||
| end_id_index = repo.index("/view") | ||
| return repo[start_id_index:end_id_index] | ||
|
|
||
| @hookimpl | ||
| def handle_files(self, repo, repo_parent_dir): | ||
| """ | ||
| :param str repo: google drive share link to compressed source files | ||
| :param str repo_parent_dir: where we will store the downloaded repo | ||
| :return two parameter json unzip_dir and origin_repo_path | ||
| :rtype json object | ||
| """ | ||
| response = fetch_files(self.DOWNLOAD_URL, self.get_id(repo)) | ||
| args = { | ||
| "repo": repo, | ||
| "repo_parent_dir": repo_parent_dir, | ||
| "response": response, | ||
| "origin_dir": CACHED_ORIGIN_NON_GIT_REPO, | ||
| "download_dir": TEMP_DOWNLOAD_REPO_DIR | ||
| } | ||
| return handle_files_helper(args) | ||
|
|
||
|
|
||
| # handles downloads from DropBox | ||
| class ZipSourceDropBoxDownloader(object): | ||
| @hookimpl | ||
| def handle_files(self, repo, repo_parent_dir): | ||
| """ | ||
| :param str repo: dropbox download link to compressed source files | ||
| :param str repo_parent_dir: where we will store the downloaded repo | ||
| :return two parameter json unzip_dir and origin_repo_path | ||
| :rtype json object | ||
| """ | ||
| repo = repo.replace("dl=0", "dl=1") # download set to 1 for dropbox | ||
| response = fetch_files(repo) | ||
| args = { | ||
| "repo": repo, | ||
| "repo_parent_dir": repo_parent_dir, | ||
| "response": response, | ||
| "origin_dir": CACHED_ORIGIN_NON_GIT_REPO, | ||
| "download_dir": TEMP_DOWNLOAD_REPO_DIR | ||
| } | ||
| return handle_files_helper(args) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Uh oh!
There was an error while loading. Please reload this page.