-
Notifications
You must be signed in to change notification settings - Fork 813
Open
Description
The script get-all-data.sh
downloads most of the data files successfully but fails to download the following files due to 404 errors:
data/nmt/eng-fra.txt
data/nmt/simplest_eng_fra.csv
data/yelp/raw_train.csv
Example Error Content
In the case of raw_train.csv
, the downloaded file contains the following HTML, indicating a "404 Not Found" error:
<html lang="en" dir="ltr">
<meta charset="utf-8">
<meta name="viewport" content="initial-scale=1, minimum-scale=1, width=device-width">
<title>Error 404 (Not Found)!!1</title>
<style>
/* Truncated for brevity */
</style>
<main>
<a href="//www.google.com">
<span id="logo" aria-label="Google" role="img"></span>
</a>
<p><b>404.</b> <ins>That’s an error.</ins></p>
<p>The requested URL was not found on this server. <ins>That’s all we know.</ins></p>
</main>
</html>
Steps to Reproduce
- Run the
get-all-data.sh
script as described in the book. - Observe that the listed files are not downloaded, and the resulting files contain 404 error messages.
Expected Behavior
The script should download the required data files or provide updated instructions if the files have been moved or removed.
Suggestions
- Verify whether the file links (Google Drive IDs or other sources) are still valid.
- Provide updated links or alternative sources for the missing files.
- If the files are no longer available, include placeholder data or instructions to generate equivalent datasets.
Thank you for addressing this issue!
Metadata
Metadata
Assignees
Labels
No labels