Here's a minimal-ish repro: ```python from fsspec.implementations.http import HTTPFileSystem remote_path = "https://huggingface.co/api/datasets/abisee/cnn_dailymail/parquet/3.0.0/train/0.parquet" expected_data_size = 256540614 filesystem = HTTPFileSystem() with filesystem.open(remote_path) as file: total_read = 0 while data := file.read(256 * 1024): total_read += len(data) assert ( total_read == expected_data_size ), f"Data mismatch: {total_read} != {expected_data_size}" # AssertionError: Data mismatch: 5767168 != 256540614 ``` This issue causes Ray Data's `from_huggingface` API to break. See https://github.com/ray-project/ray/issues/54101