Skip to content

some kind of fatal on parsing - detentionstats columns changed #74

@HongPong

Description

@HongPong

So this is on the current head. I merged in #66 already. The headers changed in the detentionstats.xlsx file. Fy25 was replaced with fy26. Also the column "Pending FY25 Inspection" disappeared. I suspect "Pending FY26 Inspection" could appear later so we might want to change it so that it can deal with it. Anyways I made quick fixes and a couple comments to get it working again for now...

 uv run python main.py --scrape --debug 
ICE Detention Facilities Scraper by the Open Security Mapping Project. MIT License.
Collecting initial facility data from https://www.ice.gov/detain/detention-management
Found sheet at: https://www.ice.gov/doclib/detention/FY26_detentionStats11202025.xlsx
Downloading detention stats sheet from https://www.ice.gov/doclib/detention/FY26_detentionStats11202025.xlsx
Wrote 220106 byte sheet to ice_detention_scraper/ice_scrapers/detentionstats.xlsx
Traceback (most recent call last):
  File "ice_detention_scraper/main.py", line 173, in <module>
    main()
    ~~~~^^
  File "ice_detention_scraper/main.py", line 134, in main
    facilities_data = facilities_scrape_wrapper(
        keep_sheet=not args.delete_sheets,
        force_download=not args.skip_downloads,
        skip_vera=not args.use_vera,
    )
  File "ice_detention_scraper/ice_scrapers/general.py", line 15, in facilities_scrape_wrapper
    facilities = load_sheet(keep_sheet, force_download)
  File "ice_detention_scraper/ice_scrapers/spreadsheet_load.py", line 109, in load_sheet
    df, sheet_url = _download_sheet(keep_sheet, force_download)
                    ~~~~~~~~~~~~~~~^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "ice_detention_scraper/ice_scrapers/spreadsheet_load.py", line 93, in _download_sheet
    df = polars.read_excel(
        drop_empty_rows=True,
    ...<5 lines>...
        source=open(filename, "rb"),
    )
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/polars/_utils/deprecation.py", line 128, in wrapper
    return function(*args, **kwargs)
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/polars/_utils/deprecation.py", line 128, in wrapper
    return function(*args, **kwargs)
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/polars/io/spreadsheet/functions.py", line 403, in read_excel
    _read_spreadsheet(
    ~~~~~~~~~~~~~~~~~^
        src,
        ^^^^
    ...<13 lines>...
        drop_empty_cols=drop_empty_cols,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/polars/io/spreadsheet/functions.py", line 687, in _read_spreadsheet
    name: reader_fn(
          ~~~~~~~~~^
        parser=parser,
        ^^^^^^^^^^^^^^
    ...<7 lines>...
        drop_empty_cols=drop_empty_cols,
        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
    )
    ^
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/polars/io/spreadsheet/functions.py", line 1093, in _read_spreadsheet_calamine
    ws_arrow = parser.load_sheet_eager(sheet_name, **read_options)
  File "ice_detention_scraper/.venv/lib/python3.13/site-packages/fastexcel/__init__.py", line 576, in load_sheet_eager
    return self._reader.load_sheet(
           ~~~~~~~~~~~~~~~~~~~~~~~^
        idx_or_name=idx_or_name,
        ^^^^^^^^^^^^^^^^^^^^^^^^
    ...<8 lines>...
        eager=True,
        ^^^^^^^^^^^
    )
    ^
_fastexcel.CannotRetrieveCellDataError: cannot retrieve cell data at (7, 27)
Context:
    0: could not determine dtype for column Last Final Rating

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions