Skip to content

Conversation

@igor-raits
Copy link

The read() system call on Linux has a maximum single-read limit of
approximately SSIZE_MAX (~2.1GB). For files larger than this limit,
a single read() call returns fewer bytes than requested.

The previous implementation assumed a single read() would return all
requested bytes, causing silent data truncation for large files.
For example, a 3GB file would only return ~2.1GB of data.

Fix by looping until all bytes are read or EOF is reached, using the
4-argument form of read() to append at the correct buffer offset.

Fixes: #316

@xdg
Copy link
Contributor

xdg commented Dec 18, 2025

Thank you. I'll take a look but it may not be until next week.

@ap
Copy link
Contributor

ap commented Dec 18, 2025

For the second hunk I’m somewhat uncomfortable with sometimes switching to syswrite and sometimes not, but – ya learn something new every day, even after 30 years – I went looking and found that not only (as I knew) is write not the complement to read (like sysread and syswrite are), but that apparently there isn’t any: no other I/O write function returns the number of bytes written. It astounds me that I never noticed this before. So there is no way around sysread after all, I guess. Separately from that, though, I also see a real issue with the use of join, which will always make a full copy the data before writing it – unlike the code it replaces, which just passes the scalars to print directly. (Even if the individual scalars do get copied, perl uses copy-on-write to avoid copying their actual data buffers. But the join function cannot do that, it must always copy the data. Though I don’t know if it short-circuits to a copy-on-write scalar if it is only given a single scalar to join.) And a more minor quibble is that the conditional is kinda convoluted. I think that hunk as a whole should instead be something like this:

    if ( defined($binmode) and $binmode eq ":unix" ) {
        # Use syswrite in a loop to handle write() syscall size limit (~2GB)
        for my $data ( map { ref eq 'ARRAY' ? @$_ : $_ } @data ) {
            my $total_left = length $data;
            my $total_written = 0;
            my $rc = 0;
            while ( $total_left and ( $rc = syswrite $fh, $data, $total_left, $total_written ) ) {
               $total_left -= $rc;
               $total_written += $rc;
            }
            $self->_throw('syswrite', $temp->[PATH]) unless defined $rc;
        }
    }

@igor-raits
Copy link
Author

@ap thanks for the review & feedback. I have incorporated it and force-pushed.

@ap
Copy link
Contributor

ap commented Dec 18, 2025

Thanks.

More thoughts, this time about the slurp hunk, which I couldn’t include in the first comment because I had to research the details a bit to put a fix together. There is a subtle TOCTOU race condition in the patch: when the read is retried, the file may have been truncated or extended by someone else. To avoid this, the loop must always run until read returns 0, regardless of what -s said about how much data there is. But -s is still necessary because read needs to allocate memory up front, therefore it has to be passed a reasonable value for the LENGTH parameter – which basically means -s has to be re-checked within the loop. Overall that leads to this:

    if ( defined($binmode) and $binmode eq ":unix"
        and my $total_left = -s $fh )
    {
        # Read in a loop to handle read() syscall size limit (~2GB)
        my $buf = "";
        my $total_read = 0;
        my $rc = 0;
        while ( $rc = read $fh, $buf, $total_left, $total_read ) {
            $total_read += $rc;
            # Ensure we will keep read()ing until we get 0 or undef
            # even if someone else changed the file length from under us
            $total_left = ( -s $fh ) - $total_read;
            $total_left = 1 if $total_left < 1;
        }
        $self->_throw('read') unless defined $rc;
        return $buf;
    }

The read() and write() system calls on Linux have a maximum single
operation limit of approximately SSIZE_MAX (~2.1GB). When using the
:unix PerlIO layer (which bypasses buffering), this limit caused
silent data truncation for large files.

Affected methods:
- slurp_raw / slurp with binmode => ":unix"
- spew_raw / spew_utf8 (when Unicode::UTF8 is available)

For example, reading or writing a 3GB file would silently truncate
to ~2.1GB.

Fix by using loops that continue reading/writing until all data is
processed:
- slurp: loop with 4-argument read() to append at correct offset
- spew: loop with 4-argument syswrite() over each data element,
  avoiding unnecessary data copying

The buffered PerlIO path (regular slurp/spew without :unix) was not
affected as PerlIO handles chunking internally.
@igor-raits
Copy link
Author

@ap thanks again, adjusted!

@Leont
Copy link

Leont commented Dec 19, 2025

The read part looks like more or less the same as what I'm doing in File::Slurper. I never got around to implementing the fast writer path because it is a PITA.

@xdg
Copy link
Contributor

xdg commented Dec 19, 2025

FYI: I am not ignoring this, but given some of the complexity, I'm going to move slowly on this and haven't had a chance to do a close read. Thank you, everyone, for your feedback so far.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Linux: slurp file bigger than 2GiB reads only part of it

4 participants