-
Notifications
You must be signed in to change notification settings - Fork 1.9k
Bugfix/downstream oss mem fixes #17973
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Bugfix/downstream oss mem fixes #17973
Conversation
HIGHMEM kmap interfaces operate on only a single page at a time, yet ZFS hadn't accounted for this, resulting in crashes and potential memory corruption on HIGHMEM (typically 32-bit) systems. This was caught by PaX's KERNSEAL feature as it makes use of HIGHMEM functionality on x64.
On typical 64-bit systems, this issue wouldn't have been observed, as the map interfaces simply fall back to returning an address in lowmem where the contiguous pages can be accessed directly.
Joint work with the PaX Team, tested by Mark van Dijk
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
ZFS typically preserves proper LIFO ordering regarding map/unmap operations that wrap the Linux kernel's kmap interfaces that require such ordering, but one instance in abd_raidz_gen_iterate() did not.
Similar issues have been fixed in the Linux kernel in the past, see for instance CVE-2025-39899 for userfaultfd.
Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
Fix another instance where ZFS assumes multiple pages can be mapped at once via zfs_kmap_local(), resulting in crashes and potential memory corruption on HIGHMEM-enabled (typically 32-bit) systems. Signed-off-by: bspengler-oss <94915855+bspengler-oss@users.noreply.github.com>
|
Ping @behlendorf for continuity from the prior conversation |
behlendorf
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice work running these highmem cases down. Unfortunately, we don't regularly test on 32-bit systems to catch this kind of thing. The fixes themselves look right to me.
@sempervictus can you fix the cstyle issues by updating the patches to wrap at 80 columns, and the commit comments at 72. Then rebase this again on the latest master.
| paddr = zfs_kmap_local(sg_page(aiter->iter_sg)); | ||
| page = sg_page(aiter->iter_sg); | ||
| if (PageHighMem(page)) { | ||
| page = nth_page(page, offset / PAGE_SIZE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's change this to offset >> PAGE_SHIFT to avoid depending on the compiler to optimize this.
|
|
||
| page = sg_page(aiter->iter_sg); | ||
| if (PageHighMem(page)) | ||
| offset &= ~PAGE_MASK; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One thing I wanted to mention, which had been pointed out to me by the PaX Team, is you may (depending on style preferences) want to use PAGE_SIZE - 1 here and at the other use on line 923 above rather than ~PAGE_MASK, that is, both lines would be turned into:
offset &= PAGE_SIZE - 1;
Both evaluate to the same value on Linux, but on FreeBSD PAGE_MASK has a different interpretation/value which may cause some confusion for developers working on both sides of the codebase.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh interesting, I didn't realize things were different on FreeBSD in this regard. Then yes, let's update it to PAGE_SIZE - 1 to avoid any ambiguity here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, FreeBSD uses PAGE_MASK to mean the offset within the page (therefore has a 0xfff value), whereas Linux uses PAGE_MASK to mean the address aligned to page boundary (~0xfff). Exact opposite meaning and value, unfortunately.
Motivation and Context
fix #15668 and proxy upstreaming per discussion therein
Description
Upstream @bspengler-oss memory fixes for ZFS permitting operation under the much more rigorous memory and execution controls of grsecurity/pax kernels.
How Has This Been Tested?
2.3.4 no longer blows up with KERNSEAL+PRIVKSTACK+RAP+KERNEXEC+UDEREF and so forth
Types of changes
Checklist:
Signed-off-by.