Skip to content

Commit f636cbd

Browse files
committed
secret-hiding: Update kernel patches for kvm-clock
We used a very ad-hoc solution for kvm-clock. The new kernel patches make gfn_to_pfn_cache (that kvm-clock is based on) work for guest_memfd without the direct map. Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
1 parent 3b0a2ef commit f636cbd

File tree

4 files changed

+284
-103
lines changed

4 files changed

+284
-103
lines changed
Lines changed: 72 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,72 @@
1+
From 363385a3c2cd4f7fe445ed71329e55d190cb14d5 Mon Sep 17 00:00:00 2001
2+
From: Takahiro Itazuri <itazur@amazon.com>
3+
Date: Tue, 2 Dec 2025 12:15:49 +0000
4+
Subject: [RFC PATCH 0/2] KVM: pfncache: Support guest_memfd without direct map
5+
6+
[ based on kvm/next with [1] ]
7+
8+
Recent work on guest_memfd [1] is introducing support for removing guest
9+
memory from the kernel direct map (Note that it hasn't been merged yet,
10+
and that is why this patch series is labelled RFC). While this was
11+
originally motivated by CoCo VMs, the feature is also useful for
12+
non-CoCo VMs: preventing the host kernel from accidentally or
13+
speculatively accessing guest memory is a general safety improvement.
14+
Pages for guest_memfd created with GUEST_MEMFD_FLAG_NO_DIRECT_MAP have
15+
their direct-map PTEs explicitly disabled, and thus cannot rely on the
16+
direct map.
17+
18+
This breaks the facilities that use gfn_to_pfn_cache, including
19+
kvm-clock. gfn_to_pfn_cache caches the pfn and kernel host virtual
20+
address (khva) for a given gfn so that KVM can repeatedly read or write
21+
the corresponding guest page. The cached khva may be later dereferenced
22+
from atomic contexts in some cases. Such contexts cannot tolerate
23+
sleeping or page faults, and therefore cannot use a userspace mapping
24+
(uhva), as those mappings may fault at any time. As a result,
25+
gfn_to_pfn_cache requires a stable, fault-free kernel virtual address
26+
for the backing pages, independent of the userspace page.
27+
28+
This small patch series enables gfn_to_pfn_cache to work correctly when
29+
a memslot is backed by guest_memfd with GUEST_MEMFD_FLAG_NO_DIRECT_MAP.
30+
The first patch teaches gfn_to_pfn_cache to obtain pfn for guest_memfd-
31+
backed memslots via kvm_gmem_get_pfn() instead of GUP (hva_to_pfn()).
32+
The second patch makes gfn_to_pfn_cache use vmap()/vunmap() to create a
33+
fault-free kernel address for such pages. We believe that establishing
34+
such mapping for paravirtual guest/host communication are acceptable
35+
since such pages do not contain sensitive data.
36+
37+
One possible approach was to use memremap() instead of vmap(), since
38+
gpc_map() already falls back to memremap() if pfn_valid() is false.
39+
However, vmap() was chosen for the following reason. memremap() with
40+
MEMREMAP_WB first attempts to use the direct map via try_ram_remap(),
41+
and then falls back to arch_memremap_wb(), which explicitly refuses to
42+
map system RAM. It would be possible to relax this restriction, but the
43+
side effects are unclear because memremap() is widely used throughout
44+
the kernel. Changing memremap() to support system RAM without the
45+
direct map solely for gfn_to_pfn_cache feels disproportionate. If
46+
additional users appear that need to map RAM without the direct map,
47+
revisiting and generalizing memremap() might make sense. For now,
48+
vmap()/vunmap() provides a contained and predictable solution.
49+
50+
Another possible approach is to use the "ephmap" (or proclocal) proposed
51+
in [2], but it is not yet clear when that work will be merged since it
52+
is expected to be relatively large and complex. In constrast, the
53+
changes in this patch series are small and self-contained, yet
54+
immediately allow gfn_to_pfn_cache (including kvm-clock) to operate
55+
correctly with direct map-removed guest_memfd. Once ephmap eventually
56+
is merged, gfn_to_pfn_cache can be updated to make use of it as
57+
appropriate.
58+
59+
[1]: https://lore.kernel.org/all/20250924151101.2225820-1-patrick.roy@campus.lmu.de/
60+
[2]: https://lore.kernel.org/all/20250812173109.295750-1-jackmanb@google.com/
61+
62+
Takahiro Itazuri (2):
63+
KVM: pfncache: Use kvm_gmem_get_pfn() for guest_memfd-backed memslots
64+
KVM: pfncache: Use vmap() for guest_memfd pages without direct map
65+
66+
include/linux/kvm_host.h | 7 ++++++
67+
virt/kvm/pfncache.c | 52 +++++++++++++++++++++++++++++-----------
68+
2 files changed, 45 insertions(+), 14 deletions(-)
69+
70+
--
71+
2.50.1
72+
Lines changed: 100 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,100 @@
1+
From 0d1d88a9f88afa2e068a26ddd004c7c5deea27c9 Mon Sep 17 00:00:00 2001
2+
From: Takahiro Itazuri <itazur@amazon.com>
3+
Date: Mon, 1 Dec 2025 14:58:44 +0000
4+
Subject: [PATCH 1/2] KVM: pfncache: Use kvm_gmem_get_pfn() for guest_memfd-backed memslots
5+
6+
gfn_to_pfn_cache currently relies on hva_to_pfn(), which resolves PFNs
7+
through GUP. GUP assumes that the page has a valid direct-map PTE,
8+
which is not true for guest_memfd created with
9+
GUEST_MEMFD_FLAG_NO_DIRECT_MAP, because their direct-map PTEs are
10+
explicitly removed via set_direct_map_valid_noflush().
11+
12+
Introduce a helper function, gpc_to_pfn(), that routes PFN lookup to
13+
kvm_gmem_get_pfn() for guest_memfd-backed memslots (regardless of
14+
whether GUEST_MEMFD_FLAG_NO_DIRECT_MAP is set), and otherwise falls
15+
back to the existing hva_to_pfn() path. Rename hva_to_pfn_retry() to
16+
gpc_to_pfn_retry() accordingly.
17+
18+
Reviewed-by: David Woodhouse <dwmw2@infradead.org>
19+
Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
20+
---
21+
virt/kvm/pfncache.c | 34 +++++++++++++++++++++++-----------
22+
1 file changed, 23 insertions(+), 11 deletions(-)
23+
24+
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
25+
index 728d2c1b488a..bf8d6090e283 100644
26+
--- a/virt/kvm/pfncache.c
27+
+++ b/virt/kvm/pfncache.c
28+
@@ -152,22 +152,34 @@ static inline bool mmu_notifier_retry_cache(struct kvm *kvm, unsigned long mmu_s
29+
return kvm->mmu_invalidate_seq != mmu_seq;
30+
}
31+
32+
-static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
33+
+static kvm_pfn_t gpc_to_pfn(struct gfn_to_pfn_cache *gpc, struct page **page)
34+
{
35+
- /* Note, the new page offset may be different than the old! */
36+
- void *old_khva = (void *)PAGE_ALIGN_DOWN((uintptr_t)gpc->khva);
37+
- kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
38+
- void *new_khva = NULL;
39+
- unsigned long mmu_seq;
40+
- struct page *page;
41+
+ if (kvm_slot_has_gmem(gpc->memslot)) {
42+
+ kvm_pfn_t pfn;
43+
+
44+
+ kvm_gmem_get_pfn(gpc->kvm, gpc->memslot, gpa_to_gfn(gpc->gpa),
45+
+ &pfn, page, NULL);
46+
+ return pfn;
47+
+ }
48+
49+
struct kvm_follow_pfn kfp = {
50+
.slot = gpc->memslot,
51+
.gfn = gpa_to_gfn(gpc->gpa),
52+
.flags = FOLL_WRITE,
53+
.hva = gpc->uhva,
54+
- .refcounted_page = &page,
55+
+ .refcounted_page = page,
56+
};
57+
+ return hva_to_pfn(&kfp);
58+
+}
59+
+
60+
+static kvm_pfn_t gpc_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
61+
+{
62+
+ /* Note, the new page offset may be different than the old! */
63+
+ void *old_khva = (void *)PAGE_ALIGN_DOWN((uintptr_t)gpc->khva);
64+
+ kvm_pfn_t new_pfn = KVM_PFN_ERR_FAULT;
65+
+ void *new_khva = NULL;
66+
+ unsigned long mmu_seq;
67+
+ struct page *page;
68+
69+
lockdep_assert_held(&gpc->refresh_lock);
70+
71+
@@ -206,7 +218,7 @@ static kvm_pfn_t hva_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
72+
cond_resched();
73+
}
74+
75+
- new_pfn = hva_to_pfn(&kfp);
76+
+ new_pfn = gpc_to_pfn(gpc, &page);
77+
if (is_error_noslot_pfn(new_pfn))
78+
goto out_error;
79+
80+
@@ -319,7 +331,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned l
81+
}
82+
}
83+
84+
- /* Note: the offset must be correct before calling hva_to_pfn_retry() */
85+
+ /* Note: the offset must be correct before calling gpc_to_pfn_retry() */
86+
gpc->uhva += page_offset;
87+
88+
/*
89+
@@ -327,7 +339,7 @@ static int __kvm_gpc_refresh(struct gfn_to_pfn_cache *gpc, gpa_t gpa, unsigned l
90+
* drop the lock and do the HVA to PFN lookup again.
91+
*/
92+
if (!gpc->valid || hva_change) {
93+
- ret = hva_to_pfn_retry(gpc);
94+
+ ret = gpc_to_pfn_retry(gpc);
95+
} else {
96+
/*
97+
* If the HVA→PFN mapping was already valid, don't unmap it.
98+
--
99+
2.50.1
100+

resources/hiding_ci/linux_patches/11-kvm-clock/0001-KVM-x86-use-uhva-for-kvm-clock-if-kvm_gpc_refresh-fa.patch

Lines changed: 0 additions & 103 deletions
This file was deleted.
Lines changed: 112 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,112 @@
1+
From f1161445f66071ea305a9acbe1117d4aef232b3a Mon Sep 17 00:00:00 2001
2+
From: Takahiro Itazuri <itazur@amazon.com>
3+
Date: Mon, 1 Dec 2025 16:47:05 +0000
4+
Subject: [PATCH 2/2] KVM: pfncache: Use vmap() for guest_memfd pages without direct map
5+
6+
gfn_to_pfn_cache currently maps RAM PFNs with kmap(), which relies on
7+
the direct map. guest_memfd created with GUEST_MEMFD_FLAG_NO_DIRECT_MAP
8+
disable their direct-map PTEs via set_direct_map_valid_noflush(), so the
9+
linear address returned by kmap()/page_address() will fault if
10+
dereferenced.
11+
12+
In some cases, gfn_to_pfn_cache dereferences the cached kernel address
13+
(khva) from atomic contexts where page faults cannot be tolerated.
14+
Therefore khva must always refer to a fault-free kernel mapping. Since
15+
mapping and unmapping happen exclusively in the refresh path, which may
16+
sleep, using vmap()/vunmap() for these pages is safe and sufficient.
17+
18+
Introduce kvm_slot_no_direct_map() to detect guest_memfd slots without
19+
the direct map, and make gpc_map()/gpc_unmap() use vmap()/vunmap() for
20+
such pages.
21+
22+
This allows the facilities based on gfn_to_pfn_cache (e.g. kvm-clock) to
23+
work correctly with guest_memfd regardless of whether its direct-map
24+
PTEs are valid.
25+
26+
Reviewed-by: David Woodhouse <dwmw2@infradead.org>
27+
Signed-off-by: Takahiro Itazuri <itazur@amazon.com>
28+
---
29+
include/linux/kvm_host.h | 7 +++++++
30+
virt/kvm/pfncache.c | 21 ++++++++++++++++-----
31+
2 files changed, 23 insertions(+), 5 deletions(-)
32+
33+
diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h
34+
index 70e6a5210ceb..793d98f97928 100644
35+
--- a/include/linux/kvm_host.h
36+
+++ b/include/linux/kvm_host.h
37+
@@ -15,6 +15,7 @@
38+
#include <linux/minmax.h>
39+
#include <linux/mm.h>
40+
#include <linux/mmu_notifier.h>
41+
+#include <linux/pagemap.h>
42+
#include <linux/preempt.h>
43+
#include <linux/msi.h>
44+
#include <linux/slab.h>
45+
@@ -628,6 +629,12 @@ static inline bool kvm_slot_dirty_track_enabled(const struct kvm_memory_slot *sl
46+
return slot->flags & KVM_MEM_LOG_DIRTY_PAGES;
47+
}
48+
49+
+static inline bool kvm_slot_no_direct_map(const struct kvm_memory_slot *slot)
50+
+{
51+
+ return slot && kvm_slot_has_gmem(slot) &&
52+
+ mapping_no_direct_map(slot->gmem.file->f_mapping);
53+
+}
54+
+
55+
static inline unsigned long kvm_dirty_bitmap_bytes(struct kvm_memory_slot *memslot)
56+
{
57+
return ALIGN(memslot->npages, BITS_PER_LONG) / 8;
58+
diff --git a/virt/kvm/pfncache.c b/virt/kvm/pfncache.c
59+
index bf8d6090e283..c98109bd2876 100644
60+
--- a/virt/kvm/pfncache.c
61+
+++ b/virt/kvm/pfncache.c
62+
@@ -96,10 +96,16 @@ bool kvm_gpc_check(struct gfn_to_pfn_cache *gpc, unsigned long len)
63+
return true;
64+
}
65+
66+
-static void *gpc_map(kvm_pfn_t pfn)
67+
+static void *gpc_map(struct gfn_to_pfn_cache *gpc, kvm_pfn_t pfn)
68+
{
69+
- if (pfn_valid(pfn))
70+
- return kmap(pfn_to_page(pfn));
71+
+ if (pfn_valid(pfn)) {
72+
+ struct page *page = pfn_to_page(pfn);
73+
+
74+
+ if (kvm_slot_no_direct_map(gpc->memslot))
75+
+ return vmap(&page, 1, VM_MAP, PAGE_KERNEL);
76+
+
77+
+ return kmap(page);
78+
+ }
79+
80+
#ifdef CONFIG_HAS_IOMEM
81+
return memremap(pfn_to_hpa(pfn), PAGE_SIZE, MEMREMAP_WB);
82+
@@ -115,6 +121,11 @@ static void gpc_unmap(kvm_pfn_t pfn, void *khva)
83+
return;
84+
85+
if (pfn_valid(pfn)) {
86+
+ if (is_vmalloc_addr(khva)) {
87+
+ vunmap(khva);
88+
+ return;
89+
+ }
90+
+
91+
kunmap(pfn_to_page(pfn));
92+
return;
93+
}
94+
@@ -224,13 +235,13 @@ static kvm_pfn_t gpc_to_pfn_retry(struct gfn_to_pfn_cache *gpc)
95+
96+
/*
97+
* Obtain a new kernel mapping if KVM itself will access the
98+
- * pfn. Note, kmap() and memremap() can both sleep, so this
99+
+ * pfn. Note, kmap(), vmap() and memremap() can sleep, so this
100+
* too must be done outside of gpc->lock!
101+
*/
102+
if (new_pfn == gpc->pfn)
103+
new_khva = old_khva;
104+
else
105+
- new_khva = gpc_map(new_pfn);
106+
+ new_khva = gpc_map(gpc, new_pfn);
107+
108+
if (!new_khva) {
109+
kvm_release_page_unused(page);
110+
--
111+
2.50.1
112+

0 commit comments

Comments
 (0)