Skip to content

Commit 1857518

Browse files
committed
readme
1 parent 36c7368 commit 1857518

File tree

1 file changed

+56
-61
lines changed

1 file changed

+56
-61
lines changed

README.md

Lines changed: 56 additions & 61 deletions
Original file line numberDiff line numberDiff line change
@@ -31,6 +31,31 @@ void WINAPI MySleep(DWORD _dwMilliseconds)
3131
The previous implementation, utilising `StackWalk64` can be accessed in this [commit c250724](https://github.com/mgeeky/ThreadStackSpoofer/tree/c2507248723d167fb2feddf50d35435a17fd61a2).
3232

3333

34+
## Demo
35+
36+
This is how a call stack may look like when it is **NOT** spoofed:
37+
38+
![not-spoofed](images/not-spoofed.png)
39+
40+
This in turn, when thread stack spoofing is enabled:
41+
42+
![spoofed](images/spoofed2.png)
43+
44+
Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
45+
```
46+
kernel32!BaseThreadInitThunk+0x14
47+
ntdll!RtlUserThreadStart+0x21
48+
```
49+
50+
However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
51+
52+
![legit call stack](images/legit-call-stack.png)
53+
54+
The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
55+
56+
Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
57+
58+
3459
## How it works?
3560

3661
This program performs self-injection shellcode (roughly via classic `VirtualAlloc` + `memcpy` + `CreateThread`).
@@ -62,31 +87,41 @@ _(the above image was borrowed from **Eli Bendersky's** post named [Stack frame
6287
This precise logic is provided by `walkCallStack` and `spoofCallStack` functions in `main.cpp`.
6388

6489

65-
## Demo
90+
## Example run
6691

67-
This is how a call stack may look like when it is **NOT** spoofed:
92+
Use case:
6893

69-
![not-spoofed](images/not-spoofed.png)
94+
```
95+
C:\> ThreadStackSpoofer.exe <shellcode> <spoof>
96+
```
7097

71-
This in turn, when thread stack spoofing is enabled:
98+
Where:
99+
- `<shellcode>` is a path to the shellcode file
100+
- `<spoof>` when `1` or `true` will enable thread stack spoofing and anything else disables it.
72101

73-
![spoofed](images/spoofed2.png)
74102

75-
Above we can see that the last frame on our call stack is our `MySleep` callback. That immediately brings opportunities for IOCs hunting for threads having call stacks not unwinding into following two commonly expected system entry points:
76-
```
77-
kernel32!BaseThreadInitThunk+0x14
78-
ntdll!RtlUserThreadStart+0x21
79-
```
103+
Example run that spoofs beacon's thread call stack:
80104

81-
However a brief examination of my system shown, that there are plenty of threads having call stacks not unwinding to the above handlers:
105+
```
106+
PS D:\dev2\ThreadStackSpoofer> .\x64\Release\ThreadStackSpoofer.exe .\tests\beacon64.bin 1
107+
[.] Reading shellcode bytes...
108+
[.] Hooking kernel32!Sleep...
109+
[.] Injecting shellcode...
110+
[+] Shellcode is now running.
111+
[>] Original return address: 0x1926747bd51. Finishing call stack...
82112
83-
![legit call stack](images/legit-call-stack.png)
113+
===> MySleep(5000)
84114
85-
The above screenshot shows unmodified, unhooked, thread of Total Commander x64.
115+
[<] Restoring original return address...
116+
[>] Original return address: 0x1926747bd51. Finishing call stack...
86117
87-
Why should we care about carefully faking our call stack when there are processes exhibiting traits that we can simply mimic?
118+
===> MySleep(5000)
88119
120+
[<] Restoring original return address...
121+
[>] Original return address: 0x1926747bd51. Finishing call stack...
122+
```
89123

124+
---
90125

91126
## How do I use it?
92127

@@ -101,28 +136,21 @@ While developing your advanced shellcode loader, you might also want to implemen
101136
- **Unhook everything you might have hooked** (such as AMSI, ETW, WLDP) before sleeping and then re-hook afterwards.
102137

103138

139+
---
140+
104141
## Actually this is not (yet) a true stack spoofing
105142

106-
As it's been pointed out to me, the technique here is not _yet_ truly holding up to its name for being a _stack spoofer_. Since we're merely overwriting return addresses on the thread's stack, we're not spoofing the remaining areas of the stack itself. Moreover we leave a sequence of `::CreateFileW` addresses which looks very odd and let the thread be unable to unwind its stack. That's because `CreateFile` was meant to solely act as an example, we're making the stack non-unwindable but still obscuring references to our shellcode memory pages.
143+
As it's been pointed out to me, the technique here is not _yet_ truly holding up to its name for being a _stack spoofer_. Since we're merely overwriting return addresses on the thread's stack, we're not spoofing the remaining areas of the stack itself. Moreover we're leaving our call stack _unwindable_ meaking it look anomalous since the system will not be able to properly walk the entire call stack frames chain.
107144

108145
However I'm aware of these shortcomings, at the moment I've left it as is since I cared mostly about evading automated scanners that could iterate over processes, enumerate their threads, walk those threads stacks and pick up on any return address pointing back to a non-image memory (such as `SEC_PRIVATE` - the one allocated dynamically by `VirtuaAlloc` and friends). A focused malware analyst would immediately spot the oddity and consider the thread rather unusual, hunting down our implant. More than sure about it. Yet, I don't believe that nowadays automated scanners such as AV/EDR have sorts of heuristics implemented that would _actually walk each thread's stack_ to verify whether its un-windable `¯\_(ツ)_/¯` .
109146

110147
Surely this project (and commercial implementation found in C2 frameworks) gives AV & EDR vendors arguments to consider implementing appropriate heuristics covering such a novel evasion technique.
111148

112-
The research on the subject is not yet finished and hopefully will result in a better quality _Stack Spoofing_ in upcoming days. Nonetheless, I'm releasing what I got so far in hope of sparkling inspirations and interest community into further researching this area.
113-
114-
Next areas for improving the outcome are to research how we can _exchange_ or copy stacks with one of the following ideas:
149+
In order to improve this technique, one can aim for a true _Thread Stack Spoofer_ by inserting carefully crafted fake stack frames established in an reverse-unwinding process.
150+
Read more on this idea below.
115151

116-
1. utilising [`GetCurrentThreadStackLimits`](https://docs.microsoft.com/en-us/windows/win32/api/processthreadsapi/nf-processthreadsapi-getcurrentthreadstacklimits)/`NtQueryInformationThread`) from a legitimate thread running `kernel32!Sleep(INFINITE)`
117152

118-
2. manipulating our Beacon's thread `TEB/TIB` structures and fields such as `TebBaseAddress`, `NT_TIB.StackBase / NT_TIB.StackLimit` by swapping them with values taken from another legitimate thread.
119-
120-
3. playing with `RBP/EBP` and `RSP/ESP` pointers on a paused Beacon's thread to change stacks in a similar manner to ROP chains - by swapping values of these registers while Beacon's thread is suspended.
121-
122-
4. Create a new user stack with `RtlCreateUserStack` / `RtlFreeUserStack` and exchange stacks from a Beacons thread into that newly created one
123-
124-
125-
## Implementing a true Thread Stack Spoofer
153+
### Implementing a true Thread Stack Spoofer
126154

127155
Hours-long conversation with [namazso](https://twitter.com/namazso) teached me, that in order to aim for a proper thread stack spoofer we would need to reverse x64 call stack unwinding process.
128156
Firstly, one needs to carefully acknowledge the stack unwinding process explained in (a) linked below. The system when traverses Thread call stack on x64 architecture will not simply rely on return addresses scattered around the thread's stack, but rather it:
@@ -197,40 +225,7 @@ This PoC does not follows replicate this algorithm, because my current understan
197225
- **c)** [`.pdata` section](https://docs.microsoft.com/en-us/windows/win32/debug/pe-format#the-pdata-section)
198226
- **d)** [another sample implementation of `RtlpUnwindPrologue`](https://github.com/hzqst/unicorn_pe/blob/master/unicorn_pe/except.cpp#L773)
199227

200-
201-
## Example run
202-
203-
Use case:
204-
205-
```
206-
C:\> ThreadStackSpoofer.exe <shellcode> <spoof>
207-
```
208-
209-
Where:
210-
- `<shellcode>` is a path to the shellcode file
211-
- `<spoof>` when `1` or `true` will enable thread stack spoofing and anything else disables it.
212-
213-
214-
Example run that spoofs beacon's thread call stack:
215-
216-
```
217-
PS D:\dev2\ThreadStackSpoofer> .\x64\Release\ThreadStackSpoofer.exe .\tests\beacon64.bin 1
218-
[.] Reading shellcode bytes...
219-
[.] Hooking kernel32!Sleep...
220-
[.] Injecting shellcode...
221-
[+] Shellcode is now running.
222-
[>] Original return address: 0x1926747bd51. Finishing call stack...
223-
224-
===> MySleep(5000)
225-
226-
[<] Restoring original return address...
227-
[>] Original return address: 0x1926747bd51. Finishing call stack...
228-
229-
===> MySleep(5000)
230-
231-
[<] Restoring original return address...
232-
[>] Original return address: 0x1926747bd51. Finishing call stack...
233-
```
228+
---
234229

235230
## Word of caution
236231

0 commit comments

Comments
 (0)