vmem_write clear tb cache after write #2271

PhilippTakacs · 2025-12-04T13:01:07Z

So you don't need to manual clear the tb after change code.

So you don't need to manual clear the tb after change code. see unicorn-engine#2258

gerph · 2025-12-04T17:33:07Z

Presumably, this will make the calls to mem_write very costly as they need to search all the TBs for the possible blocks to be flushed.

Consider a data copy which reads from one area of memory to another, stopping when a terminal condition is met...
eg

while (true):
    byte = mem_read(address)
    if byte == 255:
         break
    mem_write(address, byte ^ 255)
    address+=1

(yes, that's a contrived example, but if you think about string operations, or a compression operation you might have something similar)

Such an example is now slowed down by a redundant TB flush on every mem_write call.

It's also potentially wrong, because on some architectures you must perform operations to flush the cache in order to allow the newly written instruction to be executed. For example, on ARM if you had executed that instruction previously (or, as in #2258, it had been pulled in as part of the cache lines for existing code) it would (depending on the translation tables) be cached in the instruction cache. Unless you explicitly flushed that cache, the execution of that address would execute the old instruction. That's what the cache is there for. (sorry, I know ARM best, but possibly other architectures have this behaviour - the example was for x86, and if that's normal for x86 then it should be protected as such in this change).

If you want to have the behaviour of the CPU, the existing behaviour (where you must explicitly flush the TB if it has been executed and you changed the address) is more correct. It depends on what you are trying to do - if you just want to run code, then it might help to have the cached addresses automatically flushed. But if you're trying to see what the CPU would execute then you do not want any automatic behaviour - convenience behaviour like this would break any code that tried to show the behaviour of a race condition on a code block store. Would that matter? If you were trying to simulate a bad actor exploiting odd behaviour then maybe. If you were trying to test whether your code would work on a real system then certainly it would matter, because you're not doing what a real system does.

I would suggest that the automatic TB flush be protected by a run time configuration option to allow the user to decide which behaviour they prefer, and that the option be disabled so that performance and behavioural regressions are not seen by existing code - you should opt in to something that could change the behaviour of your application.

PhilippTakacs · 2025-12-04T18:29:28Z

Presumably, this will make the calls to mem_write very costly as they need to search all the TBs for the possible blocks to be flushed

First of all uc_mem_write function is not changed it's only the uc_vmem_write function. Even when you want to use the mmu you can still use uc_vmem_translate in combination with uc_mem_write.

It's also potentially wrong, because on some architectures you must perform operations to flush the cache in order to allow the newly written instruction to be executed. For example, on ARM if you had executed that instruction previously (or, as in #2258, it had been pulled in as part of the cache lines for existing code) it would (depending on the translation tables) be cached in the instruction cache. Unless you explicitly flushed that cache, the execution of that address would execute the old instruction

I would argue that the uc_vmem_write api should behave like you would execute code on the cpu doing this write. So when your architecture doesn't flush the instruction cache for you you still need to do this yourself. On an architecture which does clear the instruction cache automatically, it should just work. As far as I see this is exactly what happen with my patch.

To make it clear: I don't need this patch. I have just remembered the change while make my other PR.

PhilippTakacs · 2025-12-04T18:42:40Z

As far as I see this is exactly what happen with my patch

Also I can write some tests to check, if my assumption is correct. But this need some time.

gerph · 2025-12-04T23:00:16Z

First of all uc_mem_write function is not changed it's only the uc_vmem_write function. Even when you want to use the mmu you can still use uc_vmem_translate in combination with uc_mem_write.

Oh! I missed the v on that... ok, so that's a lot less impactful as I suspect the number of users using mmu are far fewer.

I would argue that the uc_vmem_write api should behave like you would execute code on the cpu doing this write. So when your architecture doesn't flush the instruction cache for you you still need to do this yourself. On an architecture which does clear the instruction cache automatically, it should just work. As far as I see this is exactly what happen with my patch.

That's exactly what I was saying - I didn't see any code that was omitting the flush on ARM architectures, which means that it's not the same as you would see if you executed the code on an ARM. The line:

if (uc_ctl_remove_cache(uc, address, address + len) != UC_ERR_OK) {

has no conditions to stop it being flushed on ARM.

PhilippTakacs · 2025-12-05T07:57:25Z

Now I understand what your problem is. I think we speak about different thinks. I want to remove the TB cache, which is an extra cache from unicorn/qemu holding prebuild executions blocks.

I assume that clearing the TB cache on arm has no effect on the ICache of arm. So when you change the memory without removing the cached TB the cached TB is emulated. When removing the cached TB (my patch) the instructions are fetched from ICache, translated to a new TB and executed. So when you want some similar behavior like in #2258 you still need to clear the ICache.

This is only my assumption, I haven't though about the ICache before you mentioned it.

gerph · 2025-12-05T17:48:03Z

I assume that clearing the TB cache on arm has no effect on the ICache of arm.

I'm treating the TBs as the instruction cache - they're essentially a cached block which can be run if you start execution. If you modify the memory, the cached block will be used rather than the modified form, so they're equivalent to the ICache in that sense. They don't have the same effect of having cache lines populated, and you may get a lot more cached into the TB than you would have with real hardware, but they're very close.

vmem_write clear tb cache after write

4ffba8b

So you don't need to manual clear the tb after change code. see unicorn-engine#2258

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

vmem_write clear tb cache after write #2271

vmem_write clear tb cache after write #2271

Uh oh!

PhilippTakacs commented Dec 4, 2025

Uh oh!

gerph commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 4, 2025

Uh oh!

gerph commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 5, 2025 •

edited

Loading

Uh oh!

gerph commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

vmem_write clear tb cache after write #2271

Are you sure you want to change the base?

vmem_write clear tb cache after write #2271

Uh oh!

Conversation

PhilippTakacs commented Dec 4, 2025

Uh oh!

gerph commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 4, 2025

Uh oh!

gerph commented Dec 4, 2025

Uh oh!

PhilippTakacs commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

gerph commented Dec 5, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

PhilippTakacs commented Dec 5, 2025 •

edited

Loading