-
Notifications
You must be signed in to change notification settings - Fork 816
Propagate to exact types in struct-utils.h #7889
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Key collected struct information on both heap type and exactness, allowing queries for exact types to return more precise results than queries on the corresponding inexact types. Use this to fix a bug in CFP where it failed to take into account exactness and would unnecessarily and incorrectly emit a select between two values of different types where a single exact type was expected. Also update GTO to propagate to exact types even though it does not take advantage of them. This is necessary because the FieldScanner now collects information for exact and inexact types separately and they need to be combined.
} | ||
|
||
auto iter = rawNewInfos.find(type); | ||
auto iter = rawNewInfos.find({type, Exact}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rawNewInfos
is always exact, though. This is what my last revision in the other PR uses directly.
They are always exact because they are info from struct.new
, which always has an exact type.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I think the idea in this PR is good! I'll read the code in detail monday.
But my point is that CFP already has exact info, in the form of rawNewInfos
. We can probably remove that entirely, in this PR. And for the immediate fix, the latest version of my other PR is just a few lines, basically simply using rawNewInfos
in a second place.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll have follow-up PRs removing rawNewInfos
and making a few other improvements to CFP.
// depends on the underlying T defining a combine() method. | ||
template<typename T> | ||
struct StructValuesMap : public std::unordered_map<HeapType, StructValues<T>> { | ||
struct StructValuesMap |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Worth explaining in the comment what the meaning of the mapping is, that is, what do the exact and inexact entries mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
src/ir/struct-utils.h
Outdated
|
||
StructValues<T>& operator[](HeapType type) { | ||
return (*this)[{type, Inexact}]; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we disallow this operator? Or, are we not risking users getting the less-precise version by default?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The idea here was to allow other users of these utilities to keep using them as-is without changes. Otherwise there would be a lot of mechanical changes here just adding Inexact
in a bunch of places in various passes.
auto& fields = heapType.getStruct().fields; | ||
auto& infos = functionNewInfos[this->getFunction()][heapType]; | ||
auto ht = std::make_pair(heapType, Exact); | ||
auto& infos = functionNewInfos[this->getFunction()][ht]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if we can remove functionNewInfos
entirely?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, probably. I can look at this in a follow-up.
CFP takes advantage of exact type information, but it currently does so only for immutable fields. It is also unnecessarily conservative about how it propagates type information so that sets to a type inhibit optimizations of its sibling types, even though those sets cannot possibly affect the siblings. Add tests for these cases to demonstrate the benefit of follow-on PRs that will fix these issues.
src/ir/struct-utils.h
Outdated
// account fields. | ||
void propagateToSuperTypes(StructValuesMap<T>& infos) { | ||
propagate(infos, false, true); | ||
propagate(infos, false, true, true); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this defaulting to exact while the others have new variants for that, and default to inexact?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Propagating up never propagates to exact types, so its behavior does not depend on the includeExact
flag. Even though they would have identical behavior, I think true
is a little more honest than false
because it can still propagate from exact types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Propagating up never propagates to exact types,
But propagating down does? Why is there this asymmetry?
(What I feel might be happening is that this class left "reasoning" to the users, that is, it provided low-level "propagate here or there", and the user was responsible for picking which made sense for their use case. It's not obvious to me why this is asymmetrical so I wonder if this is anticipating certain use cases. If so that might be fine but could be documented perhaps.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because exact types are subtypes of their inexact counterparts, but have no subtypes themselves no matter how many subtypes their inexact counterparts have. This entirely due to the structure of the type relationships, not for any particular use cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I see, so includeExact
means literally "include Exact when looking at subtypes", not "take exactness into account when propagating"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, pretty much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, perhaps a comment with that? The wrong meaning was my initial impression.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
Co-authored-by: Alon Zakai <azakai@google.com>
src/ir/struct-utils.h
Outdated
std::vector<std::pair<HeapType, Exactness>> subs; | ||
if (includeExact) { | ||
subs.emplace_back(type, Exact); | ||
} | ||
for (auto subType : subTypes.getImmediateSubTypes(type)) { | ||
auto& subInfos = combinedInfos[subType]; | ||
subs.emplace_back(subType, Inexact); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::vector<std::pair<HeapType, Exactness>> subs; | |
if (includeExact) { | |
subs.emplace_back(type, Exact); | |
} | |
for (auto subType : subTypes.getImmediateSubTypes(type)) { | |
auto& subInfos = combinedInfos[subType]; | |
subs.emplace_back(subType, Inexact); | |
} | |
auto subs = subTypes.getImmediateSubTypes(type); | |
if (includeExact) { | |
subs.emplace_back(type, Exact); | |
} |
Though even this is not ideal, as it always copies the output of subTypes.getImmediateSubTypes
... but maybe that's ok?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can also factor the loop body into a lambda to avoid any copying at all.
StructUtils::TypeHierarchyPropagator<StructUtils::CombinableBool> | ||
boolPropagator(subTypes); | ||
boolPropagator.propagateToSubTypes(combinedCopyInfos); | ||
boolPropagator.propagateToSubTypesWithExact(combinedCopyInfos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment above explains in detail why we propagate as we do, and should be updated for exactness.
Perhaps something like "when we read from an exact reference, the value at its field may be influenced by writes to non-exact super- and subtypes."
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's address this in follow-on PRs that remove the false distinction between values written by news and values written by sets. The explanation of the analysis can be made much simpler and more rigorous once we simplify the analysis itself.
propagator.propagateToSuperAndSubTypesWithExact(dataFromSubsAndSupersMap); | ||
auto dataFromSupersMap = std::move(combinedSetGetInfos); | ||
propagator.propagateToSubTypes(dataFromSupersMap); | ||
propagator.propagateToSubTypesWithExact(dataFromSupersMap); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same issue as before - why do we want exactness here? On the one hand it seems obvious we do, but it isn't the default - why not?
And, I see no test changes from this - if it doesn't help, why do it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the FieldInfoScanner
now writes the info it discovers at a mix of exact and inexact keys, we now need to propagate information to the exact keys as well to ensure all the data is properly combined.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense, but that sounds like it applies to all users of the utility? (why do it here and not in all others?)
auto& dataFromSupers = dataFromSupersMap[type]; | ||
auto ht = std::make_pair(type, Exact); | ||
auto& dataFromSubsAndSupers = dataFromSubsAndSupersMap[ht]; | ||
auto& dataFromSupers = dataFromSupersMap[ht]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I must say, I find it confusing to see "use an Exact reference, but look in data that was propagated from subs and supers". Those feel in contradiction. What's the right way to look at this?
Concretely, I'm not sure why this Exact should not be InExact.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
An (exact $Foo)
is a subtype of $Foo
and is propagated to and from just like any other subtype of $Foo
. Exactness fits right into the existing super and subtype relationships, so there is no contradiction between talking about exactness and talking about super- and subtypes.
Concretely, this needs to be Exact
because the inexact keys no longer have all the data in dataFromSupersMap
. The field scanner now writes data to a mix of exact and inexact keys. This doesn't matter for dataFromSubsAndSupersMap
because it propagates up and down simultaneously, ensuring the exact and inexact entries for a given heap type have the same data. But for dataFromSupersMap
, the inexact entry has never been joined with the data in the exact entry since propagation has only gone down.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see, thanks.
That makes sense, but this does seem more complicated than before. Continuing my comment from a moment ago, can we
- Make the propagation always go into exact types? As you say, they are just a normal part of subtyping. So there shouldn't be a need to oddly avoid propagating them.
- That the inexact keys don't have all the data, and that the solution is to query with an exact ref, seems surprising, in the sense that I would expect the point of "querying with an exact ref" to be "get exact results". Can we perhaps add an API to query a heap type without specifying exactness, and that will get all the data (internally, using an exact reference if that makes sense in the concrete data structure, but my point is that the query is just trying to get all the propagated info for the heap type).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another solution to limit the spread of this complexity would be to revert the shared changes and so something bespoke in CFP. I already have further changes to how copies are noted in the works, and that also only benefits CFP. I could also get some nice performance gains by doing a custom propagation algorithm for CFP, so maybe it makes sense not try to share any of that with other passes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Interesting, how would the custom propagation differ from the existing?
Though it seems CFP has fairly general needs, so I would hope we can keep sharing logic between it and other passes?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As we discussed offline, I will remove the complexity of trying to sometimes include exactness and sometimes not. We will always make exactness explicit for all users. As we also discussed, the problem with suggestion (2) is that the fully collected information for a particular heap type might be in its exact or inexact entry (or both) depending on which direction the information was propagated in.
@kripken, I've made the changes we discussed offline. PTAL! |
auto& dataFromSupers = dataFromSupersMap[type]; | ||
// Use the exact entry because information from the inexact entry will | ||
// have been propagated down into it but not vice versa. | ||
auto ht = std::make_pair(type, Exact); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment makes sense for dataFromSupersMap
which was propagated using propagateToSubTypes
- only down. But what about dataFromSubsAndSupersMap
which was propagated both ways?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's propagated both ways, then the exact and inexact entries are the same. I'll add that to the comment.
auto oldType = fields[i].type; | ||
auto& info = finalInfos[type][i]; | ||
// Use inexact because exact info will have been propagated up to | ||
// inexact entries but not necessarily vice versa. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We propagate in both directions though?
binaryen/src/passes/TypeRefining.cpp
Line 177 in 1d61606
propagator.propagateToSuperAndSubTypes(combinedSetGetInfos); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, it looks like all our (3) propagations include supertypes. So we do always go up.
Key collected struct information on both heap type and exactness,
allowing queries for exact types to return more precise results than
queries on the corresponding inexact types.
Use this to fix a bug in CFP where it failed to take into account
exactness and would unnecessarily and incorrectly emit a select between
two values of different types where a single exact type was expected.
Also update GTO to propagate to exact types even though it does not take
advantage of them. This is necessary because the FieldScanner now
collects information for exact and inexact types separately and they
need to be combined.