-
Notifications
You must be signed in to change notification settings - Fork 14.8k
[DA] Fix the check between Subscript and Size after delinearization #151326
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@llvm/pr-subscribers-llvm-analysis Author: Ryotaro Kasuga (kasuga-fj) ChangesDelinearization provides two values: the size of the array, and the subscript of the access. DA checks their validity (
This patch introduces extra checks to ensure the sign of Fix #150604 Full diff: https://github.com/llvm/llvm-project/pull/151326.diff 2 Files Affected:
diff --git a/llvm/lib/Analysis/DependenceAnalysis.cpp b/llvm/lib/Analysis/DependenceAnalysis.cpp
index f1473b2694ca4..fbf60dc34ee33 100644
--- a/llvm/lib/Analysis/DependenceAnalysis.cpp
+++ b/llvm/lib/Analysis/DependenceAnalysis.cpp
@@ -1126,17 +1126,32 @@ bool DependenceInfo::isKnownLessThan(const SCEV *S, const SCEV *Size) const {
Size = SE->getTruncateOrZeroExtend(Size, MaxType);
// Special check for addrecs using BE taken count
- const SCEV *Bound = SE->getMinusSCEV(S, Size);
- if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(Bound)) {
- if (AddRec->isAffine()) {
+ if (const SCEVAddRecExpr *AddRec = dyn_cast<SCEVAddRecExpr>(S))
+ if (AddRec->isAffine() && AddRec->hasNoSignedWrap() &&
+ AddRec->hasNoUnsignedWrap()) {
const SCEV *BECount = SE->getBackedgeTakenCount(AddRec->getLoop());
- if (!isa<SCEVCouldNotCompute>(BECount)) {
- const SCEV *Limit = AddRec->evaluateAtIteration(BECount, *SE);
- if (SE->isKnownNegative(Limit))
- return true;
- }
+ const SCEV *Start = AddRec->getStart();
+ const SCEV *Step = AddRec->getStepRecurrence(*SE);
+ const SCEV *End = AddRec->evaluateAtIteration(BECount, *SE);
+ const SCEV *Diff0 = SE->getMinusSCEV(Start, Size);
+ const SCEV *Diff1 = SE->getMinusSCEV(End, Size);
+
+ // If the value of Step is non-negative and the AddRec is non-wrap, it
+ // reaches its maximum at the last iteration. So it's enouth to check
+ // whether End - Size is negative.
+ if (SE->isKnownNonNegative(Step) && SE->isKnownNegative(Diff1))
+ return true;
+
+ // If the value of Step is non-positive and the AddRec is non-wrap, the
+ // initial value is its maximum.
+ if (SE->isKnownNonPositive(Step) && SE->isKnownNegative(Diff0))
+ return true;
+
+ // Even if we don't know the sign of Step, either Start or End must be
+ // the maximum value of the AddRec since it is non-wrap.
+ if (SE->isKnownNegative(Diff0) && SE->isKnownNegative(Diff1))
+ return true;
}
- }
// Check using normal isKnownNegative
const SCEV *LimitedBound =
diff --git a/llvm/test/Analysis/DependenceAnalysis/DADelin.ll b/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
index b2e4959a7812e..410480188f364 100644
--- a/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
+++ b/llvm/test/Analysis/DependenceAnalysis/DADelin.ll
@@ -594,14 +594,15 @@ for.end12: ; preds = %for.inc10, %entry
}
+; FIXME? It seems that we cannot prove that %N is non-negative...
define void @nonnegative(ptr nocapture %A, i32 %N) {
; CHECK-LABEL: 'nonnegative'
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 1, ptr %arrayidx, align 4
-; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: da analyze - output [* *]!
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4
-; CHECK-NEXT: da analyze - consistent output [0 0|<]!
+; CHECK-NEXT: da analyze - output [* *|<]!
; CHECK-NEXT: Src: store i32 2, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4
-; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: da analyze - output [* *]!
;
entry:
%cmp44 = icmp eq i32 %N, 0
@@ -630,3 +631,78 @@ for.latch:
exit:
ret void
}
+
+; i = 0;
+; do {
+; a[k * i] = 42;
+; a[k * (i + 1)] = 42;
+; i++;
+; } while (i < k);
+;
+; The dependency direction between the two stores depends on the sign of k.
+; FIXME: Each store has loop-carried dependencies on itself if k is zero.
+;
+define void @coeff_may_negative(ptr %a, i32 %k) {
+; CHECK-LABEL: 'coeff_may_negative'
+; CHECK-NEXT: Src: store i8 42, ptr %idx.0, align 1 --> Dst: store i8 42, ptr %idx.0, align 1
+; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: Src: store i8 42, ptr %idx.0, align 1 --> Dst: store i8 42, ptr %idx.1, align 1
+; CHECK-NEXT: da analyze - consistent output [-1]!
+; CHECK-NEXT: Src: store i8 42, ptr %idx.1, align 1 --> Dst: store i8 42, ptr %idx.1, align 1
+; CHECK-NEXT: da analyze - none!
+;
+entry:
+ br label %loop
+
+loop:
+ %i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
+ %i.next = add i32 %i, 1
+ %subscript.0 = mul i32 %i, %k
+ %subscript.1 = mul i32 %i.next, %k
+ %idx.0 = getelementptr i8, ptr %a, i32 %subscript.0
+ %idx.1 = getelementptr i8, ptr %a, i32 %subscript.1
+ store i8 42, ptr %idx.0
+ store i8 42, ptr %idx.1
+ %cond.exit = icmp eq i32 %i.next, %k
+ br i1 %cond.exit, label %exit, label %loop
+
+exit:
+ ret void
+}
+
+; i = 0;
+; do {
+; a[k * i] = 42;
+; a[k * (i + 1)] = 42;
+; i++;
+; } while (i < k);
+;
+; We can infer that the value of k is non-negative from the nsw flag.
+;
+define void @coeff_positive(ptr %a, i32 %k) {
+; CHECK-LABEL: 'coeff_positive'
+; CHECK-NEXT: Src: store i8 42, ptr %idx.0, align 1 --> Dst: store i8 42, ptr %idx.0, align 1
+; CHECK-NEXT: da analyze - none!
+; CHECK-NEXT: Src: store i8 42, ptr %idx.0, align 1 --> Dst: store i8 42, ptr %idx.1, align 1
+; CHECK-NEXT: da analyze - consistent output [-1]!
+; CHECK-NEXT: Src: store i8 42, ptr %idx.1, align 1 --> Dst: store i8 42, ptr %idx.1, align 1
+; CHECK-NEXT: da analyze - none!
+;
+entry:
+ br label %loop
+
+loop:
+ %i = phi i32 [ 0, %entry ], [ %i.next, %loop ]
+ %i.next = add nsw i32 %i, 1
+ %subscript.0 = mul i32 %i, %k
+ %subscript.1 = mul i32 %i.next, %k
+ %idx.0 = getelementptr i8, ptr %a, i32 %subscript.0
+ %idx.1 = getelementptr i8, ptr %a, i32 %subscript.1
+ store i8 42, ptr %idx.0
+ store i8 42, ptr %idx.1
+ %cond.exit = icmp eq i32 %i.next, %k
+ br i1 %cond.exit, label %exit, label %loop
+
+exit:
+ ret void
+}
|
@@ -1126,17 +1126,32 @@ bool DependenceInfo::isKnownLessThan(const SCEV *S, const SCEV *Size) const { | |||
Size = SE->getTruncateOrZeroExtend(Size, MaxType); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although this may not be directly related to the change, this might be incorrect, as Size
can be negative at this point.
// If the value of Step is non-negative and the AddRec is non-wrap, it | ||
// reaches its maximum at the last iteration. So it's enouth to check | ||
// whether End - Size is negative. | ||
if (SE->isKnownNonNegative(Step) && SE->isKnownNegative(Diff1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case, it might be sufficient to ensure that the AddRec
has nsw (nuw is unnecessary)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isKnownNonNegative
assumes a signed representation, so nsw is sufficient.
If it is not nsw, shouldn't these SCEVs be SCEVCouldNotCompute anyway?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not nsw, shouldn't these SCEVs be SCEVCouldNotCompute anyway?
As I tried with @nonnegative
in DADelin.ll, it didn't become SCEVCouldNotCompute.
Anyway, nsw seems sufficient in this case.
; FIXME? It seems that we cannot prove that %N is non-negative... | ||
define void @nonnegative(ptr nocapture %A, i32 %N) { | ||
; CHECK-LABEL: 'nonnegative' | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 1, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - consistent output [0 0|<]! | ||
; CHECK-NEXT: da analyze - output [* *|<]! | ||
; CHECK-NEXT: Src: store i32 2, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not confident whether this change is valid...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failure test seems to be the same case as this. If I'm correct,
- We can reason from
nusw
that the AddRec of GEP doesn't wrap, as LoopAccessAnalysis does - But it doesn't imply each subscript doesn't wrap as well?
So, in this specific case,
- From
nusw
, we can assume{{0,+,(4 * %N)}<%for.outer>,+,4}<%for.inner>
(the SCEV representation for%add
) doesn't wrap - It is delinearized as follows
- Array size:
ArrayDecl[UnknownSize][%N]
- Access:
ArrayRef[{0,+,1}<nuw><%for.outer>][{0,+,1}<nuw><%for.inner>]
- Array size:
- Currently fails to infer
{0,+,1}<nuw><%for.inner>
is less than%N
- Can we infer this based on the fact that the original AddRec cannot wrap?
- Since the backedge taken count of the
%for.inner
is%N - 1
, IIUIC, we can prove the predication if we know that the subscript is nsw?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have to debug this to find out. Might have to do with isKnownNonNegative
etc only working on constants?
At least the change is only more pessimistic, so still correct.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the original result is incorrect, we cannot prove %N
to be non-negative. But I'm pretty sure I missed something...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The original result is correct in the sense that %N
should be be interpreted as an unsigned integer, i.e. always non-negative. It is compared to %add19
which has an nuw flag which means %i.043
will never have the same value in different iterations (and since %mul
and %A
are invariant, also %add
and %arrayidx
).
%N can be larger than INT32_MAX which woud indeed be negative in a signed interpretation. But do we want to only consider signed loop indiction variables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, what happens when
%N
isINT32_MIN
? When%h.045
is 0,%mul
is also 0, and%add
has the same value as%i.043
.
Since %N
is unsigned, its value should be interpreted as UINT32_MAX
. But dang, I only considered the inner loop. With getementptr/add you can add any invariant value and the result would is injective (different %h.045
results in different %arrayidx
), not that I would assume DA would consider that. inbounds
and therefore poison
only applies to what getelementptr
does (for the sake of the argument, assume that %A
is zero in kernel-space in a memory block that spans the entire virtual memory space -- no wrapping behaviour can happen here), but the mul
is variant in the outer loop and might wrap / has no no-wrap flag.
So the store 1 -> store 1 self-dependency might be something like output [* <]
. Again , not something I expect DA to figure out.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My concern to the assumption that values are always signed. The only indicator for wrapping behavior are %add16
, %add19
which are no-UNSIGNED-wrap. Moving from 0b01111111
to 0b10000000
is valid and must be considered. We cannot implicitly assume the result of it is poison/cannot happen because it is valid.
Every single SE->isKnownNegative
assumes a signed interpretation, and it seems not even
https://github.com/kasuga-fj/llvm-project/blob/b06f10d96c6d0fb89253c75c7f1d75c4cf519339/llvm/lib/Analysis/DependenceAnalysis.cpp#L1144
considers that it could be unsigned. But also getBackedgeTakenCount()
logically cannot be negative. That is, we always have a mix of unsigned and signed representations. getMinusSCEV()
uses two-complement wrapping behavior.
Assuming and i8 SCEV:
getMinusSCEV(129, 0) == 129 == 0b10000001 == -127
isKnownNegative(-127)
so isKnownNegative
is happy to interpret an (obviously positive) unsigned number as negtive. Just start the iteration space with the 'negative' number and isKnownNegative
returns exactly the opposite of what you would expect
for (uint8_t i = 128; i < 255; ++i)
A[i];
If we wanted to exploit the fact that memory blocks larger than half the address space might be undefined behavior (due to ptrdiff_t
being signed), we would need to sign-extend the SCEV to pointer size. Also, the logic would only be valid for getelementptr argument, not in general. Also, by adding/multiplying a sufficient larger number, the actuall array index may again be within the sweet sport of below half the address space.
I seems doubious to me that <
is implemented in terms of getMinusSCEV < 0
. The more I think about it the more worried I become. The issue can easily be shown when using constants.
This PR at least adds a check for the AddRec having signed semantics. But AddRec->evaluateAtIteration(BECount)
could still be larger than INTn_MAX as above. Then isKnownNegative
returns true on Diff1
it even if Size is zero. AddRec->hasNoSignedWrap()
just means the AddRec itself does not wrap, but maybe the index that is computed from that AddRec.
I am considering just LGTM since it clearly fixes a bug. Cannot fix all the pre-existing corner cases. Some idea for rigerous/systematic testing would be to use loops with a single iteration only (in two different loops; so AddRec resolves to a constant), or with two iterations, one before the (unsigned and/or signed) wrap and one below. With so just a single iteration it is easier to reason about and SCEV less often has to fall back to SCEVCouldNotCompute. Instead of comparing just the first/last value of the array index SCEV, one might possibly (also) need to recursively visit the SCEV tyo analyze whether each of its constituents is monotonic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I seems doubious to me that < is implemented in terms of getMinusSCEV < 0. The more I think about it the more worried I become. The issue can easily be shown when using constants.
I couldn't agree more. I believe this patch fixes some bugs, but not sure if this is a foundamental solution. And probably the similar bugs exist in other places of DA as well...
With my earlier comment, "all values should be interpreted as signed", what I meant was that we should proceed with the analysis only if we can prove that none of the subscripts overflow in the signed sense. If we encounter a value that can overflow (e.g., transitioning from 0b01111111
to 0b10000000
), or if we cannot prove that this doesn't happen, we should bail out early and return something like "unknown", in DA, probably a Dependence
object instead of a FullDependence
one. It may also be fine to assume that all values are unsigned, but anyway, it would be safer to unify the interpretation used in subsequent processing. Mixing signed and unsigned representations complicates things and, in my opinion, will lead to inconsistencies with the mathematical foundations of the DA algorithm.
Also, I think we need to pay more attention to the SCEV computation (like getAddExpr
) within each function (ZIV, SIV, MIV, etc.). At the very least, the calculation on SCEVConstant
seems to wrap silently (in two-complement sense) if they are sufficently large. Although it might sound a bit extreme, I'm starting to think we should replace all ScalaEvolution::getAddExpr
invocations in DA with something like getAddExprIfNoConstantOverflow
(Minus and Mul as well). Once an overflow is detected, the analysis should be terminated immediately. Or, using wider types in such cases would be another solution.
There is one bit of good news; regarding an AddRec used as an argument of GEP, maybe we can reason about wrapping behavior it if the GEP has nusw
flag (not the wraps about the during the GEP operation, the AddRec itself!). Looking at this commit message, the same logic seems to apply in DA as well. However, at the same time, I also have the following concerns:
-
It says that the AddRec in the GEP argument doesn't wrap. So, what about subscripts after delinearization? At the moment, I think it would also not wrap if sizes are also positive (signed sense? unsigned sense?).
-
The result of the GEP being poison itself doesn't trigger UB. It will be triggered only when it is used as the pointer argument to a load/store. So, we might need to add some control-flow checks, to rule out cases like this:
... %gep = getelementptr inbounds i32, ptr %A, i64 %addrec %cond = ... ; an expression that always evaluates to false whenever %gep is poison br i1 %cond, label %if.then, label %sink if.then: store i32 42, ptr %gep br %sink sink: ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It says that the AddRec in the GEP argument doesn't wrap. So, what about subscripts after delinearization? At the moment, I think it would also not wrap if sizes are also positive (signed sense? unsigned sense?).
Thats exatly what I mentioned as well. I think "non-wrap" of the subscript has to be replaced with a notion of monotonocity of the SCEV expression.
The result of the GEP being poison itself doesn't trigger UB. It will be triggered only when it is used as the pointer argument to a load/store. So, we might need to add some control-flow checks, to rule out cases like this:
DA does not support loop versioning like LAA does, we cannot require additional checks. I aslo don't think it is even possible to find such an expression to figure out whether GEP results in poison.
Since we can assume that the program we are analyzing has no UB, we can assume that any poison returned by GEP will not be used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It says that the AddRec in the GEP argument doesn't wrap. So, what about subscripts after delinearization? At the moment, I think it would also not wrap if sizes are also positive (signed sense? unsigned sense?).
Thats exatly what I mentioned as well. I think "non-wrap" of the subscript has to be replaced with a notion of monotonocity of the SCEV expression.
Hm, it looks reasonable to me for now.
The result of the GEP being poison itself doesn't trigger UB. It will be triggered only when it is used as the pointer argument to a load/store. So, we might need to add some control-flow checks, to rule out cases like this:
DA does not support loop versioning like LAA does, we cannot require additional checks. I aslo don't think it is even possible to find such an expression to figure out whether GEP results in poison. Since we can assume that the program we are analyzing has no UB, we can assume that any poison returned by GEP will not be used.
I didn't mean inserting additional checks, I'm considering the case where the input IR has such a condition. DA reasons about something based on the GEP flags (example), but violating these flags doesn't trigger immediate UB. It actually triggered when it is used as, e.g.. pointer argument for load/store. I'm now wondering if this inference is only allowed when the value of the GEP is guaranteed to be used.
// If the value of Step is non-negative and the AddRec is non-wrap, it | ||
// reaches its maximum at the last iteration. So it's enouth to check | ||
// whether End - Size is negative. | ||
if (SE->isKnownNonNegative(Step) && SE->isKnownNegative(Diff1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
isKnownNonNegative
assumes a signed representation, so nsw is sufficient.
If it is not nsw, shouldn't these SCEVs be SCEVCouldNotCompute anyway?
; FIXME? It seems that we cannot prove that %N is non-negative... | ||
define void @nonnegative(ptr nocapture %A, i32 %N) { | ||
; CHECK-LABEL: 'nonnegative' | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 1, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - consistent output [0 0|<]! | ||
; CHECK-NEXT: da analyze - output [* *|<]! | ||
; CHECK-NEXT: Src: store i32 2, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have to debug this to find out. Might have to do with isKnownNonNegative
etc only working on constants?
At least the change is only more pessimistic, so still correct.
// If the value of Step is non-negative and the AddRec is non-wrap, it | ||
// reaches its maximum at the last iteration. So it's enouth to check | ||
// whether End - Size is negative. | ||
if (SE->isKnownNonNegative(Step) && SE->isKnownNegative(Diff1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is not nsw, shouldn't these SCEVs be SCEVCouldNotCompute anyway?
As I tried with @nonnegative
in DADelin.ll, it didn't become SCEVCouldNotCompute.
Anyway, nsw seems sufficient in this case.
@@ -1111,9 +1111,9 @@ bool DependenceInfo::isKnownPredicate(ICmpInst::Predicate Pred, const SCEV *X, | |||
} | |||
} | |||
|
|||
/// Compare to see if S is less than Size, using isKnownNegative(S - max(Size, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The rationale for using max here isn't entirely clear, but S < max(Size, 1)
doesn't imply S < Size
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The want a Size of 0 to pass the test. Zero size means no memory accessed, so obviously no dependency.
And then there is the issue that Size
could evaluate to a negative number... , which first needs to verify that it is being used in a GEP inbounds expression (that is then used to access memory)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before comparing Subscript
and Size
, we check that Subscript
is non-negative. This function is, as the name suggests, returns true if Subscript
is less than Size
. If Size
being 0 pass implies 0 <= Subscript < 0
. Something seems off.
I think the case where Size
is 0 is a special one and should be handled more pessimistically. For example, an offset like i*d0*d1 + j*d1 + k
would be delinearized with Sizes as [UnknownSize][d0][d1]
and Subscripts as [i][j][k]
.
If d1
is 0, the offset calculation yields k
, regardless of the values of the first two subscripts. However, if the subscripts of Src
are [2*i+1][j][k]
and those of Dst
are [2*i][j][k]
, then DA would conclude that no dependency exists between them.
; XFAIL: * | ||
; At the moment, DependenceAnalysis cannot infer `n` to be positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it permitted to add XFAIL: *
temporary to the failed test?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There are many remaining issues with DA, but let's improve it step-by-step.
; XFAIL: * | ||
; At the moment, DependenceAnalysis cannot infer `n` to be positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes
; FIXME? It seems that we cannot prove that %N is non-negative... | ||
define void @nonnegative(ptr nocapture %A, i32 %N) { | ||
; CHECK-LABEL: 'nonnegative' | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 1, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - consistent output [0 0|<]! | ||
; CHECK-NEXT: da analyze - output [* *|<]! | ||
; CHECK-NEXT: Src: store i32 2, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It says that the AddRec in the GEP argument doesn't wrap. So, what about subscripts after delinearization? At the moment, I think it would also not wrap if sizes are also positive (signed sense? unsigned sense?).
Thats exatly what I mentioned as well. I think "non-wrap" of the subscript has to be replaced with a notion of monotonocity of the SCEV expression.
The result of the GEP being poison itself doesn't trigger UB. It will be triggered only when it is used as the pointer argument to a load/store. So, we might need to add some control-flow checks, to rule out cases like this:
DA does not support loop versioning like LAA does, we cannot require additional checks. I aslo don't think it is even possible to find such an expression to figure out whether GEP results in poison.
Since we can assume that the program we are analyzing has no UB, we can assume that any poison returned by GEP will not be used.
@@ -1111,9 +1111,9 @@ bool DependenceInfo::isKnownPredicate(ICmpInst::Predicate Pred, const SCEV *X, | |||
} | |||
} | |||
|
|||
/// Compare to see if S is less than Size, using isKnownNegative(S - max(Size, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The want a Size of 0 to pass the test. Zero size means no memory accessed, so obviously no dependency.
And then there is the issue that Size
could evaluate to a negative number... , which first needs to verify that it is being used in a GEP inbounds expression (that is then used to access memory)
@@ -719,12 +719,14 @@ for.end: ; preds = %for.body | |||
;; for(int j = 0; j < M; j+=1) | |||
;; A[M*N + M*i + j] = 2; | |||
|
|||
; FIXME: Currently failing to infer %M being positive. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like header guards for %M
being negative are not present. There is one for %N
though, but seems for.body.us
is executed at least once with negative %M
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might be misunderstanding something, but in principle, we can infer it from nsw
on %add10.us
?
The more I think about it, the more confused I get. Still, unsigned feels dangerous. Consider the array size is estimated as So, is such check sufficient? Probably No. I found the following example (godbolt): ; void f(char *a, unsigned long long d) {
; if (d == UINT64_MAX)
; for (unsigned long long i = 0; i < d; i++)
; a[i * (d + 1)] = 42;
; }
define void @f(ptr %a, i64 %d) {
entry:
%guard = icmp eq i64 %d, -1
%stride = add nsw i64 %d, 1
br i1 %guard, label %loop, label %exit
loop: ; %d being -1, %stride is 0
%i = phi i64 [ 0, %entry ], [ %i.next, %loop ]
%offset = phi i64 [ 0, %entry ], [ %offset.next, %loop ]
%idx = getelementptr inbounds i8, ptr %a, i64 %offset
store i8 42, ptr %idx
%i.next = add nuw i64 %i, 1
%offset.next = add nsw nuw i64 %offset, %stride
%cond = icmp eq i64 %i.next, %d
br i1 %cond, label %exit, label %loop
exit:
ret void
} This loop stores 42 to The SCEV representation for
Well, this result is also suspicious, maybe Anyway, we know that the back-edge taken count is |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are many remaining issues with DA, but let's improve it step-by-step.
Thank you, that really encourages me.
; FIXME? It seems that we cannot prove that %N is non-negative... | ||
define void @nonnegative(ptr nocapture %A, i32 %N) { | ||
; CHECK-LABEL: 'nonnegative' | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 1, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! | ||
; CHECK-NEXT: Src: store i32 1, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - consistent output [0 0|<]! | ||
; CHECK-NEXT: da analyze - output [* *|<]! | ||
; CHECK-NEXT: Src: store i32 2, ptr %arrayidx, align 4 --> Dst: store i32 2, ptr %arrayidx, align 4 | ||
; CHECK-NEXT: da analyze - none! | ||
; CHECK-NEXT: da analyze - output [* *]! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- It says that the AddRec in the GEP argument doesn't wrap. So, what about subscripts after delinearization? At the moment, I think it would also not wrap if sizes are also positive (signed sense? unsigned sense?).
Thats exatly what I mentioned as well. I think "non-wrap" of the subscript has to be replaced with a notion of monotonocity of the SCEV expression.
Hm, it looks reasonable to me for now.
The result of the GEP being poison itself doesn't trigger UB. It will be triggered only when it is used as the pointer argument to a load/store. So, we might need to add some control-flow checks, to rule out cases like this:
DA does not support loop versioning like LAA does, we cannot require additional checks. I aslo don't think it is even possible to find such an expression to figure out whether GEP results in poison. Since we can assume that the program we are analyzing has no UB, we can assume that any poison returned by GEP will not be used.
I didn't mean inserting additional checks, I'm considering the case where the input IR has such a condition. DA reasons about something based on the GEP flags (example), but violating these flags doesn't trigger immediate UB. It actually triggered when it is used as, e.g.. pointer argument for load/store. I'm now wondering if this inference is only allowed when the value of the GEP is guaranteed to be used.
@@ -1111,9 +1111,9 @@ bool DependenceInfo::isKnownPredicate(ICmpInst::Predicate Pred, const SCEV *X, | |||
} | |||
} | |||
|
|||
/// Compare to see if S is less than Size, using isKnownNegative(S - max(Size, 1)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Before comparing Subscript
and Size
, we check that Subscript
is non-negative. This function is, as the name suggests, returns true if Subscript
is less than Size
. If Size
being 0 pass implies 0 <= Subscript < 0
. Something seems off.
I think the case where Size
is 0 is a special one and should be handled more pessimistically. For example, an offset like i*d0*d1 + j*d1 + k
would be delinearized with Sizes as [UnknownSize][d0][d1]
and Subscripts as [i][j][k]
.
If d1
is 0, the offset calculation yields k
, regardless of the values of the first two subscripts. However, if the subscripts of Src
are [2*i+1][j][k]
and those of Dst
are [2*i][j][k]
, then DA would conclude that no dependency exists between them.
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/186/builds/11409 Here is the relevant piece of the build log for the reference
|
LLVM Buildbot has detected a new failure on builder Full details are available at: https://lab.llvm.org/buildbot/#/builders/18/builds/20221 Here is the relevant piece of the build log for the reference
|
Delinearization provides two values: the size of the array, and the subscript of the access. DA checks their validity (
0 <= subscript < size
), with some special handling. In particular, to ensuresubscript < size
, calculate the maximum value ofsubscript - size
and check if it is negative. There was an issue in its process: whensubscript - size
is expressed as an affine format likeinit + step * i
, the value in the last iteration (start + step * (num_iterations - 1)
) was assumed to be the maximum value. This assumption is incorrect in the following cases:step
is negativeThis patch introduces extra checks to ensure the sign of
step
and verify the existence of nsw/nuw flags.Also,
isKnownNonNegative(S - smax(1, Size))
was used as a regular check, which is incorrect whenSize
is negative. This patch also replace it withisKnownNonNegative(S - Size)
, although it's still unclear whether usingisKnownNonNegative
is appropriate in the first place.Fix #150604