This remove some erroneous debug info from tests that should address the
test failures that showed up when the this was previously committed.
This reverts commit 6716ce8b641f0e42e2343e1694ee578b027be0c4.
This started out as trying to combine bf16 fpround to BFCVT2
instructions, but ended up removing the aarch64.neon.nfcvt intrinsics in
favour of generating fpround instructions directly. This simplifies the
patterns and can lead to other optimizations. The BFCVT2 instruction is
adjusted to makes sure the types are valid, and a bfcvt2 is now
generated in more place. The old intrinsics are auto-upgraded to fptrunc
instructions too.
Summary:
This is spelled `ompx_aligned_barrier` when used directly, but wasn't
included in the list of known assumptions. Fix that so now th test
works.
This reverts commit c3a935e3f967f8f22f5db240d145459ee621c1e0.
The only change to the reverted commit is that this also updates
the OCaml bindings according to the C debug-info API changes.
The build failure originally introduced was:
```
FAILED: bindings/ocaml/debuginfo/debuginfo_ocaml.o /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.o
cd /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo && /usr/bin/ocamlfind ocamlc -c /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c -ccopt "-I/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/bindings/ocaml/debuginfo/../llvm -D_GNU_SOURCE -D_DEBUG -D_GLIBCXX_ASSERTIONS -DEXPENSIVE_CHECKS -D_GLIBCXX_DEBUG -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/b/1/llvm-clang-x86_64-expensive-checks-debian/build/include -I/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/include -DNDEBUG "
/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c: In function ‘llvm_dibuild_create_object_pointer_type’:
/b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c:620:30: error: too few arguments to function ‘LLVMDIBuilderCreateObjectPointerType’
620 | LLVMMetadataRef Metadata = LLVMDIBuilderCreateObjectPointerType(
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
In file included from /b/1/llvm-clang-x86_64-expensive-checks-debian/build/bindings/ocaml/debuginfo/debuginfo_ocaml.c:23:
/b/1/llvm-clang-x86_64-expensive-checks-debian/llvm-project/llvm/include/llvm-c/DebugInfo.h:880:17: note: declared here
880 | LLVMMetadataRef LLVMDIBuilderCreateObjectPointerType(LLVMDIBuilderRef Builder,
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
```
print-after-all is useful for diffing IR between two passes. When one of
the two is a function pass, and the other is a loop pass, the diff
becomes useless. Add an option which prints the entire function for loop
passes.
Came up recently with some nodebug case on codeview, that caused a null
entry in elements and crashed LLVM.
Original clang fix to avoid generating IR like this: 504dd577675e8c85cdc8525990a7c8b517a38a89
If a pointer gets freed, it may not be dereferenceable any longer, even
though there is a dominating dereferenceable assumption. As first step,
only consider assumptions if the pointer value cannot be freed if
UseDerefAtPointSemantics is used.
PR: https://github.com/llvm/llvm-project/pull/123196
This patch adds possibility to specify alignment for
llvm.masked.expandload/llvm.masked.compressstore intrinsics in IRBuilder
(this is mostly NFC for now since it's only used in MemorySanitizer, but
there is an intention to generate these intrinsics in the compiler
passes, e.g. in LoopVectorizer)
66badf2 (VT: teach a special-case optz about samesign) introduced a
compile-time regression due to the use of CmpPredicate::getMatching,
which is unnecessarily inefficient. Introduce
CmpPredicate::getPreferredSignedPredicate, which alleviates the
inefficiency problem and squashes the compile-time regression.
CmpPredicate::getMatching implicitly assumes that both predicates are
integer-predicates, and this has led to a crash being reported in
VectorCombine after e409204 (VectorCombine: teach foldExtractedCmps
about samesign). FP predicates are simple enough to handle as there is
never any samesign information associated with them: hence handle them
in CmpPredicate::getMatching, fixing the VectorCombine crash and
guarding against future incorrect usages.
Create an abstraction over isImplied{True,False}ByMatchingCmp to
faithfully communicate the result of both functions, cleaning up code in
callsites. While at it, fix a bug in the implied-false version of the
function, which was inadvertedenly dropping samesign information.
This introduces the `captures` attribute as described in:
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420
This initial patch only introduces the IR/bitcode support for the
attribute and its in-memory representation as `CaptureInfo`. This will
be followed by a patch to upgrade and remove the `nocapture` attribute,
and then by actual inference/analysis support.
Based on the RFC feedback, I've used a syntax similar to the `memory`
attribute, though the only "location" that can be specified is `ret`.
I've added some pretty extensive documentation to LangRef on the
semantics. One non-obvious bit here is that using ptrtoint will not
result in a "return-only" capture, even if the ptrtoint result is only
used in the return value. Without this requirement we wouldn't be able
to continue ordinary capture analysis on the return value.
Move isImplied{True,False}ByMatchingCmp from CmpInst to ICmpInst, so
that it can operate on CmpPredicate instead of CmpInst::Predicate, and
teach it about samesign. There are two callers of this function, and we
choose to migrate the one in ValueTracking, namely
isImpliedCondMatchingOperands to CmpPredicate, hence teaching it about
samesign, with visible test impact.
This is a split-off from #109833 and only adds code relating to checking
if a struct-returning call can be vectorized.
This initial patch only allows the case where all users of the struct
return are `extractvalue` operations that can be widened.
```
%call = tail call { float, float } @foo(float %in_val)
%extract_a = extractvalue { float, float } %call, 0
%extract_b = extractvalue { float, float } %call, 1
```
Note: The tests require the VFABI changes from #119000 to pass.
This PR is motivated by a mismatch we discovered between compilation
results with vs. without `-g3`. We noticed this when compiling SPEC2017
testcases. The specific instance we saw is fixed in this PR by modifying
a guard (see below), but it is likely similar instances exist elsewhere
in the codebase.
The specific case fixed in this PR manifests itself in the `SimplifyCFG`
pass doing different things depending on whether DebugInfo is generated
or not. At the end of this comment, there is reduced example code that
shows the behavior in question.
The differing behavior has two root causes:
1. Commit https://github.com/llvm/llvm-project/commit/c07e19b adds loop
metadata including debug locations to loops that otherwise would not
have loop metadata
2. Commit https://github.com/llvm/llvm-project/commit/ac28efa6c100 adds
a guard to a simplification action in `SImplifyCFG` that prevents it
from simplifying away loop metadata
So, the change in 2. does not consider that when compiling with debug
symbols, loops that otherwise would not have metadata that needs
preserving, now have debug locations in their loop metadata. Thus, with
`-g3`, `SimplifyCFG` behaves differently than without it.
The larger issue is that while debug info is not supposed to influence
the final compilation result, commits like 1. blur the line between what
is and is not debug info, and not all optimization passes account for
this.
This PR does not address that and rather just modifies this particular
guard in order to restore equivalent behavior between debug and
non-debug builds in this one instance.
---
Here is a reduced version of a file from `f526.blender_r` that showcases
the behavior in question:
```C
struct LinkNode;
typedef struct LinkNode {
struct LinkNode *next;
void *link;
} LinkNode;
void do_projectpaint_thread_ph_v_state() {
int *ps = do_projectpaint_thread_ph_v_state;
LinkNode *node;
while (do_projectpaint_thread_ph_v_state)
for (node = ps; node; node = node->next)
;
}
```
Compiling this with and without DebugInfo, and then disassembling the
results, leads to different outcomes (tested on SystemZ and X86). The
reason for this is that the `SimplifyCFG` pass does different things in
either case.
This patch updates the `VFABIDemangler` to support vector functions that
return struct types. For example, a vector variant of `sincos` that
returns a vector of sine values and a vector of cosine values within a
struct.
This patch also adds some helpers for vectorizing types (including
struct types). Some of these are used in the `VFABIDemangler`, and
others will be used in subsequent patches, so this patch simply adds
tests for them.
This makes sure no optimizations are applied that assume the
bigger alignment or size, which could be incorrect if we link
together with non-instrumented code.
Improve the non-fatal cases to use DiagnosticInfo, which will now
provide a location. The allocators attempt to report different errors
if it happens to see inline assembly is involved (this detection is
quite unreliable) using srcloc instead of dbgloc. For now, leave this
behavior unchanged. I think reporting the full location and context
function would be more useful.
Strip hash_value() for CmpPredicate, as different callers have different
hashing use-cases. In this case, there is just one caller, namely
EarlyCSE, which calls hash_combine() on a CmpPredicate, which used to
call hash_combine() on a CmpInst::Predicate prior to 4a0d53a
(PatternMatch: migrate to CmpPredicate). This has uncovered a bug where
two icmp instructions differing in just the fact that one of them has
the samesign flag on it are hashed differently, leading to divergent
hashing, and a crash. Fix this crash by dropping samesign information on
icmp instructions before hashing them, preserving the former behavior.
Fixes#119893.
With the introduction of CmpPredicate in 51a895a (IR: introduce struct
with CmpInst::Predicate and samesign), PatternMatch is one of the first
key pieces of infrastructure that must be updated to match a CmpInst
respecting samesign information. Implement this change to Cmp-matchers.
This is a preparatory step in migrating the codebase over to
CmpPredicate. Since we no functional changes are desired at this stage,
we have chosen not to migrate CmpPredicate::operator==(CmpPredicate)
calls to use CmpPredicate::getMatching(), as that would have visible
impact on tests that are not yet written: instead, we call
CmpPredicate::operator==(Predicate), preserving the old behavior, while
also inserting a few FIXME comments for follow-ups.
To avoid changing the behavior in the general case, only do this
for anonymous functions. Otherwise, we'll end up with a leading
'@' on the name, which may not be meaningful to end users.
Currently LLVMContext::emitError emits any error as an "inline asm"
error which does not make any sense. InlineAsm appears to be special,
in that it uses a "LocCookie" from srcloc metadata, which looks like
a parallel mechanism to ordinary source line locations. This meant
that other types of failures had degraded source information reported
when available.
Introduce some new generic error types, and only use inline asm
in the appropriate contexts. The DiagnosticInfo types are still
a bit of a mess, and I'm not sure why DiagnosticInfoWithLocationBase
exists instead of just having an optional DiagnosticLocation in the
base class.
DK_Generic is for any error that derives from an IR level instruction,
and thus can pull debug locations directly from it. DK_GenericWithLoc
is functionally the generic codegen error, since it does not depend
on the IR and instead can construct a DiagnosticLocation from the
MI debug location.
This PR changes how target extension types are printed when they are
emitted as IR.
This prevents repetitive phrases like "struct = type {...}" from being
repeated over and over in the outputted IR.
Additionally, it should allow opt to not crash when parsing the DXIL
output.
Fixes [#114131](https://github.com/llvm/llvm-project/issues/114131)
Clang [defaults to aligning `__int128_t` to 16 bytes], while LLVM
`datalayout` strings [default to aligning `i128` to 8 bytes]. Wasm is
currently using the defaults for both, so it's inconsistent. Fix this by
adding `-i128:128` to Wasm's `datalayout` string so that it aligns
`i128` to 16 bytes too.
This is similar to
[llvm/llvm-project@dbad963](dbad963a69)
for SPARC.
This fixesrust-lang/rust#133991; see that issue for further discussion.
[defaults to aligning `__int128_t` to 16 bytes]:
f8b4182f07/clang/lib/Basic/TargetInfo.cpp (L77)
[default to aligning `i128` to 8 bytes]:
https://llvm.org/docs/LangRef.html#langref-datalayout
This avoids the need to dynamically relocate each pointer in the table.
To make this work, this PR also moves the binary search of intrinsic
names to an internal function with an adjusted signature, and switches
the unittesting to test against actual intrinsics.