236176 Commits

Author SHA1 Message Date
Johannes Doerfert
c72d93a08a [Attributor][NFC] Remove unnecessary overwritten methods 2022-07-21 21:57:02 -05:00
Fangrui Song
9742166935 [LoongArch] Support load/store of dso_local PIC global values
lowerGlobalAddress added by D128427 can be used for PIC. The actual condition is
that the global value needs to be dso_local (a dso_preemptable one needs GOT
indirection).

load-store.ll has UB due to out-of-bounds load/store. Fix the UB in the variable
test and add an array test. Note: NOPIC array index is currently wrong.

Reviewed By: wangleiat

Differential Revision: https://reviews.llvm.org/D129977
2022-07-21 19:37:56 -07:00
Fangrui Song
d805aabe8f [verify-uselistorder] Hide unrelated options 2022-07-21 18:41:28 -07:00
Chenbing Zheng
1a0187c9e7 [InstCombine] remove useless ‘InstCombiner::’. nfc
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D130220
2022-07-22 09:24:24 +08:00
Fangrui Song
61b8a8a672 [sanstats] Hide unrelated options 2022-07-21 18:08:34 -07:00
Fangrui Song
2b9bfa6044 [sancov] --help: hide unrelated options 2022-07-21 18:00:30 -07:00
Phoebe Wang
02fe96b240 [X86][FP16] Do not split FP64->FP16 to FP64->FP32->FP16
Truncation from double to half is not always identical to truncating to float first and then to half. https://godbolt.org/z/56s9517hd

On the other hand, expanding to float and then to double is always identical to expanding to double directly. https://godbolt.org/z/Ye8vbYPnY

Reviewed By: RKSimon, skan

Differential Revision: https://reviews.llvm.org/D130151
2022-07-22 08:36:05 +08:00
Ilia Diachkov
b8e1544b9d [SPIRV] add SPIRVPrepareFunctions pass and update other passes
The patch adds SPIRVPrepareFunctions pass, which modifies function
signatures containing aggregate arguments and/or return values before
IR translation. Information about the original signatures is stored in
metadata. It is used during call lowering to restore correct SPIR-V types
of function arguments and return values. This pass also substitutes some
llvm intrinsic calls to function calls, generating the necessary functions
in the module, as the SPIRV translator does.

The patch also includes changes in other modules, fixing errors and
enabling many SPIR-V features that were omitted earlier. And 15 LIT tests
are also added to demonstrate the new functionality.

Differential Revision: https://reviews.llvm.org/D129730

Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com>
Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com>
Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>
2022-07-22 04:00:48 +03:00
Nick Desaulniers
0ccb6da725 precommit update_mir_test_checks run for D130316 NFC 2022-07-21 17:10:53 -07:00
Philip Reames
bd75350180 [LV] Fix a conceptual mistake around meaning of uniform in isPredicatedInst
This code confuses LV's "Uniform" and LVL/LAI's "Uniform".  Despite the
common name, these are different.
* LVs notion means that only the first lane *of each unrolled part* is
  required.  That is, lanes within a single unroll factor are considered
  uniform.  This allows e.g. widenable memory ops to be considered
  uses of uniform computations.
* LVL and LAI's notion refers to all lanes across all unrollings.

IsUniformMem is in turn defined in terms of LAI's notion.  Thus a
UniformMemOpmeans is a memory operation with a loop invariant address.
This means the same address is accessed in every iteration.

The tweaked piece of code was trying to match a uniform mem op (i.e.
fully loop invariant address), but instead checked for LV's notion of
uniformity.  In theory, this meant with UF > 1, we could speculate
a load which wasn't safe to execute.

This ends up being mostly silent in current code as it is nearly
impossible to create the case where this difference is visible.  The
closest I've come in the test case from 54cb87, but even then, the
incorrect result is only visible in the vplan debug output; before this
change we sink the unsafely speculated load back into the user's predicate
blocks before emitting IR.  Both before and after IR are correct so the
differences aren't "interesting".

The other test changes are uninteresting.  They're cases where LV's uniform
analysis is slightly weaker than SCEV isLoopInvariant.
2022-07-21 15:44:34 -07:00
Philip Reames
54cb87964d [LV] Add a load focused version of the r45679 test
This a reproducer for bug in predicated instruction handling.  The final result code is correct, but the reasoning by which we get there isn't.
2022-07-21 15:33:42 -07:00
Craig Topper
ab2348a6fa [RISCV] Add sext.b/h and zext.b/h/w to RISCVInstrInfo::foldMemoryOperandImpl.
We can always fold zext.b since it is just andi. The others require
Zba/Zbb.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D130302
2022-07-21 14:54:58 -07:00
Alexander Shaposhnikov
e9afdf838e [GlobalOpt] Enable evaluation of atomic loads
Relax the check to allow evaluation of atomic loads
(but still skip volatile loads).

Test plan:
1/ ninja check-llvm check-clang
2/ Bootstrapped LLVM/Clang pass tests

Differential revision: https://reviews.llvm.org/D130211
2022-07-21 21:36:11 +00:00
LLVM GN Syncbot
674cab116d [gn build] Port 1d057a6d4306 2022-07-21 21:26:59 +00:00
LLVM GN Syncbot
31049b3d2b [gn build] Port 1dad6247d275 2022-07-21 20:54:39 +00:00
Daniel Thornburgh
cc0a1078f5 Fix use after free in MarkupFilter.cpp 2022-07-21 13:52:24 -07:00
Teresa Johnson
1dad6247d2 [MemProf] Add memprof metadata related analysis utilities
Adds a number of utilities that are used to help create and update
memprof related metadata. These will be used during profile matching
and annotation, as well as by the inliner when updating the metadata.
Also adds unit tests for the utilities.

See also related RFCs:
RFC: Sanitizer-based Heap Profiler [1]
RFC: A binary serialization format for MemProf [2]
RFC: IR metadata format for MemProf [3]
(Note that the IR metadata format has changed from the RFC during
implementation, as described in the preceeding patch adding the basic
metadata and verification support.)

Depends on D128141.

Differential Revision: https://reviews.llvm.org/D128854
2022-07-21 13:46:01 -07:00
Martin Storsjö
606348cc72 [MinGW] Don't currently set visibility=hidden when building for MinGW
If we build the Target libraries with -fvisibility=hidden, then
LLVM_EXTERNAL_VISIBILITY must also be able to override it back
to default visibility.

Currently, the LLVM_EXTERNAL_VISIBILITY define is a no-op for
mingw targets, thus set CMAKE_CXX_VISIBILITY_PRESET correspondingly.

This unbreaks the mingw dylib build, if the compiler actually
takes hidden visiblity into account (e.g. after D130121).

(Later, once hidden visiblity can be used for MinGW targets, we
can make LLVM_EXTERNAL_VISIBILITY and LLVM_LIBRARY_VISIBILITY expand
to actual attributes, and reverse this commit.)

Differential Revision: https://reviews.llvm.org/D130200
2022-07-21 23:16:33 +03:00
Philip Reames
83993d666b [LV][SVE] Autogen a test for ease of update 2022-07-21 13:12:53 -07:00
Augie Fackler
bd6aa67e02 BuildLibCalls: move inference of freeing memory later
This probably should have been part of D123089, but the effects of it
don't show up until we start removing functions from the table in
D130107. Oops.

Differential Revision: https://reviews.llvm.org/D130184
2022-07-21 15:31:16 -04:00
Augie Fackler
62f48cadfd MemoryBuiltins: accept non-TLI funcs with attribs as allocator funcs
This allows us to accept annotations from out-of-tree languages (the
example test is derived from Rust) so they can enjoy the benefits of
LLVM's optimizations without requiring LLVM to have language-specific
knowledge.

Differential Revision: https://reviews.llvm.org/D123091
2022-07-21 15:31:16 -04:00
Augie Fackler
5a3e3675f6 MemoryBuiltins: start using properties of functions
Prior to this change, we relied on the hard-coded list for all of the
information performed by MemoryBuiltins. With this change, we're able to
start relying on properites of functions described in attributes, which
opens the door to out-of-tree compilers being able to describe their
allocator functions to LLVM's optimizer logic without having to register
their implementation details with LLVM.

Differential Revision: https://reviews.llvm.org/D123090
2022-07-21 15:31:15 -04:00
Sanjay Patel
78c09f0f24 [PatternMatch][InstCombine] match a vector with constant expression element(s) as a constant expression
The InstCombine test is reduced from issue #56601. Without the more
liberal match for ConstantExpr, we try to rearrange constants in
Negator forever.

Alternatively, we could adjust the definition of m_ImmConstant to be
more conservative, but that's probably a larger patch, and I don't
see any downside to changing m_ConstantExpr. We never capture and
modify a ConstantExpr; transforms just want to avoid it.

Differential Revision: https://reviews.llvm.org/D130286
2022-07-21 15:23:57 -04:00
Sanjay Patel
b03891268c [PatternMatch] add tests for constant expression matcher; NFC 2022-07-21 15:23:57 -04:00
Arthur Eubanks
04d398db46 [LoopAccessAnalysis] Simplify D119047
No need to add checks for every type per pointer that we couldn't create
a check for the first time around, just the types that weren't
successful.

Reviewed By: fhahn

Differential Revision: https://reviews.llvm.org/D119376
2022-07-21 12:16:02 -07:00
Philip Reames
27945f9282 [RISCV][LV] Split coverage of uniform load with outside use
Turns out this has a large effect of tail folding, so split out a single test to cover that case and remove it from the others.
2022-07-21 12:07:26 -07:00
John Ericson
07b749800c [cmake] Don't export LLVM_TOOLS_INSTALL_DIR anymore
First of all, `LLVM_TOOLS_INSTALL_DIR` put there breaks our NixOS
builds, because `LLVM_TOOLS_INSTALL_DIR` defined the same as
`CMAKE_INSTALL_BINDIR` becomes an *absolute* path, and then when
downstream projects try to install there too this breaks because our
builds always install to fresh directories for isolation's sake.

Second of all, note that `LLVM_TOOLS_INSTALL_DIR` stands out against the
other specially crafted `LLVM_CONFIG_*` variables substituted in
`llvm/cmake/modules/LLVMConfig.cmake.in`.

@beanz added it in d0e1c2a550ef348aae036d0fe78cab6f038c420c to fix a
dangling reference in `AddLLVM`, but I am suspicious of how this
variable doesn't follow the pattern.

Those other ones are carefully made to be build-time vs install-time
variables depending on which `LLVMConfig.cmake` is being generated, are
carefully made relative as appropriate, etc. etc. For my NixOS use-case
they are also fine because they are never used as downstream install
variables, only for reading not writing.

To avoid the problems I face, and restore symmetry, I deleted the
exported and arranged to have many `${project}_TOOLS_INSTALL_DIR`s.
`AddLLVM` now instead expects each project to define its own, and they
do so based on `CMAKE_INSTALL_BINDIR`. `LLVMConfig` still exports
`LLVM_TOOLS_BINARY_DIR` which is the location for the tools defined in
the usual way, matching the other remaining exported variables.

For the `AddLLVM` changes, I tried to copy the existing pattern of
internal vs non-internal or for LLVM vs for downstream function/macro
names, but it would good to confirm I did that correctly.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D117977
2022-07-21 19:04:00 +00:00
Daniel Thornburgh
6605187103 [NFC] Fix compiler warning in MarkupFilter 2022-07-21 12:00:29 -07:00
Daniel Thornburgh
17e4c217b6 [Symbolizer] Implement contextual symbolizer markup elements.
This change implements the contextual symbolizer markup elements: reset,
module, and mmap. These provide information about the runtime context of
the binary necessary to resolve addresses to symbolic values.

Summary information is printed to the output about this context.
Multiple mmap elements for the same module line are coalesced together.
The standard requires that such elements occur on their own lines to
allow for this; accordingly, anything after a contextual element on a
line is silently discarded.

Implementing this cleanly requires that the filter drive the parser;
this allows skipped sections to avoid being parsed. This also makes the
filter quite a bit easier to use, at the cost of some unused
flexibility.

Reviewed By: peter.smith

Differential Revision: https://reviews.llvm.org/D129519
2022-07-21 11:29:19 -07:00
Zequan Wu
4979b16db1 [llvm-cov] Improve error message by printing the object file name that produces error
If error occurs on constructing coverage info for one of the object files, it prints the name of the object file, so that users know which one is the cause of error.

Differential Revision: https://reviews.llvm.org/D130196
2022-07-21 11:26:51 -07:00
Philip Reames
bb5dc2918f {RISCV][LV] Add tail folding coverage of uniform load store cases 2022-07-21 11:15:36 -07:00
Philip Reames
56a25ed208 {RISCV][LV] Add a test for uniform store of a loop varying value 2022-07-21 11:15:36 -07:00
Philip Reames
0ae46693f0 {RISCV][LV] Split out and expand tests for uniform loads and stores 2022-07-21 10:42:18 -07:00
Pengxuan Zheng
53d7bf3052 [llvm-lib] Ignore /VERBOSE flag
Ignore the flag for now, but we can start using it for verbose output if needed.

Reviewed By: thakis

Differential Revision: https://reviews.llvm.org/D130202
2022-07-21 10:06:13 -07:00
David Sherwood
f15b6b2907 [AArch64] Add target hook for preferPredicateOverEpilogue
This patch adds the AArch64 hook for preferPredicateOverEpilogue,
which currently returns true if SVE is enabled and one of the
following conditions (non-exhaustive) is met:

1. The "sve-tail-folding" option is set to "all", or
2. The "sve-tail-folding" option is set to "all+noreductions"
and the loop does not contain reductions,
3. The "sve-tail-folding" option is set to "all+norecurrences"
and the loop has no first-order recurrences.

Currently the default option is "disabled", but this will be
changed in a later patch.

I've added new tests to show the options behave as expected here:

  Transforms/LoopVectorize/AArch64/sve-tail-folding-option.ll

Differential Revision: https://reviews.llvm.org/D129560
2022-07-21 17:20:06 +01:00
Joe Nash
ccbab2ca15 [AMDGPU] NFC. Auto-generate test for vcclo 2022-07-21 10:50:02 -04:00
Phoebe Wang
f621e568f3 [X86] Remove cfi directives and duplicated check in tests. NFC 2022-07-21 22:55:25 +08:00
David Sherwood
ceb6c23b70 [NFC][LoopVectorize] Explicitly disable tail-folding on some SVE tests
This patch is in preparation for enabling vectorisation with tail-folding
by default for SVE targets. Once we do that many existing tests will
break that depend upon having normal unpredicated vector loops. For
all such tests I have added the flag:

  -prefer-predicate-over-epilogue=scalar-epilogue

Differential Revision: https://reviews.llvm.org/D129137
2022-07-21 15:23:00 +01:00
Graham Hunter
0a715c1146 [LAA] Precommit add/sub tests for forked pointers
Adds new tests for add and sub instructions before reaching a select.

Also adds tests using different bit widths for memory, including
non-power-of-two integers.
2022-07-21 15:16:15 +01:00
Ivan Kosarev
4b9dbbdb09 [AMDGPU][MC][NFC] Refine SMEM load definitions.
Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D130009
2022-07-21 14:56:56 +01:00
Ivan Kosarev
75950be836 [AMDGPU][NFC] Validate G_MERGE_VALUES as we match zero-extended 32-bit scalars.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D130001
2022-07-21 14:49:57 +01:00
Joseph Huber
bc33c2fa0c [Binary] Hard-code the alignment of the offloading binary
Summary:
We previously used `alignof` to get the necessary alignment of the
binary header. However this was different on 32-bit platforms and caused
a few tests to fail because of it. This patch just changes this to be a
hard-coded constant of 8.
2022-07-21 09:28:26 -04:00
Jay Foad
716ca2e3ef [AMDGPU] Pre-sink IR input for some tests
Edit the IR input for some codegen tests to simulate what the IR code
sinking pass would do to it. This makes the tests immune to the presence
or absence of the code sinking pass in the codegen pass pipeline, which
does not belong there.

Differential Revision: https://reviews.llvm.org/D130169
2022-07-21 14:25:44 +01:00
Matt Arsenault
5a5439cb73 AMDGPU: Refine user-sgpr-init16-bug
It only applies to gfx1100 and gfx1102, and for wave32.
2022-07-21 08:57:00 -04:00
Nikita Popov
1f69503107 [MemoryBuiltins] Add getReallocatedOperand() function (NFC)
Replace the value-accepting isReallocLikeFn() overload with a
getReallocatedOperand() function, which returns which operand is
the one being reallocated. Currently, this is always the first one,
but once allockind(realloc) is respected, the reallocated operand
will be determined by the allocptr parameter attribute.
2022-07-21 14:54:16 +02:00
Nikita Popov
46e6dd84b7 [MemoryBuiltins] Remove isFreeCall() function (NFC)
Remove isFreeCall() in favor of getFreedOperand(). Replace the
two remaining uses with a getFreedOperand() != nullptr check, as
they only care that something is getting freed. (The usage in DSE
is correct as such. The allocator-related checks in CFLGraph look
rather questionable in general.)
2022-07-21 14:44:23 +02:00
Nikita Popov
5e856a8578 [InstCombine] Use getFreedOperand() (NFC)
Use getFreedOperand() instead of isFreeCall() to remove the
implicit assumption that any pointer operand to a free function
is the operand being freed. This won't actually matter until we
handle allockind(free).
2022-07-21 14:33:55 +02:00
Nikita Popov
3ac8587a2b [Attributor] Use getFreedOperand() (NFC)
Track which operand is actually freed, to avoid the implicit
assumption that it is the first call argument.
2022-07-21 14:26:47 +02:00
Thomas Symalla
fd64a857ee [AMDGPU] Combine s_or_saveexec, s_xor instructions.
This patch merges a consecutive sequence of

s_or_saveexec s_o, s_i
s_xor exec, exec, s_o

into a single

s_andn2_saveexec s_o, s_i instruction.
This patch also cleans up the SIOptimizeExecMasking pass a bit.

Reviewed By: nhaehnle

Differential Revision: https://reviews.llvm.org/D129073
2022-07-21 14:16:37 +02:00
Alexey Lapshin
8bb4451a65 [Reland][DebugInfo][llvm-dwarfutil] Combine overlapped address ranges.
DWARF files may contain overlapping address ranges. f.e. it can happen if the two
copies of the function have identical instruction sequences and they end up sharing.
That looks incorrect from the point of view of DWARF spec. Current implementation
of DWARFLinker does not combine overlapped address ranges. It would be good if such
ranges would be handled in some useful way. Thus, this patch allows DWARFLinker
to combine overlapped ranges in a single one.

Depends on D86539

Reviewed By: aprantl

Differential Revision: https://reviews.llvm.org/D123469
2022-07-21 14:15:39 +03:00