6192 Commits

Author SHA1 Message Date
Julian Lettner
22570bac69 Lower @llvm.global_dtors using __cxa_atexit on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121736
2022-03-17 10:47:13 -07:00
Nikita Popov
20531b3a6b [RelLookupTableConverter] Avoid querying TTI for declarations
This code queries TTI on a single function, which is considered to
be representative. This is a bit odd, but probably fine in practice.

However, I think we should at least avoid querying declarations,
which e.g. will generally lack target attributes, and for which
we don't seem to ever query TTI in other places.
2022-03-16 10:39:28 +01:00
Simon Pilgrim
7262eacd41 Revert rG9c542a5a4e1ba36c24e48185712779df52b7f7a6 "Lower @llvm.global_dtors using __cxa_atexit on MachO"
Mane of the build bots are complaining: Unknown command line argument '-lower-global-dtors'
2022-03-15 13:01:35 +00:00
Julian Lettner
9c542a5a4e Lower @llvm.global_dtors using __cxa_atexit on MachO
For MachO, lower `@llvm.global_dtors` into `@llvm_global_ctors` with
`__cxa_atexit` calls to avoid emitting the deprecated `__mod_term_func`.

Reuse the existing `WebAssemblyLowerGlobalDtors.cpp` to accomplish this.

Enable fallback to the old behavior via Clang driver flag
(`-fregister-global-dtors-with-atexit`) or llc / code generation flag
(`-lower-global-dtors-via-cxa-atexit`).  This escape hatch will be
removed in the future.

Differential Revision: https://reviews.llvm.org/D121327
2022-03-14 17:51:18 -07:00
Teresa Johnson
fee0bde4c6 [WPD] Extend checking mode to support fallback to indirect call
Extend -wholeprogramdevirt-check to support both the existing
trapping mode on an incorrect devirtualization, as well as a new
mode to fallback to an indirect call on a mismatch. The new mode is

The new mode is useful in cases where we want to enable
devirtualization but cannot fully guarantee whole program visibility
(e.g in the case where LTO has been disabled for a small set of objects
that could potentially override virtual methods without having a symbol
reference to anything in the base class including the vtable).

Remove !prof and !callees metadata (which are used by indirect call
promotion) from both the new direct call and the fallback indirect call
(so that we don't perform another round of promotion on the latter).
Also remove it from the direct call in the non-fallback cases, which
was an oversight, although it didn't seem to cause any issues. Add tests
for the metadata removal covering the various cases.

Differential Revision: https://reviews.llvm.org/D121419
2022-03-14 10:16:28 -07:00
Nikita Popov
067c035012 [GlobalOpt] Handle undef global_ctors gracefully
If there are no ctors, then this can have an arbirary zero-sized
value. The current code checks for null, but it could also be
undef or poison.

Replacing the specific null check with a check for
non-ConstantArray.
2022-03-10 16:02:12 +01:00
Benoit Jacob
851332a1f2 Fix linking error, undefined class static constants.
Reviewed By: spupyrev

Differential Revision: https://reviews.llvm.org/D121293
2022-03-09 10:01:38 -08:00
Vitaly Buka
ce29a0429b Revert "Attempt to fix linking issue on the bot"
The issue was fixed with 48c74bb2e2a72830f1068823bfc2f6fd4b53d427

This reverts commit ac423a8c8aa87a128e51f3690afc1405d06b8c9d.
2022-03-08 16:16:01 -08:00
Florian Mayer
e86bd32b71 [NFC] [HWASan] [MTE] Use function_ref over template. 2022-03-08 15:49:55 -08:00
Vitaly Buka
ac423a8c8a Attempt to fix linking issue on the bot 2022-03-08 15:33:10 -08:00
Fangrui Song
48c74bb2e2 [SampleProfileInference] Work around odr-use of const non-inline static data member to fix -O0 builds after D120508
MinBaseDistance may be odr-used by std::max, leading to an undefined symbol linker error:

```
ld.lld: error: undefined symbol: (anonymous namespace)::MinCostMaxFlow::MinBaseDistance
>>> referenced by SampleProfileInference.cpp:744 (/home/ray/llvm-project/llvm/lib/Transforms/Utils/SampleProfileInference.cpp:744)
>>>               lib/Transforms/Utils/CMakeFiles/LLVMTransformUtils.dir/SampleProfileInference.cpp.o:((anonymous namespace)::FlowAdjuster::jumpDistance(llvm::FlowJump*) const)
```

Since llvm-project is still using C++ 14, workaround it with a cast.
2022-03-08 14:34:53 -08:00
spupyrev
81aedab7dd introducing some profi flags
Differential Revision: https://reviews.llvm.org/D120508
2022-03-08 12:35:15 -08:00
William S. Moses
87ec6f41bb [OpenMPIRBuilder] Allocate temporary at the correct block in a nested parallel
The OpenMPIRBuilder has a bug. Specifically, suppose you have two nested openmp parallel regions (writing with MLIR for ease)

```
omp.parallel {
  %a = ...
  omp.parallel {
    use(%a)
  }
}
```

As OpenMP only permits pointer-like inputs, the builder will wrap all of the inputs into a stack allocation, and then pass this
allocation to the inner parallel. For example, we would want to get something like the following:

```
omp.parallel {
  %a = ...
  %tmp = alloc
  store %tmp[] = %a
  kmpc_fork(outlined, %tmp)
}
```

However, in practice, this is not what currently occurs in the context of nested parallel regions. Specifically to the OpenMPIRBuilder,
the entirety of the function (at the LLVM level) is currently inlined with blocks marking the corresponding start and end of each
region.

```
entry:
  ...

parallel1:
  %a = ...
  ...

parallel2:
  use(%a)
  ...

endparallel2:
  ...

endparallel1:
  ...
```

When the allocation is inserted, it presently inserted into the parent of the entire function (e.g. entry) rather than the parent
allocation scope to the function being outlined. If we were outlining parallel2, the corresponding alloca location would be parallel1.

This causes a variety of bugs, including https://github.com/llvm/llvm-project/issues/54165 as one example.

This PR allows the stack allocation to be created at the correct allocation block, and thus remedies such issues.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D121061
2022-03-06 18:34:25 -05:00
Augie Fackler
b32735d599 BuildLibCalls: add allocalign attributes for memalign and aligned_alloc
This gets us close to being able to remove a column from the table in
MemoryBuiltins.cpp.

Differential Revision: https://reviews.llvm.org/D117923
2022-03-04 15:57:53 -05:00
Augie Fackler
d664c4b73c Attributes: add a new allocalign attribute
This will let us start moving away from hard-coded attributes in
MemoryBuiltins.cpp and put the knowledge about various attribute
functions in the compilers that emit those calls where it probably
belongs.

Differential Revision: https://reviews.llvm.org/D117921
2022-03-04 15:57:53 -05:00
Alexandros Lamprineas
910eb988eb [FuncSpec][NFC] Refactor internal structures.
`ArgInfo` is reduced to only contain a pair of {formal,actual} values.
The specialized function `Fn` and the `Partial` flag are redundant in
this structure. The `Gain` is moved to a new struct `SpecializationInfo`.

The value mappings created by cloneCandidateFunction() are being used
by rewriteCallSites() for matching the formal arguments of recursive
functions.

The list of specializations is passed by reference to calculateGains()
instead of being returned by value.

The `IsPartial` flag is removed from isArgumentInteresting() and
getPossibleConstants() as it's no longer used anywhere in the code.

Differential Revision: https://reviews.llvm.org/D120753
2022-03-03 13:08:13 +00:00
spupyrev
f2ade65fb2 [CSSPGO] Even flow distribution
Differential Revision: https://reviews.llvm.org/D118640
2022-03-02 13:12:05 -08:00
Stephen Long
2f6c14816a [LoopPeel] Add EXPENSIVE_CHECKS ifdef guard around domtree verify call
The verify call was taking 50% of the compile time in our internal LLVM
fork when trying to unroll many loops.

Differential Revision: https://reviews.llvm.org/D113028
2022-03-02 09:56:20 -08:00
spupyrev
bcdc047731 speeding up ext-tsp for huge instances
Differential Revision: https://reviews.llvm.org/D120780
2022-03-02 07:17:48 -08:00
serge-sans-paille
a494ae43be Cleanup includes: TransformsUtils
Estimation on the impact on preprocessor output:
before: 1065307662
after:  1064800684

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D120741
2022-03-01 21:00:07 +01:00
Tong Zhang
17ce89fa80 [SanitizerBounds] Add support for NoSanitizeBounds function
Currently adding attribute no_sanitize("bounds") isn't disabling
-fsanitize=local-bounds (also enabled in -fsanitize=bounds). The Clang
frontend handles fsanitize=array-bounds which can already be disabled by
no_sanitize("bounds"). However, instrumentation added by the
BoundsChecking pass in the middle-end cannot be disabled by the
attribute.

The fix is very similar to D102772 that added the ability to selectively
disable sanitizer pass on certain functions.

In this patch, if no_sanitize("bounds") is provided, an additional
function attribute (NoSanitizeBounds) is attached to IR to let the
BoundsChecking pass know we want to disable local-bounds checking. In
order to support this feature, the IR is extended (similar to D102772)
to make Clang able to preserve the information and let BoundsChecking
pass know bounds checking is disabled for certain function.

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D119816
2022-03-01 18:47:02 +01:00
Alexandros Lamprineas
b803aee67b [FuncSpec][NFC] Improve debug messages.
Adds diagnostic messages when debugging the pass.

Differential Revision: https://reviews.llvm.org/D119875
2022-03-01 11:55:08 +00:00
Alexandros Lamprineas
7b74123a3d [FuncSpec][NFC] Variable renaming.
Just preparing the ground for follow up patches to make the reviews easier.

Differential Revision: https://reviews.llvm.org/D119874
2022-03-01 11:38:57 +00:00
Nikita Popov
16a2d5f885 [SCEVExpander] Use early returns in FindValueInExprValueMap() (NFC) 2022-02-25 10:09:16 +01:00
Nikita Popov
2d0fc3e46f [SCEV] Return ArrayRef from getSCEVValues() (NFC)
Return a read-only view on this set. For the one internal use,
directly access ExprValueMap.
2022-02-25 09:32:22 +01:00
Nikita Popov
d9715a7266 [SCEV] Don't try to reuse expressions with offset
SCEVs ExprValueMap currently tracks not only which IR Values
correspond to a given SCEV expression, but additionally stores that
it may be expanded in the form X+Offset. In theory, this allows
reusing existing IR Values in more cases.

In practice, this doesn't seem to be particularly useful (the test
changes are rather underwhelming) and adds a good bit of complexity.
Per https://github.com/llvm/llvm-project/issues/53905, we have an
invalidation issue with these offseted expressions.

Differential Revision: https://reviews.llvm.org/D120311
2022-02-25 09:16:48 +01:00
Joseph Huber
7aef8b3754 [OpenMP] Make section variable external to prevent collisions
Summary:
We use a section to embed offloading code into the host for later
linking. This is normally unique to the translation unit as it is thrown
away during linking. However, if the user performs a relocatable link
the sections will be merged and we won't be able to access the files
stored inside. This patch changes the section variables to have external
linkage and a name defined by the section name, so if two sections are
combined during linking we get an error.
2022-02-24 10:57:09 -05:00
Matthias Braun
6a383369f9 PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges regardless of PHIs
The `SplitIndirectBrCriticalEdges` function was originally designed for
`CodeGenPrepare` and skipped splitting of edges when the destination
block didn't contain any `PHI` instructions. This only makes sense when
reducing COPYs like `CodeGenPrepare`. In the case of
`PGOInstrumentation` or `GCOVProfiling` it would result in missed
counters and wrong result in functions with computed goto.

Differential Revision: https://reviews.llvm.org/D120096
2022-02-23 16:27:37 -08:00
Bill Wendling
a5bbc6ef99 [NFC] Remove unnecessary "#include"s from header files 2022-02-23 01:20:48 -08:00
Nikita Popov
f8d7210032 [GlobalStatus] Keep Visited set in isSafeToDestroyConstant()
Constants cannot be cyclic, but they can be tree-like. Keep a
visited set to ensure we do not degenerate to exponential run-time.

This fixes the problem reported in https://reviews.llvm.org/D117223#3335482,
though I haven't been able to construct a concise test case for
the issue. This requires a combination of dead constants and the
kind of constant expression tree that textual IR cannot represent
(because the textual representation, unlike the in-memory
representation, is also exponential in size).
2022-02-22 10:02:37 +01:00
Arthur Eubanks
053c2a0020 [SimplifyCFG][OpaquePtr] Check store type when merging conditional store 2022-02-20 11:29:54 -08:00
Arthur Eubanks
129af4daa7 [SCEVExpander][OpaquePtr] Check GEP source type when finding identical GEP
Fixes an opaque pointers miscompile.

Reviewed By: #opaque-pointers, nikic

Differential Revision: https://reviews.llvm.org/D120004
2022-02-17 08:48:11 -08:00
Nikita Popov
36fdfaba19 [RelLookupTableConverter] Ensure that GV, GEP and load types match
This code could be generalized to be type-independent, but for now
just ensure that the same type constraints are enforced with opaque
pointers as with typed pointers.
2022-02-17 12:05:05 +01:00
Roman Lebedev
371fcb720e
[SimplifyCFG][PhaseOrdering] Defer lowering switch into an integer range comparison and branch until after at least the IPSCCP
That transformation is lossy, as discussed in
https://github.com/llvm/llvm-project/issues/53853
and https://github.com/rust-lang/rust/issues/85133#issuecomment-904185574

This is an alternative to D119839,
which would add a limited IPSCCP into SimplifyCFG.

Unlike lowering switch to lookup, we still want this transformation
to happen relatively early, but after giving a chance for the things
like CVP to do their thing. It seems like deferring it just until
the IPSCCP is enough for the tests at hand, but perhaps we need to
be more aggressive and disable it until CVP.

Fixes https://github.com/llvm/llvm-project/issues/53853
Refs. https://github.com/rust-lang/rust/issues/85133

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D119854
2022-02-17 12:13:55 +03:00
Florian Mayer
c195addb60 [NFC] [MTE] [HWASan] Remove unnecessary member of AllocaInfo
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119981
2022-02-16 15:19:30 -08:00
Nikita Popov
c9032f1a69 [LowerMemIntrinsics] Explicitly use i8 type in memmove lowering
By convention, memcpy/memmove intrinsics are always used with i8
pointers (though this is not enforced), so in practice this code
was always using an i8 type. Make that explicit.

Of course, i8 is not a very profitable choice, and this code could
be more performant by picking an appropriate larger type. But that
would require additional test coverage and correctness review, and
certainly shouldn't be a decision based on the pointer element type.
2022-02-16 16:31:55 +01:00
Max Kazantsev
bfc1217119 [NFC] Introduce option to switch off compatible invokes merge
Does not affect default behavior (transform is on).
2022-02-15 21:51:03 +07:00
Florian Mayer
8de457eafc [HWASAN] use common alignAndPadAlloca
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119614
2022-02-14 15:28:32 -08:00
Florian Mayer
205308de6b [NFC] [MTE] Move alignAndPadAlloca to MemoryTaggingSupport.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119610
2022-02-14 14:54:04 -08:00
Florian Mayer
6759cdd829 [NFC] [MTE] Use helpers for stack tagging.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119503
2022-02-11 16:01:46 -08:00
Florian Mayer
bf2f72fa10 [hwasan] keep debug intrinsicts in AllocaInfo.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498
2022-02-11 16:01:02 -08:00
Florian Mayer
26dbc47468 Revert "[hwasan] keep debug intrinsicts in AllocaInfo."
This reverts commit 19fdf85f5858fa0aaae5f28f7140fbf12643993c.
2022-02-11 14:41:24 -08:00
Florian Mayer
b1bd64aeee Revert "[NFC] [MTE] Use helpers for stack tagging."
This reverts commit 8f0e5b4e26a54a3ec347d6499cbb43b15ad6352c.
2022-02-11 14:41:24 -08:00
Florian Mayer
8f0e5b4e26 [NFC] [MTE] Use helpers for stack tagging.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119503
2022-02-11 10:59:09 -08:00
Florian Mayer
19fdf85f58 [hwasan] keep debug intrinsicts in AllocaInfo.
Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D119498
2022-02-11 10:56:53 -08:00
Florian Mayer
e7356fb3e2 [nfc] [hwasan] factor out logic to collect info about stack
this is the first step in unifying some of the logic between hwasan and
mte stack tagging. this only moves around code, changes to converge
different implementations of the same logic follow later.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D118947
2022-02-11 10:54:12 -08:00
Sameer Sahasrabuddhe
d8f99bb6e0 [AMDGPU] replace hostcall module flag with function attribute
The module flag to indicate use of hostcall is insufficient to catch
all cases where hostcall might be in use by a kernel. This is now
replaced by a function attribute that gets propagated to top-level
kernel functions via their respective call-graph.

If the attribute "amdgpu-no-hostcall-ptr" is absent on a kernel, the
default behaviour is to emit kernel metadata indicating that the
kernel uses the hostcall buffer pointer passed as an implicit
argument.

The attribute may be placed explicitly by the user, or inferred by the
AMDGPU attributor by examining the call-graph. The attribute is
inferred only if the function is not being sanitized, and the
implictarg_ptr does not result in a load of any byte in the hostcall
pointer argument.

Reviewed By: jdoerfert, arsenm, kpyzhov

Differential Revision: https://reviews.llvm.org/D119216
2022-02-11 22:51:56 +05:30
Philip Reames
5ba115031d [PSE] Remove assumption that top level predicate is union from public interface [NFC*]
Note that this doesn't actually cause the top level predicate to become a non-union just yet.

The * above comes from a case in the LoopVectorizer where a predicate which is later proven no longer blocks vectorization due to a change from checking if predicates exists to whether the predicate is possibly false.
2022-02-10 16:14:52 -08:00
Philip Reames
d39f4ac494 [SCEV] Unwind SCEVUnionPredicate from getPredicatedBackedgeTakenCount [NFC]
For those curious, the whole reason for tracking the predicate set seperately as opposed to just immediately registering the dependencies appears to be allowing the printing code to print a result without changing the PSE state.  It's slightly questionable if this justifies the complexity, but since we can preserve it with local ugliness, I did so.
2022-02-09 12:55:40 -08:00
Roman Lebedev
c8ba2b67a0
[SimplifyCFG] 'merge compatible invokes': fully support indirect invokes
As long as *all* the invokes in the set are indirect,
we can merge them, but don't merge direct invokes into the set,
even though it would be legal to do.
2022-02-08 21:29:38 +03:00