510371 Commits

Author SHA1 Message Date
kadir çetinkaya
b47d7ce812
[clangd] Update TidyFastChecks for release/19.x (#106354)
Run for clang-tidy checks available in release/19.x branch.

Some notable findings:
- altera-id-dependent-backward-branch, stays slow with 13%.
- misc-const-correctness become faster, going from 261% to 67%, but
still above
  8% threshold.
- misc-header-include-cycle is a new SLOW check with 10% runtime
implications
- readability-container-size-empty went from 16% to 13%, still SLOW.
2024-09-02 15:25:26 +02:00
Sam Tebbs
44cfbef1b3
[AArch64] Lower partial add reduction to udot or svdot (#101010)
This patch introduces lowering of the partial add reduction intrinsic to
a udot or svdot for AArch64. This also involves adding a
`shouldExpandPartialReductionIntrinsic` target hook, which AArch64 will
return false from in the cases that it can be lowered.
2024-09-02 14:06:14 +01:00
David Sherwood
df3d70b5a7
[Analysis] Add getPredicatedExitCount to ScalarEvolution (#105649)
Due to a reviewer request on PR #88385 I have created this patch
to add a getPredicatedExitCount function, which is similar to
getExitCount except that it uses the predicated backedge taken
information. With PR #88385 we will start to care about more
loops with multiple exits, and want the ability to query exit
counts for a particular exiting block. Such loops may require
predicates in order to be vectorised.

New tests added here:

Analysis/ScalarEvolution/predicated-exit-count.ll
2024-09-02 14:05:26 +01:00
Hans
ef26afcb88
Win release packaging: Don't try to use rpmalloc for 32-bit x86 (#106969)
because that doesn't work (results in `LINK : error LNK2001: unresolved
external symbol malloc`).
Based on the title of #91862 it was only intended for use in 64-bit
builds.
2024-09-02 15:04:13 +02:00
Florian Hahn
b0de7fa466
[VPlan] Use op from underlying call in computeCost if needed.
This fixes a divergence between legacy and VPlan-based cost model, e.g.
if one of the operands has an first-order recurrence phi as operand.
2024-09-02 14:00:10 +01:00
Pavel Labath
181cc75ea8
[lldb/linux] Make truncated reads work (#106532)
Previously, we were returning an error if we couldn't read the whole
region. This doesn't matter most of the time, because lldb caches memory
reads, and in that process it aligns them to cache line boundaries. As
(LLDB) cache lines are smaller than pages, the reads are unlikely to
cross page boundaries.

Nonetheless, this can cause a problem for large reads (which bypass the
cache), where we're unable to read anything even if just a single byte
of the memory is unreadable. This patch fixes the lldb-server to do
that, and also changes the linux implementation, to reuse any partial
results it got from the process_vm_readv call (to avoid having to
re-read everything again using ptrace, only to find that it stopped at
the same place).

This matches debugserver behavior. It is also consistent with the gdb
remote protocol documentation, but -- notably -- not with actual
gdbserver behavior (which returns errors instead of partial results). We
filed a
[clarification
bug](https://sourceware.org/bugzilla/show_bug.cgi?id=24751) several
years ago. Though we did not really reach a conclusion there, I think
this is the most logical behavior.

The associated test does not currently pass on windows, because the
windows memory read APIs don't support partial reads (I have a WIP patch
to work around that).
2024-09-02 14:44:18 +02:00
Nikita Popov
0c0bac94c0 [InstCombine] Add additional tests for arm intrinsic alignment (NFC) 2024-09-02 14:43:49 +02:00
David Sherwood
dc6c3ba4c4
[NFC][IR] Add CreateCountTrailingZeroElems helper (#106711)
The LoopIdiomVectorize pass already creates calls to the intrinsic
experimental_cttz_elts, but PR #88385 will start calling this more
too so I've created a helper for it.
2024-09-02 13:40:14 +01:00
Mital Ashok
dc3f66af58
[NFC] Fix dead links in TargetCXXABI.def (#96348)
http://itanium-cxx-abi.github.io/cxx-abi/

> This website may be mirrored in many places, some of which may become
stale. The current canonical location is:
>  * http://itanium-cxx-abi.github.io/cxx-abi/

https://github.com/ARM-software/abi-aa

> This is the official place for the latest documents of the Application
Binary Interface for the Arm® Architecture, both for source files and
officially released documents.
2024-09-02 14:35:21 +02:00
Martin Storsjö
b32dc67732 Revert "[compiler-rt][fuzzer] SetThreadName build fix for Mingwin attempt (#106902)"
This reverts commit 7c4cffd9d8be424e9e9542be9aec3b5a6f69073e.

This commit broke compilation in environments that don't use
winpthreads.
2024-09-02 15:25:56 +03:00
Marius Brehler
8b2ad5c8f1
[mlir][EmitC] Remove restrictions on include op (#106953)
An `emitc.include` should be usable even though the parent is not a
ModuleOp. This requirement is therefore removed.
2024-09-02 14:21:09 +02:00
kadir çetinkaya
60ed1043d7
[include-cleaner] Report refs for enum constants used through namespace aliases (#106706) 2024-09-02 14:16:49 +02:00
Nikita Popov
224112f833 [ARM] Regenerate test checks (NFC) 2024-09-02 14:15:03 +02:00
Timm Baeder
a9006bffa9
[clang][bytecode] Fix zero-init of first union member (#106962)
... if done via a ImplicitValueInitExpr.
We were already doing this later in visitZeroRecordInitializer().
2024-09-02 13:51:01 +02:00
Timm Baeder
f838d6b1b2
[clang][bytecode] Implement __noop (#106714)
This does nothing and returns 0.
2024-09-02 13:50:22 +02:00
Roger Ferrer Ibáñez
4ed90920a8
[Flang][Lower] Handle mangling of a generic name with a homonym specific procedure (#106693)
This may happen when using modules.

Fixes #93707
2024-09-02 13:48:39 +02:00
Alastair Houghton
bdfd780490
[RuntimeDyld][Windows] Allocate space for dllimport things. (#106958)
We weren't taking account of the space we require in the stubs for
things that are dllimported, and as a result we could hit the assertion
failure for running out of stub space. Fix that.

Also add a couple of `override` specifiers that were missing last time
(#102586).

rdar://133473673
2024-09-02 12:34:12 +01:00
Christian Sigg
e90b21959a
[llvm][bazel] Port 1e65b76 to bazel.
1e65b76587
2024-09-02 13:33:49 +02:00
Andrzej Warzyński
a9c71d3665
[mlir][vector] Add more tests for ConvertVectorToLLVM (5/n) (#106510) 2024-09-02 12:19:00 +01:00
Simon Pilgrim
f19dff1b80 [X86] scmp/ucmp - add SSE42/AVX2/AVX512 test coverage to show current state of vector legalization/lowering 2024-09-02 12:17:21 +01:00
Timm Bäder
f79722b932 [clang][bytecode][NFC] Move test case to -verify=both style 2024-09-02 13:15:48 +02:00
Jeremy Morse
25f87f2d70
[DebugInfo][RemoveDIs] Find types hidden in DbgRecords (#106547)
When serialising to textual IR, there can be constant Values referred to
by DbgRecords that don't appear anywhere else, and have types hidden
even deeper in side them. Enumerate these when enumerating all types.

Test by Mikael Holmén.
2024-09-02 11:56:40 +01:00
Nikita Popov
5dcea4628d [AutoUpgrade] Preserve attributes when upgrading named struct return
For example, if the argument has an alignment attribute, preserve it.
2024-09-02 12:42:52 +02:00
Tobias Gysi
751975530e
Reapply "[MLIR][LLVM] Make DISubprogramAttr cyclic" (#106571) with fixes (#106947)
This reverts commit fa93be4, restoring
commit d884b77, with fixes that ensure the CAPI declarations are
exported properly.

This commit implements LLVM_DIRecursiveTypeAttrInterface for the
DISubprogramAttr to ensure cyclic subprograms can be imported properly.
In the process multiple shortcuts around the recently introduced
DIImportedEntityAttr can be removed.
2024-09-02 12:26:15 +02:00
Brad Smith
1e65b76587
[llvm][Support] Add support for thread naming under DragonFly BSD and Solaris/illumos (#106944) 2024-09-02 06:17:40 -04:00
Benjamin Maxwell
c42512436b
[mlir][ArmSME] Rename slice move operations to insert/extract_tile_slice (#106755)
This renames:

- `arm_sme.move_tile_slice_to_vector` to `arm_sme.extract_tile_slice`
- `arm_sme.move_vector_to_tile_slice` to `arm_sme.insert_tile_slice`

The new names are more consistent with the rest of MLIR and should be
easier to understand. The current names (to me personally) are hard to
parse and easy to mix up when skimming through code.

Additionally, the syntax for `insert_tile_slice` has changed from:

```mlir
%4 = arm_sme.insert_tile_slice %0, %1, %2
  : vector<[16]xi8> into vector<[16]x[16]xi8>
```

To:

```mlir
%4 = arm_sme.insert_tile_slice %0, %1[%2]
  : vector<[16]xi8> into vector<[16]x[16]xi8>
```

This is for consistency with `extract_tile_slice`, but also helps with
readability as it makes it clear which operand is the index.
2024-09-02 11:12:40 +01:00
Nikita Popov
b9bba6ca9f
[BasicAA] Track nuw through decomposed expressions (#106512)
When we decompose the GEP offset expression, and the arithmetic is not
performed using nuw operations, we cannot retain the nuw flag on the
decomposed GEP.

For example, if we have `gep nuw p, (a-1)`, this is not at all the same
as `gep nuw (gep nuw p, a), -1`.

Fix this by tracking NUW through linear expression decomposition,
similarly to what we already do for the NSW flag.

This fixes the miscompilation reported in
https://github.com/llvm/llvm-project/pull/105496#issuecomment-2315322220.
2024-09-02 12:11:03 +02:00
Brad Smith
d7100111f4
[llvm][Support] Adjust maximum thread name length to the right value for OpenBSD (#106956)
The thread name length is derived from _MAXCOMLEN which is 24.
2024-09-02 06:02:24 -04:00
Nikita Popov
24fe1d4fd6
[SCCP] Infer return attributes in SCCP as well (#106732)
We can infer the range/nonnull attributes in non-interprocedural SCCP as
well. The results may be better after the function has been simplified.
2024-09-02 11:44:37 +02:00
Alastair Houghton
87d904871f
Revert "[RuntimeDyld][Windows] Allocate space for dllimport things." (#106954)
Looks like I missed an `override` (maybe that warning was enabled recently?). Will revert and fix.

Reverts llvm/llvm-project#102586
2024-09-02 10:27:28 +01:00
c8ef
eaea4d15ac
[clang] The ms-extension __noop should return zero in a constexpr context. (#106849)
Fixes #106713.
2024-09-02 11:15:44 +02:00
Tom Eccles
cde3838c43
[flang][runtime] long double isn't always f80 (#106746)
f80 is only a thing on x86, and even then the size of long double can be
changed with compiler flags. Instead set the size according to the host
system (this is what is already done for integer types).
2024-09-02 10:12:43 +01:00
Alastair Houghton
a0a253181e
[RuntimeDyld][Windows] Allocate space for dllimport things. (#102586)
We weren't taking account of the space we require in the stubs for
things that are dllimported, and as a result we could hit the assertion
failure for running out of stub space. Fix that.

rdar://133473673

---------

Co-authored-by: Saleem Abdulrasool <compnerd@compnerd.org>
Co-authored-by: Lang Hames <lhames@gmail.com>
Co-authored-by: Ben Barham <b.n.barham@gmail.com>
2024-09-02 10:07:11 +01:00
Yingwei Zheng
a156b5a47d
[SLP] Add vectorization support for [u|s]cmp (#106747)
This patch adds vectorization support for [u|s]cmp intrinsic calls.
2024-09-02 17:06:07 +08:00
Owen Pan
0fa78b6c7b
[clang-format] Correctly annotate braces in macro definition (#106662)
Fixes #106418.
2024-09-02 01:40:13 -07:00
Nikita Popov
34b10e165d [InstCombine] Remove optional LoopInfo dependency
https://github.com/llvm/llvm-project/pull/106075 has removed the
last dependency on LoopInfo in InstCombine, so don't fetch the
analysis anymore and remove the use-loop-info pass option.
2024-09-02 10:25:45 +02:00
David Spickett
5bd3ee0ac0
[libcxx][test] Use long double test macro in strong_order.pass.cpp (#106742) 2024-09-02 09:06:14 +01:00
Antonio Frighetto
d79c4c1119 [CGP] Regenerate revert-constant-ptr-propagation-on-calls.ll test (NFC)
Multiple buildbots were previously failing.
2024-09-02 09:55:43 +02:00
Oliver Stannard
9cf68679c4
[ARM] Fix failure to register-allocate CMP_SWAP_64 pseudo-inst (#106721)
This test case was failing to compile with a "ran out of registers
during register allocation" error at -O0. This was because CMP_SWAP_64
has 3 operands which must be an even-odd register pair, and two other
GPR operands. All of the def operands are also early-clobber, so
registers can't be shared between uses and defs. Because the function
has an over-aligned alloca it needs frame and base pointers, so r6 and
r11 are both reserved. That leaves r0/r1, r2/r3, r4/r5 and r8/r9 as the
only valid register pairs, and if the two individual GPR operands happen
to get allocated to registers in different pairs then only 2 pairs will
be available for the three GPRPair operands.

To fix this, I've merged the two GPR operands into a single GPRPair
operand. This means that the instruction now has 4 GPRPair operands,
which can always be allocated without relying on luck. This does
constrain register allocation a bit more, but this pseudo instruction is
only used at -O0, so I don't think that's a problem.
2024-09-02 08:54:10 +01:00
Nikita Popov
30cc198c2d
[APInt] Add default-disabled assertion to APInt constructor (#106524)
If the uint64_t constructor is used, assert that the value is actually a
signed or unsigned N-bit integer depending on whether the isSigned flag
is set. Provide an implicitTrunc flag to restore the previous behavior,
where the argument is silently truncated instead.

In this commit, implicitTrunc is enabled by default, which means that
the new assertions are disabled and no actual change in behavior occurs.
The plan is to flip the default once all places violating the assertion
have been fixed. See #80309 for the scope of the necessary changes.

The primary motivation for this change is to avoid incorrectly specified
isSigned flags. A recurring problem we have is that people write
something like `APInt(BW, -1)` and this works perfectly fine -- until
the code path is hit with `BW > 64`. Most of our i128 specific
miscompilations are caused by variants of this issue.

The cost of the change is that we have to specify the correct isSigned
flag (and make sure there are no excess bits) for uses where BW is
always <= 64 as well.
2024-09-02 09:48:54 +02:00
Antonio Frighetto
e4e0dfb0c2 [CGP] Undo constant propagation of pointers across calls
It may be profitable to revert SCCP propagation of C++ static values,
if such constants are pointers, in order to avoid redundant pointer
computation, since the method returning the constant is non-removable.
2024-09-02 09:33:23 +02:00
Antonio Frighetto
ed6d9f6d2a [CGP] Introduce test for PR102926 (NFC) 2024-09-02 09:33:23 +02:00
pudge62
fe1006b7f2
[TSan] fix crash when symbolize on darwin platforms (#99441)
The `dli_sname` filed in `Dl_info` may be `NULL`, which could cause a
crash
2024-09-02 09:31:51 +02:00
Lang Hames
08a72cbd6b [clang] Bump up DIAG_SIZE_SEMA by 500 for downstream diagnostics.
Recently added HLSL diagnostics (89fb8490a99e) pushed the Swift compiler over
the existing limit.

rdar://135126738
2024-09-02 17:29:07 +10:00
Craig Topper
cd3667d1db
[CodeGen] Update a few places that were passing Register to raw_ostream::operator<< (#106877)
These would implicitly cast the register to `unsigned`. Switch most of
them to use printReg will give a more readable output. Change some
others to use Register::id() so we can eventually remove the implicit
cast to `unsigned`.
2024-09-02 00:19:19 -07:00
Craig Topper
c950ecb90e
[RISCV] Remove zfbfmin.ll. NFC (#106937)
Most of it is redundant with bfloat-convert.ll. One testcase is found in
bfloat-imm.ll. The load and stores are more thoroughly tested in
bfloat-mem.ll.
2024-09-02 00:18:52 -07:00
Nikita Popov
f044564db1
[InstCombine] Make backedge check in op of phi transform more precise (#106075)
The op of phi transform wants to prevent moving an operation across a
backedge, as this may lead to an infinite combine loop.

Currently, this is done using isPotentiallyReachable(). The problem with
that is that all blocks inside a loop are reachable from each other.
This means that the op of phi transform is effectively completely
disabled for code inside loops, even when it's not actually operating on
a loop phi (just a phi that happens to be in a loop).

Fix this by explicitly computing the backedges inside the function
instead. Do this via RPOT, which is a bit more efficient than using
FindFunctionBackedges() (which does it without any pre-computed
analyses).

For irreducible cycles, there may be multiple possible choices of
backedge, and this just picks one of them. This is still sufficient to
prevent combine loops.

This also removes the last use of LoopInfo in InstCombine -- I'll drop
the analysis in a followup.
2024-09-02 09:09:21 +02:00
Brad Smith
d2ce9dc85e
Add support for retrieving the thread ID on DragonFly BSD (#106938) 2024-09-02 02:38:23 -04:00
Pavel Labath
dd5d730072
[lldb] Better matching of types in anonymous namespaces (#102111)
This patch extends TypeQuery matching to support anonymous namespaces. A
new flag is added to control the behavior. In the "strict" mode, the
query must match the type exactly -- all anonymous namespaces included.
The dynamic type resolver in the itanium abi (the motivating use case
for this) uses this flag, as it queries using the name from the
demangles, which includes anonymous namespaces.

This ensures we don't confuse a type with a same-named type in an
anonymous namespace. However, this does *not* ensure we don't confuse
two types in anonymous namespacs (in different CUs). To resolve this, we
would need to use a completely different lookup algorithm, which
probably also requires a DWARF extension.

In the "lax" mode (the default), the anonymous namespaces in the query
are optional, and this allows one search for the type using the usual
language rules (`::A` matches `::(anonymous namespace)::A`).

This patch also changes the type context computation algorithm in
DWARFDIE, so that it includes anonymous namespace information. This
causes a slight change in behavior: the algorithm previously stopped
computing the context after encountering an anonymous namespace, which
caused the outer namespaces to be ignored. This meant that a type like
`NS::(anonymous namespace)::A` would be (incorrectly) recognized as
`::A`). This can cause code depending on the old behavior to misbehave.
The fix is to specify all the enclosing namespaces in the query, or use
a non-exact match.
2024-09-02 08:34:14 +02:00
Akshat Oke
da13754103
AMDGPU/NewPM Port SILoadStoreOptimizer to NPM (#106362) 2024-09-02 11:41:56 +05:30