57409 Commits

Author SHA1 Message Date
Thorsten Schütt
71ac1eb509
Revert "[GlobalISel] Combine [s,z]ext of undef into 0" (#118746)
Reverts llvm/llvm-project#117439
2024-12-05 07:48:20 +01:00
Craig Topper
3e0e1c13ce
[RISCV][GISel] Support fp128 arithmetic and conversion for RV64. (#118707)
We can support these via libcalls in libgcc/compiler-rt or integer
operations for fneg/fabs/fcopysign. fp128 values will be passed in two
64-bit GPRs according to the psABI.

Supporting RV32 requires sret which is not supported by libcall handling
in LegalizerHelper.cpp yet. It doesn't call canLowerReturn.
2024-12-04 21:43:29 -08:00
Kazu Hirata
50f8580e2c
[memprof] Add IndexedMemProfData::addFrame (#118724)
This patch adds a helper function to replace an idiom like:

  FrameId Id = F.hash();
  MemProfData.Frames.try_emplace(Id, F);
  // Do something with Id.
2024-12-04 20:33:35 -08:00
Kazu Hirata
7b8cf147ad
[memprof] Update YAML traits for writer purposes (#118720)
For Frames, we prefer the inline notation for the brevity.

For PortableMemInfoBlock, we go through all member fields and print
out those that are populated.
2024-12-04 19:23:27 -08:00
Matthias Springer
f50ce316ec
[llvm][NFC] APFloat: Add missing semantics to enum (#117291)
* Add missing semantics to the `Semantics` enum.
* Move all documentation of the semantics to the header file.
* Also rename some functions for consistency.
2024-12-04 14:58:59 -08:00
Zequan Wu
2e425bf629 Reapply "[lldb][dwarf] Compute fully qualified names on simplified template names with DWARFTypePrinter (#117071)"
9de73b20404f0b2db1cbf70d164cfe0789d5bb94 lands a fix to DWARFTypePrinter that is used by lldb in this change.
2024-12-04 13:05:36 -08:00
Matt Arsenault
e0f52538c9
AMDGPU: Change bitop3 intrinsic operand to i32 (#118647) 2024-12-04 15:44:04 -05:00
Sander de Smalen
048fc2bc10
[LiveIntervals] Ignore artificial regs when adding kill flags (#116963)
If parts of a physical register for a given liverange, as assigned by
the register allocator, can be used to store other values not
represented by this liverange, then `LiveIntervals::addKillFlags`
normally avoids adding a kill flag on the use of this register
when the value's liverange ends.

However, if all the other regunits are artificial, then we can
still safely add the kill flag, since those parts of the register
can never be accessed independently.
2024-12-04 20:25:31 +00:00
Kazu Hirata
c3d15188cf
[memprof] Move YAML traits to MemProf.h (NFC) (#118668)
This patch moves the MemProf YAML traits to MemProf.h so that the YAML
writer can access them from outside MemProfReader.cpp in the future.
2024-12-04 12:01:39 -08:00
Florian Hahn
9e66206638
[Passes] Generalize ShouldRunExtraVectorPasses to allow re-use (NFCI). (#118323)
Generalize ShouldRunExtraVectorPasses to ShouldRunExtraPasses, to allow
re-use for other transformations.

PR: https://github.com/llvm/llvm-project/pull/118323
2024-12-04 16:55:06 +00:00
Thorsten Schütt
148fdc519c
[GlobalISel] Add G_ABDS and G_ABDU instructions (#118122)
The DAG has the same instructions: the signed and unsigned absolute
difference of it's input. For AArch64, they map to uabd and sabd for
Neon and SVE. The Neon and SVE instructions will require custom
patterns.

They are pseudo opcodes and are not imported by the IRTranslator. We
need combines to create them.

PowerPC, ARM, and AArch64 have native instructions.

/// i.e trunc(abs(sext(Op0) - sext(Op1))) becomes abds(Op0, Op1) 
///  or trunc(abs(zext(Op0) - zext(Op1))) becomes abdu(Op0, Op1)

For GlobalISel, we are going to write the combines in MIR patterns.

see:
llvm/test/CodeGen/AArch64/abd-combine.ll

- [ ] combine into abd
- [ ] legalize and add td patterns
2024-12-04 12:53:15 +01:00
John Brawn
ecbe4d1e36
[IR] Allow fast math flags on fptrunc and fpext (#115894)
This consists of:
 * Make these instructions part of FPMathOperator.
* Adjust bitcode/ir readers/writers to expect fast math flags on these
instructions.
 * Make IRBuilder set the fast math flags on these instructions.
 * Update langref and release notes.
* Update a bunch of tests. Some of these are due to InstCombineCasts
incorrectly adding fast math flags to fptrunc, which will be fixed in a
later patch.
2024-12-04 10:53:04 +00:00
Sam Elliott
73731d6873
[llvm-tblgen] Increase Coverage Index Size (#118329) 2024-12-04 09:19:13 +00:00
Akshat Oke
d9b4bdbff5
[CodeGen][NewPM] Port LiveDebugVariables to NPM (#115468)
The existing analysis was already a pimpl wrapper.

I have extracted legacy pass logic to a LDVImpl wrapper named
`LiveDebugVariables` which is the analysis::Result now. This controls
whether to activate the LDV (depending on `-live-debug-variables` and
DIsubprogram) itself.

The legacy and new analysis only construct the LiveDebugVariables.

VirtRegRewriter will test this.
2024-12-04 14:31:34 +05:30
ronryvchin
ff281f7d37
[PGO] Add option to always instrumenting loop entries (#116789)
This patch extends the PGO infrastructure with an option to prefer the
instrumentation of loop entry blocks.
This option is a generalization of
19fb5b467b,
and helps to cover cases where the loop exit is never executed.
An example where this can occur are event handling loops.

Note that change does NOT change the default behavior.
2024-12-04 07:56:46 +01:00
Akshat Oke
92ed7e2924
[CodeGen][PM] Use errs() instead of dbgs() in printer passes (#118469)
Printing passes is not exactly a debug activity, it is used in release (and dbgs() is errs() in release)
2024-12-04 12:21:33 +05:30
Craig Topper
b076fbb844
[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587)
I want to use this function for GISel too so Type * is a better common
interface. All of the callers already convert EVT to Type * as needed
by calling lowering anyway.
2024-12-03 22:06:55 -08:00
Kyungwoo Lee
4f41862c5a Reapply "[StructuralHash] Global Variable (#118412)"
This reverts commit 6a0d6fc2e92bcfb7cb01a4c6cdd751a9b4b4c159.
2024-12-03 21:33:03 -08:00
Lang Hames
3e11ae69ab [ORC] Merge ostream operators for SymbolStringPtrs into SymbolStringPool.h. NFC.
These are simple and commonly used. Having them in the SymbolStringPool header
saves clients from having to #include "DebugUtils.h" everywhere.
2024-12-04 16:06:46 +11:00
Kazu Hirata
a93b77ce49
[memprof] Fix IndexedMemProfRecord::clear (#118533)
This patch ensures that IndexedMemProfRecord::clear clears every field
of IndexedMemProfRecord.

This fix is not critical at the moment.  The only use of this function
is in RecordWriterTrait::EmitData to release the memory we are done
with.  That is, we never clear the data structure for the purpose of
reusing it.
2024-12-03 19:23:27 -08:00
Craig Topper
caa8aa551b
[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574)
This is eventually passed to shouldSignExtendTypeInLibCall which calls
it IsSigned.
2024-12-03 18:25:44 -08:00
Kyungwoo Lee
6a0d6fc2e9 Revert "[StructuralHash] Global Variable (#118412)"
This reverts commit 1afb81dfaf902c1c42bd91fec1a7385e6e1529d3.
2024-12-03 17:19:30 -08:00
Shubham Sandeep Rastogi
259bdc0033 Revert "Reland "[NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#117042)" (#118546)"
This reverts commit 0c8928d456ac3ef23ed25bfc9e5d491dd7b62a11.

Broke Bot: https://lab.llvm.org/buildbot/#/builders/76/builds/5008

error: undefined reference to `vtable for llvm::DroppedVariableStatsIR'
2024-12-03 16:50:53 -08:00
Kyungwoo Lee
1afb81dfaf
[StructuralHash] Global Variable (#118412)
This update enhances the implementation of structural hashing for global
variables, using their initial contents. Private global variables or
constants are often used for metadata, where their names are not unique.
This can lead to the creation of different hash results although they
could be merged by the linker as they are effectively identical.
- Refine the hashing of GlobalVariables for strings or certain
Objective-C metadata cases that have section names. This can be further
extended to other scenarios.
- Expose StructuralHash for GlobalVariable so that this API can be
utilized by MachineStableHashing, which is also employed in the global
function outliner.

This change significantly improves size reduction by an additional 1% on
the LLD binary when the global function outliner and merger are enabled
together. As discussed in the RFC
https://discourse.llvm.org/t/loh-conflicting-with-machineoutliner/83279/8?u=kyulee-com,
if we disable or relocate the LOH pass, the size impact could increase
to 4%.
2024-12-03 16:01:50 -08:00
Shubham Sandeep Rastogi
0c8928d456
Reland "[NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#117042)" (#118546)
Removed the virtual destructor in the derived
class DroppedVariableStatsIR
2024-12-03 14:13:06 -08:00
Florian Hahn
45ff28746f
[ConstraintSystem] Fix signed overflow in negate.
Use AddOverflow for potentially overflowing addition to fixed signed
integer overflow.

Compile-time impact is in the noise
https://llvm-compile-time-tracker.com/compare.php?from=bfb26202e05ee2932b4368b5fca607df01e8247f&to=195b0707148b567c674235e59712458e7ce1bb0e&stat=instructions:u
2024-12-03 21:06:36 +00:00
Shubham Sandeep Rastogi
80987ef4b6 Revert "Reland [NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#117042)"
This reverts commit acf3b1aa932b2237c181686e52bc61584a80a3ff.

Broke https://lab.llvm.org/buildbot/#/builders/76/builds/5002

tools/clang/lib/CodeGen/CMakeFiles/obj.clangCodeGen.dir/BackendUtil.cpp.o:(.toc+0x258): undefined reference to `vtable for llvm::DroppedVariableStatsIR'
2024-12-03 12:51:24 -08:00
Shubham Sandeep Rastogi
d8b5af4504 Revert "Reland "Add a pass to collect dropped var stats for MIR" (#117044)"
This reverts commit 249755cedb17ffa707253edcef1a388f807caa35.

Broke https://lab.llvm.org/buildbot/#/builders/160/builds/9420

Note: This is test shard 99 of 154.
[==========] Running 2 tests from 2 test suites.
[----------] Global test environment set-up.
[----------] 1 test from DroppedVariableStatsMIR
[ RUN      ] DroppedVariableStatsMIR.InlinedAt

--
exit: -11
2024-12-03 12:50:13 -08:00
Shubham Sandeep Rastogi
249755cedb
Reland "Add a pass to collect dropped var stats for MIR" (#117044)
Moved the MIR Test to the unittests/CodeGen folder

This is patch is part of a stack of patches, and follows
https://github.com/llvm/llvm-project/pull/117042

I moved the MIR test to the unittests/CodeGen folder 

I am trying to reland https://github.com/llvm/llvm-project/pull/115566
2024-12-03 12:37:30 -08:00
Shubham Sandeep Rastogi
acf3b1aa93
Reland [NFC] Move DroppedVariableStats to its own file and redesign it to be extensible. (#117042)
Moved the IR unit test to the CodeGen folder to resolve linker errors:

`error: undefined reference to 'vtable for
llvm::DroppedVariableStatsIR'`

This patch is trying to reland
https://github.com/llvm/llvm-project/pull/115563
2024-12-03 10:39:40 -08:00
Ramkumar Ramachandra
51a895aded
IR: introduce struct with CmpInst::Predicate and samesign (#116867)
Introduce llvm::CmpPredicate, an abstraction over a floating-point
predicate, and a pack of an integer predicate with samesign information,
in order to ease extending large portions of the codebase that take a
CmpInst::Predicate to respect the samesign flag.

We have chosen to demonstrate the utility of this new abstraction by
migrating parts of ValueTracking, InstructionSimplify, and InstCombine
from CmpInst::Predicate to llvm::CmpPredicate. There should be no
functional changes, as we don't perform any extra optimizations with
samesign in this patch, or use CmpPredicate::getMatching.

The design approach taken by this patch allows for unaudited callers of
APIs that take a llvm::CmpPredicate to silently drop the samesign
information; it does not pose a correctness issue, and allows us to
migrate the codebase piece-wise.
2024-12-03 13:31:04 +00:00
Julian Nagele
4dafb091a0
Revert "Add symbol visibility macros to abi-breaking.h.cmake" (#118464)
Reverts llvm/llvm-project#110898

This change has caused a cyclic module dependency `fatal error: cyclic
dependency in module 'LLVM_Utils': LLVM_Utils ->
LLVM_Config_ABI_Breaking -> LLVM_Utils`. Reverting for now until we the
right fix.
2024-12-03 10:28:12 +00:00
Yingwei Zheng
78e35e4c0e
[ValueTracking] Fix typo in isKnownNegative and MaskedValueIsZero NFC. (#118438)
Fix typos introduced by
d9e8ae7d2f
and
42b6c8ed3a.
2024-12-03 17:14:55 +08:00
Antonio Frighetto
1d6ab189be [MemCpyOpt] Drop dead memmove calls on memset'd source data
When a memmove happens to clobber source data, and such data have
been previously memset'd, the memmove may be redundant.
2024-12-03 09:50:57 +01:00
Lang Hames
d09707070c Re-apply "[ORC][JITLink] Add jitlink::Scope::SideEffectsOnly" with fixes.
This reapplies aba6bb0820b, which was reverted in 28e2a891210 due to bot
failures. It contains fixes to silence warnings for uncovered switches,
and for incorrect initializer-symbol handling on ELF and COFF.
2024-12-03 08:00:29 +00:00
Fangrui Song
e776484a02 [InstructionCost] Optimize operator==
`!(*this < RHS) && !(RHS < *this)` is difficult for the optimizer to
reason about.
2024-12-02 23:22:33 -08:00
Thorsten Schütt
45162635bf
[GlobalISel] Combine [s,z]ext of undef into 0 (#117439)
Alternative for https://github.com/llvm/llvm-project/pull/113764

It builds on a minimalistic approach with the legality check in match
and a blind apply. The precise patterns are used for better compile-time
and modularity. It also moves the pattern check into combiner. While
unary_undef_to_zero and propagate_undef_any_op rely on custom C++ code
for pattern matching.

Is there a limit on the number of patterns?

G_ANYEXT of undef -> undef
G_SEXT of undef -> 0
G_ZEXT of undef -> 0

The combine is not a member of the post legalizer combiner for AArch64.

Test:
llvm/test/CodeGen/AArch64/GlobalISel/combine-cast.mir
2024-12-03 07:14:49 +01:00
Lang Hames
28e2a89121 Revert "[ORC][JITLink] Add jitlink::Scope::SideEffectsOnly, use it in ORC Platforms."
This reverts commit aba6bb0820b247d4caf4b5e00810909214a58053 while I investigate bot
failures (e.g. https://lab.llvm.org/buildbot/#/builders/143/builds/3848)
2024-12-03 16:56:23 +11:00
Lang Hames
aba6bb0820 [ORC][JITLink] Add jitlink::Scope::SideEffectsOnly, use it in ORC Platforms.
SideEffectsOnly is a new jitlink::Scope value that corresponds to the
JITSymbolFlags::MaterializationSideEffectsOnly flag: Symbols with this scope
can be looked up (and form part of the initial interface of a LinkGraph) but
never actually resolve to an address (so can only be looked up with a
WeaklyReferencedSymbol lookup).

Previously ObjectLinkingLayer implicitly treated JITLink symbols as having this
scope, regardless of a Symbol's actual scope, if the
MaterializationSideEffectsOnly flag was set on the corresponding symbol in the
MaterializationResponsibility object. Using an explicit scope in JITLink for
this (1) allows JITLink plugins to identify and correctly handle
side-effects-only symbols, and (2) allows raw LinkGraphs to define
side-effects-only symbols without clients having to manually modify their
`MaterializationUnit::Interface`.
2024-12-03 15:36:03 +11:00
Mingming Liu
6faf17b762
[ThinLTO]Supports declaration import for global variables in distributed ThinLTO (#117616)
When `-import-declaration` option is enabled, declaration import is
supported for functions. https://github.com/llvm/llvm-project/pull/88024
has the context for this option.

This patch supports declaration import for global variables in
distributed ThinLTO. The motivating use case is to propagate `dso_local`
attribute of global variables across modules, to optimize global
variable access when a binary is built with
`-fno-direct-access-external-data`.
* With `-fdirect-access-external-data`, non thread-local global
variables will [have `dso_local`
attributes](fe3c23b439/clang/lib/CodeGen/CodeGenModule.cpp (L1730-L1746)).
This optimizes the global variable access as shown by
https://gcc.godbolt.org/z/vMzWcKdh3
2024-12-02 16:15:52 -08:00
Justin Bogner
bd92e46204
[HLSL] Implement RWBuffer::operator[] via __builtin_hlsl_resource_getpointer (#117017)
This introduces `__builtin_hlsl_resource_getpointer`, which lowers to
`llvm.dx.resource.getpointer` and is used to implement indexing into
resources.

This will only work through the backend for typed buffers at this point,
but the changes to structured buffers should be correct as far as the
frontend is concerned.

Note: We probably want this to return a reference in the HLSL device
address space, but for now we're just using address space 0. Creating a
device address space and updating this code can be done later as
necessary.

Fixes #95956
2024-12-02 14:03:31 -08:00
Florian Hahn
4226e0a0c7
[TTI] Add SCEVExpansionBudget to loop unrolling options. (#118316)
Add an extra know to UnrollingPreferences to let backends control the
maximum budget for SCEV expansions.

This gives backends more fine-grained control on the cost of the runtime
checks for runtime unrolling.

PR: https://github.com/llvm/llvm-project/pull/118316
2024-12-02 21:35:00 +00:00
Krzysztof Parzyszek
bde79c0e27
[flang][OpenMP] Use new modifiers in DEPEND/GRAINSIZE/NUM_TASKS (#117917)
The usual changes, added more references to OpenMP specs.
2024-12-02 15:22:05 -06:00
Krzysztof Parzyszek
cdbd22876b
[flang][OpenMP] Use new modifiers in ALLOCATE clause (#117627)
Again, this simplifies the semantic checks and lowering quite a bit.
Update the check for positive alignment to use a more informative
message, and to highlight the modifier itsef, not the whole clause.
Remove the checks for the allocator expression itself being positive:
there is nothing in the spec that says that it should be positive.

Remove the "simple" modifier from the AllocateT template, since both
simple and complex modifiers are the same thing, only differing in
syntax.
2024-12-02 08:11:49 -06:00
Devajith
e48c7fe49f
Reland '[lineeditor] Add setHistorySize() method for adjusting history size' (#115442)
Reland: https://github.com/llvm/llvm-project/pull/110092

Buildbot was failing due to an unused variable warning as there were 2
implementations (with and without libedit). Compared to last version,
wrap the variable inside the `#ifdef HAVE_LIBEDIT`.

Apologies for the oversight @vgvassilev
2024-12-02 15:17:51 +02:00
Veera
979a0356d4
[InstCombine] Fold X Pred C2 ? X BOp C1 : C2 BOp C1 to min/max(X, C2) BOp C1 (#116888)
Fixes #82414.

General Proof: https://alive2.llvm.org/ce/z/ERjNs4 
Proof for Tests: https://alive2.llvm.org/ce/z/K-934G

This PR transforms `select` instructions of the form `select (Cmp X C1)
(BOp X C2) C3` to `BOp (min/max X C1) C2` iff `C3 == BOp C1 C2`.

This helps in eliminating a noop loop in
https://github.com/rust-lang/rust/issues/123845 but does not improve
optimizations.
2024-12-02 09:33:45 +01:00
Adam Yang
0a44b24d66
[DXIL] Add GroupMemoryBarrierWithGroupSync intrinsic (#114349)
fixes #112974
partially fixes #70103

An earlier version of this change was reverted so some issues could be fixed.

### Changes
- Added new tablegen based way of lowering dx intrinsics to DXIL ops.
- Added int_dx_group_memory_barrier_with_group_sync intrinsic in
IntrinsicsDirectX.td
- Added expansion for int_dx_group_memory_barrier_with_group_sync in
DXILIntrinsicExpansion.cpp`
- Added DXIL backend test case

### Related PRs
* [[clang][HLSL] Add GroupMemoryBarrierWithGroupSync intrinsic
#111883](https://github.com/llvm/llvm-project/pull/111883)
* [[SPIRV] Add GroupMemoryBarrierWithGroupSync intrinsic
#111888](https://github.com/llvm/llvm-project/pull/111888)
2024-12-01 22:31:40 -08:00
Daniil Kovalev
a9ad9e27ca
[PAC][llvm-readobj][AArch64] Move PAuth GOT relocs out of private space (#118214)
Apply change from the spec
https://github.com/ARM-software/abi-aa/pull/300
2024-12-02 08:53:49 +03:00
Fangrui Song
574f64ca61 [TimeProfiler] Remove unneeded check 2024-12-01 15:12:57 -08:00
Yingwei Zheng
f7ef0721d6
[SCEV] Do not allow refinement in the rewriting of BEValue (#117152)
See the following case:
```
; bin/opt -passes="print<scalar-evolution>" test.ll --disable-output
define i32 @widget() {
b:
  br label %b1

b1:                                              ; preds = %b5, %b
  %phi = phi i32 [ 0, %b ], [ %udiv6, %b5 ]
  %phi2 = phi i32 [ 1, %b ], [ %add, %b5 ]
  %icmp = icmp eq i32 %phi, 0
  br i1 %icmp, label %b3, label %b8

b3:                                              ; preds = %b1
  %udiv = udiv i32 10, %phi2
  %urem = urem i32 %udiv, 10
  %icmp4 = icmp eq i32 %urem, 0
  br i1 %icmp4, label %b7, label %b5

b5:                                              ; preds = %b3
  %udiv6 = udiv i32 %phi2, 0
  %add = add i32 %phi2, 1
  br label %b1

b7:                                              ; preds = %b3
  ret i32 5

b8:                                              ; preds = %b1
  ret i32 7
}
```
```
%phi2 = phi i32 [ 1, %b ], [ %add, %b5 ] -->  {1,+,1}<nuw><nsw><%b1>
%udiv6 = udiv i32 %phi2, 0 --> ({1,+,1}<nuw><nsw><%b1> /u 0)
%phi = phi i32 [ 0, %b ], [ %udiv6, %b5 ] --> ({0,+,1}<nuw><nsw><%b1> /u 0)
```
`ScalarEvolution::createAddRecFromPHI` gives a wrong SCEV result for
`%phi`:

d7d6fb1804/llvm/lib/Analysis/ScalarEvolution.cpp (L5926-L5950)
It converts `phi(0, ({1,+,1}<nuw><nsw><%b1> /u 0))` into `phi(0 / 0,
({1,+,1}<nuw><nsw><%b1> /u 0))`. Then it simplifies the expr into
`{0,+,1}<nuw><nsw><%b1> /u 0`.

As we did in
acd700a24b,
this patch disallows udiv simplification if we cannot prove that the
denominator is a well-defined non-zero value.

Fixes https://github.com/llvm/llvm-project/issues/117133.
2024-12-01 20:11:09 +08:00