1248 Commits

Author SHA1 Message Date
Philip Reames
4d629f9744
[MIR] Remove std::variant from multiple save/restore point handling [nfc] (#153226)
In review of bbde6b, I had originally proposed that we support the
legacy text format. As review evolved, it bacame clear this had been a
bad idea (too much complexity), but in order to let that patch finally
move forward, I approved the change with the variant. This change undoes
the variant, and updates all the tests to just use the array form.
2025-08-12 11:23:05 -07:00
Sudharsan Veeravalli
3796efb5dc
[Hexagon] Add nounwind to hexagon-strcpy.ll (#151293)
The test does not check for anything related to cfi information so we
don't really need them in the test checks. Also it looks like there were
some failures on the Alpine Linux builders due to the placement of the
cfi information in the output assembly.

I have also changed `-march` to `-mtriple` in the run line similar to
2208c97
2025-07-31 11:13:55 +05:30
Ellis Hoag
819f020b28
Use F.hasOptSize() instead of checking optsize directly (#147348) 2025-07-28 08:38:52 -07:00
Ryotaro Kasuga
6df012ab48
[MachinePipeliner] Fix incorrect dependency direction (#149436)
This patch fixes a bug introduced in #145878. A dependency was added in
the wrong direction, causing an assertion failure due to broken
topological order.
2025-07-22 09:53:13 +09:00
Abinaya Saravanan
fcabb53f0c
[HEXAGON] Add AssertSext in sign-extended mpy (#149061)
The pattern i32xi32->i64, should be matched to the sign-extended
multiply op, instead of explicit sign- extension of the operands
followed by non-widening multiply (this takes 4 operations instead of
one). Currently, if one of the operands of multiply inside a loop is a
constant, the sign-extension of this constant is hoisted out of the loop
by LICM pass and this pattern is not matched by the ISEL.

This change handles multiply operand with Opcode of the type AssertSext
which is seen when the sign-extension is hoisted out-of the loop.
Modifies the DetectUseSxtw() to check for this.
2025-07-17 17:27:13 +05:30
Brian Cain
2b952753f8
[hexagon] Add support for llvm.thread.pointer (#148752)
UGP contains the pointer for thread data: 

> The TLS area is accessed at the processor level through the special
register UGP This register is set to the address one location above the
TLS area, which grows downwards from UGP.

From the Hexagon ABI spec -
https://docs.qualcomm.com/bundle/publicresource/80-N2040-23_REV_K_Qualcomm_Hexagon_Application_Binary_Interface_User_Guide.pdf

Also: disable clang-format for `NodeType` enum in
`llvm/lib/Target/Hexagon/HexagonISelLowering.h` to avoid disruptive
formatting.
2025-07-15 09:59:04 -05:00
Matt Arsenault
ee5b9369cd
Hexagon: Add frexp intrinsic test (#148671) 2025-07-15 09:00:03 +09:00
Matt Arsenault
43206d1b2e
Hexagon: Add test for llvm.exp10 intrinsic (#148664)
This is mostly to test the libcall behavior
2025-07-15 08:56:35 +09:00
Matt Arsenault
7299250c03
DAG: Use fast variants of fast math libcalls (#147481)
Hexagon currently has an untested global flag to control fast
math variants of libcalls. Add fast variants as explicit libcall
options so this can be a flag based lowering decision, and implement
it. I have no idea what fast math flags the hexagon case requires,
so I picked the maximally potentially relevant set of flags although
this probably is refinable per call. Looking in compiler-rt, I'm not
sure if the fast variants are anything more than aliases.
2025-07-13 10:41:45 +09:00
aankit-ca
f9d3278901
[Hexagon] Add saturating add instructions (#148132)
Generate the saturating add instructions for sadd.sat for scalar and
vector instructions

Co-authored-by: aankit-quic <aankit@quicinc.com>
Co-authored-by: Jyotsna Verma <jverma@quicinc.com>
2025-07-11 15:00:05 -07:00
pkarveti
d679dc7822
[Hexagon]Handle truncate of v4i8/v2i16 -> v4i1/v2i1 when Hvx is enabled (#147476) 2025-07-11 01:27:07 -07:00
pkarveti
1f39b92a16
[Hexagon] Handle bitcast of i32/v2i16/v4i8 -> v32i1 when Hvx is enabled (#147466) 2025-07-11 01:26:53 -07:00
Ryotaro Kasuga
c0b82df5f3
[MachinePipeliner] Add validation for missed loop-carried memory deps (#145878)
This patch adds an additional validation step to ensure that the
generated schedule does not violate loop-carried memory dependencies.
Prior to this patch, incorrect schedules could be produced due to the
lack of checks for the following types of dependencies:

- load-to-store backward (from bottom to top within the BB) dependencies
- store-to-load dependencies
- store-to-store dependencies

One possible solution to this issue is to add these dependencies
directly to the dependency graph, although doing so may lead to
performance degradation. In addition, no known cases of incorrect code
generation caused by these missing dependencies have been observed in
practice. Given these factors, this patch introduces a post-scheduling
validation phase to check for such previously missed dependencies,
instead of adding them to the graph before searching for a schedule.
Since no actual problems have been identified so far, it is likely that
most generated schedules are already valid. Therefore, this additional
validation is not expected to cause performance degradation in practice.

Split off from #135148 .

The remaining tasks are as follows:

- Address other missing loop-carried dependencies (e.g., output
dependencies between physical registers, barrier instructions, and
instructions that may raise floating-point exceptions)
- Remove code that are currently retained to maintain the existing
behavior but probably unnecessary.
- Eliminate `SwingSchedulerDAG::isLoopCarriedDep` and use
`SwingSchedulerDDG` to traverse edges after dependency analysis part.
2025-07-11 09:20:43 +09:00
Matt Arsenault
c8fbcb6590
Hexagon: Add sincos intrinsic test (#147474) 2025-07-10 16:08:53 +09:00
pkarveti
de732df551
[Hexagon] Handle Call Operand vxi1 in Hexagon without HVX Enabled (#136546)
This commit updates the Hexagon backend to handle vxi1 call operands
Without HVX enabled. It ensures compatibility for vector types of sizes
4, 8, 16, 32, 64, and 128 x i1 when HVX is not enabled.
2025-07-08 09:43:15 -07:00
Simon Pilgrim
075c1b1afc [Hexagon] aggr-copy-order.ll - regenerate test checks 2025-07-07 18:31:12 +01:00
Simon Pilgrim
728cb7f6d6 [Hexagon] clr_set_toggle.ll - regenerate test checks 2025-07-07 18:31:12 +01:00
John Brawn
c34508023b
[NFC] Remove undef in swp-const-tc1.ll test (#147287)
Change undef branch conditions to the values that loop-simplify gives
them, and handle other undef values by using extra arguments. I'm making
this change because of an upcoming loop strength reduction change that
results in instsimplify removing more instructions due to them using
undef, causing the test checks to fail.
2025-07-07 17:15:33 +01:00
Simon Pilgrim
747496269a [Hexagon] build-vector-float-type.ll - regenerate test checks 2025-07-07 14:33:17 +01:00
Simon Pilgrim
6190d407e0 [Hexagon] vect-vshifts.ll - regenerate test checks 2025-07-07 14:33:17 +01:00
Guy David
76274eb2b3
[PHIElimination] Revert #131837 #146320 #146337 (#146850)
Reverting because mis-compiles:
- https://github.com/llvm/llvm-project/pull/131837
- https://github.com/llvm/llvm-project/pull/146320
- https://github.com/llvm/llvm-project/pull/146337
2025-07-03 07:48:08 -04:00
Sudharsan Veeravalli
15ab4bb5c8
[Hexagon] Implement shouldConvertConstantLoadToIntImm (#146452)
This will convert loads of constant strings to immediate values. Put
this behind a flag that is enabled by default so that we can toggle it
if need be.
2025-07-01 17:52:09 +05:30
Guy David
f5c62ee0fa
[PHIElimination] Reuse existing COPY in predecessor basic block (#131837)
The insertion point of COPY isn't always optimal and could eventually
lead to a worse block layout, see the regression test in the first
commit.

This change affects many architectures but the amount of total
instructions in the test cases seems too be slightly lower.
2025-06-29 21:28:42 +03:00
Ryotaro Kasuga
ef60ee6005
[MachinePipeliner] Introduce a new class for loop-carried deps (#137663)
In MachinePipeliner, loop-carried memory dependencies are represented by
DAG, which makes things complicated and causes some necessary
dependencies to be missing. This patch introduces a new class to manage
loop-carried memory dependencies to simplify the logic. The ultimate
goal is to add currently missing dependencies, but this is a first step
of that, and this patch doesn't intend to change current behavior. This
patch also adds new tests that show the missed dependencies, which
should be fixed in the future.

Split off from #135148
2025-06-05 21:30:27 +09:00
YunQiang Su
5a7e72d575
Hexagon: sfmax/sfmin instructions are IEEE754-2019 (#139056)
The min/max instructions of Hexagon follow IEEE754-2019
minimumNumber/maximumNumber,
aka 
   FMINIMUMNUM and FMAXIMUMNUM
instead of
  FMAXNUM and FMINNUM
2025-05-14 11:55:11 +08:00
Douglas Yung
194a4a333a Fix test pfalse-v4i1.ll added in #138712 to require asserts.
Should fix build bot failure: https://lab.llvm.org/buildbot/#/builders/202/builds/1102
2025-05-07 06:14:01 +00:00
Ikhlas Ajbar
57e88993fe
[Hexagon] Add missing patterns to select PFALSE and PTRUE (#138712)
Fixes #134659
2025-05-06 16:47:25 -05:00
Ryotaro Kasuga
3cd6b86cc1
[MachinePipeliner] Use AliasAnalysis properly when analyzing loop-carried dependencies (#136691)
MachinePipeliner uses AliasAnalysis to collect loop-carried memory
dependencies. To analyze loop-carried dependencies, we need to
explicitly tell AliasAnalysis that the values may come from different
iterations. Before this patch, MachinePipeliner didn't do this, so some
loop-carried dependencies might be missed. For example, in the following
case, there is a loop-carried dependency from the load to the store, but
it wasn't considered.

```
def @f(ptr noalias %p0, ptr noalias %p1) {
entry:
  br label %body

loop:
  %idx0 = phi ptr [ %p0, %entry ], [ %p1, %body ]
  %idx1 = phi ptr [ %p1, %entry ], [ %p0, %body ]
  %v0 = load %idx0
  ...
  store %v1, %idx1
  ...
}
```

Further, the handling of the underlying objects was not sound. If there
is no information about memory operands (i.e., `memoperands()` is
empty), it must be handled conservatively. However, Machinepipeliner
uses a dummy value (namely `UnknownValue`). It is distinguished from
other "known" objects, causing necessary dependencies to be missed.
(NOTE: in such cases, `buildSchedGraph` adds non-loop-carried
dependencies correctly, so perhaps a critical problem has not occurred.)

This patch fixes the above problems. This change has increased false
dependencies that didn't exist before. Therefore, this patch also
introduces additional alias checks with the underlying objects.

Split off from #135148
2025-04-23 18:11:34 +09:00
Ryotaro Kasuga
b6820c35c5
[MachinePipeliner] Remove UB from tests (NFC) (#123169)
This patch removes UB from some tests for MachinePipeliner. This patch
fixes following cases.

- Branching on an `undef` value.
- Using `undef`/`null` as a pointer operand of a load/store.

There are other tests of pipeliner that contain the same UB, but for
now, this patch fixes particularly unstable cases when I developed
pipeliner.
2025-04-21 16:12:25 +09:00
Akshat Oke
31ddaef8d1
[CodeGen][NPM] Port UnreachableMachineBlockElim to NPM (#136127) 2025-04-18 15:06:30 +05:30
Yingwei Zheng
78b37ca2a3
[Hexagon] Pre-commit tests for PR130742. NFC. (#135604)
Needed by https://github.com/llvm/llvm-project/pull/130742.
2025-04-17 14:28:53 +08:00
aankit-ca
da8ce56c53
[HEXAGON] Fix corner cases for hwloops pass (#135439)
Add check to make sure Dist > 0 or Dist < 0 for appropriate cmp cases to
hexagon hardware loops pass. The change modifies the
HexagonHardwareLoops pass to add runtime checks to make sure that
end_value > initial_value for less than comparisons and end_value <
initial_value for greater than comparisons.

Fix for https://github.com/llvm/llvm-project/issues/133241

@androm3da @iajbar PTAL

---------

Co-authored-by: aankit-quic <aankit@quicinc.com>
2025-04-14 13:03:10 -05:00
Ikhlas Ajbar
32c39092ea
[llvm][Hexagon] Promote operand v2i1 to v2i32 (#135409)
Fixes #118879
2025-04-11 14:25:50 -05:00
Afanasyev Ivan
337bad3921
[EarlyIfConverter] Fix reg killed twice after early-if-predicator and ifcvt (#133554)
Bug relates to `early-if-predicator` and `early-ifcvt` passes. If
virtual register has "killed" flag in both basic blocks to be merged
into head, both instructions in head basic block will have "killed" flag
for this register. It makes MIR incorrect.

Example:

```
  bb.0: ; if
    ...
    %0:intregs = COPY $r0
    J2_jumpf %2, %bb.2, implicit-def dead $pc
    J2_jump %bb.1, implicit-def dead $pc

  bb.1: ; if.then
    ...
    S4_storeiri_io killed %0, 0, 1
    J2_jump %bb.3, implicit-def dead $pc

  bb.2: ; if.else
    ...
    S4_storeiri_io killed %0, 0, 1
    J2_jump %bb.3, implicit-def dead $pc
```

After early-if-predicator will become:

```
  bb.0:
    %0:intregs = COPY $r0
    S4_storeirif_io %1, killed %0, 0, 1
    S4_storeirit_io %1, killed %0, 0, 1
```

Having `killed` flag set twice in bb.0 for `%0` is an incorrect MIR.
2025-04-01 12:06:30 +02:00
Alexey Karyakin
c0b2c10e9f
[hexagon] Bump the default version to v68 (#132304)
Set the default processor version to v68 when the user does not specify
one in the command line. This includes changes in the LLVM backed and
linker (lld). Since lld normally sets the version based on inputs, this
change will only affect cases when there are no inputs.

Fixes #127558
2025-03-21 20:08:45 -05:00
Hua Tian
b09b9ac108
[llvm][CodeGen] Fix the empty interval issue in Window Scheduler (#129204)
The interval of newly generated reg in ModuloScheduleExpander is empty.
This will cause crash at some corner case. This patch recalculate the
live intervals of these regs.
2025-03-17 14:28:47 +08:00
aankit-ca
d642eec78f
[HEXAGON] Fix semantics of ordered FP compares (#131089)
For the ordered FP compare bitcode instructions, the Hexagon backend was
assuming that no operand could be a NaN. This assumption is flawed. This
patch fixes the code-generation to produce fpcmp.uo and and appropriate
bit comparison operators to account for the case when an operand to a FP
compare is a NaN.

Fix for https://github.com/llvm/llvm-project/issues/129391

Co-authored-by: aankit-quic <aankit@quicinc.com>
2025-03-13 14:48:31 -05:00
Akshat Oke
5952972c91
[CodeGen][NPM] Port BranchFolder to NPM (#128858)
EnableTailMerge is false by default and is handled by the pass builder.
Passes are independent of target pipeline options.

This completes the generic `MachineLateOptimization` passes for the NPM
pipeline.
2025-03-13 13:41:28 +05:30
Abinaya Saravanan
9c65e6ac11
[HEXAGON] Add support to lower "FREEZE a half(f16)" instruction on Hexagon and fix the isel-buildvector-v2f16.ll assertion (#130977) 2025-03-12 16:58:26 -05:00
aankit-ca
29d3fc3f11
[HEXAGON] Fix hvx-isel for extract_subvector op (#129672)
Fixes a crash with extract_subvectors in Hexagon backend seen when the
source vector is a vector-pair and result vector is not hvx vector size.

LLVM Issue: https://github.com/llvm/llvm-project/issues/128775
Fixes #128775
---------

Co-authored-by: aankit-quic <aankit@quicinc.com>
2025-03-06 17:02:10 -06:00
Daniel Paoliello
16e051f0b9
[win] NFC: Rename EHCatchret to EHCont to allow for EH Continuation targets that aren't catchret instructions (#129953)
This change splits out the renaming and comment updates from #129612 as a non-functional change.
2025-03-06 09:28:44 -08:00
Akshat Oke
77f44a9642
[CodeGen][NewPM] Port MachineSink to NPM (#115434)
Targets can set the EnableSinkAndFold option in CGPassBuilderOptions for
the NPM pipeline in buildCodeGenPipeline(... &Opts, ...)
2025-03-03 15:49:37 +05:30
pkarveti
37559c8401
[Hexagon] Handle Call Operand vxi1 in Hexagon Backend (#128027)
This commit updates the Hexagon backend to handle
vxi1 call operands. It ensures compatibility for
vector types of sizes 4, 8, 16, 32, 64, and 128 x i1 when HVX is
enabled.

~Fixes #59009 and #118879~
2025-02-25 10:26:28 -06:00
Ikhlas Ajbar
4f7d8948d9
[Hexagon] Add a case to BitTracker for new register class (#128580)
Code in the HexagonBitTracker checks for a specific register class when
processing sub-registers. A crash occurred due to a register class that
was not handled. The register class is
DoubleRegs_with_isub_hi_in_IntRegsLow8RegClassID, which is a class
formed by creating a register pair when one of the sub registers is a
Low8 integer register.
Fixes #128078
Patch by: Brendon Cahoon
2025-02-25 09:07:29 -06:00
Brian Cain
788cb725d8
[Hexagon] Explicitly truncate constant in UAddSubO (#127360)
After #117558 landed, this code would assert "Value is not an N-bit
unsigned value" in getConstant(), from a test case in zig.

Co-authored-by:  Craig Topper <craig.topper@sifive.com>
Fixes #127296
2025-02-17 09:30:48 -06:00
Yashas Andaluri
a361de6d13
[RDF] Create phi nodes for clobbering defs (#123694)
When a def in a block A reaches another block B that is in A's iterated
dominance frontier, a phi node is added to B for the def register.

A clobbering def can be created at a call instruction, for a register
clobbered by a call.
However, phi nodes are not created for a register, when one of the
reaching defs of the register is a clobbering def.

This patch adds phi nodes for registers that have a clobbering reaching
def. These additional phis help in checking reaching defs for an
instruction in RDF based copy propagation and addressing mode
optimizations.
2025-02-07 08:28:29 -06:00
Yuta Mukai
e3abe940d8
[MachinePipeliner] Improve loop carried dependence analysis (#94185)
The previous implementation had false positive/negative cases in the
analysis of the loop carried dependency.

A missed dependency case is caused by incorrect analysis of address
increments. This is fixed by strict analysis of recursive definitions.
See added test swp-carried-dep4.mir.

Excessive dependency detection is fixed by improving the formula
for determining the overlap of address ranges to be accessed. See added test
swp-carried-dep5.mir.
2025-02-05 21:08:20 +09:00
Christudasan Devadasan
44f638f88e
CodeGen][NewPM] Port PostRAScheduler to NPM. (#125798) 2025-02-05 12:45:59 +05:30
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Hua Tian
a9d2834508
[llvm][CodeGen] Fix the issue caused by live interval checking in window scheduler (#123184)
At some corner cases, the cloned MI still retains an old slot index,
which leads to the compiler crashing. This patch update the slot index
map before delete the recycled MI.

https://github.com/llvm/llvm-project/issues/123165
2025-01-23 09:39:03 +08:00