775 Commits

Author SHA1 Message Date
antangelo
73f087b331
[NFC][SelectionDAG] Replace generic @llvm.expect.with.probability codegen test with X86 test (#117848)
Adds test case for X86 to check that the output of
@llvm.expect.with.probability's generic lowering is reasonable. This
replaces a generic test which only asserts that llc does not crash.
2024-12-01 17:46:08 -05:00
Sergei Barannikov
61a23646c9
[SjLjEHPrepare] Configure call sites correctly (#117656)
After 9fe78db4, the pass inserts `store volatile i32 -1, ptr %call_site`
before all invoke instruction except the one in the entry block, which
has the effect of bypassing landing pads on exceptions.

When configuring the call site for a potentially throwing instruction
check that it is not `InvokeInst` -- they are handled by earlier code.
2024-11-27 08:03:47 +03:00
antangelo
dd4844722d
[SelectionDAG] Add generic implementation for @llvm.expect.with.probability when optimizations are disabled (#117459)
Handle \@llvm.expect.with.probability in SelectionDAGBuilder, FastISel,
and IntrinsicLowering in the same way \@llvm.expect is handled, where
the value is passed through as-is. This can be reached if the intrinsic
is used without optimizations, where it would otherwise be properly
transformed out.

Fixes #115411 for SelectionDAG. A similar patch is likely needed for
GlobalISel.
2024-11-26 20:22:25 -05:00
Kyungwoo Lee
fe69a20cc1 Reland [CGData][GMF] Skip No Params (#116548)
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
  - Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required.
  - Parameter count is now calculated based on the unique hash count.
  - Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size.
  - Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time.
  - Moved a sorting operation outside of the loop.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-25 13:55:02 -08:00
Kyungwoo Lee
fe3c23b439 Revert "[CGData][GMF] Skip No Params (#116548)"
This reverts commit fdf1f69c57ac3667d27c35e097040284edb1f574.
2024-11-25 11:09:29 -08:00
Kyungwoo Lee
fdf1f69c57
[CGData][GMF] Skip No Params (#116548)
This update follows up on change #112671 and is mostly a NFC, with the following exceptions:
  - Introduced `-global-merging-skip-no-params` to bypass merging when no parameters are required.
  - Parameter count is now calculated based on the unique hash count.
  - Added `-global-merging-inst-overhead` to adjust the instruction overhead, reflecting the machine instruction size.
  - Costs and benefits are now computed using the double data type. Since the finalization process occurs offline, this should not significantly impact build time.
  - Moved a sorting operation outside of the loop.

This is a patch for
https://discourse.llvm.org/t/rfc-global-function-merging/82608.
2024-11-25 10:57:41 -08:00
Rahman Lavaee
68f7b075c0
[BasicBlockSections] Allow mixing of -basic-block-sections with MFS. (#117076)
This PR allows mixing `-basic-block-sections` with
`-enable-machine-function-splitter`. The strategy is to let
`-basic-block-sections` take precedence over functions with profiles.
2024-11-22 22:23:29 -08:00
Kyungwoo Lee
816c975ea7
Fix crash from [CGData] Global Merge Functions (#112671) (#116241)
Module summary index is optional for this pass, and we shouldn't run it,
but import it as necessary.
2024-11-15 14:57:17 -08:00
Koakuma
23d209f350
[SPARC] Allow overaligned allocas (#107223)
SPARC ABI doesn't use stack realignment, so let LLVM know about it in
`SparcFrameLowering`. This has the side effect of making all overaligned
allocations go through `LowerDYNAMIC_STACKALLOC`, so implement the
missing logic there too for overaligned allocations.
This makes the SPARC backend not crash on overaligned `alloca`s and fix
https://github.com/llvm/llvm-project/issues/89569.
2024-11-03 22:53:03 +07:00
Afanasyev Ivan
4e1b9d34f9
[mir-strip-debug] Fix debug location info strip for bundled instructions (#113676)
Fix bug that `mir-strip-debug` pass does not remove debug location from
bundled instructions.

Problem arises during testing that debug info does not affect
optimization passes output (`llvm-lit` with ` -Dllc="llc
-debugify-and-strip-all-safe"`), when pass operates on MIR with bundled
instructions + memory operands.

Let mir test check looks like:

```
CHECK-NEXT: BUNDLE {
CHECK-NEXT:   $r3 = LD $r1, $r2 :: (load (s64) from %ir.a, !tbaa !2)
CHECK-NEXT: }
```

So as `mir-strip-debug` pass does not process bundled instructions,
running `llc -debugify-and-strip-all-safe` on the test will produce the
following output:

```
BUNDLE {
  $r3 = LD $r1, $r2, debug-location !DILocation(line: 3, column: 1, scope: <0x608cb2b99b10>) :: (load (s64) from %ir.a, !tbaa !2)
}
```

And test will fail, but it shouldn't.

Seems like the root cause is that `mir-strip-debug` pass should remove
debug location from bundled instructions.
2024-10-29 10:26:15 -07:00
Alex Bradbury
2fe1f84db3 [test] Fix llc-start-stop.ll when the default target enables the loop terminator folding pass
Previously this would fail if the default target enabled the loop
terminator folding pass (currently just RISC-V), as it runs after loop
strength reduction.
2024-10-07 16:06:44 +01:00
Sean Perry
27b5dc422c
Add target-byteorder for cases where endian in target triple is what matters (#107915)
I came across the subtly when setting up lit for z/OS and running it on
a Linux on Power machine. Linux on Power is little endian. This was
resulting in all of these tests being run even though the target triple
was z/OS which is big endian. The lit should really be checking if the
target is little endian not the host. The previous way didn't handle
cross compilation while running lit.
2024-09-23 13:00:44 -04:00
Alexis Engelke
fa92d51f9e
[VP] Merge ExpandVP pass into PreISelIntrinsicLowering (#101652)
Similar to #97727; avoid an extra pass over the entire IR by performing
the lowering as part of the pre-isel-intrinsic-lowering pass.
2024-08-06 09:27:59 +02:00
Jonas Paulsson
f231d3dab3
Fix some X86 tests (#101944)
extractelement-shuffle.ll: Test for bugfix in DAGCombiner, moved to
Generic.

2010-07-06-DbgCrash.ll and 2006-10-02-BoolRetCrash.ll: Bugfixes in X86,
run tests with X86 backend.
2024-08-05 11:50:05 +02:00
Matt Arsenault
af1d2b9fb1
CodeGen: Remove -disable-debug-info-print cl::opt (#100319)
This was first introduced way back in in 2010 by
6c74a872a8d34d41b751efb68e335cbe91b5a5cc, and has little evidence
of use. Only one test attempts to make use of this, but it's
also redundant since it's also using strip to drop debug info anyway
(and that also makes the test buggy, since it's intended to test
with and without debug info).

The other tests using it were only added to test the option after
discovering it was untested and moved, in later commits.
2024-07-25 16:39:39 +04:00
paperchalice
ab58b6d58e
Revert "[CodeGen][NewPM] Port machine-branch-prob to new pass manager" (#96858)
Reverts llvm/llvm-project#96389
Some ppc bots failed.
2024-06-27 15:00:17 +08:00
paperchalice
73e46c2bb4
[CodeGen][NewPM] Port machine-branch-prob to new pass manager (#96389)
Like IR version `print<branch-prob>`, there is also a
`print<machine-branch-prob>`.
2024-06-27 14:04:51 +08:00
Stephen Tozer
094572701d
[RemoveDIs] Print IR with debug records by default (#91724)
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records. 

If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.

For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-06-14 15:07:27 +01:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
Min-Yih Hsu
f8063ffe73
[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen (#91782)
`vp.reduce.fmaximum/fminimum` are the VP version of
`vector.reduce.fmaximum/fminimum`.
2024-05-10 16:01:47 -07:00
Kevin P. Neal
79726ef5d2
[VP] Correct lowering of predicated fma and faddmul to avoid strictfp. (#85272)
Correct missing cases in a switch that result in @llvm.vp.fma.v4f32
getting lowered to a constrained fma intrinsic. Vector predicated
lowering to contrained intrinsics is not supported currently, and
there's no consensus on the path forward. We certainly shouldn't be
introducing constrained intrinsics into a function that isn't strictfp.

Problem found with D146845.
2024-04-17 08:34:25 -04:00
Stephen Tozer
1c6b0f779f
[RemoveDI] Add support for debug records to debugify (#87383)
This patch changes debugify to support debug variable records, and
subsequently to no longer convert modules automatically to intrinsics
when entering debugify.
2024-04-16 17:07:46 +01:00
Arthur Eubanks
5d6d8dcd29
[clang][llvm] Remove "implicit-section-name" attribute (#87906)
D33412/D33413 introduced this to support a clang pragma to set section
names for a symbol depending on if it would be placed in
bss/data/rodata/text, which may not be known until the backend. However,
for text we know that only functions will go there, so just directly set
the section in clang instead of going through a completely separate
attribute.

Autoupgrade the "implicit-section-name" attribute to directly setting
the section on a Fuction.
2024-04-11 12:29:29 -07:00
Weining Lu
0f5f931a9b [CodeGen] Fix test after #86049 2024-04-03 22:28:02 +08:00
Vitaly Buka
cbb27bef3e [CodeGen] Fix test after #86049 2024-04-01 00:44:27 -07:00
Vitaly Buka
d76a1233f7 [CodeGen] Fix test after #86049 2024-03-31 23:48:23 -07:00
Vitaly Buka
b890c17892 [CodeGen] Fix test after #86049 2024-03-31 23:22:07 -07:00
Vitaly Buka
289d2cc3f3 [CodeGen] Fix test after #86049 2024-03-31 23:10:21 -07:00
Vitaly Buka
20f56e1f8e
[CodeGen] Add default lowering for llvm.allow.{runtime,ubsan}.check() (#86049)
RFC:
https://discourse.llvm.org/t/rfc-add-llvm-experimental-hot-intrinsic-or-llvm-hot/77641
2024-03-31 22:19:33 -07:00
Alex MacLean
89b7b3b995
[NVPTX] support dynamic allocas with PTX alloca instruction (#84585)
Add support for dynamically sized alloca instructions with the PTX
alloca instruction introduced in PTX 7.3 
([9.7.15.3. Stack Manipulation Instructions: alloca]
(https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#stack-manipulation-instructions-alloca))
2024-03-15 11:51:46 -07:00
paperchalice
edc2066465
[CodeGen][GC] Skip function without GC in GCLoweringPass (#84421) 2024-03-14 13:07:41 +08:00
Paul Walker
900bea9b1c [LLVM][test] Convert remaining instances of ConstantExpr based splats to use splat().
This is mostly NFC but some output does change due to consistently
inserting into poison rather than undef and using i64 as the index
type for inserts.
2024-02-27 13:37:23 +00:00
Rohit Aggarwal
36adfec155
Adding support of AMDLIBM vector library (#78560)
Hi,

AMD has it's own implementation of vector calls. This patch include the
changes to enable the use of AMD's math library using -fveclib=AMDLIBM.
Please refer https://github.com/amd/aocl-libm-ose 

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-02-15 12:13:07 +05:30
Nikita Popov
ff9af4c43a [CodeGen] Convert tests to opaque pointers (NFC) 2024-02-05 14:07:09 +01:00
Aiden Grossman
b1778c7d7b
[AsmPrinter] Remove mbb-profile-dump flag (#76595)
Now that the work embedding PGO information in SHT_LLVM_BB_ADDR_MAP ELF
sections has landed, there is no longer a need to keep around the
mbb-profile-dump flag.
2024-01-23 16:48:10 -08:00
Min-Yih Hsu
03be448cce
[RISCV][AMDGPU] Mark test/CodeGen/Generic/live-debug-label.ll XFAIL for RISCV and AMDGPU (#77631)
Both RISC-V and AMDGPU(GCN) deploy two VirtRegRewriter in their codegen
pipeline. This test prematurely stops at the first one, which doesn't
cleanup the virtual register map and cause an assertion failure. Ideally
we can solve this by teaching `-stop-after` how to stop at the last
instance of a Pass, but we're just marking XFAIL for these two targets
for now.
2024-01-10 16:47:34 -08:00
Nick Anderson
f1ec0d12bb
Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#77182)
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #75380

Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2024-01-09 13:32:59 +07:00
Daniel Hoekwater
def42537ee [NFC][CodeGen][AArch64] Add tests for unconditional branch duplication
c9f3288 introduced unconditional branch deduplication for basic block
sections and machine function splitting, but it didn't add tests for
AArch64 since prior behavior crashed the test.

This change adds tests for AArch64 and has no functional change.
2024-01-05 23:39:01 +00:00
Orlando Cazalet-Hyams
10b03e6662
[RemoveDIs] Handle DPValues in FastISel (#76952)
The change is fairly mechanical:
1. Factor code from `FastISel::selectIntrinsicCall`, which converts
debug intrinsics into debug instructions, into functions (NFC).
2. Call those functions for DPValues attached to instructions too.

The test updates look the same as other RemoveDIs changes: re-run the
tests with `--try-experimental-debuginfo-iterators`, which checks the
output is identical using the new debug info format (if it has been
enabled in the cmake configuration).

Depends on #76941 (otherwise some modified tests spuriously fail).
2024-01-05 15:11:47 +00:00
Simon Pilgrim
7648371c25 Revert 4d7c5ad58467502fcbc433591edff40d8a4d697d "[NewPM] Update CodeGenPreparePass reference in CodeGenPassBuilder (#77054)"
Revert e0c554ad87d18dcbfcb9b6485d0da800ae1338d1 "Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)"

Revert #75380 and #77054 as they were breaking EXPENSIVE_CHECKS buildbots: https://lab.llvm.org/buildbot/#/builders/104
2024-01-05 12:28:10 +00:00
Nick Anderson
e0c554ad87
Port CodeGenPrepare to new pass manager (and BasicBlockSectionsProfil… (#75380)
Port CodeGenPrepare to new pass manager and dependency
BasicBlockSectionsProfileReader
Fixes: #64560

Co-authored-by: Krishna-13-cyber <84722531+Krishna-13-cyber@users.noreply.github.com>
2024-01-05 13:47:56 +07:00
paperchalice
9bd32d78a9
[CodeGen] Update DwarfEHPreparePass references in CodeGenPassBuilder.h (#74068)
Forgot to update the counterpart in `CodeGenPassBuilder.h`. Also Rename `dwarfehprepare` -> `dwarf-eh-prepare`.
2023-12-11 09:26:01 +08:00
Min-Yih Hsu
0e24179797
[SelectionDAG] Add support to filter SelectionDAG dumps during ISel by function names (#72696)
`-debug-only=isel-dump` is the new debug type for printing SelectionDAG
after each ISel phase. This can be furthered filter by
`-filter-print-funcs=<function names>`.
Note that the existing `-debug-only=isel` will take precedence over the
new behavior and print SelectionDAG dumps of every single function
regardless of `-filter-print-funcs`'s values.
2023-11-20 14:00:47 -08:00
David Sherwood
bdc0afc871
[CodeGen][AArch64] Set min jump table entries to 13 for AArch64 targets (#71166)
There are some workloads that are negatively impacted by using jump
tables when the number of entries is small. The SPEC2017 perlbench
benchmark is one example of this, where increasing the threshold to
around 13 gives a ~1.5% improvement on neoverse-v1. I chose the minimum
threshold based on empirical evidence rather than science, and just
manually increased the threshold until I got the best performance
without impacting other workloads. For neoverse-v1 I saw around ~0.2%
improvement in the SPEC2017 integer geomean, and no overall change for
neoverse-n1. If we find issues with this threshold later on we can
always revisit this.

The most significant SPEC2017 score changes on neoverse-v1 were:

500.perlbench_r: +1.6%
520.omnetpp_r: +0.6%

and the rest saw changes < 0.5%.

I updated CodeGen/AArch64/min-jump-table.ll to reflect the new
threshold. For most of the affected tests I manually set the min number
of entries back to 4 on the RUN line because the tests seem to rely upon
this behaviour.
2023-11-14 13:00:28 +00:00
Nikita Popov
17764d2c87
[IR] Remove FP cast constant expressions (#71408)
Remove support for the fptrunc, fpext, fptoui, fptosi, uitofp and sitofp
constant expressions. All places creating them have been removed
beforehand, so this just removes the APIs and uses of these constant
expressions in tests.

With this, the only remaining FP operation that still has constant
expression support is fcmp.

This is part of
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179.
2023-11-07 09:34:16 +01:00
Luke Lau
2e85123bfe
[VP] Check if VP ops with functional intrinsics are speculatable (#69504)
Noticed whilst working on #69494. VP intrinsics whose functional
equivalent is
an intrinsic were being marked as their lanes being non-speculatable,
even if
the underlying intrinsic was speculatable.

This meant that

```llvm
  %1 = call <4 x i32> @llvm.vp.umax(<4 x i32> %x, <4 x i32> %y, <4 x i1> %mask, i32 %evl)
```

would be expanded out to

```llvm
  %.splatinsert = insertelement <4 x i32> poison, i32 %evl, i64 0
  %.splat = shufflevector <4 x i32> %.splatinsert, <4 x i32> poison, <4 x i32> zeroinitializer
  %1 = icmp ult <4 x i32> <i32 0, i32 1, i32 2, i32 3>, %.splat
  %2 = and <4 x i1> %1, %mask
  %3 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)
```

instead of

```llvm
  %1 = call <4 x i32> @llvm.umax.v4i32(<4 x i32> %x, <4 x i32> %y)
```

The cause of this was isSafeToSpeculativelyExecuteWithOpcode checking
the
function attributes for the VP instruction itself, not the functional
intrinsic. Since isSafeToSpeculativelyExecuteWithOpcode expects an
already
materialized instruction, we can't use it directly for the intrinsic
case. So
this fixes it by manually checking the function attributes on the
intrinsic.
2023-10-26 13:46:32 +01:00
Mircea Trofin
f179486204
[AsmPrint] Correctly factor function entry count when dumping MBB frequencies (#67826)
The goal in #66818 was to capture function entry counts, but those are not the same as the frequency of the entry (machine) basic block. This fixes that, and adds explicit profiles to the test.

We also increase the precision of `MachineBlockFrequencyInfo::getBlockFreqRelativeToEntryBlock` to double. Existing code uses it as float so should be unaffected.
2023-09-29 18:06:53 -07:00
Aiden Grossman
3dc2f2618b
[MLGO] Move MBB Profile Dump test to Generic (#66856)
This patch moves the MBB Profile Dump to ./llvm/test/CodeGen/Generic
from ./llvm/test/CodeGen/MlRegAlloc as the profile dump doesn't have
anything to do with the ML guided register allocation heuristic.
2023-09-20 11:50:33 -07:00
David Spickett
69f1cd58aa [llvm][AArch64] Disable BigByval with expensive checks
AArch64 incorrectly nests ADJCALLSTACKDOWN/ADJCALLSTACKUP which fails
to verify with expensive checks enabled.

See https://github.com/llvm/llvm-project/issues/62137 and
https://github.com/llvm/llvm-project/issues/62138.
2023-08-31 10:15:45 +00:00
Daniel Hoekwater
0982d96186 [CodeGen][AArch64] Don't split inline asm goto blocks or their targets
Machine function splitting + branch relaxation currently don't properly
handle inline asm goto blocks that conditional branch to cold goto
labels. While such inline asm is technically invalid, machine
function splitting is the only thing that exposes it as such.

Since machine function splitting doesn't help too much in these
circumstances anyway, disable it for asm goto blocks and their targets.

Differential Revision: https://reviews.llvm.org/D158647
2023-08-29 20:24:38 +00:00