3882 Commits

Author SHA1 Message Date
esmeyi
c119da23af [PowerPC] Function descriptor symbol may be omitted for external symbol. #97526
If a function's address is taken, which means it may be called via a function pointer,
we need the function descriptor for it.
Otherwise, the function descriptor can be omitted for external symbols.
2024-07-08 03:47:33 -04:00
Matt Arsenault
db9252b115
DAG: Call SimplifyDemandedBits on fcopysign sign value (#97151)
Math library code has quite a few places with complex bit
logic that are ultimately fed into a copysign. This helps
avoid some regressions in a future patch.

This assumes the position in the float type, which should
at least be valid for IEEE types. Not sure if we need to guard
against ppc_fp128 or anything else weird.

There appears to be some value in simplifying the value operand
as well, but I'll address that separately.
2024-07-01 12:19:17 +02:00
Chen Zheng
e1c03ddc9b [PowerPC] use r1 as the frame pointer when there is dynamic alloca
On PPC, when there is dynamic alloca, only r1 points to the backchain.
2024-06-20 22:26:52 -04:00
Chen Zheng
abaaa48ce6 [PowerPC] fix frameaddress error when there is dynamic alloca call, NFC 2024-06-20 22:26:48 -04:00
Zaara Syeda
898b8a42b5
[PPC] Add DwarfRegAlias for VSRPair (#95837)
Add DwarfRegAlias for VSRPair as it shares dwarfRegNum with the VR
registers.
2024-06-20 11:30:58 -04:00
Kai Luo
480a788e49
[PowerPC] Make verifier happy after peephole on MMA COPYs (#94321) 2024-06-20 12:06:47 +08:00
Simon Pilgrim
2a57a08829 [PowerPC] Regenerate p8altivec-shuffles-pred.ll with update_llc_test_checks script 2024-06-19 16:55:33 +01:00
Kai Luo
117921e071
[PowerPC] Alignment of toc-data symbol should not be increased during optimizations (#94593)
Currently, the alignment of toc-data symbol might be changed during
instcombine
```
IC: Visiting:   %global = alloca %struct.widget, align 8                                                                                         
Found alloca equal to global:   %global = alloca %struct.widget, align 8                                                                         
  memcpy =   call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 1 %global, ptr align 1 @global, i64 3, i1 false)
```
The `alloca` is created with `PrefAlign` which is 8 and after IC, the
alignment of `@global` is enforced into `8`, same as the `alloca`. This
is not expected, since toc-data symbol has the same alignment as toc
entry and should not be increased during optimizations.

---------

Co-authored-by: Sean Fertile <sd.fertile@gmail.com>
Co-authored-by: Eli Friedman <efriedma@quicinc.com>
2024-06-18 09:58:37 +08:00
Stefan Pintilie
1af1c9fb98 [NFC][PowerPC] Update the option to -enable-subreg-liveness. 2024-06-14 13:57:21 -05:00
Stefan Pintilie
e84ecf26fa
[NFC][PowerPC] Add test to check lanemasks for subregisters. (#94363)
This change adds a test case to check the lane masks for a varitey of
subregisters.
2024-06-14 13:49:37 -04:00
David Green
706e197540
[CodeGen] Remove target SubRegLiveness flags (#95437)
This removes the uses of target flags to disable subreg liveness,
relying on the `-enable-subreg-liveness` flag instead. The
`-enable-subreg-liveness` flag has been changed to take precedence over
the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()`
has been changed to `MRI->subRegLivenessEnabled()` to make sure the
option properly applies.
2024-06-14 08:51:56 +01:00
Amy Kwan
19b43e1757 [PowerPC][NFC] Pre-commit test case to prepare for patch to merge internal and private global data 2024-06-13 10:53:19 -05:00
Farzon Lotfi
189d471191
[clang] Reland Add tanf16 builtin and support for tan constrained intrinsic (#94559)
Relanding this PR now that
https://github.com/llvm/llvm-project/pull/90503 has merged. with `FTAN`
landing in
[TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63
) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm
backends.

In LLVM, the `llvm.experimental.constrained.cos` and
`llvm.experimental.constrained.sin` intrinsics are used for performing
cosine and sine calculations with additional constraints on
floating-point operations. This behavior is expected for all
floating-point math intrinsics. This change adds these constraints for
the `tan` intrinsic.

-  `Builtins.td` - replace TanF128 with F16F128MathTemplate
- `CGBuiltin.cpp` - map existing tan builtins to `tan` and
`constrained_tan` intrinsic
-   `ConstrainedOps.def` map tan and constrained_tan  to an ISDOpcode.

resolves  #91421

---------

Co-authored-by: Farzon Lotfi <farzon@farzon.com>
2024-06-10 20:46:26 -04:00
Chen Zheng
3453dedfaf [PowerPC] return correct frame address for frameaddress intrinsic 2024-06-07 05:17:22 -04:00
Chen Zheng
0749b01c81 [PowerPC] modify the frameaddress case, NFC 2024-06-07 05:15:41 -04:00
paperchalice
1bc8b3258e
[NewPM][CodeGen] Port regallocfast to new pass manager (#94426)
This pull request port `regallocfast` to new pass manager. It exposes
the parameter `filter` to handle different register classes for AMDGPU.
IIUC AMDGPU need to allocate different register classes separately so it
need implement its own `--<reg-class>-regalloc`. Now users can use e.g.
`-passe=regallocfast<filter=sgpr>` to allocate specific register class.
The command line option `--regalloc-npm` is still in work progress, plan
to reuse the syntax of passes, e.g. use
`--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to
replace `--sgpr-regalloc` and `--vgpr-regalloc`.
2024-06-07 12:22:42 +08:00
Kai Luo
bf02f81da7
[PowerPC] Adjust operand order of ADDItoc to be consistent with other ADDI* nodes (#93642)
Simultaneously, the `ADDItoc` machineinstr is generated in
`PPCISelDAGToDAG::Select` so the pattern is not used and can be removed.
2024-06-06 17:15:53 +08:00
Kai Luo
d3f8eab0ac [PowerPC] Add test to show alignment of toc-data symbol is changed. NFC.
After O3 opt pipeline, the alignment of toc-data symbol is changed which is
unexpected.
2024-06-06 08:58:04 +00:00
Zarko Todorovski
0295c2ada4
[PowerPC][AIX] Support ByVals with greater alignment then pointer size (#93341)
Implementation is NOT compatible with IBM XL C 16.1 and earlier but is
compatible with GCC.
It handles all ByVals with greater alignment then pointer width the same
way IBM XL C handles Byvals
that have vector members. For overaligned objects that do not contain
vectors IBM XL C does not align them properly if they are passed in the
GPR
argument registers.

This patch was originally written by Sean Fertile @mandlebug. 

Previously on Phabricator https://reviews.llvm.org/D105659
2024-06-05 12:19:16 -04:00
Kai Luo
8ab578a126 [PowerPC] Add test of non-zero addend in tocdata relocation. NFC.
It intends to check if IAS handles non-zero addend correctly.
2024-06-05 08:55:57 +00:00
Kai Luo
8423337502 [PowerPC] Add test for ppc-mi-peepholes on MMA register COPYs. NFC. 2024-06-04 08:13:34 +00:00
Nikita Popov
deab451e7a
[IR] Remove support for icmp and fcmp constant expressions (#93038)
Remove support for the icmp and fcmp constant expressions.

This is part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179

As usual, many of the updated tests will no longer test what they were
originally intended to -- this is hard to preserve when constant
expressions get removed, and in many cases just impossible as the
existence of a specific kind of constant expression was the cause of the
issue in the first place.
2024-06-04 08:31:03 +02:00
Kai Luo
4d20f495df
[PowerPC] Remove DAG matching in ADDIStocHA (#93905)
The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.
2024-06-04 10:00:43 +08:00
Simon Pilgrim
2b1dfd2b35
[DAG] Replace getValid*ShiftAmountConstant helpers with getValid*ShiftAmount helpers to support KnownBits analysis (#93182)
The getValidShiftAmountConstant/getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant helpers only worked with constant shift amounts, which could be problematic after type legalization (e.g. v2i64 might be partially scalarized or split into v4i32 on some targets such as 32-bit x86, Thumb2 MVE).

This patch proposes we generalize these helpers to work with ConstantRange+KnownBits if a scalar/buildvector constant isn't available.

Most restrictions are the same - the helper fails if any shift amount is out of bounds, getValidShiftConstant must be a specific constant uniform etc.

However, getValidMinimumShiftAmount/getValidMaximumShiftAmount now can return bounds values that aren't values in the actual data, as they are based off the common KnownBits of every vector element.

This addresses feedback on #92096
2024-06-01 16:48:26 +01:00
Yingwei Zheng
47fd32f81c
[DAGCombine] Fix type mismatch in (shl X, cttz(Y)) -> (mul (Y & -Y), X) (#94008)
Proof: https://alive2.llvm.org/ce/z/J7GBMU

Same as https://github.com/llvm/llvm-project/pull/92753, the types of
LHS and RHS in shift nodes may differ.
+ When VT is smaller than ShiftVT, it is safe to use trunc.
+ When VT is larger than ShiftVT, it is safe to use zext iff
`is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See
also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2
proofs.

Fixes issue
https://github.com/llvm/llvm-project/pull/85066#issuecomment-2142553617.
2024-06-01 19:04:55 +08:00
Kai Luo
59116e0941 [PowerPC] Update test so that target flags are exposed. NFC. 2024-06-01 13:00:18 +08:00
Egor Pasko
cab81dd038
[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171)
Move EntryExitInstrumenter(PostInlining=true) to as late as possible and
EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage
(but skip for ThinLTO post-link).

This should fix the issues reported in
https://github.com/rust-lang/rust/issues/92109 and
https://github.com/llvm/llvm-project/issues/52853. These are caused
by https://reviews.llvm.org/D97608.
2024-05-31 12:48:45 -07:00
zhijian lin
6127f15e5b
[PowerPC] option -msoft-float should not block the PC-relative address instruction (#92543)
The Prefix instruction is introduced on PowerPC ISA3_1.

In the PR,  
1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector`
,`FeatureP9Vector` .
2. `FeaturePrefixInstrs`  implies only the FeatureISA3_1.
3. For the prefix instructions `paddi` and `pli` , they have `Predicates
= [PrefixInstrs] `
4. For the prefix instructions `plfs` and `plfd`, they have `Predicates
= [PrefixInstrs, HasFPU] `
5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have
`Predicates = [PrefixInstrs, HasP10Vector]`

Fixes #62372
2024-05-29 10:53:00 -04:00
Matt Arsenault
465bc5e729
AArch64/ARM/PPC/X86: Add some atomic tests (#92933)
FP typed atomic load/store coverage was mostly missing, especially for
half and bfloat.
2024-05-29 07:05:55 +02:00
Tyker
6e1a04247d Fix failure after d46e37348ec3f8054b10bcbbe7c11149d7f61031 2024-05-28 15:23:13 +02:00
Nikita Popov
9f85bc834b
[PPCMergeStringPool] Only replace constant once (#92996)
In #88846 I changed this code to use RAUW to perform the replacement
instead of manual updates -- but kept the outer loop, which means we try
to perform RAUW once per user. However, some of the users might be freed
by the RAUW operation, resulting in use-after-free.

The case where this happens is constant users where the replacement
might result in the destruction of the original constant.

Fixes https://github.com/llvm/llvm-project/issues/92991.
2024-05-27 08:54:11 +02:00
Nikita Popov
1579e9ca9c Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331)"
This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2.
This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7.

Causes major compile-time regressions for unoptimized builds.
2024-05-24 08:14:26 +02:00
Chen Zheng
cd9bab2e2a
[PowerPC] handle toc-data in load selection of fast-isel (#91916)
Support the address selection for toc-data globals in fast isel. This
benefits instruction selection for fast-isel for toc data symbol for
example for load selection. This also aligns the code generation
with/without -mtocdata.
2024-05-24 11:09:37 +08:00
Nuri Amari
8cc8e5d6c6
Run ObjCContractPass in Default Codegen Pipeline (#92331)
Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase.

The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass. 

Co-authored-by: Nuri Amari <nuriamari@fb.com>
2024-05-23 10:04:55 -07:00
Nikita Popov
ca478bc6cc
[SCEV] Support ule/sle exit counts via widening (#92206)
If we have an exit condition of the form IV <= Limit, we will first try
to convert it into IV < Limit+1 or IV-1 < Limit based on range info (in
icmp simplification). If that fails, we try to convert it to IV < Limit
+ 1 based on controlling exits in non-infinite loops.

However, if all else fails, we can still determine the exit count by
rewriting to ext(IV) < ext(Limit) + 1, where the zero/sign extension
ensures that the addition does not overflow.

Proof: https://alive2.llvm.org/ce/z/iR-iYd
2024-05-23 07:54:08 +02:00
Zaara Syeda
29456e9bcc
[PowerPC] Fix assembler error with toc-data and data-sections (#91976)
We should not emit the label for the toc-data variable when
data-sections=false.
2024-05-22 14:07:51 -04:00
Zaara Syeda
194e7cc7aa
[PowerPC][AIX] 64-bit large code-model support for toc-data (#90619)
This patch adds support for toc-data for 64-bit large code-model on AIX.
The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly
from the TOC.
When emitting the instruction ADDIStocHA8, we check if the symbol has
toc-data attribute before creating a toc entry for it. When emitting the
instruction ADDItocL8, we use the LA8 instruction to load the address.
2024-05-21 14:00:24 -04:00
Nikita Popov
d0e0205bfc [InstCombine] Check for poison instead of undef in single shuffle fold
Otherwise we'll convert undef to poison. Alive2 was already flagging
the existing test8 test as a miscompile.
2024-05-21 16:03:20 +02:00
Chen Zheng
2143b7cd7d [PowerPC]perform bitcast lowering only at 64 bit
Perform bitcast lowering requires 64-bit to be native supported,
However this is not true on 32-bit targets. Explicitly require
64-bit target.

Fixes #92233
2024-05-20 03:17:21 -04:00
Simon Pilgrim
117d755b1b
[DAG] SimplifyDemandedBits - use ComputeKnownBits instead of getValidShiftAmountConstant to check for constant shift amounts. (#92412)
This allows us to handle cases where the constant has already been type legalized behind a bitcast

Despite calling ComputeKnownBits I'm not seeing any notable change in compile time.
2024-05-16 17:04:30 +01:00
Jake Egan
d9db266499
[PowerPC][test] Catch any exception when retrieving git revision (#92004)
This makes the `vc-rev-enabled` feature unsupported if we fail to
retrieve the git revision for any reason, such as if git is not
installed.
2024-05-14 10:32:30 -04:00
Simon Pilgrim
31fb0ae23d [PowerPC] Regenerate and_sext.ll with test checks
I've kept the grep checks for extsh/extsb instructions, but we can now see the actual codegen as well
2024-05-14 11:58:48 +01:00
Chen Zheng
662267daea [PPC] add testcase, nfc 2024-05-13 01:49:00 -04:00
Matt Arsenault
6a8d30b1c1
DAG: Skip 0 sign handling in minimum/maximum lowering for _ieee case (#91326)
dc9664a8adae17f2083fbcc8e96cfce606c56d57 changed the documentation to
assume these order -0 as less than +0.
2024-05-09 14:41:13 +02:00
Nikita Popov
3a3aeb8eba
[PPCMergeStringPool] Avoid replacing constant with instruction (#88846)
String pool merging currently, for a reason that's not entirely clear to
me, tries to create GEP instructions instead of GEP constant expressions
when replacing constant references. It only uses constant expressions in
cases where this is required. However, it does not catch all cases where
such a requirement exists. For example, the landingpad catch clause has
to be a constant.

Fix this by always using the constant expression variant, which also
makes the implementation simpler.

Additionally, there are some edge cases where even replacement with a
constant GEP is not legal. The one I am aware of is the
llvm.eh.typeid.for intrinsic, so add a special case to forbid
replacements for it.

Fixes https://github.com/llvm/llvm-project/issues/88844.
2024-05-09 13:27:20 +09:00
Felix (Ting Wang)
ea126aebdc
[PowerPC] Tune AIX shared library TLS model at function level (#84132)
Under some circumstance (library loaded with the main program), TLS
initial-exec model can be applied to local-dynamic access(es). We
could use some simple heuristic to decide the update at function level:
* If there is equal or less than a number of TLS local-dynamic access(es)
in the function, use TLS initial-exec model. (the threshold which default to
1 is controlled by hidden option)
2024-05-09 09:50:36 +08:00
Felix (Ting Wang)
19220110ac
[PowerPC][AIX] Refactor existing logic to handle non-zero offsets for aix-small-local-dynamic-tls (#89182)
To enable optimized small local-dynamic access sequence for non-zero
offsets, this patch refactors existing
2a50921553798d2db52ca6330c89f0f8a5bc2215.
2024-05-08 18:37:51 +08:00
Maryam Moghadas
9a28814f59
[PowerPC] Spill non-volatile registers required for traceback table (#71115)
On AIX we need to spill all [rfv]N-[rfv]31 when a function clobbers
[rfv]N so that the traceback table contains accurate information.
2024-05-07 16:23:37 -04:00
Jake Egan
8cde1cfc60
[AIX] Add git revision to .file string (#88164)
If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file`
string. The revision can be set with `LLVM_FORCE_VC_REVISION`.

Before:
`.file	"git_revision.cpp",,"LLVM version 19.0.0git"`

After:
`.file	"git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`
2024-04-30 20:37:35 -04:00
zhijian lin
70ada5b178
NFC add a new precommit test case for PPCMIpeephole (#90656)
Add pre-commit MIR test for PR "[Promote Pseudo Opcode from 32-bit to
64-bit after eliminating the extsw instruction in PPCMIPeepholes
optimization](https://github.com/llvm/llvm-project/pull/85451)" which
fixes bug reported in the issue "[Inconsistent Output at -O1 and -O2
Optimization Levels on PowerPC64 Due to Complex Type Casting and Nested
Loop Structure](https://github.com/llvm/llvm-project/issues/71030)".
2024-04-30 16:27:34 -04:00