llvm-project

Author	SHA1	Message	Date
esmeyi	c119da23af	[PowerPC] Function descriptor symbol may be omitted for external symbol. #97526 If a function's address is taken, which means it may be called via a function pointer, we need the function descriptor for it. Otherwise, the function descriptor can be omitted for external symbols.	2024-07-08 03:47:33 -04:00
Matt Arsenault	db9252b115	DAG: Call SimplifyDemandedBits on fcopysign sign value (#97151 ) Math library code has quite a few places with complex bit logic that are ultimately fed into a copysign. This helps avoid some regressions in a future patch. This assumes the position in the float type, which should at least be valid for IEEE types. Not sure if we need to guard against ppc_fp128 or anything else weird. There appears to be some value in simplifying the value operand as well, but I'll address that separately.	2024-07-01 12:19:17 +02:00
Chen Zheng	e1c03ddc9b	[PowerPC] use r1 as the frame pointer when there is dynamic alloca On PPC, when there is dynamic alloca, only r1 points to the backchain.	2024-06-20 22:26:52 -04:00
Chen Zheng	abaaa48ce6	[PowerPC] fix frameaddress error when there is dynamic alloca call, NFC	2024-06-20 22:26:48 -04:00
Zaara Syeda	898b8a42b5	[PPC] Add DwarfRegAlias for VSRPair (#95837 ) Add DwarfRegAlias for VSRPair as it shares dwarfRegNum with the VR registers.	2024-06-20 11:30:58 -04:00
Kai Luo	480a788e49	[PowerPC] Make verifier happy after peephole on MMA COPYs (#94321 )	2024-06-20 12:06:47 +08:00
Simon Pilgrim	2a57a08829	[PowerPC] Regenerate p8altivec-shuffles-pred.ll with update_llc_test_checks script	2024-06-19 16:55:33 +01:00
Kai Luo	117921e071	[PowerPC] Alignment of toc-data symbol should not be increased during optimizations (#94593 ) Currently, the alignment of toc-data symbol might be changed during instcombine ``` IC: Visiting: %global = alloca %struct.widget, align 8 Found alloca equal to global: %global = alloca %struct.widget, align 8 memcpy = call void @llvm.memcpy.p0.p0.i64(ptr nonnull align 1 %global, ptr align 1 @global, i64 3, i1 false) ``` The `alloca` is created with `PrefAlign` which is 8 and after IC, the alignment of `@global` is enforced into `8`, same as the `alloca`. This is not expected, since toc-data symbol has the same alignment as toc entry and should not be increased during optimizations. --------- Co-authored-by: Sean Fertile <sd.fertile@gmail.com> Co-authored-by: Eli Friedman <efriedma@quicinc.com>	2024-06-18 09:58:37 +08:00
Stefan Pintilie	1af1c9fb98	[NFC][PowerPC] Update the option to -enable-subreg-liveness.	2024-06-14 13:57:21 -05:00
Stefan Pintilie	e84ecf26fa	[NFC][PowerPC] Add test to check lanemasks for subregisters. (#94363 ) This change adds a test case to check the lane masks for a varitey of subregisters.	2024-06-14 13:49:37 -04:00
David Green	706e197540	[CodeGen] Remove target SubRegLiveness flags (#95437 ) This removes the uses of target flags to disable subreg liveness, relying on the `-enable-subreg-liveness` flag instead. The `-enable-subreg-liveness` flag has been changed to take precedence over the subtarget if set, and one use of `Subtarget->enableSubRegLiveness()` has been changed to `MRI->subRegLivenessEnabled()` to make sure the option properly applies.	2024-06-14 08:51:56 +01:00
Amy Kwan	19b43e1757	[PowerPC][NFC] Pre-commit test case to prepare for patch to merge internal and private global data	2024-06-13 10:53:19 -05:00
Farzon Lotfi	189d471191	[clang] Reland Add tanf16 builtin and support for tan constrained intrinsic (#94559 ) Relanding this PR now that https://github.com/llvm/llvm-project/pull/90503 has merged. with `FTAN` landing in [TargetLoweringBase.cpp:L1021](https://github.com/llvm/llvm-project/blob/main/llvm/lib/CodeGen/TargetLoweringBase.cpp#L1020C23-L1021C63 ) There is now a llvm tan intrinsic 32\64\128 Expand case for all llvm backends. In LLVM, the `llvm.experimental.constrained.cos` and `llvm.experimental.constrained.sin` intrinsics are used for performing cosine and sine calculations with additional constraints on floating-point operations. This behavior is expected for all floating-point math intrinsics. This change adds these constraints for the `tan` intrinsic. - `Builtins.td` - replace TanF128 with F16F128MathTemplate - `CGBuiltin.cpp` - map existing tan builtins to `tan` and `constrained_tan` intrinsic - `ConstrainedOps.def` map tan and constrained_tan to an ISDOpcode. resolves #91421 --------- Co-authored-by: Farzon Lotfi <farzon@farzon.com>	2024-06-10 20:46:26 -04:00
Chen Zheng	3453dedfaf	[PowerPC] return correct frame address for frameaddress intrinsic	2024-06-07 05:17:22 -04:00
Chen Zheng	0749b01c81	[PowerPC] modify the frameaddress case, NFC	2024-06-07 05:15:41 -04:00
paperchalice	1bc8b3258e	[NewPM][CodeGen] Port `regallocfast` to new pass manager (#94426 ) This pull request port `regallocfast` to new pass manager. It exposes the parameter `filter` to handle different register classes for AMDGPU. IIUC AMDGPU need to allocate different register classes separately so it need implement its own `--<reg-class>-regalloc`. Now users can use e.g. `-passe=regallocfast<filter=sgpr>` to allocate specific register class. The command line option `--regalloc-npm` is still in work progress, plan to reuse the syntax of passes, e.g. use `--regalloc-npm=regallocfast<filter=sgpr>,greedy<filter=vgpr>` to replace `--sgpr-regalloc` and `--vgpr-regalloc`.	2024-06-07 12:22:42 +08:00
Kai Luo	bf02f81da7	[PowerPC] Adjust operand order of ADDItoc to be consistent with other ADDI* nodes (#93642 ) Simultaneously, the `ADDItoc` machineinstr is generated in `PPCISelDAGToDAG::Select` so the pattern is not used and can be removed.	2024-06-06 17:15:53 +08:00
Kai Luo	d3f8eab0ac	[PowerPC] Add test to show alignment of toc-data symbol is changed. NFC. After O3 opt pipeline, the alignment of toc-data symbol is changed which is unexpected.	2024-06-06 08:58:04 +00:00
Zarko Todorovski	0295c2ada4	[PowerPC][AIX] Support ByVals with greater alignment then pointer size (#93341 ) Implementation is NOT compatible with IBM XL C 16.1 and earlier but is compatible with GCC. It handles all ByVals with greater alignment then pointer width the same way IBM XL C handles Byvals that have vector members. For overaligned objects that do not contain vectors IBM XL C does not align them properly if they are passed in the GPR argument registers. This patch was originally written by Sean Fertile @mandlebug. Previously on Phabricator https://reviews.llvm.org/D105659	2024-06-05 12:19:16 -04:00
Kai Luo	8ab578a126	[PowerPC] Add test of non-zero addend in tocdata relocation. NFC. It intends to check if IAS handles non-zero addend correctly.	2024-06-05 08:55:57 +00:00
Kai Luo	8423337502	[PowerPC] Add test for ppc-mi-peepholes on MMA register COPYs. NFC.	2024-06-04 08:13:34 +00:00
Nikita Popov	deab451e7a	[IR] Remove support for icmp and fcmp constant expressions (#93038 ) Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.	2024-06-04 08:31:03 +02:00
Kai Luo	4d20f495df	[PowerPC] Remove DAG matching in ADDIStocHA (#93905 ) The MI is generated in `PPCDAGToDAGISel::Select` so the match pattern isn't used and can be removed.	2024-06-04 10:00:43 +08:00
Simon Pilgrim	2b1dfd2b35	[DAG] Replace getValidShiftAmountConstant helpers with getValidShiftAmount helpers to support KnownBits analysis (#93182 ) The getValidShiftAmountConstant/getValidMinimumShiftAmountConstant/getValidMaximumShiftAmountConstant helpers only worked with constant shift amounts, which could be problematic after type legalization (e.g. v2i64 might be partially scalarized or split into v4i32 on some targets such as 32-bit x86, Thumb2 MVE). This patch proposes we generalize these helpers to work with ConstantRange+KnownBits if a scalar/buildvector constant isn't available. Most restrictions are the same - the helper fails if any shift amount is out of bounds, getValidShiftConstant must be a specific constant uniform etc. However, getValidMinimumShiftAmount/getValidMaximumShiftAmount now can return bounds values that aren't values in the actual data, as they are based off the common KnownBits of every vector element. This addresses feedback on #92096	2024-06-01 16:48:26 +01:00
Yingwei Zheng	47fd32f81c	[DAGCombine] Fix type mismatch in `(shl X, cttz(Y)) -> (mul (Y & -Y), X)` (#94008 ) Proof: https://alive2.llvm.org/ce/z/J7GBMU Same as https://github.com/llvm/llvm-project/pull/92753, the types of LHS and RHS in shift nodes may differ. + When VT is smaller than ShiftVT, it is safe to use trunc. + When VT is larger than ShiftVT, it is safe to use zext iff `is_zero_poison` is true (i.e., `opcode == ISD::CTTZ_ZERO_UNDEF`). See also the counterexample `src_shl_cttz2 -> tgt_shl_cttz2` in the alive2 proofs. Fixes issue https://github.com/llvm/llvm-project/pull/85066#issuecomment-2142553617.	2024-06-01 19:04:55 +08:00
Kai Luo	59116e0941	[PowerPC] Update test so that target flags are exposed. NFC.	2024-06-01 13:00:18 +08:00
Egor Pasko	cab81dd038	[EntryExitInstrumenter] Move passes out of clang into LLVM default pipelines (#92171 ) Move EntryExitInstrumenter(PostInlining=true) to as late as possible and EntryExitInstrumenter(PostInlining=false) to an early pre-inlining stage (but skip for ThinLTO post-link). This should fix the issues reported in https://github.com/rust-lang/rust/issues/92109 and https://github.com/llvm/llvm-project/issues/52853. These are caused by https://reviews.llvm.org/D97608.	2024-05-31 12:48:45 -07:00
zhijian lin	6127f15e5b	[PowerPC] option `-msoft-float` should not block the PC-relative address instruction (#92543 ) The Prefix instruction is introduced on PowerPC ISA3_1. In the PR, 1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector` ,`FeatureP9Vector` . 2. `FeaturePrefixInstrs` implies only the FeatureISA3_1. 3. For the prefix instructions `paddi` and `pli` , they have `Predicates = [PrefixInstrs] ` 4. For the prefix instructions `plfs` and `plfd`, they have `Predicates = [PrefixInstrs, HasFPU] ` 5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have `Predicates = [PrefixInstrs, HasP10Vector]` Fixes #62372	2024-05-29 10:53:00 -04:00
Matt Arsenault	465bc5e729	AArch64/ARM/PPC/X86: Add some atomic tests (#92933 ) FP typed atomic load/store coverage was mostly missing, especially for half and bfloat.	2024-05-29 07:05:55 +02:00
Tyker	6e1a04247d	Fix failure after d46e37348ec3f8054b10bcbbe7c11149d7f61031	2024-05-28 15:23:13 +02:00
Nikita Popov	9f85bc834b	[PPCMergeStringPool] Only replace constant once (#92996 ) In #88846 I changed this code to use RAUW to perform the replacement instead of manual updates -- but kept the outer loop, which means we try to perform RAUW once per user. However, some of the users might be freed by the RAUW operation, resulting in use-after-free. The case where this happens is constant users where the replacement might result in the destruction of the original constant. Fixes https://github.com/llvm/llvm-project/issues/92991.	2024-05-27 08:54:11 +02:00
Nikita Popov	1579e9ca9c	Revert "Run ObjCContractPass in Default Codegen Pipeline (#92331 )" This reverts commit 8cc8e5d6c6ac9bfc888f3449f7e424678deae8c2. This reverts commit dae55c89835347a353619f506ee5c8f8a2c136a7. Causes major compile-time regressions for unoptimized builds.	2024-05-24 08:14:26 +02:00
Chen Zheng	cd9bab2e2a	[PowerPC] handle toc-data in load selection of fast-isel (#91916 ) Support the address selection for toc-data globals in fast isel. This benefits instruction selection for fast-isel for toc data symbol for example for load selection. This also aligns the code generation with/without -mtocdata.	2024-05-24 11:09:37 +08:00
Nuri Amari	8cc8e5d6c6	Run ObjCContractPass in Default Codegen Pipeline (#92331 ) Prior to this patch, when using -fthinlto-index= the ObjCARCContractPass isn't run prior to CodeGen, and instruction selection fails on IR containing arc intrinsics. This patch is motivated by that usecase. The pass was previously added in various places codegen is performed. This patch adds the pass to the default codegen pipepline, makes sure it bails immediately if no arc intrinsics are found, and removes the adhoc scheduling of the pass. Co-authored-by: Nuri Amari <nuriamari@fb.com>	2024-05-23 10:04:55 -07:00
Nikita Popov	ca478bc6cc	[SCEV] Support ule/sle exit counts via widening (#92206 ) If we have an exit condition of the form IV <= Limit, we will first try to convert it into IV < Limit+1 or IV-1 < Limit based on range info (in icmp simplification). If that fails, we try to convert it to IV < Limit + 1 based on controlling exits in non-infinite loops. However, if all else fails, we can still determine the exit count by rewriting to ext(IV) < ext(Limit) + 1, where the zero/sign extension ensures that the addition does not overflow. Proof: https://alive2.llvm.org/ce/z/iR-iYd	2024-05-23 07:54:08 +02:00
Zaara Syeda	29456e9bcc	[PowerPC] Fix assembler error with toc-data and data-sections (#91976 ) We should not emit the label for the toc-data variable when data-sections=false.	2024-05-22 14:07:51 -04:00
Zaara Syeda	194e7cc7aa	[PowerPC][AIX] 64-bit large code-model support for toc-data (#90619 ) This patch adds support for toc-data for 64-bit large code-model on AIX. The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly from the TOC. When emitting the instruction ADDIStocHA8, we check if the symbol has toc-data attribute before creating a toc entry for it. When emitting the instruction ADDItocL8, we use the LA8 instruction to load the address.	2024-05-21 14:00:24 -04:00
Nikita Popov	d0e0205bfc	[InstCombine] Check for poison instead of undef in single shuffle fold Otherwise we'll convert undef to poison. Alive2 was already flagging the existing test8 test as a miscompile.	2024-05-21 16:03:20 +02:00
Chen Zheng	2143b7cd7d	[PowerPC]perform bitcast lowering only at 64 bit Perform bitcast lowering requires 64-bit to be native supported, However this is not true on 32-bit targets. Explicitly require 64-bit target. Fixes #92233	2024-05-20 03:17:21 -04:00
Simon Pilgrim	117d755b1b	[DAG] SimplifyDemandedBits - use ComputeKnownBits instead of getValidShiftAmountConstant to check for constant shift amounts. (#92412 ) This allows us to handle cases where the constant has already been type legalized behind a bitcast Despite calling ComputeKnownBits I'm not seeing any notable change in compile time.	2024-05-16 17:04:30 +01:00
Jake Egan	d9db266499	[PowerPC][test] Catch any exception when retrieving git revision (#92004 ) This makes the `vc-rev-enabled` feature unsupported if we fail to retrieve the git revision for any reason, such as if git is not installed.	2024-05-14 10:32:30 -04:00
Simon Pilgrim	31fb0ae23d	[PowerPC] Regenerate and_sext.ll with test checks I've kept the grep checks for extsh/extsb instructions, but we can now see the actual codegen as well	2024-05-14 11:58:48 +01:00
Chen Zheng	662267daea	[PPC] add testcase, nfc	2024-05-13 01:49:00 -04:00
Matt Arsenault	6a8d30b1c1	DAG: Skip 0 sign handling in minimum/maximum lowering for _ieee case (#91326 ) dc9664a8adae17f2083fbcc8e96cfce606c56d57 changed the documentation to assume these order -0 as less than +0.	2024-05-09 14:41:13 +02:00
Nikita Popov	3a3aeb8eba	[PPCMergeStringPool] Avoid replacing constant with instruction (#88846 ) String pool merging currently, for a reason that's not entirely clear to me, tries to create GEP instructions instead of GEP constant expressions when replacing constant references. It only uses constant expressions in cases where this is required. However, it does not catch all cases where such a requirement exists. For example, the landingpad catch clause has to be a constant. Fix this by always using the constant expression variant, which also makes the implementation simpler. Additionally, there are some edge cases where even replacement with a constant GEP is not legal. The one I am aware of is the llvm.eh.typeid.for intrinsic, so add a special case to forbid replacements for it. Fixes https://github.com/llvm/llvm-project/issues/88844.	2024-05-09 13:27:20 +09:00
Felix (Ting Wang)	ea126aebdc	[PowerPC] Tune AIX shared library TLS model at function level (#84132 ) Under some circumstance (library loaded with the main program), TLS initial-exec model can be applied to local-dynamic access(es). We could use some simple heuristic to decide the update at function level: * If there is equal or less than a number of TLS local-dynamic access(es) in the function, use TLS initial-exec model. (the threshold which default to 1 is controlled by hidden option)	2024-05-09 09:50:36 +08:00
Felix (Ting Wang)	19220110ac	[PowerPC][AIX] Refactor existing logic to handle non-zero offsets for aix-small-local-dynamic-tls (#89182 ) To enable optimized small local-dynamic access sequence for non-zero offsets, this patch refactors existing 2a50921553798d2db52ca6330c89f0f8a5bc2215.	2024-05-08 18:37:51 +08:00
Maryam Moghadas	9a28814f59	[PowerPC] Spill non-volatile registers required for traceback table (#71115 ) On AIX we need to spill all [rfv]N-[rfv]31 when a function clobbers [rfv]N so that the traceback table contains accurate information.	2024-05-07 16:23:37 -04:00
Jake Egan	8cde1cfc60	[AIX] Add git revision to .file string (#88164 ) If `LLVM_APPEND_VC_REV` is on, add the git revision to the `.file` string. The revision can be set with `LLVM_FORCE_VC_REVISION`. Before: `.file "git_revision.cpp",,"LLVM version 19.0.0git"` After: `.file "git_revision.cpp",,"LLVM version 19.0.0git (LLVM_REVISION)"`	2024-04-30 20:37:35 -04:00
zhijian lin	70ada5b178	NFC add a new precommit test case for PPCMIpeephole (#90656 ) Add pre-commit MIR test for PR "[Promote Pseudo Opcode from 32-bit to 64-bit after eliminating the extsw instruction in PPCMIPeepholes optimization](https://github.com/llvm/llvm-project/pull/85451)" which fixes bug reported in the issue "[Inconsistent Output at -O1 and -O2 Optimization Levels on PowerPC64 Due to Complex Type Casting and Nested Loop Structure](https://github.com/llvm/llvm-project/issues/71030)".	2024-04-30 16:27:34 -04:00

1 2 3 4 5 ...

3882 Commits