llvm-project

Author	SHA1	Message	Date
Jay Foad	7ea33e6848	[CodeGen] Remove unused first operand of SUBREG_TO_REG (#179690 ) The first input operand of SUBREG_TO_REG was an immediate that most targets set to 0. In practice it had no effect on codegen. Remove it.	2026-02-04 17:35:21 +00:00
paperchalice	1ffe78811e	[PowerPC] Remove NoInfsFPMath uses (#163029 ) Only `ninf` should be used. This is the PowerPC part.	2026-02-04 00:35:15 +00:00
Wael Yehia	e1f69ee8e8	[AIX] Implement the ifunc attribute. (#153049 ) Currently, the AIX linker and loader do not provide a mechanism to implement ifuncs similar to GNU_ifunc on ELF Linux. On AIX, we will lower `__attribute__((ifunc("resolver"))` to the llvm `ifunc` as other platforms do. The llvm `ifunc` in turn will get lowered at late stages of the optimization pipeline to an AIX-specific implementation. No special linkage or relocations are needed when generating assembly/object output. On AIX, a function `foo` has two symbols associated with it: a function descriptor (`foo`) residing in the `.data` section, and an entry point (`.foo`) residing in the `.text` section. The first field of the descriptor is the address of the entry point. Typically, the address field in the descriptor is initialized once: statically, at load time (?), or at runtime if runtime linking is enabled. Here we would like to use the address field in the descriptor to implement the `ifunc` semantics. Specifically, the ifunc function will become a stub that jumps to the entry point in the address field. A constructor function is linked into every linkage module. The constructor walks an array of `{descriptor, resolver}` pairs, calling the resolver and saving the result in the address field in the descriptor (thus setting `foo`'s descriptor to point to the resolved version early during program runtime). Known limitations: - Due to bug #161576, which affects object generation path, you will need either `-ffunction-sections` or `-fno-integrated-as` to generate a correct/linkable object file. - aliases to ifuncs are not supported, a testcase has been added and marked XFAIL. I'm planning to address in a follow-up PR because it's not important enough, IMHO, for this PR - dead ifuncs in a CU that contains at least one live ifunc, will result in all ifuncs being kept by the linker. The fix for this is common with a similar problem we have with PGO. PR #159435 is trying to provide a mechanism that will allow the ifunc and PGO implementations to avoid the dead code retention at the link step. - the resolver must return a function that is in the same DSO as the ifunc; the compiler will try to detect if this condition is violated and report it, but it cannot detect it in general. To be safe, all candidate functions (returned by a particular resolver) must either be static or have hidden/protected visibility. This is so that the ifunc stub doesn't have to save and restore the TOC register r2. In future work, this case will be supported and the requirement will be lifted. --------- Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>	2026-02-03 14:15:16 -05:00
Nicolai Hähnle	6f0b873f1c	[CodeGen] Refactor targets to override the new getTgtMemIntrinsic overload (NFC) (#175844 ) This is a fairly mechanical change. Instead of returning true/false, we either keep the Infos vector empty or push one entry.	2026-02-02 17:40:02 -08:00
SiliconA-Z	b4797d4c03	[PowerPC] Fix miscompilation when using 32-bit ucmp on 64-bit PowerPC (#178979 ) I forgot that you need to clear the upper 32 bits for the carry flag to work properly on ppc64 or else there will be garbage and possibly incorrect results. Fixes: https://github.com/llvm/llvm-project/issues/179119 I do not have merge permissions.	2026-02-02 09:00:40 +01:00
Sam Elliott	7184229fea	[NFC][MI] Tidy Up RegState enum use (2/2) (#177090 ) This Change makes `RegState` into an enum class, with bitwise operators. It also: - Updates declarations of flag variables/arguments/returns from `unsigned` to `RegState`. - Updates empty RegState initializers from 0 to `{}`. If this is causing problems in downstream code: - Adopt the `RegState getXXXRegState(bool)` functions instead of using a ternary operator such as `bool ? RegState::XXX : 0`. - Adopt the `bool hasRegState(RegState, RegState)` function instead of using a bitwise check of the flags.	2026-01-23 00:19:03 -08:00
Simon Pilgrim	7e01b33a42	[PPC] Fix suspicious AltiVec VAVG patterns (#176891 ) The existing ((X+Y+1)>>1) patterns didn't correct handle overflow, like the VAVG instructions would Remove the old patterns and correctly mark the altivec VAVGS/VAVGU patterns as matching the ISD::AVGCEIL opcodes - the generic DAG folds will handle everything else I've updated the vavg.ll tests to correct match ISD::AVGCEILS/U patterns and added the old tests as negative "overflow" patterns that shouldn't fold to VAVG instructions Fixes #174718	2026-01-21 16:48:26 +00:00
Aditi Medhane	7cf30a7d3d	[PowerPC] Add Support for BCDSHIFT, BCDSHIFTR, BCDTRUNC, BCDUTRUNC, and BCDUSHIFT instruction support (#154715 ) Support the following BCD format conversion builtins for PowerPC. - `__builtin_bcdshift` – Shifts a packed decimal value by a specified number of decimal digits. - `__builtin_bcdshiftround` – Shifts a packed decimal value by a specified number of decimal digits, with rounding applied. - `__builtin_bcdtruncate` –Truncates a packed decimal value to a specified number of digits. - `__builtin_bcdunsignedtruncate` – Truncates a packed decimal value and returns the result as an unsigned packed decimal. - `__builtin_bcdunsignedshift` – Shifts an unsigned packed decimal value by a specified number of digits. > Note: This built-in functions are valid only when all following conditions are met: > -qarch is set to utilize POWER9 technology. > The bcd.h file is included. ## Prototypes ```c vector unsigned char __builtin_bcdshift(vector unsigned char, int, unsigned char); vector unsigned char __builtin_bcdshiftround(vector unsigned char, int, unsigned char); vector unsigned char __builtin_bcdtruncate(vector unsigned char, int, unsigned char); vector unsigned char __builtin_bcdunsignedtruncate(vector unsigned char, int); vector unsigned char __builtin_bcdunsignedshift(vector unsigned char, int); ``` ---------	2026-01-21 21:34:06 +05:30
Matt Arsenault	2c9cc88e25	FastISel: Thread LibcallLoweringInfo through (#176799 ) Boilerplate change to prepare to take LibcallLoweringInfo from an analysis. For now, it just sets it from the copy inside of TargetLowering.	2026-01-19 20:44:48 +00:00
Maryam Moghadas	196548988e	[PowerPC] Add support for AMO store builtins (#170933 ) This commit adds 4 Clang builtins for PowerPC AMO store operations: __builtin_amo_stwat for 32-bit unsigned operations __builtin_amo_stdat for 64-bit unsigned operations __builtin_amo_stwat_s for 32-bit signed operations __builtin_amo_stdat_s for 64-bit signed operations and maps GCC's AMO store functions to these Clang builtins for compatibility.	2026-01-19 10:58:32 -05:00
Sam Elliott	2042887709	Reland "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176277 ) This Change is to prepare to make RegState into an enum class. It: - Updates documentation to match the order in the code. - Brings the `get<>RegState` functions together and makes them `constexpr`. - Adopts the `get<>RegState` where RegStates were being chosen with ternary operators in backend code. - Introduces `hasRegState` to make querying RegState easier once it is an enum class. - Adopts `hasRegState` where equivalent was done with bitwise arithmetic. - Introduces `RegState::NoFlags`, which will be used for the lack of flags. - Documents that `0x1` is a reserved flag value used to detect if someone is passing `true` instead of flags (due to implicit bool to unsigned conversions). - Updates two calls to `MachineInstrBuilder::addReg` which were passing `false` to the flags operand, to no longer pass a value. - Documents that `getRegState` seems to have forgotten a call to `getEarlyClobberRegState`. This PR relands llvm/llvm-project#176091 (commit 1d616cdca3aba9d22f120888bb6b09b75ca90b92) which was reverted in llvm/llvm-project#176190 (commit 6309cd8668fc2ae589f156b23f86821f4ce5b7ea).	2026-01-16 13:05:06 -08:00
Maryam Moghadas	6f8a7e79db	[PowerPC] Add AMO load builtins for conditional increment/decrement (#169435 ) This commit adds 4 Clang builtins for PowerPC AMO load conditional increment and decrement operations: __builtin_amo_lwat_cond for 32-bit unsigned operations __builtin_amo_ldat_cond for 64-bit unsigned operations __builtin_amo_lwat_cond_s for 32-bit signed operations __builtin_amo_ldat_cond_s for 64-bit signed operations	2026-01-16 12:11:45 -05:00
Akshay Deodhar	3860147a7f	[NFC][TargetLowering] Make shouldExpandAtomicRMWInIR and shouldExpandAtomicCmpXchgInIR take a const Instruction pointer (#176073 ) Splits out change from https://github.com/llvm/llvm-project/pull/176015 Changes shouldExpandAtomicRMWInIR to take a constant argument: This is to allow some other TargetLowering constant-argument functions to call it. This change touches several backends. An alternative solution exists, but to me, this seems the "right" way.	2026-01-15 14:22:57 -08:00
Sam Elliott	6309cd8668	Revert "[NFC][MI] Tidy Up RegState enum use (1/2)" (#176190 ) Reverts llvm/llvm-project#176091 Reverting because some compilers were erroring on the call to `Reg.isReg()` (which is not `constexpr`) in a `constexpr` function.	2026-01-15 07:58:05 -08:00
Sam Elliott	1d616cdca3	[NFC][MI] Tidy Up RegState enum use (1/2) (#176091 ) This Change is to prepare to make RegState into an enum class. It: - Updates documentation to match the order in the code. - Brings the `get<>RegState` functions together and makes them `constexpr`. - Adopts the `get<>RegState` where RegStates were being chosen with ternary operators in backend code. - Introduces `hasRegState` to make querying RegState easier once it is an enum class. - Adopts `hasRegState` where equivalent was done with bitwise arithmetic. - Introduces `RegState::NoFlags`, which will be used for the lack of flags. - Documents that `0x1` is a reserved flag value used to detect if someone is passing `true` instead of flags (due to implicit bool to unsigned conversions). - Updates two calls to `MachineInstrBuilder::addReg` which were passing `false` to the flags operand, to no longer pass a value. - Documents that `getRegState` seems to have forgotten a call to `getEarlyClobberRegState`.	2026-01-15 07:47:05 -08:00
RolandF77	057c7a79e3	[PowerPC] Add type checking for DMF insert (#172078 ) Create PPCISD nodes for DMF DMXXINSTDMR512 and DMXXINSTDMR256 operations to allow type checking.	2026-01-08 12:34:43 -05:00
Kevin Per	98b82f90df	[PowerPC]: Add check for cast when shufflevector (#172443 ) The crash happens because the cast for `Mask = cast<ShuffleVectorSDNode>(Res)->getMask();` fails for node `t197: v16i8 = vector_shuffle<16,17,18,19,4,5,6,7,8,9,10,11,u,u,u,u> t196, t196`. However, both `LHS` and `RHS` are the same node, so `DAG.getCommutedVectorShuffle` doesn't return a `ShuffleVectorSDNode` and crashes. The fix is to add a check before the cast is performed. Closes https://github.com/llvm/llvm-project/issues/172265	2025-12-18 17:14:01 +08:00
Frederik Harwath	6ad41bcc49	[CodeGen] expand-fp: Change frem expansion criterion (#158285 ) The existing condition for checking whether or not to expand an frem instruction in expand-fp is not sufficiently precise. The expansion on other targets than AMDGPU - which is the only intended user right now - is only prevented due to the interaction with the MaxLegalFpConvertBitWidth check. Relying on this is conceptually wrong and limits the use of the pass for other targets and further expansions (e.g. merging with the similar ExpandLargeDivRem pass). Change the expansion criterion to always expand frem of a given type for targets that use "Expand" as the legalization action for the underlying scalar type and use this to exit the pass early for targets which do not require any expansions. This requires to change the frem legalization action for all targets which do not want frem to be expanded in this pass from "Expand" to "LibCall". --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-12-16 17:31:26 +01:00
paperchalice	c05ba635c4	[PowerPC] Use the same lowering rule for vector rounding instructions (#166307 ) They should have the same lowering rule.	2025-12-09 00:49:14 +00:00
Sean Fertile	7dfe599bda	Fix VarArgs FixedStack object on AIX. (#170240 ) Create a mutable aliased fixed stack object for the va_list when any of the optional arguments are passed in gprs. Since we need to spill the gpr registers into the parameter save area the stack object is not immutable, and since the values will almost certainly be accessed through the IR value for a va_list make the stack object aliased as well.	2025-12-08 14:36:45 -05:00
YunQiang Su	c6f45f51fb	PowerPC/VSX: Select FMINNUM and FMAXNUM (#135739 ) In LangRef, we claim that FMINNUM and FMAXNUM should follow the minNum and maxNum operators in IEEE754-2008. PowerPC/VSX does have these instructions XSMINDP and XSMAXDP. Now we use FMINNUM_IEEE and FMAXNUM_IEEE, since they are used by the non-arch expand codes now. In future, we may replace all FMINNUM_IEEE/FMAXNUM_IEEE with FMINNUM and FMAXNUM. --------- Co-authored-by: Your Name <you@example.com>	2025-12-08 13:18:52 +08:00
Maryam Moghadas	f650330665	[PowerPC] Add initial support for AMO load builtins (#168746 ) This commit adds two Clang builtins for PowerPC AMO load operations: __builtin_amo_lwat for 32-bit unsigned operations __builtin_amo_ldat for 64-bit unsigned operations Also adds an amo.h header that maps GCC's AMO functions to these Clang builtins for compatibility.	2025-12-03 17:47:56 -05:00
Robert Imschweiler	5c3c0020af	[NFC] Refactor TargetLowering::getTgtMemIntrinsic to take CallBase parameter (#170334 ) cf. https://github.com/llvm/llvm-project/pull/133907#discussion_r2578576548	2025-12-02 19:42:31 +01:00
Kazu Hirata	0917a38c69	[PowerPC] Fix a warning This patch fixes: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:15676:17: error: unused variable 'CC' [-Werror,-Wunused-variable]	2025-11-25 11:22:31 -08:00
zhijian lin	0c9c62adf1	[PowerPC ]convert `(setcc (and X, 1), 0, eq)` to `XORI (and X, 1), 1` (#168384 ) Convert `(setcc (and X, 1), 0, eq)` to `XORI (and X, 1), 1` , it will save one instruction.	2025-11-25 13:16:39 -05:00
Himadhith	e4a4bb0f6d	[PowerPC] Replace vspltisw+vadduwm instructions with xxleqv+vsubuwm for adding the vector {1, 1, 1, 1} (#160882 ) This patch optimizes vector addition operations involving `all-ones` vectors by leveraging the generation of vectors of -1s(using `xxleqv`, which is cheaper than generating vectors of 1s(`vspltisw`). These are the respective vector types. `v2i64`: `A + vector {1, 1}` `v4i32`: `A + vector {1, 1, 1, 1}` `v8i16`: `A + vector {1, 1, 1, 1, 1, 1, 1, 1}` `v16i8`: `A + vector {1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1}` The optimized version replaces `vspltisw (4 cycles)` with `xxleqv (2 cycles)` using the following identity: `A - (-1) = A + 1`. --------- Co-authored-by: himadhith <himadhith.v@ibm.com> Co-authored-by: Tony Varghese <tonypalampalliyil@gmail.com>	2025-11-21 12:26:58 +05:30
Jim Lin	5c5c83d8bc	[PowerPC] Fix Wparentheses warning PPCISelLowering.cpp:15567:27: warning: suggest parentheses around '&&' within '\|\|' [-Wparentheses] 15567 \| CC == ISD::SETEQ && "CC mus be ISD::SETNE or ISD::SETEQ"); \| ~~~~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~	2025-11-21 13:37:01 +08:00
Matt Arsenault	a757c4e74e	CodeGen: Add subtarget to TargetLoweringBase constructor (#168620 ) Currently LibcallLoweringInfo is defined inside of TargetLowering, which is owned by the subtarget. Pass in the subtarget so we can construct LibcallLoweringInfo with the subtarget. This is a temporary step that should be revertable in the future, after LibcallLoweringInfo is moved out of TargetLowering.	2025-11-19 19:18:13 +00:00
Aditi Medhane	fa50a684c5	[PowerPC] Add custom lowering for SADD overflow for i32 and i64 (#159255 ) This patch improves the codegen for saddo on i32 and i64 in both 32-bit and 64-bit modes by custom lowering. It implements signed-add overflow detection using the `(x eqv y) & (sum xor x)`bit-level sequence.	2025-11-19 10:11:51 +05:30
Sergei Barannikov	43dacd07f6	[PowerPC] TableGen-erate SDNode descriptions (#168108 ) This allows SDNodes to be validated against their expected type profiles and reduces the number of changes required to add a new node. The validation functionality has detected several issues, see `PPCSelectionDAGInfo::verifyTargetNode()`. Most of the nodes have a description in `*.td` files and were successfully "imported". Those that don't have a description are listed in the enum in `PPCSelectionDAGInfo.td`. These nodes are not validated. Part of #119709. Pull Request: https://github.com/llvm/llvm-project/pull/168108	2025-11-17 22:58:26 +00:00
zhijian lin	55aff64d2c	[PowerPC] fold i128 equality/inequality compares of two loads into a vectorized compare using vcmpequb.p when Altivec is available (#158657 ) The patch add 16 bytes load size for function PPCTTIImpl::enableMemCmpExpansion and fold i128 equality/inequality compares of two loads into a vectorized compare using vcmpequb.p when Altivec is available. Rationale: A scalar i128 SETCC (eq/ne) normally lowers to multiple scalar ops. On VSX-capable subtargets, we can instead reinterpret the i128 loads as v16i8 vectors and use the Altive vcmpequb.p instruction to perform a full 128-bit equality check in a single vector compare. Example Result: This transformation replaces memcmp(a, b, 16) with two vector loads and one vector compare instruction.	2025-11-13 10:06:36 -05:00
RolandF77	411ea8e9dd	[PowerPC] Lowering support for EVL type VP_LOAD/VP_STORE (#165910 ) Map EVL type VP_LOAD/VP_STORE for fixed length vectors to PPC load/store with length.	2025-11-07 10:27:46 -05:00
Shimin Cui	531fd45e92	[PPC] Set minimum of largest number of comparisons to use bit test for switch lowering (#155910 ) Currently it is considered suitable to lower to a bit test for a set of switch case clusters when the the number of unique destinations (`NumDests`) and the number of total comparisons (`NumCmps`) satisfy: `(NumDests == 1 && NumCmps >= 3) \|\| (NumDests == 2 && NumCmps >= 5) \|\| (NumDests == 3 && NumCmps >= 6)` However it is found for some cases on powerpc, for example, when NumDests is 3, and the number of comparisons for each destination is all 2, it's not profitable to lower the switch to bit test. This is to add an option to set the minimum of largest number of comparisons to use bit test for switch lowering. --------- Co-authored-by: Shimin Cui <scui@xlperflep9.rtp.raleigh.ibm.com>	2025-10-28 10:24:32 -04:00
paperchalice	15d11ebc84	[NFC] "unsafe-fp-math" post cleanup (code comments part) (#164582 )	2025-10-22 11:07:23 +00:00
zhijian lin	7aa6c62bdb	[PowecPC] Hint branch `bne-` for atomic operation after the store-conditional (#152529 ) The branches emitted for atomic operations after the store-conditional are currently not hinted, even though they should be. According to the Power10 Processor Chip User’s Manual: ` “Without static prediction, if the lock is not acquired in the first iteration, the branch history mechanism works to update the prediction to predict taken; that is, predict lock acquisition failure and cause more lwarx traffic for the next iteration.”` This patch addresses the issue by adding explicit branch hints for atomic operations after the store-conditional.	2025-10-21 09:37:30 -04:00
paperchalice	26feb1a9f1	[PowerPC] Remove `UnsafeFPMath` uses (#154901 ) Try to remove `UnsafeFPMath` uses in PowerPC backend. These global flags block some improvements like https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast/80797. Remove them incrementally. FP operations may raise exceptions are replaced by constrained intrinsics. However, vector type is not supported by these intrinsics.	2025-10-21 19:01:34 +08:00
Tony Varghese	60ee515b8c	[PowerPC] Emit lxvkq and vsrq instructions for build vector patterns (#157625 ) ### Optimize BUILD_VECTOR having special quadword patterns This change optimizes `BUILD_VECTOR` operations by using the `lxvkq` or `xxpltib + vsrq` instructions to inline constants matching specific 128-bit patterns: - MSB set pattern: `0x8000_0000_0000_0000_0000_0000_0000_0000` - LSB set pattern: `0x0000_0000_0000_0000_0000_0000_0000_0001` ### Implementation Details The `lxvkq` instruction loads special quadword values into VSX registers: ```asm lxvkq XT, UIM # When UIM=16: loads 0x8000_0000_0000_0000_0000_0000_0000_0000 ``` The optimization reconstructs the 128-bit register pattern from `BUILD_VECTOR` operands, accounting for target endianness. For example, the MSB pattern can be represented as: - Big-Endian: `<i64 -9223372036854775808, i64 0>` - Little-Endian: `<i64 0, i64 -9223372036854775808>` Both produce the same register value: `0x8000_0000_0000_0000_0000_0000_0000_0000` ### MSB Pattern (`0x8000...0000`) All vector types (`v2i64`, `v4i32`, `v8i16`, `v16i8`) generate: ```asm lxvkq v2, 16 ``` ### LSB Pattern (`0x0000...0001`) All vector types generate: ```asm xxspltib v2, 255 vsrq v2, v2, v2 ``` --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>	2025-10-15 10:54:04 +05:30
AZero13	07eeb5f08d	[PowerPC] Lower ucmp using subtractions (#146446 ) Source: Hacker's delight, page 21. Using the carry, we can use contractions to use the ucmp.	2025-10-11 12:34:30 +09:00
Nikita Popov	8b824f3b3e	[PowerPC] Avoid working on deleted node in ext bool trunc combine (#160050 ) This code was already creating HandleSDNodes to handle the case where a node gets replaced with an equivalent node. However, the code before the handles are created also performs RAUW operations, which can end up CSEing and deleting nodes. Fix this issue by moving the handle creation earlier. Fixes https://github.com/llvm/llvm-project/issues/160040.	2025-09-22 21:37:13 +02:00
RolandF77	1eb575dcae	[PowerPC] Fix vector extend result types in BUILD_VECTOR lowering (#159398 ) The result type of the vector extend intrinsics generated by the BUILD_VECTOR lowering code should match how they are actually defined. Currently the result type is defaulting to the operand type there. This can conflict with calls to the same intrinsic from other paths.	2025-09-19 10:43:22 -04:00
Kazu Hirata	d77aafbeee	[PowerPC] Remove an unnecessary cast (NFC) (#156599 ) getSExtValue already returns int64_t.	2025-09-03 07:48:36 -07:00
Tony Varghese	3fc1aad65b	[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388 ) This change implements a patfrag based pattern matching ~dag combiner~ that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR (Vector Shift Right)` instructions into a single `VSRQ (Vector Shift Right Quadword)` instruction on Power10+ processors. Vector right shift operations like `vec_srl(vec_sro(input, byte_shift), bit_shift)` generate two separate instructions `(VSRO + VSR)` when they could be optimised into a single `VSRQ `instruction that performs the equivalent operation. ``` vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift) where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift ``` Note: ``` vsro : Vector Shift Right by Octet VX-form - vsro VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32]. - Bytes shifted out of byte 15 are lost. - Zeros are supplied to the vacated bytes on the left. - The result is placed into VSR[VRT+32]. vsr : Vector Shift Right VX-form - vsr VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits. - Bits shifted out of bit 127 are lost. - Zeros are supplied to the vacated bits on the left. - The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined. vsrq : Vector Shift Right Quadword VX-form - vsrq VRT,VRA,VRB - Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32]. - src1 is shifted right by the number of bits specified in the low-order 7 bits of src2. - Bits shifted out the least-significant bit are lost. - Zeros are supplied to the vacated bits on the left. - The result is placed into VSR[VRT+32]. ``` --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>	2025-09-01 10:14:12 +05:30
RolandF77	d1cbe6ed74	[PowerPC] Add DMF builtins for build and disassemble (#153097 ) Add support for PPC Dense Math builtins mma_build_dmr and mma_disassemble_dmr builtins.	2025-08-25 12:14:55 -04:00
Matt Arsenault	65d12622fa	RuntimeLibcalls: Add entries for stackprotector globals (#154930 ) Add entries for_stack_chk_guard, __ssp_canary_word, __security_cookie, and __guard_local. As far as I can tell these are all just different names for the same shaped functionality on different systems. These aren't really functions, but special global variable names. They should probably be treated the same way; all the same contexts that need to know about emittable function names also need to know about this. This avoids a special case check in IRSymtab. This isn't a complete change, there's a lot more cleanup which should be done. The stack protector configuration system is a complete mess. There are multiple overlapping controls, used in 3 different places. Some of the target control implementations overlap with conditions used in the emission points, and some use correlated but not identical conditions in different contexts. i.e. useLoadStackGuardNode, getIRStackGuard, getSSPStackGuardCheck and insertSSPDeclarations are all used in inconsistent ways so I don't know if I've tracked the intention of the system correctly. The PowerPC test change is a bug fix on linux. Previously the manual conditions were based around !isOSOpenBSD, which is not the condition where __stack_chk_guard are used. Now getSDagStackGuard returns the proper global reference, resulting in LOAD_STACK_GUARD getting a MachineMemOperand which allows scheduling.	2025-08-23 10:21:00 +09:00
Kazu Hirata	11b4f110e0	[llvm] Remove unused includes of SmallSet.h (NFC) (#154893 ) We just replaced SmallSet<T , N> with SmallPtrSet<T , N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.	2025-08-22 10:33:46 -07:00
DanilaZhebryakov	0a3ee7de9c	[PowerPC] fix bug affecting float to int32 conversion on LE PowerPC (#150194 ) When moving fcti results from float registers to normal registers through memory, even though MPI was adjusted to account for endianness, FIPtr was always adjusted for big-endian, which caused loads of wrong half of a value in little-endian mode.	2025-08-20 12:37:14 +02:00
Nikita Popov	fea7e6934a	[PowerPC] Remove custom original type tracking (NFCI) (#154090 ) The OrigTy is passed to CC lowering nowadays, so use it directly instead of custom pre-analysis.	2025-08-20 12:36:23 +02:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Sergei Barannikov	aa2fe4eb3d	[PowerPC] Remove some unused SDNodes and FastISel workaround (NFC) (#153964 ) These nodes have never been used since introduction in 2013/2015.	2025-08-16 17:01:03 +00:00
Nikita Popov	01bc742185	[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817 ) This ensures that the required fields are set, and also makes the construction more convenient.	2025-08-15 18:06:07 +02:00

1 2 3 4 5 ...

1986 Commits