llvm-project

Author	SHA1	Message	Date
RolandF77	1eb575dcae	[PowerPC] Fix vector extend result types in BUILD_VECTOR lowering (#159398 ) The result type of the vector extend intrinsics generated by the BUILD_VECTOR lowering code should match how they are actually defined. Currently the result type is defaulting to the operand type there. This can conflict with calls to the same intrinsic from other paths.	2025-09-19 10:43:22 -04:00
Kazu Hirata	d77aafbeee	[PowerPC] Remove an unnecessary cast (NFC) (#156599 ) getSExtValue already returns int64_t.	2025-09-03 07:48:36 -07:00
Tony Varghese	3fc1aad65b	[PowerPC] Merge vsr(vsro(input, byte_shift), bit_shift) to vsrq(input, res_bit_shift) (#154388 ) This change implements a patfrag based pattern matching ~dag combiner~ that combines consecutive `VSRO (Vector Shift Right Octet)` and `VSR (Vector Shift Right)` instructions into a single `VSRQ (Vector Shift Right Quadword)` instruction on Power10+ processors. Vector right shift operations like `vec_srl(vec_sro(input, byte_shift), bit_shift)` generate two separate instructions `(VSRO + VSR)` when they could be optimised into a single `VSRQ `instruction that performs the equivalent operation. ``` vsr(vsro (input, vsro_byte_shift), vsr_bit_shift) to vsrq(input, vsrq_bit_shift) where vsrq_bit_shift = (vsro_byte_shift * 8) + vsr_bit_shift ``` Note: ``` vsro : Vector Shift Right by Octet VX-form - vsro VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bytes specified in bits 121:124 of VSR[VRB+32]. - Bytes shifted out of byte 15 are lost. - Zeros are supplied to the vacated bytes on the left. - The result is placed into VSR[VRT+32]. vsr : Vector Shift Right VX-form - vsr VRT, VRA, VRB - The contents of VSR[VRA+32] are shifted right by the number of bits specified in bits 125:127 of VSR[VRB+32]. 3 bits. - Bits shifted out of bit 127 are lost. - Zeros are supplied to the vacated bits on the left. - The result is place into VSR[VRT+32], except if, for any byte element in VSR[VRB+32], the low-order 3 bits are not equal to the shift amount, then VSR[VRT+32] is undefined. vsrq : Vector Shift Right Quadword VX-form - vsrq VRT,VRA,VRB - Let src1 be the contents of VSR[VRA+32]. Let src2 be the contents of VSR[VRB+32]. - src1 is shifted right by the number of bits specified in the low-order 7 bits of src2. - Bits shifted out the least-significant bit are lost. - Zeros are supplied to the vacated bits on the left. - The result is placed into VSR[VRT+32]. ``` --------- Co-authored-by: Tony Varghese <tony.varghese@ibm.com>	2025-09-01 10:14:12 +05:30
RolandF77	d1cbe6ed74	[PowerPC] Add DMF builtins for build and disassemble (#153097 ) Add support for PPC Dense Math builtins mma_build_dmr and mma_disassemble_dmr builtins.	2025-08-25 12:14:55 -04:00
Matt Arsenault	65d12622fa	RuntimeLibcalls: Add entries for stackprotector globals (#154930 ) Add entries for_stack_chk_guard, __ssp_canary_word, __security_cookie, and __guard_local. As far as I can tell these are all just different names for the same shaped functionality on different systems. These aren't really functions, but special global variable names. They should probably be treated the same way; all the same contexts that need to know about emittable function names also need to know about this. This avoids a special case check in IRSymtab. This isn't a complete change, there's a lot more cleanup which should be done. The stack protector configuration system is a complete mess. There are multiple overlapping controls, used in 3 different places. Some of the target control implementations overlap with conditions used in the emission points, and some use correlated but not identical conditions in different contexts. i.e. useLoadStackGuardNode, getIRStackGuard, getSSPStackGuardCheck and insertSSPDeclarations are all used in inconsistent ways so I don't know if I've tracked the intention of the system correctly. The PowerPC test change is a bug fix on linux. Previously the manual conditions were based around !isOSOpenBSD, which is not the condition where __stack_chk_guard are used. Now getSDagStackGuard returns the proper global reference, resulting in LOAD_STACK_GUARD getting a MachineMemOperand which allows scheduling.	2025-08-23 10:21:00 +09:00
Kazu Hirata	11b4f110e0	[llvm] Remove unused includes of SmallSet.h (NFC) (#154893 ) We just replaced SmallSet<T , N> with SmallPtrSet<T , N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.	2025-08-22 10:33:46 -07:00
DanilaZhebryakov	0a3ee7de9c	[PowerPC] fix bug affecting float to int32 conversion on LE PowerPC (#150194 ) When moving fcti results from float registers to normal registers through memory, even though MPI was adjusted to account for endianness, FIPtr was always adjusted for big-endian, which caused loads of wrong half of a value in little-endian mode.	2025-08-20 12:37:14 +02:00
Nikita Popov	fea7e6934a	[PowerPC] Remove custom original type tracking (NFCI) (#154090 ) The OrigTy is passed to CC lowering nowadays, so use it directly instead of custom pre-analysis.	2025-08-20 12:36:23 +02:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Sergei Barannikov	aa2fe4eb3d	[PowerPC] Remove some unused SDNodes and FastISel workaround (NFC) (#153964 ) These nodes have never been used since introduction in 2013/2015.	2025-08-16 17:01:03 +00:00
Nikita Popov	01bc742185	[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817 ) This ensures that the required fields are set, and also makes the construction more convenient.	2025-08-15 18:06:07 +02:00
Nikita Popov	e92b7e9641	[CodeGen] Provide original IR type to CC lowering (NFC) (#152709 ) It is common to have ABI requirements for illegal types: For example, two i64 argument parts that originally came from an fp128 argument may have a different call ABI than ones that came from a i128 argument. The current calling convention lowering does not provide access to this information, so backends come up with various hacks to support it (like additional pre-analysis cached in CCState, or bypassing the default logic entirely). This PR adds the original IR type to InputArg/OutputArg and passes it down to CCAssignFn. It is not actually used anywhere yet, this just does the mechanical changes to thread through the new argument.	2025-08-11 08:57:53 +02:00
Nikita Popov	406d9b1dd6	[CodeGen] Move IsFixed into ArgFlags (NFCI) (#152319 ) The information whether a specific argument is vararg or fixed is currently stored separately from all the other argument information in ArgFlags. This means that it is not accessible from CCAssign, and backends have developed all kinds of workarounds for how they can access it after all. Move this information to ArgFlags to make it directly available in all relevant places. I've opted to invert this and store it as IsVarArg, as I think that both makes the meaning more obvious and provides for a better default (which is IsVarArg=false).	2025-08-07 09:12:40 +02:00
Sean Fertile	ab40909810	Implement the trampoline intrinsics and nest parameter for AIX. (#149388 ) We can expand the init intrinsic to create a descriptor for the nested procedure by combining the entry point and TOC pointer from the global descriptor with the nest argument. The normal indirect call sequence then calls the nested procedure through the descriptor like all other calls. Patch also implements support for a nest parameter by mapping it to gpr 11.	2025-08-06 12:15:27 -04:00
zhijian lin	23b3203113	[POWERPC] Fixes an error in the handling of the MTVSRBMI instruction for big-endian (#151565 ) The patch fixed a bug introduced patch [[PowePC] using MTVSRBMI instruction instead of constant pool in power10+](https://github.com/llvm/llvm-project/pull/144084#top). The issue arose because the layout of vector register elements differs between little-endian and big-endian modes — specifically, the elements appear in reverse order. This led to incorrect behavior when loading constants using MTVSRBMI in big-endian configurations.	2025-08-06 09:36:37 -04:00
Paul Walker	94d374ab6c	[LLVM][CGP] Allow finer control for sinking compares. (#151366 ) Compare sinking is selectable based on the result of hasMultipleConditionRegisters. This function is too coarse grained by not taking into account the differences between scalar and vector compares. This PR extends the interface to take an EVT to allow finer control. The new interface is used by AArch64 to disable sinking of scalable vector compares, but with isProfitableToSinkOperands updated to maintain the cases that are specifically tested.	2025-08-05 11:43:41 +01:00
Fangrui Song	d6c2e53151	MCSymbolXCOFF: Migrate away from classof The object file format specific derived classes are used in context where the type is statically known. We don't use isa/dyn_cast and we want to eliminate MCSymbol::Kind in the base class.	2025-08-03 18:18:44 -07:00
Amy Kwan	f48a8da342	[AIX] Handle arbitrary sized integers when lowering formal arguments passed on the stack (#149351 ) When arbitrary sized (non-simple type, or non-power of two types) integers are passed on the stack, these integers are not handled when lowering formal arguments on AIX as we always assume we will encounter simple type integers. However, it is possible for frontends to generate arbitrary sized immediate values in IR. Specifically in rustc, it will generate an integer value in LLVM IR for small structures that are less than a pointer size, which is done for optimization purposes for the Rust ABI. For example, if a Rust structure of three characters is passed into function on the stack, ``` struct my_struct { field1: u8, field2: u8, field3: u8, } ``` This will generate an `i24` type in LLVM IR. Currently, it is not obvious for the backend to distinguish an integer versus something that wasn't an integer to begin with (such as a struct), and the latter case would not have an extend on the parameter. Thus, this PR allows us to perform a truncation and extend on integers, both non-simple and simple types.	2025-08-01 08:01:26 -04:00
Boyao Wang	697beb3f17	[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664 ) Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So that we can use EVT::getVectorVT to generate EVT type in getOptimalMemOpType. Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).	2025-07-10 11:11:09 +08:00
Dominik Steenken	acdf1c7526	[DAG] Add generic expansion for ISD::FCANONICALIZE nodes (#142105 ) This PR takes the work previously done by @pawan-nirpal-031 on X86 in #106370, and makes it available in common code. This should enable all targets to use `__builtin_canonicalize` for all `f(16\|32\|64\|128)` data types. Canonicalization is implemented here as multiplication by `1.0`, as suggested in [the docs](https://llvm.org/docs/LangRef.html#llvm-canonicalize-intrinsic).	2025-07-08 16:12:17 +01:00
Matt Arsenault	d8ef156379	DAG: Remove verifyReturnAddressArgumentIsConstant (#147240 ) The intrinsic argument is already marked with immarg so non-constant values are rejected by the IR verifier.	2025-07-07 16:28:47 +09:00
Kazu Hirata	f46c1d6bcc	[PowerPC] Fix a warning This patch fixes: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:9588:16: error: unused variable 'NumOps' [-Werror,-Wunused-variable]	2025-07-04 07:53:29 -07:00
zhijian lin	45909ec469	[PowePC] using MTVSRBMI instruction instead of constant pool in power10+ (#144084 ) The instruction MTVSRBMI set 0x00(or 0xFF) to each byte of VSR based on the bits mask. Using the instruction instead of constant pool can reduce the asm code size and instructions in power10.	2025-07-04 10:07:03 -04:00
Jie Fu	25d52fbf96	[PowerPC] Prevent copying in loop variables (NFC) /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5769:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct] for (const auto [Reg, N] : RegsToPass) ^ /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:5769:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying for (const auto [Reg, N] : RegsToPass) ^~~~~~~~~~~~~~~~~~~~~ & /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6193:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct] for (const auto [Reg, N] : RegsToPass) { ^ /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6193:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying for (const auto [Reg, N] : RegsToPass) { ^~~~~~~~~~~~~~~~~~~~~ & /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6806:19: error: loop variable '[Reg, N]' creates a copy from type 'std::pair<unsigned int, llvm::SDValue> const' [-Werror,-Wrange-loop-construct] for (const auto [Reg, N] : RegsToPass) { ^ /data/llvm-project/llvm/lib/Target/PowerPC/PPCISelLowering.cpp:6806:8: note: use reference type 'std::pair<unsigned int, llvm::SDValue> const &' to prevent copying for (const auto [Reg, N] : RegsToPass) { ^~~~~~~~~~~~~~~~~~~~~ & 3 errors generated.	2025-06-29 10:21:00 +08:00
Kazu Hirata	bad5a740e1	[PowerPC] Use range-based for loops (NFC) (#146221 )	2025-06-28 13:04:08 -07:00
Wael Yehia	735d721de4	[PowerPC] Fix handling of undefs in the PPC::isSplatShuffleMask query (#145149 ) Currently, the query assumes that a single undef byte implies the rest of the `EltSize - 1` bytes are undefs, but that's not always true. e.g. isSplatShuffleMask( <0,1,2,3,4,5,6,7,undef,undef,undef,undef,0,1,2,3>, 8) should return false. --------- Co-authored-by: Wael Yehia <wyehia@ca.ibm.com>	2025-06-23 13:22:33 -04:00
Matt Arsenault	48155f93dd	CodeGen: Emit error if getRegisterByName fails (#145194 ) This avoids using report_fatal_error and standardizes the error message in a subset of the error conditions.	2025-06-23 16:33:35 +09:00
Nikita Popov	7ea7ccd24d	[PowerPC][AIX] Specify pointer info and alignment for stack store (#144526 ) When lowering call arguments to stack, specify a stack MPI, as well as the stack alignment, instead of using the defaults (which would be an unknown location with ABI alignment). I believe the asm diffs are just changes in scheduling.	2025-06-18 10:50:17 +02:00
zhijian lin	85a9f2e148	[PowerPC] enable AtomicExpandImpl::expandAtomicCmpXchg for powerpc (#142395 ) In PowerPC, the AtomicCmpXchgInst is lowered to ISD::ATOMIC_CMP_SWAP_WITH_SUCCESS. However, this node does not handle the weak attribute of AtomicCmpXchgInst. As a result, when compiling C++ atomic_compare_exchange_weak_explicit, the generated assembly includes a "reservation lost" loop — i.e., it branches back and retries if the stwcx. (store-conditional) fails. This differs from GCC’s codegen, which does not include that loop for weak compare-exchange. Since PowerPC uses LL/SC-style atomic instructions, the patch enables AtomicExpandImpl::expandAtomicCmpXchg for PowerPC. With this, the weak attribute is properly respected, and the "reservation lost" loop is removed for weak operations. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-06-13 09:14:48 -04:00
Matt Arsenault	55858527da	PowerPC: Move runtime libcall configuration to RuntimeLibcallsInfo (#142542 ) These should not be set in the TargetLowering constructor, RuntimeLibcalls needs to be accurate outside of codegen contexts.	2025-06-10 07:28:04 +09:00
RolandF77	5d6218d311	[PowerPC] extend smaller splats into bigger splats (with fix) (#142194 ) For pwr9, xxspltib is a byte splat with a range -128 to 127 - it can be used with a following vector extend sign to make splats of i16, i32, or i64 element size. For pwr8, vspltisw with a following vector extend sign can be used to make splats of i64 elements in the range -16 to 15. Add check for P8 to make sure the 64-bit vector ops are there.	2025-06-09 14:01:38 -04:00
Lei Huang	649020c680	[PowerPC] Change default for auto gen stxvp for cpu=future (#142826 ) For cpu=future, we want to auto generate stxvp instructions by default.	2025-06-09 12:34:50 -04:00
Hubert Tong	8f486254e4	Revert "[PowerPC] extend smaller splats into bigger splats (#141282 )" The subject commit causes the build to ICE on AIX: https://lab.llvm.org/buildbot/#/builders/64/builds/3890/steps/5/logs/stdio This reverts commit 7fa365843d9f99e75c38a6107e8511b324950e74.	2025-05-29 01:10:55 -04:00
RolandF77	7fa365843d	[PowerPC] extend smaller splats into bigger splats (#141282 ) For pwr9, xxspltib is a byte splat with a range -128 to 127 - it can be used with a following vector extend sign to make splats of i16, i32, or i64 element size. For pwr8, vspltisw with a following vector extend sign can be used to make splats of i64 elements in the range -16 to 15.	2025-05-28 10:11:28 -04:00
Kazu Hirata	9738373c0b	[PowerPC] Fix warnings This patch fixes: llvm/lib/Target/PowerPC/PPCISelLowering.cpp:11897:8: error: unused variable 'IsV2048i1' [-Werror,-Wunused-variable] llvm/lib/Target/PowerPC/PPCISelLowering.cpp:12035:8: error: unused variable 'IsV2048i1' [-Werror,-Wunused-variable]	2025-05-26 08:55:51 -07:00
Maryam Moghadas	a54300b32c	[PowerPC] Add load/store support for v2048i1 and DMF cryptography instructions (#136145 ) This commit adds support for loading and storing v2048i1 DMR pairs and introduces Dense Math Facility cryptography instructions: DMSHA2HASH, DMSHA3HASH, and DMXXSHAPAD, along with their corresponding intrinsics and tests.	2025-05-26 10:59:35 -04:00
RolandF77	bbca78fbcb	[PowerPC] vector shift word/double by element size - 1 use all ones (#139794 ) Vector shift word or double requires a shift amount vector of 31 or 63 which is too big for splat immediate and requires a multi-instruction sequence. However the PPC instructions only use 5 or 6 bits of the shift amount vector elements so an all ones mask, which we can generate efficiently, works.	2025-05-23 10:49:37 -04:00
RolandF77	99f0309669	[PowerPC] catch v2i64 shift left by 1 is add case (#138772 ) Catch missing case in PPC BE for v2i64 x << 1 and generate x + x.	2025-05-13 11:26:46 -04:00
zhijian lin	41647412c6	[PowerPC] Fix an LowerADDSUBO_CARRY error when converting carry bit for usubo_carry (#137809 ) In PowerPC, if a borrow occurs during a subtraction, the carry bit is zero (unset). The carry bit is set if no borrow occurs. For ISD::USUBO_CARRY, the nodes produce two results: the normal result of the addition or subtraction, and a boolean value that is 1 if and only if there is an outgoing carry or borrow. Therefore, we need to convert a 1 (which indicates a borrow in ISD::USUBO_CARRY) to 0 to match PowerPC's definition of borrow. Similarly, we need to convert a 0 (no borrow in ISD::USUBO_CARRY) to 1 for PowerPC. To perform this conversion, we use XOR 1 instead of XOR DAG.getAllOnesConstant(DL, CarryOp.getValueType()). `	2025-04-30 10:39:09 -04:00
Craig Topper	e4d2ff5b01	[SelectionDAG][PowerPC] Remove setTruncatingStore from StoreSDNode. (#137667 ) Mutating a node after it has been created isn't a good idea. After e17f07c4debbe76f5ebcdeeda619e7438700e2ad, we have a version of setStore that can create a truncating indexed store. Use that instead of MorphNodeTo+setTruncatingStore in PowerPC. Unfortunately, if we return the newly created node, DAGCombiner will visit the node and change the constant. To prevent this, we use DCI.CombineTo and avoid adding the new node to the worklist.	2025-04-28 16:48:37 -07:00
RolandF77	a903c7b7f5	[PowerPC] Intrinsics and tests for dmr insert/extract (#135653 ) Add some intrinsics and LIT tests for PPC dmr insert/extract instructions.	2025-04-24 11:27:22 -04:00
Lei Huang	b518242156	[PowerPC] Fix instruction name for dmr insert (#134301 )	2025-04-04 15:56:30 -04:00
zhijian lin	1a540c3b8b	[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#133155 ) ISD::ADDC, ISD::ADDE, ISD::SUBC and ISD::SUBE are being deprecated, using ISD::UADDO_CARRY,ISD::USUBO_CARRY instead. Lowering the UADDO, UADDO_CARRY, USUBO, USUBO_CARRY in the patch.	2025-04-03 13:22:49 -04:00
Kazu Hirata	86c382514e	[Target] Construct SmallVector with ArrayRef (NFC) (#134019 )	2025-04-01 21:59:19 -07:00
Rahul Joshi	74b7abf154	[IRBuilder] Add new overload for CreateIntrinsic (#131942 ) Add a new `CreateIntrinsic` overload with no `Types`, useful for creating calls to non-overloaded intrinsics that don't need additional mangling.	2025-03-31 08:10:34 -07:00
Lei Huang	ade22fc1d9	[PowerPC] Support conversion between f16 and f128 (#130158 ) Enables conversion between f16 and f128. Expanding on pre-Power9 targets and using HW instructions on Power9. Fixes https://github.com/llvm/llvm-project/issues/92866 Commandeer of: https://github.com/llvm/llvm-project/pull/97677 --------- Co-authored-by: esmeyi <esme.yi@ibm.com>	2025-03-19 10:19:57 -04:00
RolandF77	a73e591f33	[PowerPC] custom lower v1024i1 load/store (#126969 ) Support moving PPC dense math register values to and from storage with LLVM IR load/store.	2025-02-28 10:25:07 -05:00
David Tenty	aa9e519b24	Revert "[PowerPC] Deprecate uses of ISD::ADDC/ISD::ADDE/ISD::SUBC/ISD::SUBE (#116984 )" This reverts commit 7763119c6eb0976e4836f81c9876c49a36d46d73 (leaving the modifications from 03cb46d248b08)..	2025-02-19 09:44:39 -05:00
Nikita Popov	03cb46d248	[CodeGen] Use getSignedConstant() in more places (#127501 ) Use getSignedConstant() in a few more places, based on a search of `\bgetConstant(-`. Most of these were fine as-is (e.g. because they work on 64-bits), but I think it's better to use getSignedConstant() consistently for negative numbers.	2025-02-18 09:29:25 +01:00
Craig Topper	256145b4b0	[PowerPC] Use getSignedTargetConstant in SelectOptimalAddrMode. (#127305 ) Fixes #127298.	2025-02-15 14:13:32 -08:00

1 2 3 4 5 ...

1947 Commits