llvm-project

Author	SHA1	Message	Date
Maryam Moghadas	8a0cb9ac86	[PowerPC] Add custom lowering for ssubo (#111748 ) This patch is to improve the codegen for ssubo node for i32 in 64-bit mode by custom lowering.	2024-10-29 15:43:05 -04:00
Lei Huang	522f34cfff	[PowerPC] Expand global named register support (#113482 ) Enable all valid registers for intrinsics that read from and write to global named registers.	2024-10-24 10:05:18 -04:00
Lei Huang	a19f05b9ec	Revert "[PowerPC] Expand global named register support" (#113457 ) Reverts llvm/llvm-project#112603	2024-10-23 09:36:28 -04:00
Lei Huang	06d192925d	[PowerPC] Expand global named register support (#112603 ) Enable all valid registers for intrinsics that read from and write to global named registers.	2024-10-22 14:34:24 -04:00
RolandF77	fc59f2cc0f	[PowerPC] special case small int constant for custom scalar_to_vector (#109850 ) Special case small int constant in the PPC custom lowering of scalar_to_vector.	2024-10-21 12:19:07 -04:00
Zaara Syeda	c5ca1b8626	[PPC] Add custom lowering for uaddo (#110137 ) Improve the codegen for uaddo node for i64 in 64-bit mode and i32 in 32-bit mode by custom lowering.	2024-10-21 11:13:16 -04:00
Keith Packard	44b020a381	[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928 ) Add support for using a thread-local variable with a specified offset for holding the stack guard canary value. This supports both 32- and 64- bit PowerPC targets. This mirrors changes from #108942 but targeting PowerPC instead of RISCV. Because both of these PRs modify the same driver functions, this series is stack on top of the RISC-V one. --------- Signed-off-by: Keith Packard <keithp@keithp.com>	2024-10-17 19:06:47 -07:00
Jay Foad	85c17e4092	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112706 ) Convert many instances of: Fn = Intrinsic::getOrInsertDeclaration(...); CreateCall(Fn, ...) to the equivalent CreateIntrinsic call.	2024-10-17 16:20:43 +01:00
Jay Foad	d9c95efb6c	[LLVM] Make more use of IRBuilder::CreateIntrinsic. NFC. (#112546 ) Convert almost every instance of: CreateCall(Intrinsic::getOrInsertDeclaration(...), ...) to the equivalent CreateIntrinsic call.	2024-10-16 15:43:30 +01:00
Rahul Joshi	fa789dffb1	[NFC] Rename `Intrinsic::getDeclaration` to `getOrInsertDeclaration` (#111752 ) Rename the function to reflect its correct behavior and to be consistent with `Module::getOrInsertFunction`. This is also in preparation of adding a new `Intrinsic::getDeclaration` that will have behavior similar to `Module::getFunction` (i.e, just lookup, no creation).	2024-10-11 05:26:03 -07:00
Zaara Syeda	22067a8eb4	[PowerPC] Fix assert exposed by PR 95931 in LowerBITCAST (#108062 ) Hit Assertion failed: Num < NumOperands && "Invalid child # of SDNode!" Fix by checking opcode and value type before calling getOperand.	2024-09-10 14:14:01 -04:00
Qiu Chaofan	06c331163e	[PowerPC] Implement llvm.set.rounding intrinsic (#67302 )	2024-09-10 14:30:31 +08:00
anjenner	4af249fe6e	Add usub_cond and usub_sat operations to atomicrmw (#105568 ) These both perform conditional subtraction, returning the minuend and zero respectively, if the difference is negative.	2024-09-06 16:19:20 +01:00
RolandF77	26ba186bd0	[PowerPC] Improve pwr7 codegen for v4i8 load (#104507 ) There are no partial vector loads on pwr7 so current v4i8 codegen is an int load then store to vector sized temp and re-load as vector. Try to use lfiwax to load 32 bits into an FP reg and take advantage of VSX FP and vector reg sharing to move the result to the right vector position.	2024-09-04 12:55:27 -04:00
Matt Arsenault	911b96058a	PPC: Custom lower ppcf128 is_fpclass if is_fpclass is custom (#105540 ) Unfortunately expandIS_FPCLASS is called directly in SelectionDAGBuilder depending on whether IS_FPCLASS is custom or not. This helps avoid ppc test regressions in a future patch where the custom lowering would be bypassed.	2024-08-29 14:01:54 +04:00
RolandF77	89bbcbe285	[PowerPC] fix legalization crash (#105563 ) If v2i64 scalar_to_vector is made custom, llc can crash in certain legalization cases where v2i64 vectors are injected, even if they weren't otherwise present. The code generated would be fine, but that operation is not handled in ReplaceNodeResults. Add handling.	2024-08-28 11:22:23 -04:00
Craig Topper	4b0c0ec6b8	[CodeGen] Use MCRegister for CCState::AllocateReg and CCValAssign::getReg. NFC (#106032 )	2024-08-26 11:40:25 -07:00
Craig Topper	e994494a59	[PowerPC] Use MathExtras helpers to simplify code. NFC (#104691 )	2024-08-17 23:12:52 -07:00
Kazu Hirata	3080c80671	[PowerPC] Use range-based for loops (NFC) (#104410 )	2024-08-15 17:58:46 -07:00
RolandF77	8b6e9de3dd	[PowerPC] improve P10 store forwarding on P7 scalar to vector (#102330 ) Try to make P7 code with scalar to vector operations that use store/re-load to run smoother on P10 by supplying enough store width to cover the load and allow hardware store forwarding.	2024-08-12 12:30:06 -04:00
Kazu Hirata	f4fb735840	[llvm] Construct SmallVector<SDValue> with ArrayRef (NFC) (#102578 )	2024-08-09 09:15:42 -07:00
Tim Gymnich	408d82d352	[PowerPC] Respect endianness when bitcasting to fp128 (#95931 ) Fixes #92246 Match the behaviour of `bitcast v2i64 (BUILD_PAIR %lo %hi)` when encountering `bitcast fp128 (BUILD_PAIR %lo $hi)`. by inserting a missing swap of the arguments based on endianness. ### Current behaviour: fp128 bitcast fp128 (BUILD_PAIR %lo $hi) => BUILD_FP128 %lo %hi BUILD_FP128 %lo %hi => MTVSRDD %hi %lo v2i64 bitcast v2i64 (BUILD_PAIR %lo %hi) => BUILD_VECTOR %hi %lo BUILD_VECTOR %hi %lo => MTVSRDD %lo %hi	2024-08-08 08:51:04 +08:00
Qiu Chaofan	20957d2091	[AIX] Add -msave-reg-params to save arguments to stack (#97524 ) In PowerPC ABI, a few initial arguments are passed through registers, but their places in parameter save area are reserved, arguments passed by memory goes after the reserved location. For debugging purpose, we may want to save copy of the pass-by-reg arguments into correct places on stack. The new option achieves by adding new function level attribute and make argument lowering part aware of it.	2024-07-24 20:58:37 +08:00
azhan92	1df4d866cc	[PowerPC] Add support for -mcpu=pwr11 / -mtune=pwr11 (#99511 ) This PR adds support for -mcpu=pwr11/power11 and -mtune=pwr11/power11 in clang and llvm.	2024-07-23 09:49:41 -04:00
Joseph Huber	615b7eeaa9	Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5. I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.	2024-07-20 09:29:31 -05:00
NAKAMURA Takumi	740161a9b9	Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610	2024-07-20 12:36:57 +09:00
Matt Arsenault	0f0cfcff2c	CodeGen: Avoid some references to MachineFunction's getMMI (#99652 ) MachineFunction's probably should not include a backreference to the owning MachineModuleInfo. Most of these references were used just to query the MCContext, which MachineFunction already directly stores. Other contexts are using it to query the LLVMContext, which can already be accessed through the IR function reference.	2024-07-19 22:09:05 +04:00
Amara Emerson	f270a4dd66	[AArch64] Don't tail call memset if it would convert to a bzero. (#98969 ) Well, not quite that simple. We can tc memset since it returns the first argument but bzero doesn't do that and therefore we can end up miscompiling. This patch also refactors the logic out of isInTailCallPosition() into the callers. As a result memcpy and memmove are also modified to do the same thing for consistency. rdar://131419786	2024-07-17 01:31:52 -07:00
Joseph Huber	c05126bdfc	[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 ) Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept. This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)	2024-07-16 06:22:09 -05:00
Joseph Huber	3f1a767572	[LLVM] Factor disabled Libcalls into the initializer (#98421 ) Summary: These Libcalls represent which functions are available to the backend. If a runtime call is not available, the target sets the the name to `nullptr`. Currently, this logic is spread around the various targets. This patch pulls all of the locations that disable libcalls into the intializer. This patch is effectively NFC. The motivation behind this patch is that currently the LTO handling uses the list of all runtime calls to determine which functions cannot be internalized and must be extracted from static libraries. We do not want this to happen for libcalls that are not emitted by the backend. A follow-up patch will move out this logic so the LTO pass can know which rtlib calls are actually used by the backend.	2024-07-11 12:59:25 -05:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Kazu Hirata	0a5292ebbb	[PowerPC] Remove extraneous ArrayRef (NFC) (#96092 ) ArrayRef can be implicitly constructed from a C array while inferring its size.	2024-06-19 13:38:46 -07:00
Zarko Todorovski	0295c2ada4	[PowerPC][AIX] Support ByVals with greater alignment then pointer size (#93341 ) Implementation is NOT compatible with IBM XL C 16.1 and earlier but is compatible with GCC. It handles all ByVals with greater alignment then pointer width the same way IBM XL C handles Byvals that have vector members. For overaligned objects that do not contain vectors IBM XL C does not align them properly if they are passed in the GPR argument registers. This patch was originally written by Sean Fertile @mandlebug. Previously on Phabricator https://reviews.llvm.org/D105659	2024-06-05 12:19:16 -04:00
zhijian lin	6127f15e5b	[PowerPC] option `-msoft-float` should not block the PC-relative address instruction (#92543 ) The Prefix instruction is introduced on PowerPC ISA3_1. In the PR, 1. The `FeaturePrefixInstrs` do not imply the `FeatureP8Vector` ,`FeatureP9Vector` . 2. `FeaturePrefixInstrs` implies only the FeatureISA3_1. 3. For the prefix instructions `paddi` and `pli` , they have `Predicates = [PrefixInstrs] ` 4. For the prefix instructions `plfs` and `plfd`, they have `Predicates = [PrefixInstrs, HasFPU] ` 5. For the prefix instructions "plxv` , "plxssp` and `plxsd` , they have `Predicates = [PrefixInstrs, HasP10Vector]` Fixes #62372	2024-05-29 10:53:00 -04:00
Chen Zheng	2143b7cd7d	[PowerPC]perform bitcast lowering only at 64 bit Perform bitcast lowering requires 64-bit to be native supported, However this is not true on 32-bit targets. Explicitly require 64-bit target. Fixes #92233	2024-05-20 03:17:21 -04:00
Jay Foad	1650f1b3d7	Fix typo "indicies" (#92232 )	2024-05-15 13:10:16 +01:00
Felix (Ting Wang)	ea126aebdc	[PowerPC] Tune AIX shared library TLS model at function level (#84132 ) Under some circumstance (library loaded with the main program), TLS initial-exec model can be applied to local-dynamic access(es). We could use some simple heuristic to decide the update at function level: * If there is equal or less than a number of TLS local-dynamic access(es) in the function, use TLS initial-exec model. (the threshold which default to 1 is controlled by hidden option)	2024-05-09 09:50:36 +08:00
Felix (Ting Wang)	09d51a841d	[PowerPC][AIX] Enable aix-small-local-dynamic-tls target attribute (#86641 ) Following the aix-small-local-exec-tls target attribute, this patch adds a target attribute for an AIX-specific option in llc that informs the compiler that it can use a faster access sequence for the local-dynamic TLS model (formally named aix-small-local-dynamic-tls) when TLS variables are less than ~32KB in size. The patch either produces an addi/la with a displacement off of module handle (return value from .__tls_get_mod) when the address is calculated, or it produces an addi/la followed by a load/store when the address is calculated and used for further accesses. --------- Co-authored-by: Amy Kwan <amy.kwan1@ibm.com>	2024-04-12 08:18:01 +08:00
Chen Zheng	053750c3b4	[PowerPC] Fix the undef register for VECINSERT If the V2 of the vector_shuffle is undef, the two vector inputs are expected to be the same when do the VECINSERT transformation. For now the first operand of VECINSERT is set to undef which is not right. This patch fixes this bug.	2024-04-11 04:01:07 -04:00
Chen Zheng	f33a6dcf95	[PPC][NFC] add an option for GatherAllAliasesMaxDepth (#87071 ) GatherAllAliases is time consuming. Add an debug option on PPC to control the complexity of the function. This is useful when debuging compile time related issues.	2024-04-02 08:40:28 +08:00
Amy Kwan	a3efc53f16	[AIX][TLS] Produce a faster local-exec access sequence for the "aix-small-tls" global variable attribute (#83053 ) Similar to 3f46e5453d9310b15d974e876f6132e3cf50c4b1, this patch allows the backend to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided, for local-exec TLS variables that are annotated with the "aix-small-tls" attribute. The expectation is for local-exec TLS variables to be set with this attribute through PGO. Furthermore, the optimized access sequence is only generated for local-exec TLS variables annotated with "aix-small-tls", only if they are less than ~32KB in size.	2024-03-28 09:18:45 -04:00
Qiu Chaofan	e5b20c83e5	[PowerPC] Update chain uses when emitting lxsizx (#84892 )	2024-03-18 22:31:05 +08:00
Qiu Chaofan	65ae09eeb6	[PowerPC] Fix behavior of rldimi/rlwimi/rlwnm builtins (#85040 ) rldimi is 64-bit instruction, so the corresponding builtin should not be available in 32-bit mode. Rotate amount should be in range and cases when mask is zero needs special handling. This change also swaps the first and second operands of rldimi/rlwimi to match previous behavior. For masks not ending at bit 63-SH, rotation will be inserted before rldimi.	2024-03-18 14:17:16 +08:00
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
Arthur Eubanks	94c988bcfd	[NFC] Remove unused parameter from shouldAssumeDSOLocal()	2024-03-11 19:48:17 +00:00
Qiu Chaofan	906580bad3	[PowerPC] Add intrinsics for rldimi/rlwimi/rlwnm (#82968 ) These builtins are already there in Clang, however current codegen may produce suboptimal results due to their complex behavior. Implement them as intrinsics to ensure expected instructions are emitted.	2024-03-04 21:13:59 +08:00
Felix (Ting Wang)	5b05870953	[PowerPC] Support local-dynamic TLS relocation on AIX (#66316 ) Supports TLS local-dynamic on AIX, generates below sequence of code: ``` .tc foo[TC],foo[TL]@ld # Variable offset, ld relocation specifier .tc mh[TC],mh[TC]@ml # Module handle for the caller lwz 3,mh[TC]$2$ $$ For 64-bit: ld 3,mh[TC]$2$ bla .__tls_get_mod # Modifies r0,r3,r4,r5,r11,lr,cr0 #r3 = &TLS for module lwz 4,foo[TC]$2$ $$ For 64-bit: ld 4,foo[TC]$2$ add 5,3,4 # Compute &foo .rename mh[TC], "\_$TLSML" # Symbol for the module handle must have the name "_$TLSML" ``` --------- Co-authored-by: tingwang <tingwang@tingwangs-MBP.lan> Co-authored-by: tingwang <tingwang@tingwangs-MacBook-Pro.local>	2024-03-01 08:09:40 +08:00
Kai Luo	d1924f0474	[PowerPC] Do not generate `isel` instruction if target doesn't have this instruction (#72845 ) When expand `select_cc` in finalize-isel, we should not generate `isel` for targets not feature it.	2024-03-01 08:03:06 +08:00
Kazu Hirata	ae46855f53	[Target] Use getConstantOperand (NFC)	2024-01-28 18:03:38 -08:00
Kazu Hirata	1f5934a901	[PowerPC] Directly call Instruction::getMetadata (NFC)	2024-01-28 18:03:36 -08:00

1 2 3 4 5 ...

1876 Commits