llvm-project

Author	SHA1	Message	Date
zhijian lin	e99d8bb0dc	[PowerPC] eliminate RLWINM instruction following LBARX as possible (#144089 ) LBARX loads a byte from memory into a register, automatically setting the remaining bits of the register to zero. If a subsequent RLWINM instruction is used to clear those same bits (which LBARX has already set to zero), the RLWINM is redundant and can be eliminated. these redundant clear instructions are introduced by 85a9f2e14859b.	2025-06-25 09:26:40 -04:00
Shimin Cui	b1017a4b84	Use getSignedTargetConstant for offset (#141149 ) This is to fix an assertion failure with PeepholePPC64. The load/store offset can be negative. A reduced case from one of our failures is added as well.	2025-05-26 11:08:13 -04:00
Kazu Hirata	aa15596b5f	[llvm] Remove unused local variables (NFC) (#138478 )	2025-05-04 21:33:54 -07:00
Kazu Hirata	71935281e0	[Target] Use *Set::insert_range (NFC) (#132140 ) DenseSet, SmallPtrSet, SmallSet, SetVector, and StringSet recently gained C++23-style insert_range. This patch replaces: Dest.insert(Src.begin(), Src.end()); with: Dest.insert_range(Src); This patch does not touch custom begin like succ_begin for now.	2025-03-20 09:09:30 -07:00
Fangrui Song	5e0be962fe	[PowerPC] Support PIC Secure PLT for CALL_RM https://reviews.llvm.org/D111433 introduced PPCISD::CALL_RM for -frounding-math. -msecure-plt -frounding-math {-fpic,-fPIC} codegen for PPC32 became incorrect when a function contains function calls but no global variable references (GlobalBaseReg). As reported by @q66 , musl/src/dirent/closedir.c implements such a function, which is miscompiled. PPCISD::CALL has custom logic to set up the base register (https://reviews.llvm.org/D42112). Add an extra case for CALL_RM. While here, improve the test to * actually test `case PPCISD::CALL`: we need a non-leaf function that doesn't access global variables (global variables lead to GlobalBaseReg, which call `getGlobalBaseReg()` as well). * test `ExternalSymbolSDNode` with a memset. Supersedes: #72758 Pull Request: https://github.com/llvm/llvm-project/pull/121281	2025-01-06 08:59:42 -08:00
Craig Topper	f139bde8d8	[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536 ) This allows us to write more range based for loops because we no longer need the iterator. It also matches IR's Use class.	2024-12-19 09:07:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Craig Topper	bd261ecc5a	[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509 ) Most of these are just places that want the first user and aren't iterating over the whole list. While there I changed some use_size() == 1 to hasOneUse() which is more efficient. This is part of an effort to rename use_iterator to user_iterator and provide a use_iterator that dereferences to SDUse&. This patch helps reduce the diff on later patches.	2024-12-18 22:13:04 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Nikita Popov	04a2d50efd	[PPC] Use getSignedConstant() for frame index offset The offset is signed. Fixes assertion failure reported at: https://github.com/llvm/llvm-project/pull/117558#issuecomment-2504413074	2024-11-28 10:49:45 +01:00
Nikita Popov	157d847ba7	[PowerPC] Use getSignedConstant() where necessary (#117177 ) This is to prevent assertion failures when we disable implicit truncation in getConstant(). getCanonicalConstSplat() works with a mix of unsigned and signed values, so I explicitly truncate the APInt there.	2024-11-22 09:40:19 +01:00
Lei Huang	f895fc9550	[NFC][PowerPC] Add getScalarIntVT to return MVT based on arch (#115203 ) Add `getScalarIntVT()` to return scalar int VT based on if arch is 32 or 64bit.	2024-11-11 12:25:14 -05:00
Craig Topper	5e231ffe29	[PowerPC] Use APInt::getZExtValue() instead of getRawData(). NFC	2024-08-13 19:05:12 -07:00
Kai Luo	bf02f81da7	[PowerPC] Adjust operand order of ADDItoc to be consistent with other ADDI* nodes (#93642 ) Simultaneously, the `ADDItoc` machineinstr is generated in `PPCISelDAGToDAG::Select` so the pattern is not used and can be removed.	2024-06-06 17:15:53 +08:00
paperchalice	7652a59407	Reland "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94149 ) - Fix build with `EXPENSIVE_CHECKS` - Remove unused `PassName::ID` to resolve warning - Mark `~SelectionDAGISel` virtual so AArch64 backend can work properly	2024-06-04 08:10:58 +08:00
paperchalice	8917afaf0e	Revert "[NewPM][CodeGen] Port selection dag isel to new pass manager" (#94146 ) This reverts commit de37c06f01772e02465ccc9f538894c76d89a7a1 to de37c06f01772e02465ccc9f538894c76d89a7a1 It still breaks EXPENSIVE_CHECKS build. Sorry.	2024-06-02 14:31:52 +08:00
paperchalice	d2cdc8ab45	[NewPM][CodeGen] Port selection dag isel to new pass manager (#83567 ) Port selection dag isel to new pass manager. Only `AMDGPU` and `X86` support new pass version. `-verify-machineinstrs` in new pass manager belongs to verify instrumentation, it is enabled by default.	2024-06-02 09:12:33 +08:00
Zaara Syeda	194e7cc7aa	[PowerPC][AIX] 64-bit large code-model support for toc-data (#90619 ) This patch adds support for toc-data for 64-bit large code-model on AIX. The sequence ADDIStocHA8/ADDItocL8 is used to access the data directly from the TOC. When emitting the instruction ADDIStocHA8, we check if the symbol has toc-data attribute before creating a toc entry for it. When emitting the instruction ADDItocL8, we use the LA8 instruction to load the address.	2024-05-21 14:00:24 -04:00
Felix (Ting Wang)	19220110ac	[PowerPC][AIX] Refactor existing logic to handle non-zero offsets for aix-small-local-dynamic-tls (#89182 ) To enable optimized small local-dynamic access sequence for non-zero offsets, this patch refactors existing 2a50921553798d2db52ca6330c89f0f8a5bc2215.	2024-05-08 18:37:51 +08:00
Kazu Hirata	c18bcd0a57	[Target] Use StringRef::operator== instead of StringRef::equals (NFC) (#91072 ) (#91138 ) I'm planning to remove StringRef::equals in favor of StringRef::operator==. - StringRef::operator==/!= outnumber StringRef::equals by a factor of 38 under llvm/ in terms of their usage. - The elimination of StringRef::equals brings StringRef closer to std::string_view, which has operator== but not equals. - S == "foo" is more readable than S.equals("foo"), especially for !Long.Expression.equals("str") vs Long.Expression != "str".	2024-05-05 13:43:10 -07:00
Lei Huang	520ccca2f9	NFC: fix clang format spacing and documentation (#90775 ) Some minor fixes to clean up tabs and language in code documentation.	2024-05-02 12:06:44 -04:00
Zaara Syeda	76ad289748	[PowerPC] 32-bit large code-model support for toc-data (#85129 ) This patch adds the pseudo op ADDItocL for 32-bit large code-model support for toc-data.	2024-04-17 09:24:53 -04:00
Amy Kwan	a3efc53f16	[AIX][TLS] Produce a faster local-exec access sequence for the "aix-small-tls" global variable attribute (#83053 ) Similar to 3f46e5453d9310b15d974e876f6132e3cf50c4b1, this patch allows the backend to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided, for local-exec TLS variables that are annotated with the "aix-small-tls" attribute. The expectation is for local-exec TLS variables to be set with this attribute through PGO. Furthermore, the optimized access sequence is only generated for local-exec TLS variables annotated with "aix-small-tls", only if they are less than ~32KB in size.	2024-03-28 09:18:45 -04:00
Sean Fertile	2d80505401	[AIX] Support per global code model. (#79202 ) Exploit the per global code model attribute on AIX. On AIX we need to update both the code sequence used to access the global (either 1 or 2 instructions for small and large code model respectively) and the storage mapping class that we emit the toc entry. --------- Co-authored-by: Amy Kwan <akwan0907@gmail.com>	2024-03-15 12:52:04 -04:00
Zaara Syeda	cc761a7c35	[PowerPC][NFC] Rename ADDItocL to match the 64-bit naming convention (#85099 ) In preparation of adding a similar instruction for large code model on AIX for 32-bit, rename the exisitng ADDItocL 64-instruction to ADDItocL8 to match the naming convention of other instructions with 32-bit and 64-bit variants.	2024-03-13 11:57:07 -04:00
Zaara Syeda	37b5eb0a0a	[AIX][TOC] Add -mtocdata/-mno-tocdata options on AIX (#67999 ) This patch enables support that the XL compiler had for AIX under -qdatalocal/-qdataimported.	2024-03-13 10:26:31 -04:00
Qiu Chaofan	292d9e869f	[PowerPC] Mask constant operands in ValueBit tracking (#67653 ) In IR or C code, shift amount larger than value size is undefined behavior. But in practice, backend lowering for shift_parts produces add/sub of shift amounts, thus constant shift amounts might be negative or larger than value size, which depends on ISA definition. PowerPC ISA says, the lowest 7 bits (6 bits for 32-bit instruction) will be taken, and if the highest among them is 1, result will be zero, otherwise the low 6 bits (or 5 on 32-bit) are used as shift amount. This commit emulates the behavior and avoids array overflow in bit permutation's value bits calculator.	2024-02-06 18:37:31 +08:00
Jie Fu	40f6b7d476	[PowerPC] Fix -Wunused-variable in PPCAsmPrinter.cpp and PPCISelDAGToDAG.cpp (NFC) llvm-project/llvm/lib/Target/PowerPC/PPCAsmPrinter.cpp:1648:15: error: unused variable 'InstDisp' [-Werror,-Wunused-variable] ptrdiff_t InstDisp = TLSVarAddress + Offset - Delta; ^ llvm-project/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp:7624:19: error: unused variable 'TPReg' [-Werror,-Wunused-variable] RegisterSDNode *TPReg = dyn_cast<RegisterSDNode>(TPRegNode.getNode()); ^ llvm-project/llvm/lib/Target/PowerPC/PPCISelDAGToDAG.cpp:7625:23: error: unused variable 'Subtarget' [-Werror,-Wunused-variable] const PPCSubtarget &Subtarget = ^	2024-02-01 22:50:14 +08:00
Amy Kwan	2a50921553	[AIX][TLS] Optimize the small local-exec access sequence for non-zero offsets (#71485 ) This patch utilizes the -maix-small-local-exec-tls option to produce a faster, non-TOC-based access sequence for the local-exec TLS model. Specifically, for when the offsets from the TLS variable are non-zero. In particular, this patch produces either a single: - addi/la with a displacement off of R13 plus a non-zero offset for when an address is calculated, or - load or store off of R13 plus a non-zero offset for when an address is calculated and used for further access where R13 is the thread pointer, respectively. In order to produce a single addi or load/store off of the thread pointer with a non-zero offset, this patch also adds the necessary support in the assembly printer when printing these instructions. Specifically: - The non-zero offset is added to the TLS variable address when the address of the TLS variable + it's offset is less than 32KB. - Otherwise, when the address of the TLS variable + its offset is greater than 32KB, the non-zero offset (and a multiple of 64KB) is subtracted from the TLS address. This handling in the assembly printer is necessary to ensure that the TLS address + the non-zero offset is between [-32768, 32768), so that the total displacement can fit within the addi/load/store instructions. This patch is meant to be a follow-up to 3f46e5453d9310b15d974e876f6132e3cf50c4b1 (where the optimization occurs for when the offset is zero).	2024-02-01 09:29:21 -05:00
Zaara Syeda	a03a6e9964	[AIX] [XCOFF] Add support for common and local common symbols in the TOC (#79530 ) This patch adds support for common and local symbols in the TOC for AIX. Note that we need to update isVirtualSection so as a common symbol in TOC will have the symbol type XTY_CM and will be initialized when placed in the TOC so sections with this type are no longer virtual. --------- Co-authored-by: Zaara Syeda <syzaara@ca.ibm.com>	2024-01-31 16:34:21 -05:00
Nico Weber	184ca39529	[llvm] Move CodeGenTypes library to its own directory (#79444 ) Finally addresses https://reviews.llvm.org/D148769#4311232 :) No behavior change.	2024-01-25 12:01:31 -05:00
Alex Bradbury	197214e39b	[RFC][SelectionDAG] Add and use SDNode::getAsZExtVal() helper (#76710 ) This follows on from #76708, allowing `cast<ConstantSDNode>(N)->getZExtValue()` to be replaced with just `N->getAsZextVal();` Introduced via `git grep -l "cast<ConstantSDNode>\(.\).getZExtValue" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.*)\)->getZExtValue/\1->getAsZExtVal/'` and then using `git clang-format` on the result.	2024-01-09 12:25:17 +00:00
Alex Bradbury	80aeb62211	[llvm][NFC] Use SDValue::getConstantOperandVal(i) where possible (#76708 ) This helper function shortens examples like `cast<ConstantSDNode>(Node->getOperand(1))->getZExtValue();` to `Node->getConstantOperandVal(1);`. Implemented with: `git grep -l "cast<ConstantSDNode>\(.->getOperand\(.\)\)->getZExtValue\(\)" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.)->getOperand\((.)\)\)->getZExtValue\(\)/\1->getConstantOperandVal(\2)/` and `git grep -l "cast<ConstantSDNode>\(.\.getOperand\(.\)\)->getZExtValue\(\)" \| xargs sed -E -i 's/cast<ConstantSDNode>\((.)\.getOperand\((.)\)\)->getZExtValue\(\)/\1.getConstantOperandVal(\2)/'`. With a couple of simple manual fixes needed. Result then processed by `git clang-format`.	2024-01-02 13:14:28 +00:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Nick Desaulniers	86735a4353	reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66264 ) reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5. Fix up build failures in targets I missed in #66003 Kept as 3 commits for reviewers to see better what's changed. Will squash when merging. - reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003) - fix all the targets I missed in #66003 - fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll	2023-09-13 13:31:24 -07:00
Amy Kwan	3f46e5453d	[AIX][TLS] Produce a faster local-exec access sequence with -maix-small-local-exec-tls (And optimize when load/store offsets are 0) This patch utilizes the -maix-small-local-exec-tls option added in D155544 to produce a faster access sequence for the local-exec TLS model, where loading from the TOC can be avoided. The patch either produces an addi/la with a displacement off of r13 (the thread pointer) when the address is calculated, or it produces an addi/la followed by a load/store when the address is calculated and used for further accesses. This patch also optimizes this sequence a bit more where we can remove the addi/la when the load/store offset is 0. A follow up patch will be posted to account for when the load/store offset is non-zero, and currently in these situations we keep the addi/la that precedes the load/store. Furthermore, this access sequence is only performed for TLS variables that are less than ~32KB in size. Differential Revision: https://reviews.llvm.org/D155600	2023-09-07 20:05:29 -05:00
esmeyi	b85a9b3093	[PowerPC] Try to use less instructions to materialize 64-bit constant when High32=Low32. Summary: Materialization a 64-bit constant with High32=Low32 only requires 2 instructions instead of 3 when Low32 can be materialized in 1 instruction. Reviewed By: qiucf Differential Revision: https://reviews.llvm.org/D158495	2023-09-07 13:03:17 -04:00
Stefan Pintilie	492c1f3d7c	[PowerPC] Merge rotate and clear into single instruction. This patch tries to catch a codegen opportunity where the rotate and mask can be merged into a single RLDCL instruction. Reviewed By: lei, amyk Differential Revision: https://reviews.llvm.org/D158328	2023-09-07 09:25:41 -04:00
Kazu Hirata	69f3319cbe	[PowerPC] Use isNullConstant (NFC)	2023-08-21 08:19:28 -07:00
Sean Fertile	cef56b9318	Revert "[XCOFF][AIX] Peephole optimization for toc-data." This reverts commit 5e28d30f1fb10faf2db2f8bf0502e7fd72e6ac2e.	2023-08-15 10:40:35 -04:00
Sean Fertile	ce658829c9	Revert "[PPC][AIX] Fix toc-data peephole bug and some related cleanup." This reverts commit b37c7ed0c95c7f24758b1532f04275b4bb65d3c1.	2023-08-15 10:40:35 -04:00
Sean Fertile	b37c7ed0c9	[PPC][AIX] Fix toc-data peephole bug and some related cleanup. Set the ReplaceFlags variable to false, since there is code meant only for the ADDItocHi/ADDItocL nodes. This has the side effect of disabling the peephole when the load/store instruction has a non-zero offset. This patch also fixes retrieving the `ImmOpnd` node from the AIX small code model pseduos and does the same for the register operand node. This allows cleaning up the later calls to replaceOperands. Finally move calculating the MaxOffset into the code guarded by ReplaceFlags as it is only used there and the comment is specific to the ELF ABI. Fixes https://github.com/llvm/llvm-project/issues/63927 Differential Revision: https://reviews.llvm.org/D155957	2023-08-10 10:23:15 -04:00
Sean Fertile	5e28d30f1f	[XCOFF][AIX] Peephole optimization for toc-data. Followup to D101178 - peephole optimization that converts a load address instruction and a consuming load/store into just the load/store when its safe to do so. eg: converts the 2 instruction code sequence la 4, i[TD](2) stw 3, 0(4) to stw 3, i[TD](2) Differential Revision: https://reviews.llvm.org/D101470	2023-07-13 20:40:09 -04:00
Amy Kwan	598cccea80	[AIX][TLS] Generate optimized local-exec access code sequence using X-Form loads/stores This patch is a follow up to D149722, D152669 and D153645, where a slightly more optimized code sequence is generated for 64-bit and 32-bit local-exec accesses when optimizations are turned on. Handling is added PPCISelDAGToDAG.cpp in order to check if any D-form loads or stores that follow an PPCISD::ADD_TLS can be optimized to use an X-Form load or store. In this particular situation, this allows the ADD_TLS node to be removed completely. Differential Revision: https://reviews.llvm.org/D150367	2023-07-06 07:57:05 -05:00
Amy Kwan	11b71ade51	[PowerPC][TLS] Add additional TLS X-Form loads/store instructions This patch is a follow up to D43315, and adds the following new load/store TLS specific instructions for integer and floating point scalar types: ``` LHAXTLS LWAXTLS LHAXTLS_32 LWAXTLS_32 LFSXTLS LFDXTLS STFSXTLS STFDXTLS ``` These instructions can be used to optimized TLS sequences when D-Form loads/stores follow an ADD_TLS instruction. Duplicate versions of these instructions are also added within an isAsmParserOnly=1 block (similar to D47382) to allow llvm-mc to assemble these instructions. Differential Revision: https://reviews.llvm.org/D153645	2023-06-27 11:33:38 -05:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00
Craig Topper	6006d43e2d	LLVM_FALLTHROUGH => [[fallthrough]]. NFC Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150996	2023-05-24 12:40:10 -07:00
NAKAMURA Takumi	c1221251fb	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024	2023-05-03 00:13:20 +09:00
Kazu Hirata	53ead5215b	[Target] Use isNullConstant and isOneConstant (NFC)	2023-04-10 18:23:07 -07:00
Craig Topper	219ff07f72	[Targets] Rename Flag->Glue. NFC Long long ago Glue was called Flag, and it was never completely renamed.	2023-04-02 19:28:51 -07:00

1 2 3 4 5 ...

703 Commits