llvm-project

Author	SHA1	Message	Date
Sean Fertile	5e28d30f1f	[XCOFF][AIX] Peephole optimization for toc-data. Followup to D101178 - peephole optimization that converts a load address instruction and a consuming load/store into just the load/store when its safe to do so. eg: converts the 2 instruction code sequence la 4, i[TD](2) stw 3, 0(4) to stw 3, i[TD](2) Differential Revision: https://reviews.llvm.org/D101470	2023-07-13 20:40:09 -04:00
Amy Kwan	598cccea80	[AIX][TLS] Generate optimized local-exec access code sequence using X-Form loads/stores This patch is a follow up to D149722, D152669 and D153645, where a slightly more optimized code sequence is generated for 64-bit and 32-bit local-exec accesses when optimizations are turned on. Handling is added PPCISelDAGToDAG.cpp in order to check if any D-form loads or stores that follow an PPCISD::ADD_TLS can be optimized to use an X-Form load or store. In this particular situation, this allows the ADD_TLS node to be removed completely. Differential Revision: https://reviews.llvm.org/D150367	2023-07-06 07:57:05 -05:00
Amy Kwan	11b71ade51	[PowerPC][TLS] Add additional TLS X-Form loads/store instructions This patch is a follow up to D43315, and adds the following new load/store TLS specific instructions for integer and floating point scalar types: ``` LHAXTLS LWAXTLS LHAXTLS_32 LWAXTLS_32 LFSXTLS LFDXTLS STFSXTLS STFDXTLS ``` These instructions can be used to optimized TLS sequences when D-Form loads/stores follow an ADD_TLS instruction. Duplicate versions of these instructions are also added within an isAsmParserOnly=1 block (similar to D47382) to allow llvm-mc to assemble these instructions. Differential Revision: https://reviews.llvm.org/D153645	2023-06-27 11:33:38 -05:00
Elliot Goodrich	b0abd4893f	[llvm] Add missing StringExtras.h includes In preparation for removing the `#include "llvm/ADT/StringExtras.h"` from the header to source file of `llvm/Support/Error.h`, first add in all the missing includes that were previously included transitively through this header.	2023-06-25 15:42:22 +01:00
Craig Topper	6006d43e2d	LLVM_FALLTHROUGH => [[fallthrough]]. NFC Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D150996	2023-05-24 12:40:10 -07:00
NAKAMURA Takumi	c1221251fb	Restore CodeGen/MachineValueType.h from `Support` This is rework of; - rG13e77db2df94 (r328395; MVT) Since `LowLevelType.h` has been restored to `CodeGen`, `MachinveValueType.h` can be restored as well. Depends on D148767 Differential Revision: https://reviews.llvm.org/D149024	2023-05-03 00:13:20 +09:00
Kazu Hirata	53ead5215b	[Target] Use isNullConstant and isOneConstant (NFC)	2023-04-10 18:23:07 -07:00
Craig Topper	219ff07f72	[Targets] Rename Flag->Glue. NFC Long long ago Glue was called Flag, and it was never completely renamed.	2023-04-02 19:28:51 -07:00
Simon Pilgrim	c7d844ea0f	[DAG] Use ISD::isBitwiseLogicOp in AND/OR/XOR checks. NFCI. There's additional cases we can cleanup (mainly in target code), but this tries to cleanup generic code and PPC which had an equivalent helper.	2023-03-13 13:39:02 +00:00
esmeyi	2224b53f06	[PowerPC] Improve materialization for immediates which is almost a 32 bit splat. Summary: Some 64 bit constants can be materialized with fewer instructions than we currently use. We consider a 64 bit immediate value divided into four parts, Hi16OfHi32 (bits 48...63), Lo16OfHi32 (bits 32...47), Hi16OfLo32 (bits 16...31), Lo16OfLo32 (bits 0...15). When any three parts are equal, the immediate can be treated as "almost" a splat of a 32 bit value in a 64 bit register. For such case, we can use 3 instructions to generate the splat and use 1 instruction to modify the different part: Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139813	2023-01-31 06:02:17 -05:00
Kazu Hirata	e078201835	[Target] Use llvm::count{l,r}_{zero,one} (NFC)	2023-01-28 09:23:07 -08:00
Nick Desaulniers	b50327eea6	[llvm][PPCISelDAGToDAG] rename ppc-codegen to ppc-isel Every other subclass of SelectionDAGISel calls this pass "<arch>-isel". No existing tests refer to ppc-codegen so this is purely a cosmetic change to bring the pass name in line with other architecture's SelectionDAGISel subclasses. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D140497	2023-01-09 15:24:25 -08:00
Nick Desaulniers	19a004b468	[llvm][SelectionDAGISel] support -{start\|stop}-{before\|after}= for remaining targets Follow up to the series: 1. https://reviews.llvm.org/D140161 2. https://reviews.llvm.org/D140349 3. https://reviews.llvm.org/D140331 4. https://reviews.llvm.org/D140323 Completes the work from the previous two for remaining targets. This creates the following named passes that can be run via `llc -{start\|stop}-{before\|after}`: - arc-isel - arm-isel - avr-isel - bpf-isel - csky-isel - hexagon-isel - lanai-isel - loongarch-isel - m68k-isel - msp430-isel - mips-isel - nvptx-isel - ppc-codegen - riscv-isel - sparc-isel - systemz-isel - ve-isel - wasm-isel - xcore-isel A nice way to write tests for SelectionDAGISel might be to use a RUN: line like: llc -mtriple=<triple> -start-before=<arch>-isel -stop-after=finalize-isel -o - Fixes: https://github.com/llvm/llvm-project/issues/59538 Reviewed By: asb, zixuan-wu Differential Revision: https://reviews.llvm.org/D140364	2022-12-21 13:25:15 -08:00
Lei Huang	7a7e9109a2	[PowerPC] Implement P10 Byte Reverse Insructions Generate brh, brw and brd instructions for byte-swap operations on P10 and generating a single instruction for a 32-bit swap followed by a 16-bit right shift. Reviewed By: stefanp Differential Revision: https://reviews.llvm.org/D140414	2022-12-21 09:15:57 -06:00
Kai Nacke	44fe4e25e4	[PowerPC][NFC] Fix typos in PPCISelDAGToDAG Change: negtive -> negative is -> are Thanks to tschuett for finding these.	2022-12-16 16:34:46 +00:00
Craig Topper	c09edce1b3	[SelectionDAG] Give all the target specific subclasses of SelectionDAGISel their own pass ID. Previously we had a shared ID in SelectionDAGISel. AMDGPU has an initializePass function for its subclass of SelectionDAGISel. No other target does. This causes all target specific SelectionDAGISel passes to be known as "amdgpu-isel". I'm not sure what would happen if another target tried to implement an initializePass function too since the ID is already claimed. This patch gives all targets their own ID and passes it down to SelectionDAGISel constructor to MachineFunctionPass's constructor. Unfortunately, I think this causes most targets to lose print-before/after-all support for their SelectionDAGISel pass. And they probably no longer support start/stop-before/after. We can add initializePass functions to fix this as a follow up. NOTE: This was probably also broken if the AMDGPU target isn't compiled in. Step 1 to fixing PR59538. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D140161	2022-12-15 15:48:55 -08:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Nemanja Ivanovic	0d253bbd33	[PowerPC] Change CRNOT to a code gen single operand instruction Inputs to crnor can come from operands with chains so if it is being used simply to negate such an operand, the repeated input cannot be CSE'd. This patch just adds a code-gen only instruction for this that takes a single input and duplicates it in the encoding of the underlying crnor. Differential revision: https://reviews.llvm.org/D133577	2022-10-13 20:09:44 -05:00
Paul Scoropan	ce004fb4f2	[PowerPC] XCOFF exception section support on the direct assembler path This feature implements support for making entries in the exception section on XCOFF on the direct assembly path using the ".except" pseudo-op. It also provides functionality to lower entries (comprised of language and reason codes) into the exception section through the use of annotation metadata attached to llvm.ppc.trap/trapd/tw/tdw intrinsics. Integrated assembler support will be provided in another review. https://reviews.llvm.org/D133030 needs to merge first for LIT tests Reviewed By: shchenz, RKSimon Differential Revision: https://reviews.llvm.org/D132146	2022-09-26 22:24:20 -04:00
Kazu Hirata	86e8164a8f	[llvm] Qualify auto in range-based for loops (NFC) Identified with readability-qualified-auto.	2022-09-03 11:17:49 -07:00
Chen Zheng	d9004dfbab	[PowerPC] mapping hardward loop intrinsics to powerpc pseudo Map hardware loop intrinsics loop_decrement and set_loop_iteration to the new PowerPC pseudo instructions, so that the hardware loop intrinsics will be expanded to normal cmp+branch form or ctrloop form based on the CTR register usage on MIR level. Reviewed By: lkail Differential Revision: https://reviews.llvm.org/D123366	2022-08-08 21:34:20 -04:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
esmeyi	28b1ba1c07	[PowerPC] Add an ISEL pattern for i32 MULLI. We add the following ISEL pattern for i64 imm in D87384, this patch is for i32. `mul with (2^N * int16_imm) -> MULLI + RLWINM` Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D129708	2022-07-18 04:40:51 -04:00
Kai Luo	5018a5dcbe	[PowerPC] Support huge frame size for PPC64 Support allocation of huge stack frame(>2g) on PPC64. For ELFv2 ABI on Linux, quoted from the spec 2.2.3.1 General Stack Frame Requirements > There is no maximum stack frame size defined. On AIX, XL allows such huge frame. Reviewed By: #powerpc, nemanjai Differential Revision: https://reviews.llvm.org/D107886	2022-06-06 09:08:28 +00:00
Amy Kwan	0bf3c38b0b	Fix build failure revealed by c35ca3a1c78f693b749ad11742350b7fc6c5cd89 This commit resolves a Linux kernel build failure that was revealed by c35ca3a1c78f693b749ad11742350b7fc6c5cd89. The patch introduces two new intrinsics, which ultimately changes the intrinsic numbering of other PPC intrinsics. This causes an issue introduced by ff40fb07ad6309131c2448ca00572a078c7a2d59, as the patch checks for intrinsics with particular values, but the addition of the fnabs/fnabss intrinsics updates the original sqrt/sdiv intrinsic values.	2022-05-24 16:32:04 -05:00
serge-sans-paille	989f1c72e0	Cleanup codegen includes This is a (fixed) recommit of https://reviews.llvm.org/D121169 after: 1061034926 before: 1063332844 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D121681	2022-03-16 08:43:00 +01:00
Nico Weber	a278250b0f	Revert "Cleanup codegen includes" This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20. Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang, and many LLVM tests, see comments on https://reviews.llvm.org/D121169	2022-03-10 07:59:22 -05:00
serge-sans-paille	7f230feeea	Cleanup codegen includes after: 1061034926 before: 1063332844 Differential Revision: https://reviews.llvm.org/D121169	2022-03-10 10:00:30 +01:00
Kazu Hirata	69ccc96162	[llvm] Use the default constructor for SDValue (NFC)	2022-01-01 10:36:59 -08:00
Chen Zheng	63cd1842a7	[PowerPC] use lvx + splat directly for aligned splat load Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D114062	2021-12-08 02:02:18 +00:00
Yousuf Ali	415e821a50	[PowerPC][AIX] Add toc-data support for 64-bit AIX small code model. The patch expands the existing 32-bit toc-data attribute support to 64-bit. In both 32-bit and 64-bit it is supported for small code model only. Differential Revision: https://reviews.llvm.org/D114654	2021-12-01 10:56:21 -05:00
Nemanja Ivanovic	c933c2eb33	[PowerPC] Add BCD add/sub/cmp builtins Support for builtins that use bcdadd./bcdsub. to add/subtract Binary Coded Decimal values as well as to determine validity and compare BCD values. Differential revision: https://reviews.llvm.org/D114088	2021-11-23 11:42:36 -06:00
Chen Zheng	9bda9a3980	[PowerPC] fix typos in comments, NFC	2021-11-18 08:55:23 +00:00
Kazu Hirata	609ccbb240	[PowerPC] Use SDNode::uses (NFC)	2021-11-13 08:34:22 -08:00
Jordan Rupprecht	da4822f6c8	[PowerPC][NFC] Ignore unused var in release builds. Note we can't inline this call into assert because `isIntS16Immediate` has a side effect. But we only look at the return value in asserts builds.	2021-11-11 08:57:40 -08:00
Victor Huang	18fe0a0d9e	[PowerPC] PPC backend optimization to lower int_ppc_tdw/int_ppc_tw intrinsics to TDI/TWI machine instructions This patch adds the backend optimization to match XL behavior for the two builtins __tdw and __tw that when the second input argument is an immediate, emitting tdi/twi instructions instead of td/tw. Reviewed By: nemanjai, amyk, PowerPC Differential revision: https://reviews.llvm.org/D112285	2021-11-11 09:52:00 -06:00
Chen Zheng	9695027066	[PowerPC] address post-commit comments for D106555; NFC Address namanjai post commit comments.	2021-11-05 05:30:53 +00:00
Chen Zheng	5a8b196340	[PowerPC] handle more splat loads without stack operation This mostly improves splat loads code generation on Power7 Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D106555	2021-11-03 05:17:41 +00:00
Amy Kwan	5041a485b9	[PowerPC] Exploit Prefixed Load/Stores using the refactored Load/Store Implementation This patch exploits the prefixed load and store instructions utilizing the refactored load/store implementation introduced in D93370. Prefixed load and store instructions are emitted whenever we are loading or storing a value with an offset that fits into a 34-bit signed immediate. Patterns for the prefixed load and stores are added in this patch, as well as the implementation that detects when we are loading and storing a value with an offset that fits in 34-bits. Differential Revision: https://reviews.llvm.org/D96075	2021-09-14 08:39:49 -05:00
Amy Kwan	351a0d8a90	[PowerPC] Update PC-Relative Load/Store Patterns to use the refactored Load/Store Implementation This patch updates the PC-Relative load and store patterns to utilize the refactored load/store implementation introduced in D93370. PC-Relative implementation has been added to PPCISelLowering.cpp, and also the patterns in PPCInstrPrefix.td have been updated and no longer require AddedComplexity. All existing test cases pass with this update. Differential Revision: https://reviews.llvm.org/D95116	2021-09-09 15:38:42 -05:00
Craig Topper	9af8f1b18e	[SelectionDAG] Add isZero/isAllOnes methods to ConstantSDNode. Soft deprecrate isNullValue/isAllOnesValue and update in tree callers. This matches the changes to the APInt interface from D109483. Reviewed By: lattner Differential Revision: https://reviews.llvm.org/D109535	2021-09-09 13:28:30 -07:00
Quinn Pham	e002d251dd	[PowerPC] Floating Point Builtins for XL Compat. This patch is in a series of patches to provide builtins for compatibility with the XL compiler. This patch adds builtins related to floating point operations Reviewed By: #powerpc, nemanjai, amyk, NeHuang Differential Revision: https://reviews.llvm.org/D103986	2021-07-21 08:33:39 -05:00
Arthur Eubanks	693bc04bf6	[OpaquePtr] Use GlobalValue::getValueType() more	2021-07-13 09:34:34 -07:00
Stefan Pintilie	54310fc176	[PowerPC] Add ROP Protection to prologue and epilogue Added hashst to the prologue and hashchk to the epilogue. The hash for the prologue and epilogue must always be stored as the first element in the local variable space on the stack. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D99377	2021-05-13 12:54:44 -05:00
Amy Kwan	64d951be61	[PowerPC] Add new infrastructure to select load/store instructions, update P8/P9 load/store patterns. This patch introduces a new infrastructure that is used to select the load and store instructions in the PPC backend. The primary motivation is that the current implementation of selecting load/stores is dependent on the ordering of patterns in TableGen. Given this limitation, we are not able to easily and reliably generate the P10 prefixed load and stores instructions (such as when the immediates that fit within 34-bits). This refactoring is meant to provide us with more control over the patterns/different forms to exploit, as well as eliminating dependency of pattern declaration in TableGen. The idea of this refactoring is that it introduces a set of addressing modes that correspond to different instruction formats of a particular load and store instruction, along with a set of common flags that describes a load/store. Whenever a load/store instruction is being selected, we analyze the instruction and compute a set of flags for it. The computed flags are then used to select the most optimal load/store addressing mode. This patch is the first of a series of patches to be committed - it contains the initial implementation of the refactored load/store selection infrastructure and also updates P8/P9 patterns to adopt this infrastructure. The idea is that incremental patches will add more implementation and support, and eventually the old implementation will be removed. Differential Revision: https://reviews.llvm.org/D93370	2021-04-30 09:53:19 -05:00
Sidharth Baveja	70c433a184	[XCOFF][AIX] Add Global Variables Directly to TOC for 32 bit AIX Summary: This patch implements the backend implementation of adding global variables directly to the table of contents (TOC), rather than adding the address of the variable to the TOC. Currently, this patch will look for the "toc-data" attribute on symbols in the IR, and then add those symbols to the TOC. ATM, this is implemented for 32 bit AIX. Reviewers: sfertile Differential Revision: https://reviews.llvm.org/D101178	2021-04-30 14:48:02 +00:00
Qiu Chaofan	ece7345859	[PowerPC] Lower f128 SETCC/SELECT_CC as libcall if p9vector disabled XSCMPUQP is not available for pre-P9 subtargets. This patch will lower them into libcall for correct behavior on power7/power8. Reviewed By: steven.zhang Differential Revision: https://reviews.llvm.org/D92083	2021-04-12 10:33:32 +08:00
Qiu Chaofan	033c9c2552	[PowerPC] Fix use check of swap-reduction This will fix swap-reduction in DAGISel for cases where COPY_TO_REGCLASS has multiple uses.	2021-04-07 15:55:52 +08:00
Amy Kwan	bd6033eca7	[PowerPC] Materialize 34-bit constants with pli directly Previously, 34-bit constants were materialized in selectI64Imm(), and we relied on td pattern matching to instead produce a pli. This becomes problematic as there is no guarantee that the 34-bit constant will reach the td pattern selection for pli. It is also possible for other transformations (such as complex bit permutations) to also produce and utilize the 34-bit constant materialized through selectI64Imm(). This patch instead produces pli on Power10 directly whenever the constant fits within 34-bits. Differential Revision: https://reviews.llvm.org/D99906	2021-04-06 13:38:11 -05:00
Stefan Pintilie	b8f3c6d011	[PowerPC][NFC] Do not enter prefix selection if it cannot do better. Do not try to materialize a constant using prefix instructions if the selection using non prefix instructions was able to do it using a single non prefix instruction. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D98791	2021-03-22 09:17:52 -05:00

1 2 3 4 5 ...

661 Commits