llvm-project

Author	SHA1	Message	Date
Nick Desaulniers	39811e2e53	[llvm][test] enable/disable -verify-machineinstrs where possible for callbr I introduced new tests in commit 5cc1016a57b3 ("[llvm][SelectionDAGBuilder] codegen callbr.landingpad intrinsic") https://reviews.llvm.org/D140160 that fails expensive checks. Disable -verify-machineinstrs in those tests for now. Enable it in other tests for now, since MachineVerifier isn't on by default for assertion builds. Link: https://github.com/llvm/llvm-project/issues/60827	2023-02-16 20:28:18 -08:00
Nick Desaulniers	a3a84c9e25	[llvm] add CallBrPrepare pass to pipelines Capstone of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Clang changes are still necessary to enable the use of outputs along indirect edges of asm goto statements. Link: https://github.com/llvm/llvm-project/issues/53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140180	2023-02-16 17:58:34 -08:00
Nick Desaulniers	5cc1016a57	[llvm][SelectionDAGBuilder] codegen callbr.landingpad intrinsic Given a CallBrInst, retain its first virtual register in SelectionDagBuilder's FunctionLoweringInfo if there's corresponding landingpad. Walk the list of COPY MachineInstr to find the original virtual and physical registers defined by the INLINEASM_BR MachineInst. Test cases from https://reviews.llvm.org/D139565. Link: https://github.com/llvm/llvm-project/issues/59538 Part 3 from https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Follow up patches still need to wire up CallBrPrepare into the pass pipelines. Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D140160	2023-02-16 17:58:34 -08:00
Ting Wang	52a774fd4c	[PowerPC] remove XXSWAPD after load from CP which is a splat value If the value from constant-pool is a splat value of vector type, do not need swap after load from constant-pool. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139491	2023-02-16 19:21:35 -05:00
Nemanja Ivanovic	56e41fcf50	[PowerPC] Bail out of FISel when lowering long calls We currently don't handle tail calls in fast-isel but we continue with the lowering when -mlongcall is specified and lower the calls normally. We should defer to SDISel for this so that it is lowered correctly. Differential revision: https://reviews.llvm.org/D123997	2023-02-16 16:15:32 -05:00
Matt Arsenault	09dd4d870e	DAG: Remove hasBitPreservingFPLogic This doesn't make sense as an option. fneg and fabs are bit preserving by definition. If a target has some fneg or fabs instruction that are not bitpreserving it's incorrect to lower fneg/fabs to use it.	2023-02-14 10:25:24 -04:00
Arthur Eubanks	7c6b46e87e	Revert "[DAGCombiner] handle more store value forwarding" This reverts commit f35a09daebd0a90daa536432e62a2476f708150d. Causes miscompiles, see D138899	2023-02-13 19:07:28 -08:00
Chen Zheng	6ee2f770ef	[PowerPC][GISel] add support for fpconstant Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D133340	2023-02-14 02:39:22 +00:00
Stefan Pintilie	2e47aafb02	[PowerPC] Fix float materialization patterns. Two of the float materialization patterns use the VSSRC regsiter class. This register class is not available before Power 8. The patterns will stay the same for Power 8 and up but must use the class F4RC for Power 7 and earlier. This patch fixes those patterns. Reviewed By: nemanjai, amyk, #powerpc Differential Revision: https://reviews.llvm.org/D142120	2023-02-13 10:18:53 -05:00
Samuel Parker	2a58be4239	[HardwareLoops] NewPM support. With the NPM, we're now defaulting to preserving LCSSA, so a couple of tests have changed slightly. Differential Revision: https://reviews.llvm.org/D140982	2023-02-13 09:46:31 +00:00
Andrew Savonichev	c65b4d64d4	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2023-02-09 18:45:20 +03:00
Anton Sidorenko	6820cb2dd5	[Test] Fix YAML mapping keys duplication. NFC. YAML specification does not allow keys duplication an a mapping. However, YAML parser in LLVM does not have any check on that and uses only the last key entry. In this change duplicated keys are merged to satisfy the spec. Differential Revision: https://reviews.llvm.org/D141848	2023-02-09 12:59:50 +03:00
Kai Luo	96aaebd12e	[MachineCopyPropagation] Eliminate spillage copies that might be caused by eviction chain Remove spill-reload like copy chains. For example ``` r0 = COPY r1 r1 = COPY r2 r2 = COPY r3 r3 = COPY r4 <def-use r4> r4 = COPY r3 r3 = COPY r2 r2 = COPY r1 r1 = COPY r0 ``` will be folded into ``` r0 = COPY r1 r1 = COPY r4 <def-use r4> r4 = COPY r1 r1 = COPY r0 ``` Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D122118	2023-02-08 03:34:25 +00:00
Simon Pilgrim	9ffe58dc27	[PowerPC] aix32-cc-abi-vaarg.ll - improve DAG checks More closely match the actual output and should make the merge with D127115 easier.	2023-02-04 11:17:36 +00:00
Ting Wang	1d8f13ae45	[PowerPC] add a peephole to remove redundant swap instructions after vector splats on P8 Vector store on P8 little endian will have swap instruction added before the store in PPCISelLowring. If the vector is generated by splat, the swap instruction can be eliminated. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139691	2023-02-02 20:52:52 -05:00
Nemanja Ivanovic	a5b662a834	[SelectionDAG] Correctly widen bitcast of scalar to vector for big endian For big endian targets that need a node such as this: v2i8 = bitcast i16:tN legalized by: 1. Promoting the i16 input type 2. Widening the v2i32 result type The result will be incorrect because the legalizer will promote the input type and then produce a scalar_to_vector from that wider type to a vector of N elements of that type. That puts the desired bits into the low order bytes of element zero and they need to be in the high order bytes on big endian systems. This patch changes the legalization to widen to a vector with elements of the original scalar size. Differential revision: https://reviews.llvm.org/D140365	2023-02-02 12:01:14 -06:00
Chen Zheng	f35a09daeb	[DAGCombiner] handle more store value forwarding When lowering calls on target like PPC, some stack loads will be generated for by value parameters. Node CALLSEQ_START prevents such loads from being combined. Suggested by @RolandF, this patch removes the unnecessary loads for the byval parameter by extending ForwardStoreValueToDirectLoad Reviewed By: nemanjai, RolandF Differential Revision: https://reviews.llvm.org/D138899	2023-02-01 21:06:17 -05:00
Chen Zheng	0a32e693e3	[DAGCombiner][NFC] add testcases for D138899	2023-02-01 21:06:09 -05:00
Nemanja Ivanovic	19311e0a2e	[PowerPC] Do not convert lwz to lwa if the offset is not a multiple of 4 The transform that converts this checks the alignment of the global object being accessed. However, there was no check for the offset within the global object which caused the compiler to produce a DS relocation for an unaligned address.	2023-01-31 09:54:29 -06:00
esmeyi	2224b53f06	[PowerPC] Improve materialization for immediates which is almost a 32 bit splat. Summary: Some 64 bit constants can be materialized with fewer instructions than we currently use. We consider a 64 bit immediate value divided into four parts, Hi16OfHi32 (bits 48...63), Lo16OfHi32 (bits 32...47), Hi16OfLo32 (bits 16...31), Lo16OfLo32 (bits 0...15). When any three parts are equal, the immediate can be treated as "almost" a splat of a 32 bit value in a 64 bit register. For such case, we can use 3 instructions to generate the splat and use 1 instruction to modify the different part: Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139813	2023-01-31 06:02:17 -05:00
Nemanja Ivanovic	7087f053f6	[PowerPC] Pre-commit test for fix to peephole opt This just adds a test case with current code gen. The patch with the fix will correct the code gen.	2023-01-30 21:01:21 -06:00
Nemanja Ivanovic	f68fc8d9d2	[PowerPC] Fix incorrect shift amount for build_vector The pattern for a build_vector node was incorrect for big endian subtargets.	2023-01-30 16:36:08 -06:00
Sergei Barannikov	de66efdb25	[PowerPC] Convert more tests to opaque pointers (NFC)	2023-01-30 05:19:24 +03:00
Sergei Barannikov	f4fbcd62af	[PowerPC] Convert more tests to opaque pointers (NFC) * Add -fast-isel=false to func-alias.ll. The test was added as a SelectionDAG test. Without this option, FastISel successfully selects the call that had a ConstantExpr argument. * fast-isel-branch.ll couldn't be handled by FastISel. Now it can, hence the change in the stack offsets.	2023-01-30 04:27:10 +03:00
Sergei Barannikov	fd9f42fad2	[PowerPC] Convert some tests to opaque pointers (NFC)	2023-01-30 00:40:12 +03:00
Simon Pilgrim	846ec90924	[PowerPC] ppc64-P9-vabsd.ll - add some basic ISD::ABDS test coverage Test coverage to ensure D142313 lowers ISD::ABDU -> VABSD but not ISD::ABDS (although I think v4i32 would be compatible with the XVNEGSP trick)	2023-01-27 11:12:16 +00:00
Matt Arsenault	778cf5431c	IR: Add atomicrmw uinc_wrap and udec_wrap These are essentially add/sub 1 with a clamping value. AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.	2023-01-24 17:55:11 -04:00
Simon Pilgrim	2e8aa2dcbc	[PowerPC] Regenerate vec_absd.ll test checks	2023-01-22 17:19:48 +00:00
OCHyams	99c12afeb4	[Assignment Tracking] Fix tests for buildbot failure (2) Follow-up for 4ece50737d5385fb80cfa23f5297d1111f8eed39 (D142027). Assignment Tracking Analysis now always runs and is skipped internally if assignment tracking is disabled. Update these tests to expect to see the pass run. Buildbot failure: https://lab.llvm.org/buildbot/#/builders/57/builds/24094	2023-01-20 15:58:35 +00:00
Paul Kirth	557a5bc336	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-19 01:51:14 +00:00
Nikita Popov	9ed2f14c87	[AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore. The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet. Differential Revision: https://reviews.llvm.org/D141912	2023-01-18 09:58:32 +01:00
Lei Huang	ee559b21b9	[P10] Fix the implementation for BRH Fixes the patterns for the brh instruction to include a clrldi when emitted. Reviewed By: amyk Differential Revision: https://reviews.llvm.org/D141697	2023-01-16 13:53:43 -06:00
Roman Lebedev	f8d9097168	[DAGCombiner] `combineShuffleOfSplatVal()`: try to canonicalize to a splat shuffle As noted in https://reviews.llvm.org/D141778#inline-1369900, we fail to produce splat shuffles from certain sequences of shuffles, that may have non-shuffles in the middle of seq. There is a big pitfail to avoid here: just because `isSplatValue()` says that all demanded elements are splat, we can't pick any random one of them, because some of them could be undef! We must ignore those!	2023-01-15 21:11:33 +03:00
Roman Lebedev	cc39c3b17f	[Codegen][LegalizeIntegerTypes] New legalization strategy for scalar shifts: shift through stack https://reviews.llvm.org/D140493 is going to teach SROA how to promote allocas that have variably-indexed loads. That does bring up questions of cost model, since that requires creating wide shifts. Indeed, our legalization for them is not optimal. We either split it into parts, or lower it into a libcall. But if the shift amount is by a multiple of CHAR_BIT, we can also legalize it throught stack. The basic idea is very simple: 1. Get a stack slot 2x the width of the shift type 2. store the value we are shifting into one half of the slot 3. pad the other half of the slot. for logical shifts, with zero, for arithmetic shift with signbit 4. index into the slot (starting from the base half into which we spilled, either upwards or downwards) 5. load 6. split loaded integer This works for both little-endian and big-endian machines: https://alive2.llvm.org/ce/z/YNVwd5 And better yet, if the original shift amount was not a multiple of CHAR_BIT, we can just shift by that remainder afterwards: https://alive2.llvm.org/ce/z/pz5G-K I think, if we are going perform shift->shift-by-parts expansion more than once, we should instead go through stack, which is what this patch does. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D140638	2023-01-14 19:12:18 +03:00
Paul Kirth	fdc0bf6adc	Revert "[codegen] Add StackFrameLayoutAnalysisPass" This breaks on some AArch64 bots This reverts commit 0a652c540556a118bbd9386ed3ab7fd9e60a9754.	2023-01-13 22:59:36 +00:00
Paul Kirth	0a652c5405	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-13 20:52:48 +00:00
esmeyi	5ce0a26bd1	[XCOFF] handle the toc-data for object file generation. Summary: The toc-data feature has been supported for assembly file generation. This patch handles the toc-data for object file generation. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D139516	2023-01-11 23:27:47 -05:00
Kai Luo	d9630c34f4	[PowerPC][GISel] Select sync instructions required by atomic operations This is part of selecting `G_ATOMIC*` instructions. Select `isync`, `sync` and `lwsync` in GISel. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D141360	2023-01-11 16:25:46 +08:00
esmeyi	2aa4b69bd6	[XCOFF][NFC] Update the test aix-xcoff-huge-relocs.ll	2023-01-10 05:18:53 -05:00
esmeyi	ea6dec1b3a	[XCOFF] support the overflow section (only relocation overflow is handled). Summary: This patch handles relocation field overflows in an XCOFF32 file. (XCOFF64 files may not have overflow section headers.) If a section has more than 65,534 relocation entries or line number entries, both of these fields are set to a value of 65535. In this case, an overflow section header with the s_flags field equal to STYP_OVRFLO is used to contain the relocation and line-number count information. Since line number is not supported, this patch only handles the relocation overflow. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D137819	2023-01-10 02:39:02 -05:00
Josh Stone	87f57f459e	[RegAllocFast] Handle new debug values for spills These new debug values get inserted after the place where the spill happens, which means they won't be reached by the reverse traversal of basic block instructions. This would crash or fail assertions if they contained any virtual registers to be replaced. We can manually handle the new debug values right away to resolve this. Fixes https://github.com/llvm/llvm-project/issues/59172 Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D139590	2023-01-05 20:41:11 -08:00
Chen Zheng	85edf1fc70	[PowerPC] remove the ctr clobbers check related to TLS access Dynamic tls access model will be lowered to MI which clobbers CTR in the loop in ISEL(ADDItlsgdLADDR) and post-isel CTR loop pass will revert the loop to a normal compare + branch form. So no need to add this clobber check in hardware loop insertion pass now. Reviewed By: nemanjai Differential revision: https://reviews.llvm.org/D140367	2023-01-05 21:23:29 -05:00
Chen Zheng	dd0edc876c	[PowerPC][NFC] add an option to keep the test point Passes before hardware loop insertion change the loop to a form which is not a hardware loop candidate (return early before checking the ctr clobbers). And the PHI in the loop exit block is also optimized away. This breaks the previous test point when the case was committed. Fixing this by running this case just before hardware loop insertion pass. Reviewed By: nemanjai Differential revision: https://reviews.llvm.org/D140366	2023-01-05 21:18:53 -05:00
Luke Drummond	108766fc7e	Fix typos I found one typo of "implemnt", then some more. s/implemnt/implement/g	2023-01-05 18:49:23 +00:00
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Chen Zheng	6a930e8891	1: use class instead of MVT 2: minor fix for the comments	2023-01-05 07:53:59 +00:00
Chen Zheng	ac93a4e77d	[PowerPC][GISel]fcmp support This patch also includes: 1: CRRegBank support 2: Some workarounds in PPC table gen for anyext/setcc patterns selection. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D140878	2023-01-05 07:45:29 +00:00
Stefan Pintilie	c1d0118459	[PowerPC] Materialize floats in the range [-16.0, 15.0]. Previous to this patch we only materialized 0.0 and all other floating point values would be loaded from the TOC. This patch adds materialization for the floating point values that can be represented as integers in [-16.0, 15.0]. For example we will now materialize 3.0 and -5.0 but not 4.7. Reviewed By: nemanjai, lei, #powerpc Differential Revision: https://reviews.llvm.org/D138844	2023-01-04 12:52:30 -06:00
Matt Arsenault	bf4596bf58	CodeGen: Clean up some tests with broken "strictfp" attribute	2023-01-03 20:26:57 -05:00
Craig Topper	8abd70081f	[TargetLowering] Teach BuildUDIV to take advantage of leading zeros in the dividend. If the dividend has leading zeros, we can use them to reduce the size of the multiplier and avoid the fixup cases. This patch is for scalars only, but we might be able to do this for vectors in a follow up. Differential Revision: https://reviews.llvm.org/D140750	2022-12-29 13:58:46 -08:00

1 2 3 4 5 ...

3554 Commits