llvm-project

Author	SHA1	Message	Date
Jeremy Morse	30cce54dad	[X86] Return src/dest register from stack spill/restore recogniser LLVM provides target hooks to recognise stack spill and restore instructions, such as isLoadFromStackSlot, and it also provides post frame elimination versions such as isLoadFromStackSlotPostFE. These are supposed to return the store-source and load-destination registers; unfortunately on X86, the PostFE recognisers just return "1", apparently to signify "yes it's a spill/load". This patch alters the hooks to correctly return the store-source and load-destination registers: This is really useful for debug-info as we it helps follow variable values as they move on/off the stack. There should be no codegen changes: the only other users of these PostFE target hooks are MachineInstr::getRestoreSize and MachineInstr::getSpillSize, which don't attempt to interpret the returned register location. While we're here, delete the (InstrRef) LiveDebugValues heuristic that tries to find the spill source register by looking for a killed reg -- we should be able to rely on the target hooks for that. This involves temporarily turning off a n InstrRef LivedDebugValues test on aarch64 (patch to re-enable it is in D104521). Differential Revision: https://reviews.llvm.org/D105428	2021-07-09 18:12:30 +01:00
Nikita Popov	23dd750279	Revert "[IR] Don't mark mustprogress as type attribute" This reverts commit 84ed3a794b4ffe7bd673f1e5a17d507aa3113d12. A number of clang tests are also affected by this change. Revert until I can update them.	2021-07-09 18:46:00 +02:00
Nikita Popov	84ed3a794b	[IR] Don't mark mustprogress as type attribute This is a simple enum attribute. Test changes are because enum attributes are sorted before type attributes.	2021-07-09 18:24:16 +02:00
Nico Weber	97c675d3d4	Revert "Revert "Temporarily do not drop volatile stores before unreachable"" This reverts commit 52aeacfbf5ce5f949efe0eae029e56db171ea1f7. There isn't full agreement on a path forward yet, but there is agreement that this shouldn't land as-is. See discussion on https://reviews.llvm.org/D105338 Also reverts unreviewed "[clang] Improve `-Wnull-dereference` diag to be more in-line with reality" This reverts commit f4877c78c0fc98be47b926439bbfe33d5e1d1b6d. And all the related changes to tests: This reverts commit 9a0152799f8e4a59e0483728c9f11c8a7805616f. This reverts commit 3f7c9cc27422f7302cf5a683eeb3978e6cb84270. This reverts commit 329f8197ef59f9bd23328b52d623ba768b51dbb2. This reverts commit aa9f58cc2c48ca6cfc853a2467cd775dc7622746. This reverts commit 2df37d5ddd38091aafbb7d338660e58836f4ac80. This reverts commit a72a44181264fd83e05be958c2712cbd4560aba7.	2021-07-09 11:44:34 -04:00
zhijian	841077a7e9	[AIX][XCOFF] Use bit order of has_vec and longtbtable bits as defined in AIX header debug.h Summary: The bit order of the has_vec and longtbtable bits in the traceback table generated by the XL compiler flipped at some point after v12.1. This is different from the definition is the AIX header debug.h. The change in the XL compiler that caused the deviation from the OS header definition was unintentional. Since both orderings are extant and the XL compiler runtime also expects the ordering defined by the OS, we will correct the output from LLVM to match the defined ordering given by the OS (which is also consistent with the Assembler Language Reference). Mitigation for traceback tables encoded with the wrong ordering is required for either ordering. Reviewers: XingXue, HubertTong Differential Revision: https://reviews.llvm.org/D105487	2021-07-09 11:06:46 -04:00
Roman Lebedev	52aeacfbf5	Revert "Temporarily do not drop volatile stores before unreachable" This reverts commit 4e413e16216d0c94ada2171f3c59e0a85f4fa4b6, which landed almost 10 months ago under premise that the original behavior didn't match reality and was breaking users, even though it was correct as per the LangRef. But the LangRef change still hasn't appeared, which might suggest that the affected parties aren't really worried about this problem. Please refer to discussion in: * https://reviews.llvm.org/D87399 (`Revert "[InstCombine] erase instructions leading up to unreachable"`) * https://reviews.llvm.org/D53184 (`[LangRef] Clarify semantics of volatile operations.`) * https://reviews.llvm.org/D87149 (`[InstCombine] erase instructions leading up to unreachable`) clang has `-Wnull-dereference` which will diagnose the obvious cases of null dereference, it was adjusted in f4877c78c0fc98be47b926439bbfe33d5e1d1b6d, but it will only catch the cases where the pointer is a null literal, it will not catch the cases where an arbitrary store is expected to trap. Differential Revision: https://reviews.llvm.org/D105338	2021-07-09 14:16:54 +03:00
Simon Pilgrim	9dbeac16ba	[X86] ReplaceNodeResults - fp_to_sint/uint - manually widen v2i32 results to let us add AssertSext/AssertZext Its proving tricky to move this to the generic legalizer code, so manually insert the v2i32 subvector into v4i32, insert the AssertSext/AssertZext node, then extract the subvector again. This avoids masks in the truncation/pack code, which means we avoid a PSHUFB in the fp_to_sint/uint code for sub-128 bit types (specific targets can still combine the packs to a pshufb if they have fast variable per-lane shuffles). This was noticed when I was trying to improve fp_to_sint/uint costs with D103695 (and some targets had very high fp_to_sint costs due to the PSHUFB), so we can then update the fp_to_uint codegen from D89697.	2021-07-09 12:07:33 +01:00
Roman Lebedev	2df37d5ddd	[NFC][Codegen] Harden a few tests to not rely that volatile store to null isn't erased	2021-07-09 13:30:42 +03:00
Kai Luo	55bd12d4b7	[PowerPC] Remove implicit use register after transformToImmFormFedByLI() When the instruction has imm form and fed by LI, we can remove the redundat LI instruction. Below is an example: ``` renamable $x5 = LI8 2 renamable $x4 = exact SRD killed renamable $x4, killed renamable $r5, implicit $x5 ``` will be converted to: ``` renamable $x5 = LI8 2 renamable $x4 = exact RLDICL killed renamable $x4, 62, 2, implicit killed $x5 ``` But when we do this optimization, we forget to remove implicit killed $x5 This bug has caused a lnt case error. This patch is to fix above bug. Reviewed By: #powerpc, shchenz Differential Revision: https://reviews.llvm.org/D85288	2021-07-09 04:42:54 +00:00
Muhammad Omair Javaid	932e3d9960	Revert "GlobalISel/AArch64: don't optimize away redundant branches at -O0" This reverts commit 458c230b5ef893238d2471fcff27cd275e8026d5. This broke LLDB buildbot testcase where breakpoint set at start of loop failed to hit. https://lab.llvm.org/buildbot/#/builders/96/builds/9404 https://github.com/llvm/llvm-project/blob/main/lldb/test/API/commands/process/attach/main.cpp#L15 Differential Revision: https://reviews.llvm.org/D105238	2021-07-09 08:23:36 +05:00
Ben Shi	ed102ce20a	[RISCV][test] Add new tests for mul optimization in the zba extension with SH*ADD This patch will show the following optimization by future patches. (mul x imm) -> (SH1ADD x, (SLLI x, bits)) when imm = 2^n + 2. (mul x imm) -> (SH2ADD x, (SLLI x, bits)) when imm = 2^n + 4. (mul x imm) -> (SH3ADD x, (SLLI x, bits)) when imm = 2^n + 8. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D105614	2021-07-09 09:48:23 +08:00
Stanislav Mekhanoshin	e5b0fe1b83	[AMDGPU] Mark more SOP instructions as rematerializable The rest of the SOP instructions implicitly set SCC and not suitable for the rematerialization. Differential Revision: https://reviews.llvm.org/D105670	2021-07-08 16:00:45 -07:00
Thomas Lively	3dd75f5371	[WebAssembly] Scalarize extract_vector_elt of binops Override the `shouldScalarizeBinop` target lowering hook using the same implementation used in the x86 backend. This causes `extract_vector_elt`s of vector binary ops to be scalarized if the scalarized version would be supported. Differential Revision: https://reviews.llvm.org/D105646	2021-07-08 14:31:53 -07:00
David Green	e2bc88f175	[ARM] Extra v8i16 -> i64 reduction tests with loads. NFC	2021-07-08 22:27:23 +01:00
Alexey Bataev	0d74fd3fdf	[SLP][COST][X86]Improve cost model for masked gather. Revived D101297 in its original form + added some changes in X86 legalization cehcking for masked gathers. This solution is the most stable and the most correct one. We have to check the legality before trying to build the masked gather in SLP. Without this check we have incorrect cost (for SLP) in case if the masked gather is not legal/slower than the gather. And we're missing some vectorization opportunities. This can be fixed in the cost model, but in this case we need to add special checks for the cost of GEPs for ScatterVectorize node, add special check for small trees, etc., i.e. there are a lot of corner cases here and there, which insrease code base and make it harder to maintain the code. > Can't we rely on cost model to deal with this? This can be profitable for futher vectorization, when we can start from such gather loads as seed. The question from D101297. Actually, no, it can't. Actually, simple gather may give us better result, especially after we started vectorization of insertelements. Plus, like I said before, the cost for non-legal masked gathers leads to missed vectorization opportunities. Differential Revision: https://reviews.llvm.org/D105042	2021-07-08 11:53:30 -07:00
Alexey Bataev	9d826fdb28	[X86][NFC]Add run lines for AVX512VL for masked gather test, NFC.	2021-07-08 11:30:31 -07:00
Stanislav Mekhanoshin	de5582be26	[AMDGPU] Fix more indention in llc-pipeline test. NFC.	2021-07-08 11:20:00 -07:00
Stanislav Mekhanoshin	9dae86ce56	[AMDGPU] Fix indention in llc-pipeline test. NFC.	2021-07-08 11:08:25 -07:00
Stanislav Mekhanoshin	74a5760d35	[AMDGPU] Set LoopInfo as preserved by SIAnnotateControlFlow The pass does not change loops, it just adds calls. Differential Revision: https://reviews.llvm.org/D105583	2021-07-08 09:34:43 -07:00
Bradley Smith	026bb84bcd	[AArch64][SVE] Add ISel patterns for floating point compare with zero instructions Additionally, lower the floating point compare SVE intrinsics to SETCC_MERGE_ZERO ISD nodes to avoid duplicating ISel patterns. Differential Revision: https://reviews.llvm.org/D105486	2021-07-08 10:46:12 +00:00
Thomas Lively	f8c5a4c670	[WebAssembly] Optimize out shift masks WebAssembly's shift instructions implicitly masks the shift count, so optimize out redundant explicit masks of the shift count. For vector shifts, this currently only works if the mask is applied before splatting the shift count, but this should be addressed in a future commit. Resolves PR49655. Differential Revision: https://reviews.llvm.org/D105600	2021-07-07 23:14:31 -07:00
Qiu Chaofan	a22ecb4508	[PowerPC] Fix i64 to vector lowering on big endian Lowering for scalar to vector would skip if current subtarget is big endian and the scalar is larger or equal than 64 bits. However there's some issue in implementation that SToVRHS may refer to SToVLHS's scalar size if SToVLHS is present, which leads to some crash.o Reviewed By: nemanjai, shchenz Differential Revision: https://reviews.llvm.org/D105094	2021-07-08 11:05:09 +08:00
Stanislav Mekhanoshin	0fdb25cd95	[AMDGPU] Disable garbage collection passes Differential Revision: https://reviews.llvm.org/D105593	2021-07-07 15:47:57 -07:00
Jinsong Ji	89f2d98b98	[PowerPC] Add P7 RUN line for load and splat test	2021-07-07 21:43:46 +00:00
David Green	ab0096de05	[ARM] Add some opaque pointer gather/scatter tests. NFC They seem to work OK. Some other test cleanups at the same time.	2021-07-07 22:03:53 +01:00
Adrian Prantl	458c230b5e	GlobalISel/AArch64: don't optimize away redundant branches at -O0 This patch prevents GlobalISel from optimizing out redundant branch instructions when compiling without optimizations. The motivating example is code like the following common pattern in Swift, where users expect to be able to set a breakpoint on the early exit: public func f(b: Bool) { guard b else { return // I would like to set a breakpoint here. } ... } The patch modifies two places in GlobalISEL: The first one is in IRTranslator.cpp where the removal of redundant branches is made conditional on the optimization level. The second one is in AArch64InstructionSelector.cpp where an -O0 only optimization is being removed. Disabling these optimizations increases code size at -O0 by ~8%. However, doing so improves debuggability, and debug builds are the primary reason why developers compile without optimizations. We thus concluded that this is the right trade-off. rdar://79515454 Differential Revision: https://reviews.llvm.org/D105238	2021-07-07 12:51:55 -07:00
Eli Friedman	85bac9d7f9	[AArch64] Simplify sve-breakdown-scalable-vectortype.ll. Fix the calling convention so we don't spill every SVE register.	2021-07-07 12:32:17 -07:00
Nemanja Ivanovic	6a06dbafa1	[PowerPC] Disable permuted SCALAR_TO_VECTOR on LE without direct moves There are some patterns involving the permuted scalar to vector node for which we don't have patterns without direct moves on little endian subtargets. This causes selection errors. While we can of course add the missing patterns, any additional effort to make this work is not useful since there is no support for any CPU that can run in little endian mode and does not support direct moves.	2021-07-07 13:50:49 -05:00
Irina Dobrescu	5888a194c1	[AArch64][GlobalISel] Lower vector types for min/max Differential Revision: https://reviews.llvm.org/D105433	2021-07-07 15:34:03 +01:00
Zarko Todorovski	ee6ca9c7df	[AIX] Use VSSRC/VSFRC Register classes for f32/f64 callee arguments on P8 and above Adding usage of VSSRC and VSFRC when adding the live in registers on AIX. This matches the behaviour of the rest of PPC Subtargets. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D104396	2021-07-07 09:18:20 -04:00
Dylan Fleming	8ae9ab43dd	[SVE] Fixed cast<FixedVectorType> on scalable vector in SelectionDAGBuilder::getUniformBase Reviewed By: sdesmalen Differential Revision: https://reviews.llvm.org/D105350	2021-07-07 10:48:17 +01:00
David Green	4ce26deac2	[DAG] Reassociate Add with Or We already have reassociation code for Adds and Ors separately in DAG combiner, this adds it for the combination of the two where Ors act like Adds. It reassociates (add (or (x, c), y) -> (add (add (x, y), c)) where we know that the Ors operands have no common bits set, and the Or has one use. Differential Revision: https://reviews.llvm.org/D104765	2021-07-07 10:21:07 +01:00
Tom Stellard	7f1c077c30	tests/CodeGen: Use %python lit substitution when invoking python This will use the python that LLVM was configured to use rather than python from PATH. Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D105224	2021-07-06 18:46:36 -07:00
Nemanja Ivanovic	3553698de7	[PowerPC] Re-enable combine for i64 BSWAP on targets without LDBRX The combine was disabled in 4e22c7265d86 as it caused failures in the ppc64be-multistage (bootstrap) bot. It turns out that the combine did not correctly update the MMO for the high load which caused aliased stores to be reported as unaliased. This patch fixes that problem and re-enables the combine.	2021-07-06 20:42:01 -05:00
Eli Friedman	75eb43ab49	[AArch64] Add more tests related to vselect with constant condition. Not a complete set of tests, but a starting point if anyone wants to look at improving this.	2021-07-06 17:06:22 -07:00
Stanislav Mekhanoshin	a0ab45799b	[AMDGPU] Move atomic expand past infer address spaces There are cases where infer address spaces pass cannot yet infer an address space in the opt pipeline and then in the llc pipeline it runs too late for atomic expand pass to benefit from a specific address space. Move atomic expand pass past the infer address spaces. Fixes: SWDEV-293410 Differential Revision: https://reviews.llvm.org/D105511	2021-07-06 15:53:32 -07:00
Stanislav Mekhanoshin	5915d33874	[AMDGPU] Do not run IR optimizations at -O0 Differential Revision: https://reviews.llvm.org/D105515	2021-07-06 15:29:52 -07:00
Krzysztof Parzyszek	94e01d579c	[Hexagon] Generate trap/undef if misaligned access is detected This applies to memory accesses to (compile-time) constant addresses (such as memory-mapped registers). Currently when a misaligned access to such an address is detected, a fatal error is reported. This change will emit a remark, and the compilation will continue with a trap, and "undef" (for loads) emitted. This fixes https://llvm.org/PR50838. Differential Revision: https://reviews.llvm.org/D50524	2021-07-06 14:52:23 -05:00
Eli Friedman	7ac1c7bead	Recommit [ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Recommitting with fix to MemoryDepChecker::isDependent. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 12:16:05 -07:00
Craig Topper	12d51f95fe	[RISCV] Implement lround/llround/lrint/llrint with fcvt instruction with -fno-math-errno These are fp->int conversions using either RMM or dynamic rounding modes. The lround and lrint opcodes have a return type of either i32 or i64 depending on sizeof(long) in the frontend which should follow xlen. llround/llrint should always return i64 so we'll need a libcall for those on rv32. The frontend will only emit the intrinsics if -fno-math-errno is in effect otherwise a libcall will be emitted which will not use these ISD opcodes. gcc also does this optimization. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D105206	2021-07-06 11:43:22 -07:00
David Green	be0924ad17	[Tests] Update some tests for D104765. NFC	2021-07-06 19:23:52 +01:00
Eli Friedman	a6d081b2cb	Revert "[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers." This reverts commit 74d6ce5d5f169e9cf3fac0eb1042602e286dd2b9. Seeing crashes on buildbots in MemoryDepChecker::isDependent.	2021-07-06 11:17:13 -07:00
Eli Friedman	74d6ce5d5f	[ScalarEvolution] Make getMinusSCEV() fail for unrelated pointers. As part of making ScalarEvolution's handling of pointers consistent, we want to forbid multiplying a pointer by -1 (or any other value). This means we can't blindly subtract pointers. There are a few ways we could deal with this: 1. We could completely forbid subtracting pointers in getMinusSCEV() 2. We could forbid subracting pointers with different pointer bases (this patch). 3. We could try to ptrtoint pointer operands. The option in this patch is more friendly to non-integral pointers: code that works with normal pointers will also work with non-integral pointers. And it seems like there are very few places that actually benefit from the third option. As a minimal patch, the ScalarEvolution implementation of getMinusSCEV still ends up subtracting pointers if they have the same base. This should eliminate the shared pointer base, but eventually we'll need to rewrite it to avoid negating the pointer base. I plan to do this as a separate step to allow measuring the compile-time impact. This doesn't cause obvious functional changes in most cases; the one case that is significantly affected is ICmpZero handling in LSR (which is the source of almost all the test changes). The resulting changes seem okay to me, but suggestions welcome. As an alternative, I tried explicitly ptrtoint'ing the operands, but the result doesn't seem obviously better. I deleted the test lsr-undef-in-binop.ll becuase I couldn't figure out how to repair it to test what it was actually trying to test. Differential Revision: https://reviews.llvm.org/D104806	2021-07-06 10:54:41 -07:00
Jonas Paulsson	458eac2573	[SystemZ] Support the 'N' code for the odd register in inline-asm. The odd register of a (128 bit) register pair is accessed with the 'N' code with an inline assembly operand. Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D105502	2021-07-06 19:46:49 +02:00
Craig Topper	2b5e53111a	[RISCV] Add support for matching vwmul(u) and vwmacc(u) from fixed vectors. This adds a DAG combine to detect sext/zext inputs and emit a new ISD opcode. The extends will either be removed or replaced with narrower extends. Isel patterns are used to match add and widening mul to vwmacc similar to the recently added vmacc patterns. There's still some work to be to match vmulsu. We should also rewrite splats that were extended as scalars and then splatted. Reviewed By: arcbbb Differential Revision: https://reviews.llvm.org/D104802	2021-07-06 10:24:31 -07:00
Jonas Paulsson	37a92f3b03	[SystemZ] Generate XC loop for memset 0 of variable length. Benchmarking has shown that it is worthwhile to implement a variable length memset of 0 with XC (exclusive or) like gcc does, instead of using a libcall. This requires the use of the EXecute Relative Long (EXRL) instruction which can now be done in a framework that can also be used with other target instructions (not just XC). Review: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D103865	2021-07-06 18:07:31 +02:00
Bradley Smith	5ab9000fbb	[AArch64][SVE] Fix selection failures for scalable MLOAD nodes with passthru Differential Revision: https://reviews.llvm.org/D105348	2021-07-06 14:17:23 +00:00
Peter Waller	c5dfee44b9	[CodeGen][AArch64][SVE] Use ld1r[bhsd] for vector splat from memory This avoids the use of the vector unit for copying from scalar to vector. There is an extra ptrue instruction, but a predicate register with the ptrue pattern populated is likely to be free in the context of real code. Tests were generated from a template to cover the axes mentioned at the top of the test file. Co-authored-by: Francesco Petrogalli <francesco.petrogalli@arm.com> Differential Revision: https://reviews.llvm.org/D103170	2021-07-06 12:03:54 +00:00
Sebastian Neubauer	db646de3ee	[AMDGPU] Set optional PAL metadata Set informational fields in the .shader_functions table. Also correct the documentation, .scratch_memory_size and .lds_size are integers. Differential Revision: https://reviews.llvm.org/D105116	2021-07-06 11:58:00 +02:00
Albion Fung	7d10dd60ce	[PowerPC] Implament Load and Reserve and Store Conditional Builtins This patch implaments the load and reserve and store conditional builtins for the PowerPC target, in order to have feature parody with xlC on AIX. Differential revision: https://reviews.llvm.org/D105236	2021-07-05 21:35:41 -05:00

1 2 3 4 5 ...

39506 Commits