llvm-project

Author	SHA1	Message	Date
Luke Lau	18013bea46	[RISCV] Add tests for unaligned segmented loads and stores Reviewed By: reames Differential Revision: https://reviews.llvm.org/D154535	2023-07-07 15:34:22 +01:00
Matt Arsenault	94e24624c2	AMDGPU: Remove attempt at simplifying the format string in printf lowering This avoids computing the dominator tree by removing the simplifyInstruction use. This was applying simplification with some kind of questionable load-store forwarding and looking for the global. This had to have been an ancient hack copied from previous backends. In the OpenCL case, this is always emitted as required the direct global reference anyway.	2023-07-07 09:26:07 -04:00
Lucas Prates	54c7aec449	[AArch64][RCPC3] Instruction selection for LDAP1/STL1 instructions This implements the DAG patterns to enable instruction selection for the LDAP1 and STL1 instructions from FEAT_LRCPC3. The instructions should match the following combinations: * Aqcuiring atomic load + vector insert element for LDAP1. * Vector extract element + releasing atomic store for STL1. Patterns have also been added to cope with the DAG structure found when dealing with 1-lane sub-vectors. Reviewed By: tmatheson, efriedma Differential Revision: https://reviews.llvm.org/D153129	2023-07-07 12:32:56 +01:00
WuXinlong	c0221e006d	[RISCV] Add a pass to combine `cm.pop` and `ret` insts `RISCVPushPopOptimizer.cpp` combine `cm.pop` and `ret` to generates `cm.popretz` or `cm.popret` . Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D150416	2023-07-07 14:04:11 +08:00
Jim Lin	43927542d8	[RISCV] Rename prefix `fixed-vector` to `fixed-vectors` to be the same with other testcases. NFC. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D154679	2023-07-07 13:04:00 +08:00
Craig Topper	a403124998	[RISCV] Don't sink i1 vectors in shouldSinkOperands. These can't create .vx instructions so there's no reason to sink them.	2023-07-06 20:36:55 -07:00
WuXinlong	6269ed24cf	[RISCV] Readjusting the framestack for Zcmp This patch readjusts the frame stack for the push and pop instructions co-author: @Lukacma Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D134599	2023-07-07 11:24:21 +08:00
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Eduard Zingerman	0bf9bfeacc	Revert "[BPF] Undo transformation for LICM.cpp:hoistMinMax()" This reverts commit 09feee559a294611257ee157dba039fb05fe4f68. Revert because of a testbot failure: https://lab.llvm.org/buildbot/#/builders/5/builds/34931	2023-07-07 04:01:31 +03:00
Craig Topper	be253cb987	[RISCV] Support i32 brev8 intrinsic on RV64. Similar to what we do for orc.b. Another patch will expose this as a builtin in clang.	2023-07-06 17:24:53 -07:00
Derek Schuff	ad14659f72	[WebAssembly] Add frexp{f,l} libcall signatures The llvm.frexp.* family of intrinsics and their corresponding libcalls were recently added, which means we need to know their signatures. Differential Revision: https://reviews.llvm.org/D154639 Fixed: https://github.com/llvm/llvm-project/issues/63657	2023-07-06 13:37:11 -07:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	1588e18b2d	DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0 Results in some x86 codegen diffs. Some look better, some look worse. https://reviews.llvm.org/D152094	2023-07-06 13:00:52 -04:00
Matt Arsenault	9df70e4a4d	AMDGPU: Fix not applying the correct default memcpy expansion threshold Fixes 3c848194f28decca41b7362f9dd35d4939797724. The TTI hook name got renamed at some point in the process and the target implementation was left behind. Fixes: SWDEV-407329	2023-07-06 12:14:14 -04:00
zhijian	d6d7f7b1d2	[AIX][XCOFF] print out the traceback info Summary: Adding a new option -traceback-table to print out the traceback info of xcoff ojbect file. Reviewers: James Henderson, Fangrui Song, Stephen Peckham, Xing Xue Differential Revision: https://reviews.llvm.org/D89049	2023-07-06 11:47:08 -04:00
Simon Pilgrim	a69ffd6c73	[X86] isTargetShuffleEquivalent - ensure the reference operands are vector types Fixes #63700	2023-07-06 15:38:01 +01:00
Matt Arsenault	c70cae6315	AMDGPU: Make SIFixVGPRCopies preserve everything All this does is add uses of reserved registers, which aren't tracked by anything. Saves a loop info computation.	2023-07-06 10:26:21 -04:00
Matt Arsenault	8ee1cc82c9	AMDGPU: Fold out sign bit ops on frexp_exp The sign bit has no impact on the exponent, so strip these away. Saves on the source modifier encoding cost. I left the GlobalISel handling until there's a resolution to issue #62628. We should do this in instcombine too, but legalization should be introducing more frexps than it currently is where this would occur.	2023-07-06 10:26:21 -04:00
Paul Walker	90b83a6d6c	[SVE] Add isel for 32-bit add/sub(cntp()) -> incp/decp. Patterns already exist for 64-bit that I've simply copied and converted to include the necessary truncation. Differential Revision: https://reviews.llvm.org/D154350	2023-07-06 14:25:18 +00:00
Eduard Zingerman	09feee559a	[BPF] Undo transformation for LICM.cpp:hoistMinMax() Extended BPFCheckAndAdjustIR pass with sinkMinMax() transformation that undoes LICM hoistMinMax pass. The undo transformation converts the following patterns: x < min(a, b) -> x < a && x < b x > min(a, b) -> x > a \|\| x > b x < max(a, b) -> x < a \|\| x < b x > max(a, b) -> x > a && x > b Where 'a' or 'b' is a constant. Also supports `sext min(...) ...` and `zext min(...) ...`. Differential Revision: https://reviews.llvm.org/D147990	2023-07-06 16:19:59 +03:00
Simon Pilgrim	c63be92fc8	[GlobalISel][X86] Regenerate add/sub legalization tests	2023-07-06 14:09:11 +01:00
Amy Kwan	598cccea80	[AIX][TLS] Generate optimized local-exec access code sequence using X-Form loads/stores This patch is a follow up to D149722, D152669 and D153645, where a slightly more optimized code sequence is generated for 64-bit and 32-bit local-exec accesses when optimizations are turned on. Handling is added PPCISelDAGToDAG.cpp in order to check if any D-form loads or stores that follow an PPCISD::ADD_TLS can be optimized to use an X-Form load or store. In this particular situation, this allows the ADD_TLS node to be removed completely. Differential Revision: https://reviews.llvm.org/D150367	2023-07-06 07:57:05 -05:00
Alex Bradbury	619c6c0e38	[RISCV][test] Add RV32I and RV64I RUN lines to llvm.frexp.ll Thanks to D154555, these intrinsics no longer crash when used with a soft float ABI.	2023-07-06 13:36:03 +01:00
Ivan Kosarev	b4049b409b	[AMDGPU] Add GlobalISel test coverage for floating-point truncations. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154527	2023-07-06 11:37:09 +01:00
Simon Pilgrim	3f7470c33d	[X86] Fold BITOP(PACKSS(X,Z),PACKSS(Y,W)) --> PACKSS(BITOP(X,Y),BITOP(Z,W)) (REAPPLIED) Fold allsignbits pack patterns to make better use of cheap (and commutable) logic ops Reapplied after a32d14fd4c0a / 156913cb7764 with bitcast fix	2023-07-06 10:56:07 +01:00
Simon Pilgrim	bb65e5b881	[X86] Add base SSE2 i686 test coverage to vector bitlogic reduction tests	2023-07-06 10:56:07 +01:00
Simon Pilgrim	819d070e0e	[X86] Add base SSE2 i686 test coverage to vector bool reduction tests	2023-07-06 10:56:06 +01:00
Valery Pykhtin	98aa8439f5	[AMDGPU] Fix register class for a subreg in GCNRewritePartialRegUses. 1. Improved code that deduces register class from instruction definitions. Previously if some instruction didn't contain a reg class for an operand it was considered as no information on register class even if other instructions specified the class. 2. Added check on required size of resulting register because in some cases classes with smaller registers had been selected (for example VReg_1). Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D152832	2023-07-06 08:48:45 +02:00
Jianjian GUAN	a813a633d5	[RISCV][NFC] Use common prefix to simlify test. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D154487	2023-07-06 11:52:51 +08:00
Craig Topper	ee34fa0032	[RISCV] Add DAG combine for (fmv_w_x_rv64 (fmv_x_anyextw_rv64 X)) This pattern started showing up more after D151284	2023-07-05 19:35:13 -07:00
Matt Arsenault	e8ed6e35bd	DAG: Implement soften float for ffrexp Fixes #63661 https://reviews.llvm.org/D154555	2023-07-05 21:42:27 -04:00
Nemanja Ivanovic	7cd9084c69	Revert "[PowerPC] Remove extend between shift and and" This reverts commit a57236de4eb8f38b4201647b10146941cbbb5c0b. Causes a bootstrap failure on ppc64be.	2023-07-05 20:04:49 -04:00
Arthur Eubanks	156913cb77	Revert "[X86] Fold BITOP(PACKSS(X,Z),PACKSS(Y,W)) --> PACKSS(BITOP(X,Y),BITOP(Z,W))" This reverts commit a32d14fd4c0a43c154f251df1ccfe57e8b0a711a. Causes crashes, see https://reviews.llvm.org/rGa32d14fd4c0a43c154f251df1ccfe57e8b0a711a.	2023-07-05 14:52:57 -07:00
Matt Arsenault	20964c901a	DAG: Fix dropping flags when widening unary vector ops	2023-07-05 17:25:24 -04:00
Matt Arsenault	5491666248	AMDGPU: Correctly lower llvm.exp.f32 The library expansion has too many paths for all the permutations of DAZ, unsafe and the 3 exp functions. It's easier to expand it in the backend when we know all of these things. The library currently misses the no-infinity check on the overflow, which this handles optimizing out. Some of the <3 x half> fast tests regress due to vector widening dropping flags which will be fixed separately. Apparently there is no exp10 intrinsic, but there should be. Adds some deadish code in preparation for adding one while I'm following along with the current library expansion.	2023-07-05 17:23:49 -04:00
Matt Arsenault	ed556a1ad5	AMDGPU: Correctly lower llvm.exp2.f32 Previously this did a fast math expansion only.	2023-07-05 17:23:48 -04:00
Oskar Wirga	198df5f682	Weaken MFI Max Call Frame Size Assertion A year ago when I was not invested at all into compilers, I found an assertion error when building an AArch64 debug build with LTO + CFI, among other combinations. It was posted as a github issue here: https://github.com/llvm/llvm-project/issues/54088 I took it upon myself to revisit the issue now that I have spent some more time working on LLVM. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D151276	2023-07-05 14:02:51 -07:00
Matt Arsenault	9c82dc6a6b	AMDGPU: Always use v_rcp_f16 and v_rsq_f16 These inherited the fast math checks from f32, but the manual suggests these should be accurate enough for unconditional use. The definition of correctly rounded is 0.5ulp, but the manual says "0.51ulp". I've been a bit nervous about changing this as the OpenCL conformance test does not cover half. Brute force produces identical values compared to a reference host implementation for all values.	2023-07-05 16:53:01 -04:00
Matt Arsenault	59c311c5d4	AMDGPU: Add more tests for f16 fdiv lowering Probably should merge the DAG and gisel tests.	2023-07-05 16:53:01 -04:00
Nemanja Ivanovic	a57236de4e	[PowerPC] Remove extend between shift and and The SDAG will sometimes insert an extend between the shift and an and (immediate) even though the immediate is narrower than the narrow size. This does not allow us to produce a rotate instruction (such as rlwinm). This patch just adds a combine to move the extend onto the and. Differential revision: https://reviews.llvm.org/D152911	2023-07-05 16:33:07 -04:00
Amaury Séchet	872276de4b	[NFC] Autogenerate CodeGen/SystemZ/int-{uadd,sub}-0*.ll	2023-07-05 20:14:43 +00:00
Philip Reames	403261eafd	[RISCV] Remove legacy TA/TU pseudo distinction for load instructions This change continues with the line of work discussed in https://discourse.llvm.org/t/riscv-transition-in-vector-pseudo-structure-policy-variants/71295. This change targets all the pseudos used in loads (unit, strided, segmented, fault first, and their combinations). As with previous changes in the series, we replace the existing TA and TU forms with a single unified pseudo with a passthru (which may be implicit_def) and a policy operand. One quirk is that I went ahead and treated the unmasked mask load instruction (vlm) the same way. We need the pass thru operand to model tail undefined, but since the instruction is unconditionally agnostic and the instruction has no mask, the policy operand is arguably unneeded. I kept it mostly for consistency sake. Another quirk worth highlighting is that segment loads require a bit of dedicated handling. Surprisingly, we don't have IMPLICIT_DEF nodes of the right types, and attempting to use them results in some odd looking codegen and a few crashes. Instead, I left the REG_SEQUENCE form, and extended InsertVSETVLI to recognize the complex undefs. Arguably, we should probably revisit the handling of undef reg_sequence nodes here, but I'm hoping to side step that in this patch. As before, we see codegen changes (some improvements and some regressions) due to scheduling differences caused by the extra implicit_def instructions. I did have to delete one register allocation regression test as I couldn't figure out how to meaningfully update it. I spent a significant amount of time trying, and finally gave up. Differential Revision: https://reviews.llvm.org/D154141	2023-07-05 13:11:58 -07:00
Matt Arsenault	4e15f378ee	AMDGPU: Correctly lower llvm.log.f32 and llvm.log10.f32 Previously we expanded these in a fast-math way and the device libraries were relying on this behavior. The libraries have a pending change to switch to the new target intrinsic. Unlike the library version, this takes advantage of no-infinities on the result overflow check.	2023-07-05 15:30:35 -04:00
Luke Lau	1039aec30b	[RISCV] Fix interleave/deinterleave store test output Looks like the output changed after rebasing	2023-07-05 19:52:50 +01:00
Luke Lau	ea62fc79e7	[RISCV] Lower deinterleave2 intrinsics to vlseg2 Following from D153864, this patch implements the lowerDeinterleaveIntrinsic hook to lower deinterleaves of loads into vlseg2 intrinsics. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153876	2023-07-05 19:24:15 +01:00
Luke Lau	86a9bbfdb3	[RISCV] Add tests for vector.deinterleave2s of loads Reviewed By: reames Differential Revision: https://reviews.llvm.org/D153875	2023-07-05 19:24:10 +01:00
Luke Lau	70093fcf6c	[RISCV] Lower interleave2 intrinsics to vsseg2 This patch teaches the RISCV TargetLowering class to lower interleave intrinsics to vsseg2, so it can lower interleaved stores for scalable vectors. Previously, we could only lower stores of interleaves for fixed length vectors with vector shuffles. This uses the lowerInterleaveIntrinsic interface for the interleaved access pass that was added in D146218, and subsumes the DAG combine approach taken in D144175 Reviewed By: reames Differential Revision: https://reviews.llvm.org/D153864	2023-07-05 19:24:05 +01:00
Luke Lau	d914686da2	[RISCV] Add tests for stores of vector.interleave2 Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153863	2023-07-05 19:24:01 +01:00
Yusra Syeda	163aad6bcb	[SystemZ][z/OS] z/OS ADA codegen and emission This patch adds support for the ADA (associated data area), doing the following: -Creates the ADA table to handle displacements -Emits the ADA section in the SystemZAsmPrinter -Lowers the ADA_ENTRY node into the appropriate load instruction Differential Revision: https://reviews.llvm.org/D153788	2023-07-05 13:21:52 -04:00
Igor Kirillov	7f20407cee	[CodeGen] Add support for Splats in ComplexDeinterleaving pass This commit allows generating of complex number intrinsics for expressions with constants or loops invariants, which are represented as splats. For instance, after vectorizing loops in the following code snippets, the ComplexDeinterleaving pass will be able to generate complex number intrinsics: ``` complex<> x = ...; for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * x; ``` or ``` for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * (11.0 + 3.0i); ``` Differential Revision: https://reviews.llvm.org/D153355	2023-07-05 17:02:52 +00:00

... 78 79 80 81 82 ...

52796 Commits