llvm-project

Author	SHA1	Message	Date
Fangrui Song	5240e0b891	[VE,test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $ve-apple-darwin as ELF instead of rejecting it outrightly.	2024-12-15 10:24:14 -08:00
Kazushi (Jam) Marukawa	2e2395651e	[VE] Change the way of lowering store Change lowering store iff the data operand is leagalized. In this way, llvm can lower only operands first, then lower store instruction later. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D158253	2023-08-18 17:13:55 +09:00
Kazushi (Jam) Marukawa	922ac64b04	[VE] Avoid vectorizing store/load in scalar mode Avoid vectorizing store and load instructions in scalar mode. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D158049	2023-08-17 02:15:54 +09:00
Craig Topper	a983ef2c17	[DAGCombiner][AArch64][VE] Teach BuildUDIV/SDIV to use 2x mul when mulh/mul_lohi are not available. Correct the legality of i32 mul_lohi on AArch64. Previously, AArch64 incorrectly reported i32 mul_lohi as Legal. This allowed BuildUDIV/SDIV to use them. A later DAGCombiner would replace them with MULHS/MULHU because only the high half was used. This conversion does not check the legality of MULHS/MULHU under the assumption that LegalizeDAG can turn it back into MUL_LOHI later. After they are converted to MULHS/MULHU, DAGCombine ran and saw that these operations aren't supported but an i64 MUL is. So they get converted to that plus a shift. Without this, LegalizeDAG would convert back MUL_LOHI and isel would fail to find a pattern. This patch teaches BuildUDIV/SDIV to create the wide mul and shift so that we can report the correct operation legality on AArch64. It also enables div by constant folding for more cases on VE. I don't know if VE wants this div by constant optimization or not. If they don't want it, they can use the isIntDivCheap hook to disable it. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D150333	2023-05-12 09:06:17 -07:00
Nikita Popov	b006b60dc9	[VE] Convert some tests to opaque pointers (NFC)	2022-12-19 13:06:34 +01:00
Sanjay Patel	fe05a0a3dd	[SDAG] avoid udiv/urem transform for vector/scalar type mismatches This solves the crashing from issue #58994. I don't know anything about VE, so I don't know if the output is as expected or even correct.	2022-11-15 11:01:18 -05:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Craig Topper	1121eca685	[VP][VE] Default VP_SREM/UREM to Expand and add generic expansion using VP_SDIV/UDIV+VP_MUL+VP_SUB. I want to default all VP operations to Expand. These 2 were blocking because VE doesn't support them and the tests were expecting them to fail a specific way. Using Expand caused them to fail differently. Seemed better to emulate them using operations that are supported. @simoll mentioned on Discord that VE has some expansion downstream. Not sure if its done like this or in the VE target. Reviewed By: frasercrmck, efocht Differential Revision: https://reviews.llvm.org/D133514	2022-09-16 13:19:02 -07:00
Kazushi (Jam) Marukawa	469044cfd3	[VE] Support load/store/spill of vector mask registers Support load/store/spill of vector mask registers and add regression tests. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D129415	2022-07-19 10:29:21 +09:00
Kazushi (Jam) Marukawa	adbb46ea65	[VE] Support load/store vm regsiters Support load/store vm registers to memory location as a first step. As a next step, support load/store vm registers to stack location. This patch also adds several regression tests for not only load/store vm registers but also missing load/store for vr registers. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D128610	2022-07-01 08:25:24 +09:00
Daniil Kovalev	62a983ebc5	Revert "[CodeGen] Place SDNode debug ID declaration under appropriate #if" This reverts commit 83a798d4b0e17ac41d5430f1290d3661343eee1e. As discussed in D120714 with @thakis, the patch added unneeded complexity without noticeable benefits.	2022-04-06 20:32:53 +03:00
Daniil Kovalev	83a798d4b0	[CodeGen] Place SDNode debug ID declaration under appropriate #if Place PersistentId declaration under #if LLVM_ENABLE_ABI_BREAKING_CHECKS to reduce memory usage when it is not needed. Differential Revision: https://reviews.llvm.org/D120714	2022-04-06 14:09:32 +03:00
Craig Topper	49c2206b3b	[VP] Preserve address space of pointer for strided load/store intrinsics. This adds LLVMAnyPointerToElt to use instead of LLVMPointerToElt. This allows us to preserve the address space as part of the type overload for the intrinsic, but still require the vector element type to match the pointer type. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D122042	2022-03-22 09:52:54 -07:00
Jake Egan	c7dc9dbaff	[VE] Remove output to /dev/stdout Sending output to /dev/stdout on AIX gets an llc permission denied error, so this patch removes this from the tests. Reviewed By: simoll, hubert.reinterpretcast Differential Revision: https://reviews.llvm.org/D121799	2022-03-16 11:42:09 -04:00
Simon Moll	91fad1167a	[VE] v512\|256 f32\|64 fneg isel and tests fneg instruction isel and tests. We do this also in preparation of fused negatate-multiple-add fp operations. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D121620	2022-03-16 11:31:26 +01:00
Simon Moll	6ac3d8ef9c	[VE] strided v256.23 isel and tests ISel for experimental.vp.strided.load\|store for v256.32 types via lowering to vvp_load\|store SDNodes. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D121616	2022-03-15 15:29:19 +01:00
Simon Moll	3297571e32	[VE] v256f32\|64 fma isel llvm.fma\|fmuladd vp.fma isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D121477	2022-03-14 15:59:13 +01:00
Simon Moll	f318d1e26b	[VE] v256i32\|64 reduction isel and tests and\|add\|or\|xor\|smax v256i32\|64 isel and tests for vp and vector.reduce intrinsics Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D121469	2022-03-14 11:10:38 +01:00
Simon Moll	a5f1262332	[VE] v256.32\|64 gather\|scatter isel and tests This adds support for v256.32\|64 scatter\|gather isel. vp.gather\|scatter and regular gather\|scatter intrinsics are both lowered to the internal VVP layer. Splitting these ops on v512.32 is the subject of future patches. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D121288	2022-03-14 10:38:56 +01:00
Simon Moll	9ebaec461a	[VE] (masked) load\|store v256.32\|64 isel Add `vvp_load\|store` nodes. Lower to `vld`, `vst` where possible. Use `vgt` for masked loads for now. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D120413	2022-03-02 13:31:29 +01:00
Simon Moll	f27423027d	[VE] Enable v256 fcmp true\|false tests The broadcast patterns for all-true\|false masks are available now. Enable the true\|fast fcmp predicate tests that use them. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D119936	2022-02-18 13:26:18 +01:00
Simon Moll	53efbc15cb	[VE] v256i1 broadcast isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D119241	2022-02-15 12:40:51 +01:00
Simon Moll	ce48fe47af	[VE] v256i1 and\|or\|xor isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D119239	2022-02-14 08:47:06 +01:00
Simon Moll	ae1bb44ed8	[VE] v256.32\|64 setcc isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D119223	2022-02-08 13:20:55 +01:00
Simon Moll	43994e9a4a	[VE] vp_select+vectorBinOp passthru isel and tests Extend the VE binaryop vector isel patterns to use passthru when the result of a SDNode is used in a vector select or merge. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D117495	2022-01-18 11:31:14 +01:00
Simon Moll	95bf5ac8a8	[VE] select\|vp.merge\|vp.select v256 isel and tests Use the `VMRG` for all three operations for now. `vp_select` will be used in passthru patterns. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D117206	2022-01-17 15:58:54 +01:00
Simon Moll	b2cea573c9	[VE] FADD,FSUB,FMUL,FDIV v256f32\|f64 isel and tests Depends on D115940 for the `Binary_rv_vr_vv` pattern class op isel fragment used for divisions. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D116035	2021-12-21 09:15:31 +01:00
Simon Moll	8c51812913	[VE] U\|SDIV v256i32\|64 isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D115940	2021-12-21 08:51:01 +01:00
Simon Moll	676af1272b	[VE] SHL,SRA,SRL v256i32\|64 isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D115734	2021-12-15 11:32:18 +01:00
Simon Moll	6847379e89	[VE] MUL,SUB,OR,XOR v256i32\|64 isel v256i32\|i64 isel patterns and tests. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D115643	2021-12-14 13:23:48 +01:00
Simon Moll	444013d324	[VE][NFC] Use POSIX-compatible stream redirection Drop Bash-style stream redirect in favor of POSIX stream redirection to fix spurious test failures on Windows. Failure: https://lab.llvm.org/buildbot/#/builders/123/builds/7509/steps/8/logs/stdio	2021-12-01 17:28:57 +01:00
Simon Moll	611d3c63f3	[VP] ISD helper functions [VE] isel for vp_add, vp_and This implements vp_add, vp_and for the VE target by lowering them to the VVP_* layer. We also add helper functions for VP SDNodes (isVPSDNode, getVPMaskIdx, getVPExplicitVectorLengthIdx). Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D93766	2021-01-08 14:29:45 +01:00
Simon Moll	eeba70a463	[VE] Expand single-element BUILD_VECTOR to INSERT_VECTOR_ELT We do this mostly to be able to test the insert_vector_elt isel patterns. As long as we don't, most single element insertions show up as `BUILD_VECTOR` in the backend. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D93759	2021-01-08 11:48:01 +01:00
Simon Moll	d1b606f897	[VE] Extract & insert vector element isel Isel and tests for extract_vector_elt and insert_vector_elt. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D93687	2021-01-08 11:46:59 +01:00
Simon Moll	c3acda0798	[VE] Vector 'and' isel and tests Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D93709	2020-12-23 13:29:29 +01:00
Kazushi (Jam) Marukawa	8c2ad9e85f	[VE] Correct VMP allocation in calling conv VE used to allocate VM1, VM2, VMP2 (VM4+VM5), and VM3. This patch corrects to allocate VM1, VM2, VMP2 (VM4+VM5), and VM6. Also add a regression test. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93570	2020-12-21 22:42:24 +09:00
Kazushi (Jam) Marukawa	af83b74dc2	[VE] Support copy of vector mask registers Support VM and VMP registers in copyPhysReg() function. Also add regression tests. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D93547	2020-12-19 09:16:43 +09:00
Kazushi (Jam) Marukawa	e954ba28bc	[VE][NFC] Disable VP tests VP tests recently added don't work on Release mode. They work on Debug mode, so I disable them on Release mode to make tests work.	2020-12-10 15:13:05 +09:00
Simon Moll	3ffbc79357	[VP] Build VP SDNodes Translate VP intrinsics to VP_* SDNodes. The tests check whether a matching vp_* SDNode is emitted. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D91441	2020-12-09 11:36:51 +01:00
Kazushi (Jam) Marukawa	686988a50f	[VE] Optimize prologue/epilogue instructions Optimize eliminate FP mechanism. This time optimize a function which has no call but fixed stack objects. LLVM eliminates FP on such functions now. Also, optimize GOT/PLT registers save/restore instructions if a given function doesn't uses them. In addition, remove generating mechanism of `.cfi` instructions since those are taken from other architectures and not inspected yet. Update regression tests, also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92251	2020-11-30 22:22:33 +09:00
Kazushi (Jam) Marukawa	44a679eaa4	[VE] Change the behaviour of truncate Change the way to truncate i64 to i32 in I64 registers. VE assumed sext values previously. Change it to zext values this time to make it match to the LLVM behaviour. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D92226	2020-11-30 22:12:45 +09:00
Simon Moll	b955c7e630	[VE] VE Vector Predicated SDNode, vector add isel and tests VE Vector Predicated (VVP) SDNodes form an intermediate layer between VE vector instructions and the initial SDNodes. We introduce 'vvp_add' with isel and tests as the first of these VVP nodes. VVP nodes have a mask and explicit vector length operand, which we will make proper use of later. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D91802	2020-11-23 17:17:07 +01:00
Simon Moll	ffe6c97f6b	[VE] VEC_BROADCAST, lowering and isel This defines the vec_broadcast SDNode along with lowering and isel code. We also remove unused type mappings for the vector register classes (all vector MVTs that are not used in the ISA go). We will implement support for short vectors later by intercepting nodes with illegal vector EVTs before LLVM has had a chance to widen them. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D91646	2020-11-19 09:44:56 +01:00
Kazushi (Jam) Marukawa	44a4f93925	[VE] Optimize leaf functions Optimize leaf functions by not generating save/restore for callee saved registers. Update regression tests also. Reviewed By: simoll Differential Revision: https://reviews.llvm.org/D91539	2020-11-17 00:38:01 +09:00
Simon Moll	a598c08ac8	[VE] fastcc and vreg-to-vreg copy This defines a 'fastcc' for the VE target and implements vreg-to-vreg copy for parameter passing. The 'fastcc' extends the standard CC for SX-Aurora with register passing of vector-typed parameters and return values. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D90842	2020-11-16 16:24:22 +01:00
Simon Moll	351c10cc72	[VE] Add +vpu attribute `+vpu` controls whether VEISelLowering adds any vregs. This defaults to `-vpu` to have scalar code generation out of the box. We bring up vector isel under the `+vpu` flag. Once vector isel is stable we switch to `+vpu` and advertise vregs and vops in TTI. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D90465	2020-11-04 12:42:00 +01:00

46 Commits