llvm-project

Author	SHA1	Message	Date
Fangrui Song	2417618d5c	[Verifier] Reject dllexport with non-default visibility Add a visibility check for dllimport and dllexport. Note: dllimport with a non-default visibility (implicit dso_local) is already rejected, but with a less clear dso_local error. The MC level visibility `MCSA_Exported` (D123951) is mapped from IR level default visibility when dllexport is specified. The D123951 error is now very difficult to trigger (needs to disable the IR verifier). Reviewed By: mstorsjo Differential Revision: https://reviews.llvm.org/D133267	2022-09-05 10:53:41 -07:00
Amara Emerson	511f2169a8	[GlobalISel] Update combine-build-vector.mir test checks before patch.	2022-09-05 16:06:05 +01:00
Amara Emerson	22b6a4fcac	[GlobalISel] Update test checks before a patch.	2022-09-05 15:24:07 +01:00
David Sherwood	ffa6267300	[CodeGen] Support extracting fixed-length vectors from illegal scalable vectors For some indices we can simply extract the fixed-length subvector from the low half of the scalable vector, for example when the index is less than the minimum number of elements in the low half. For all other cases we can expand the operation through the stack by storing out the vector and reloading the fixed-length part we need. Fixes https://github.com/llvm/llvm-project/issues/55412 Tests added here: CodeGen/AArch64/sve-extract-fixed-from-scalable-vector.ll Differential Revision: https://reviews.llvm.org/D117499	2022-09-05 15:05:14 +01:00
Ivan Kosarev	5db8d6fd2b	[AMDGPU][CodeGen] Support (base \| offset) SMEM loads. Prevents generation of unnecessary s_or_b32 instructions. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D132552	2022-09-05 14:22:06 +01:00
Andrey Tretyakov	1268cf6454	[SPIRV] Add tests to improve test coverage Differential Revision: https://reviews.llvm.org/D133265	2022-09-05 15:52:01 +03:00
Ivan Kosarev	1f550d86b2	[AMDGPU][CodeGen] Pre-commit a test on (base \| offset) SMEM loads for D132552. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D133021	2022-09-05 13:12:43 +01:00
Ivan Kosarev	f33645301e	[AMDGPU][CodeGen] Support (soffset + offset) s_buffer_load's. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D130263	2022-09-05 12:53:05 +01:00
Amara Emerson	fb60e50c78	[GlobalISel] Fix a combine crash due to a negative G_INSERT_VECTOR_ELT idx. These should really be folded away to undef but we shouldn't crash in any case.	2022-09-05 12:10:17 +01:00
Simon Pilgrim	4e6783f866	[DAG] getFreeze()/getNode() - account for operand depth when calling isGuaranteedNotToBeUndefOrPoison (PR57554) Similar to #57402 - we were calling isGuaranteedNotToBeUndefOrPoison on the freeze operand (with Depth = 0), but wasn't accounting for the fact that a later isGuaranteedNotToBeUndefOrPoison assertion will call from the new node (with Depth = 0 as well) - which will then recursively call isGuaranteedNotToBeUndefOrPoison for its operands with Depth = 1 Fixes #57554	2022-09-05 11:46:46 +01:00
Craig Topper	0d1d36cfa6	[X86] Pre-commit tests for D130862. NFC	2022-09-04 21:19:01 -07:00
gonglingqin	bc743bf666	[LoongArch] Add codegen support for fcopysign Differential Revision: https://reviews.llvm.org/D133185	2022-09-05 11:03:54 +08:00
Simon Pilgrim	e438ce5694	[AMDGPU] Add -verify-machineinstrs to attr-amdgpu-flat-work-group-size* tests These were affected by D131825 (and reported on Issue #57149) - adding the verification will help ensure that we don't hit this again on builds with EXPENSIVE_CHECKS enabled	2022-09-03 13:47:41 +01:00
Simon Pilgrim	62cdfdab4d	[DAG] canCreateUndefOrPoison - add freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c) support We already have plenty of assertions in place to ensure that the insertion index is constant and inrange	2022-09-03 13:41:33 +01:00
Simon Pilgrim	3968844bff	[X86] Add test showing failure to fold freeze(insert_subvector(x,y,c)) -> insert_subvector(freeze(x),freeze(y),c) If at least one of x and y are known never poison.	2022-09-03 13:27:08 +01:00
Daniil Fukalov	b4e1b0e00d	[LiveIntervals] Split live intervals on any dead def Each dead def of the same virtual register is required to be split into multiple virtual registers with separate live intervals to avoid MachineVerifier error. Partially fixes https://github.com/llvm/llvm-project/issues/56050 and https://github.com/llvm/llvm-project/issues/56051 Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D130477	2022-09-02 20:00:22 +03:00
Andrey Tretyakov	f20c9c42d2	[SPIRV] Add tests to improve test coverage Differential Revision: https://reviews.llvm.org/D132903	2022-09-02 13:19:28 +03:00
WANG Xuerui	2dd434c3ee	[LoongArch] Support lowering br_jt Jump tables cannot be generated yet, due to missing support for emitting local addresses. Differential Revision: https://reviews.llvm.org/D132653	2022-09-02 17:57:50 +08:00
Andrey Tretyakov	13453c9861	[SPIRV] Add tests to improve test coverage Differential Revision: https://reviews.llvm.org/D132817	2022-09-02 11:59:18 +03:00
Nuno Lopes	858fe8664e	Expand Div/Rem: consider the case where the dividend is zero So we can't use ctlz in poison-producing mode	2022-09-01 17:04:26 +01:00
Matt Devereau	b9062ceffc	[AArch64][SVE] Add floating-point repeated complex pattern llc tests	2022-09-01 15:04:59 +00:00
Nikita Popov	5134bd432f	[DwarfEhPrepare] Assign dummy debug location for inserted _Unwind_Resume calls (PR57469) DwarfEhPrepare inserts calls to _Unwind_Resume into landing pads. If _Unwind_Resume happens to be defined in the same module and debug info is used, then this leads to a verifier error: inlinable function call in a function with debug info must have a !dbg location call void @_Unwind_Resume(ptr %exn.obj) #0 Fix this by assigning a dummy location to the call. (As this happens in the backend, inlining is not actually relevant here.) Fixes https://github.com/llvm/llvm-project/issues/57469. Differential Revision: https://reviews.llvm.org/D133095	2022-09-01 16:35:49 +02:00
Ilia Diachkov	698c800142	[SPIRV] support builtin types and ExtInsts selection The patch adds the support of OpenCL and SPIR-V built-in types. It also implements ExtInst selection and adds spv_unreachable and spv_alloca intrinsics which improve the generation of the corresponding SPIR-V code. Five LIT tests are included to demonstrate the improvement. Differential Revision: https://reviews.llvm.org/D132648 Co-authored-by: Aleksandr Bezzubikov <zuban32s@gmail.com> Co-authored-by: Michal Paszkowski <michal.paszkowski@outlook.com> Co-authored-by: Andrey Tretyakov <andrey1.tretyakov@intel.com> Co-authored-by: Konrad Trifunovic <konrad.trifunovic@intel.com>	2022-09-01 16:44:54 +03:00
Amara Emerson	4cf3db41da	[GlobalISel] Add sdiv exact (X, constant) -> mul combine. This port of the SDAG optimization is only for exact sdiv case. Differential Revision: https://reviews.llvm.org/D130517	2022-09-01 13:34:00 +01:00
gonglingqin	6e47ebdcec	[LoongArch] Support ISD::BR_CC and branch according to condition flag register Use bceqz/bcnez instead of movcf2gr + bnez/beqz for branch jumps. Differential Revision: https://reviews.llvm.org/D132824	2022-09-01 10:43:16 +08:00
Sam Clegg	349a2c37f9	[WebAssembly][MC] Update tests after recent removal of .size directives for functions These were missing from https://reviews.llvm.org/D132929	2022-08-31 14:54:13 -07:00
Nikita Popov	ab6876a40d	reland: [Local] Allow creating callbr with duplicate successors Since D129288, callbr is allowed to have duplicate successors. This patch removes a limitation which prevents optimizations from actually producing such callbrs. This is probably the riskiest of all the recent callbr changes, because code with incorrect assumptions might be lurking somewhere. I fixed the one case I encountered ahead of time in `8201e3ef5c`. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D129997 Originally landed as commit 08860f525a23 ("[Local] Allow creating callbr with duplicate successors") Reverted in commit 1cf6b93df168 ("Revert "[Local] Allow creating callbr with duplicate successors"")	2022-08-31 13:23:00 -07:00
Nick Desaulniers	d7474bef77	[llvm][TailDuplicator] don't taildup isInlineAsmBrIndirectTargets This fixes a crash observed after https://reviews.llvm.org/D129997. Similar to D88823. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D130127	2022-08-31 13:07:10 -07:00
Nick Desaulniers	86b6b39411	[llvm][test] precommit test for D130127 Add a FIXME that will be immediately removed in D130127. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D132609	2022-08-31 12:40:52 -07:00
Simon Pilgrim	eaede4b5b7	[DAG] extractShiftForRotate - replace assertion for shift opcode with an early-out We feed the result from the first extractShiftForRotate call into the second, and that result might no longer be a shift op (usually due to constant folding). NOTE: We REALLY need to stop creating nodes on the fly inside extractShiftForRotate! Fixes Issue #57474	2022-08-31 15:50:48 +01:00
Hassnaa Hamdi	a6d9c944df	[AArch64 - SVE]: Use SVE to lower reduce.fadd. Differential Revision: https://reviews.llvm.org/D132573 skip custom-lowering for v1f64 to be expanded instead, because it has only one lane Differential Revision: https://reviews.llvm.org/D132959	2022-08-31 12:31:06 +00:00
Hassnaa Hamdi	d8655bdeb4	[AArch64-SVE-fixed]: change vscale_range<2,0> to vscale_range<1,0> for 64/128-bit vectors of fadda tests	2022-08-31 11:40:46 +00:00
Simon Pilgrim	9d22800275	[DAG] visitFreeze - account for operand depth when calling isGuaranteedNotToBeUndefOrPoison (PR57402) We were calling isGuaranteedNotToBeUndefOrPoison on operands (with Depth = 0), but wasn't accounting for the fact that a later isGuaranteedNotToBeUndefOrPoison assertion will call from the new node (with Depth = 0 as well) - which will then recursively call isGuaranteedNotToBeUndefOrPoison for its operands with Depth = 1 Fixes #57402	2022-08-31 12:20:30 +01:00
gonglingqin	fb9d67636a	[LoongArch] Support floating-point number reciprocal Differential Revision: https://reviews.llvm.org/D132847	2022-08-31 14:20:46 +08:00
Xiang Li	6917799e37	[DirectX backend] change MinVectorRegisterBitWidth to 32. This is to avoid vector-combine generate vector4 on float. Reviewed By: beanz Differential Revision: https://reviews.llvm.org/D132826	2022-08-30 23:20:12 -07:00
Kai Luo	ad2f7fd286	[AtomicExpand] Make floating point conversion happens before fence insertion IIUC, the conversion part is not part of atomic operations and fences should be put around converted atomic operations. This also fixes atomic load of floating point values which requires fence on PowerPC. Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D127609	2022-08-31 09:54:58 +08:00
Markus Böck	2fdf963daf	[GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel. This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered. Fixes https://github.com/llvm/llvm-project/issues/57349 Differential Revision: https://reviews.llvm.org/D132974	2022-08-31 00:47:17 +02:00
Mingming Liu	4df696fbe9	[NFC] Move a test case across files. The test case is about pmull2 instruction generated used than a SIMD ldr being generated. So aarch64-pmull2.ll is a better test file. Differential Revision: https://reviews.llvm.org/D132277	2022-08-30 14:16:28 -07:00
Craig Topper	893f5e95e2	[RISCV] Improve isel of AND with shiftedMask containing 32 leading zeros and some trailing zeros. We can use srliw to shift out the trailing bits and slli to shift back in zeros. The sign extend of srliw will 0 the upper 32 bits since we will be shifting a 0 into bit 31.	2022-08-30 12:22:46 -07:00
Stanislav Mekhanoshin	fd1f8c85f2	[AMDGPU] Limit TID / wavefrontsize uniformness to 1D kernels If a kernel has uneven dimensions we can have a value of workitem-id-x divided by the wavefrontsize non-uniform. For example dimensions (65, 2) will have workitems with address (64, 0) and (0, 1) packed into a same wave which gives 1 and 0 after the division by 64 respectively. Unfortunately, this limits the optimization to OpenCL only and only if reqd_work_group_size attribute is set. This patch limits it to 1D kernels, although that shall be possible to perform this optimization is the size of the X dimension is a power of 2, we just do not currently have infrastructure to query it. Note that presence of amdgpu-no-workitem-id-y attribute does not help as it only hints the lack of the workitem-id-y query, but not the absence of the actual 2nd dimension, therefore affecting just the SGPR allocation. Differential Revision: https://reviews.llvm.org/D132879	2022-08-30 12:22:08 -07:00
Justin Bogner	f9433161f5	[AMDGPU] Precommit two tests showing missed combines to v_med3	2022-08-30 11:56:09 -07:00
Stephen Long	40999cbd93	[SVE] Fix SVEDup0 matching -0.0f Because of D128669, CPY is being used to zero active lanes even in the case of -0.0f. This patch checks for floating point positive zero. That way SVEDup0 won't match -0.0f. Fixes https://github.com/llvm/llvm-project/issues/57428 Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D132880	2022-08-30 11:07:17 -07:00
Jon Chesterfield	9b0b912e15	[amdgpu][nfc] Add test case showing false aliasing in LDS lowering	2022-08-30 15:33:57 +01:00
Thomas Symalla	d26dd37149	[NFC][AMDGPU] Pre-commit tests for D132837. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D132930	2022-08-30 13:55:36 +02:00
Tomas Matheson	050dad57f7	[AArch64][GISel] constrain regclass for 128->64 copy When selecting G_EXTRACT to COPY for extracting a 64-bit GPR from a 128-bit register pair (XSeqPair) we know enough to constrain the destination register class to gpr64. Without this it may have only a register bank and some copy elimination code would assert while assuming that a register class existed. The register class has to be set explicitly because we might hit the COPY -> COPY case where register class can't be inferred. This would cause the following to crash in selection, where the store is commented (otherwise the store constrains the register class): define dso_local i128 @load_atomic_i128_unordered(i128* %p) { %pair = cmpxchg i128* %p, i128 0, i128 0 acquire acquire %val = extractvalue { i128, i1 } %pair, 0 ; store i128 %val, i128* %p ret i128 %val } Differential Revision: https://reviews.llvm.org/D132665	2022-08-30 11:02:51 +01:00
Tomas Matheson	9a390d6692	[AArch64][GISel] fix G_ADD/G_SUB legalization widenScalarDst updates the insert point to after MI, so widenScalarSrc must be called before widenScalarDst. Otherwise The updated Src values will appear after MI and break SSA. e.g.: %14:_(s64), %15:_(s1) = G_UADDE %9:_, %11:_, %13:_ becomes %14:_(s64), %16:_(s32) = G_UADDE %9:_, %11:_, %17:_ %15:_(s1) = G_TRUNC %16:_(s32) %17:_(s32) = G_ZEXT %13:_(s1) Differential Revision: https://reviews.llvm.org/D132547 Change-Id: Ie3458747a6879433f4d5ab9939d2bd102dd0f2db	2022-08-30 10:59:32 +01:00
Ting Wang	710923cdc8	[PowerPC] CTRLoop pseudo instructions should not be duplicated Add isNotDuplicable to CTRLoop pseudo instructions, to avoid other pass such as early-tailduplication break the loop structure by duplicating pseudo instructions. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D132738	2022-08-30 04:32:29 -04:00
Ting Wang	f908cbc36f	[NFC][PowerPC] Add test case to show ctrloop mi shall not be duplicated Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D132899	2022-08-30 01:57:22 -04:00
wanglian	e2bb9774b1	[LegalizeTypes] Support widen result for VECTOR_REVERSE. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D132359	2022-08-30 10:01:26 +08:00
Craig Topper	e25eb61d03	[RISCV] Enable (srl (and X, C2), C) to form SRLIW in more cases. Don't require the AND has one use and don't depend on targetShrinkDemandedConstant turning C2 into 0xffffffff. Instead, check that the constant is 0xffffffff after replacing any bits that will be shifted out with 1s. Another way to fix this might be to prevent SimplifyDemandedBits from destroying the ANDI after type legalization using targetShrinkDemandedBits. That would prevent the CSE that created this mess. targetShrinkDemandedBits is currently only enable after legalize ops. Quick experiment shows we can't just change when it runs, we would need to try a different heuristic for post type legalization.	2022-08-29 15:52:08 -07:00

1 2 3 4 5 ...

44749 Commits