llvm-project

Author	SHA1	Message	Date
Amara Emerson	65f946cba4	[RISCV] Fix some GlobalISel tests using -march instead of -mtriple. This caused llc to assume the wrong target triple and broke some internal AS sanitizer bots.	2023-10-19 16:30:47 -07:00
Min-Yih Hsu	e353cd8173	[RISCV] Apply `IsSignExtendingOpW = 1` on `fcvtmod.w.d` (#69633 ) Such that RISCVOptWInstrs can eliminate the redundant sign extend.	2023-10-19 14:55:33 -07:00
Tobias Stadler	b1a6b2cc40	[AArch64][GlobalISel] Fix miscompile on carry-in selection (#68840 ) Eliding the vReg to NZCV conversion instruction for G_UADDE/... is illegal if it causes the carry generating instruction to become dead because ISel will just remove the dead instruction. I accidentally introduced this here: https://reviews.llvm.org/D153164. As far as I can tell, this is not exposed on the default clang settings, because on O0 there is always a G_AND between boolean defs and uses, so the optimization doesn't apply. Thus, when I tried to commit https://reviews.llvm.org/D159140, which removes these G_ANDs on O0, I broke some UBSan tests. We fix this by recursively selecting the previous (NZCV-setting) instruction before continuing selection for the current instruction.	2023-10-19 19:50:46 +02:00
Caroline Concatto	200a92520c	[Clang][SVE2.1] Add builtins and intrinsics for SVBFMLSLB/T As described in: https://github.com/ARM-software/acle/pull/257 Patch by: Kerry McLaughlin <kerry.mclaughlin@arm.com> Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D151461	2023-10-19 16:44:39 +00:00
Matt Arsenault	3e49ce6ea1	InlineSpiller: Delete assert that implicit_def has no implicit operands (#69087 ) It's not a verifier enforced property that implicit_def may only have one operand. Fixes assertions after the coalescer implicit-defs to preserve super register liveness to arbitrary instructions. For some reason I'm unable to reproduce this as a MIR test running only the allocator for the x86 test. Not sure it's worth keeping around.	2023-10-20 00:51:12 +09:00
Jay Foad	21e1b13f33	[TwoAddressInstruction] Handle physical registers with LiveIntervals (#66784 ) Teach the LiveIntervals path in isPlainlyKilled to handle physical registers, to get equivalent functionality with the LiveVariables path. Test this by adding -early-live-intervals RUN lines to a handful of tests that would fail without this.	2023-10-19 16:26:30 +01:00
Pierre van Houtryve	40a426fac6	[AMDGPU] Constant fold FMAD_FTZ (#69443 ) Solves #68315	2023-10-19 16:05:51 +02:00
Simon Pilgrim	8505c3b15b	[DAG] canCreateUndefOrPoison - remove AssertSext/AssertZext assumption that they never create undef/poison We need to assume that we generate poison if the assertions failed Fixes #66603	2023-10-19 13:28:53 +01:00
Simon Pilgrim	309e41dd13	[DAG] Add test coverage for Issue #66603	2023-10-19 13:28:52 +01:00
Momchil Velikov	d15fff6c69	Re-apply '[AArch64] Enable "sink-and-fold" in MachineSink by default (#67432 )' This reverts revert 19505072123e43eccf528b660973067b5c9b4a26. An issue was fixed in bea3684944c0d7962cd53ab77aad756cfee76b7c and some newly appeared tests updated.	2023-10-19 13:18:25 +01:00
Ramkumar Ramachandra	98c90a13c6	ISel: introduce vector ISD::LRINT, ISD::LLRINT; custom RISCV lowering (#66924 ) The issue #55208 noticed that std::rint is vectorized by the SLPVectorizer, but a very similar function, std::lrint, is not. std::lrint corresponds to ISD::LRINT in the SelectionDAG, and std::llrint is a familiar cousin corresponding to ISD::LLRINT. Now, neither ISD::LRINT nor ISD::LLRINT have a corresponding vector variant, and the LangRef makes this clear in the documentation of llvm.lrint.* and llvm.llrint.. This patch extends the LangRef to include vector variants of llvm.lrint. and llvm.llrint.*, and lays the necessary ground-work of scalarizing it for all targets. However, this patch would be devoid of motivation unless we show the utility of these new vector variants. Hence, the RISCV target has been chosen to implement a custom lowering to the vfcvt.x.f.v instruction. The patch also includes a CostModel for RISCV, and a trivial follow-up can potentially enable the SLPVectorizer to vectorize std::lrint and std::llrint, fixing #55208. The patch includes tests, obviously for the RISCV target, but also for the X86, AArch64, and PowerPC targets to justify the addition of the vector variants to the LangRef.	2023-10-19 13:05:04 +01:00
Pierre-Andre Saulais	0b80288e9e	[NVPTX] Preserve v16i8 vector loads when legalizing This is done by lowering v16i8 loads into LoadV4 operations with i32 results instead of letting ReplaceLoadVector split it into smaller loads during legalization. This is done at dag-combine1 time, so that vector operations with i8 elements can be optimised away instead of being needlessly split during legalization, which involves storing to the stack and loading it back.	2023-10-19 12:34:25 +01:00
Simon Pilgrim	c43ac32bca	[DAG] Expand vXi1 add/sub overflow operations as xor/and (#69191 ) Similar to what we already do for add/sub + saturation variants. Scalar support will be added in a future patch covering the other variants at the same time. Alive2: https://alive2.llvm.org/ce/z/rBDrNE Fixes #69080	2023-10-19 11:48:51 +01:00
Pierre van Houtryve	d2edff839d	[AMDGPU] PeepholeSDWA: Don't assume inst srcs are registers (#69576 ) To fix that ticket we only needed to address the V_LSHLREV_B16 case, but I did it for all insts just in case. Fixes #66899	2023-10-19 12:13:45 +02:00
Yeting Kuo	5341d5465d	[RISCV] Combine (and (select cond, x, -1), c) to (select cond, x, (and x, c)) with Zicond. (#69563 ) It's only beneficial when cond is setcc with integer equality condition code. For other case, it has same instruction count as the original.	2023-10-19 16:11:11 +08:00
Freddy Ye	278e533ee9	[X86] Support -march=pantherlake,clearwaterforest (#69277 )	2023-10-19 15:11:15 +08:00
Wang Pengcheng	f4231bf446	[RISCV] Replace PostRAScheduler with PostMachineScheduler (#68696 ) Just like what other targets have done. And this will make DAG mutations like MacroFusion take effect.	2023-10-19 13:30:41 +08:00
Craig Topper	d51855f700	[RISCV] Fix assertion failure from performBUILD_VECTORCombine when the binop is a shift. (#69349 ) The RHS of a shift can have a different type than the LHS. If there are undefs in the vector, we need the undef added to the RHS to match the type of any shift amounts that are also added to the vector. For now just don't add shifts if their RHS and LHS don't match.	2023-10-18 21:40:28 -07:00
Michal Paszkowski	817519058a	[SPIR-V] Emit proper pointer type for OpenCL kernel arguments (#67726 )	2023-10-18 20:51:53 -07:00
Wang Pengcheng	654a3a3cbc	[OpenCL][RISCV] Support SPIR_KERNEL calling convention (#69282 ) X86 supports this calling convention but I don't find any special handling, so I think we can just handle it via CC_RISCV. This should fix #69197.	2023-10-19 11:00:39 +08:00
Lu Weining	78abc45c44	[LoongArch] Improve codegen for atomic cmpxchg ops (#69339 ) PR #67391 improved atomic codegen by handling memory ordering specified by the `cmpxchg` instruction. An acquire barrier needs to be generated when memory ordering includes an acquire operation. This PR improves the codegen further by only handling the failure ordering.	2023-10-19 09:21:51 +08:00
wanglei	271087e3a0	[LoongArch] Implement COPY instruction between CFRs (#69300 ) With this patch, all CFRs can be used for register allocation.	2023-10-19 09:20:27 +08:00
Craig Topper	e103515ced	[RISCV][GISel] Support passing arguments through the stack. (#69289 ) This is needed when we run out of registers.	2023-10-18 17:48:58 -07:00
Arthur Eubanks	f3ea73133f	[ELF] Set large section flag for globals with an explicit section (#69396 ) An oversight in https://reviews.llvm.org/D148836 since this is a different code path.	2023-10-18 16:24:23 -07:00
Min-Yih Hsu	5f5faf407b	[RISCV][GISel] Add ISel supports for SHXADD from Zba extension (#67863 ) This patch consists of porting SDISel patterns of SHXADD instructions to GISel. Note that `non_imm12`, a predicate that was implemented with `PatLeaf`, is now turned into a `PatFrag` of `<op>_with_non_imm12` where `op` is the operator that uses `the non_imm12` operand, as GISel doesn't have equivalence of `PatLeaf` at this moment.	2023-10-18 15:55:19 -07:00
Craig Topper	040df124a2	[RISCV] Don't let performBUILD_VECTORCombine form a division or remainder with undef elements. (#69482 ) Division/remainder by undef is immediate UB across the entire vector.	2023-10-18 13:51:22 -07:00
Stanislav Mekhanoshin	98e95a0055	[AMDGPU] Make S_MOV_B64_IMM_PSEUDO foldable (#69483 ) With the legality checks in place it is now safe to do. S_MOV_B64 shall not be used with wide literals, thus updating the test.	2023-10-18 13:38:20 -07:00
David Green	8a701024f3	[ARM] Lower i1 concat via MVETRUNC The MVETRUNC operation can perform the same truncate of two vectors, without requiring lane inserts/extracts from every vector lane. This moves the concat i1 lowering to use it for v8i1 and v16i1 result types, trading a bit of extra stack space for less instructions.	2023-10-18 19:40:11 +01:00
Stanislav Mekhanoshin	84f398af74	[AMDGPU] Add missing test checks. NFC. (#69484 )	2023-10-18 11:26:39 -07:00
Ilya Leoshkevich	8e810dc7d9	[SystemZ] Support builtin_{frame,return}_address() with non-zero argument (#69405 ) When the code is built with -mbackchain, it is possible to retrieve the caller's frame and return addresses. GCC already can do this, add this support to Clang as well. Use RISCVTargetLowering and GCC's s390_return_addr_rtx() as inspiration. Add tests based on what GCC is emitting.	2023-10-18 19:05:31 +02:00
Stanislav Mekhanoshin	47ed921985	[AMDGPU] Add legality check when folding short 64-bit literals (#69391 ) We can only fold it if it can fit into 32-bit. I believe it did not trigger yet because we do not select 64-bit literals generally.	2023-10-18 09:22:23 -07:00
Sirish Pande	28e4f97320	[AMDGPU] Save/Restore SCC bit across waterfall loop. (#68363 ) Waterfall loop is overwriting SCC bit of status register. Make sure SCC bit is saved and restored across. We need to save/restore only in cases where SCC is live across waterfall loop. Co-authored-by: Sirish Pande <sirish.pande@amd.com>	2023-10-18 08:43:29 -05:00
David Green	c060757bcc	[ARM] Correct v2i1 concat extract types. For two v2i1 concat into a v4i1, we cannot extract each i64 element as an i32. This casts to a v4i32 instead and extracts the correct vector lanes.	2023-10-18 13:40:38 +01:00
pvanhout	868abf0961	Revert "[AMDGPU] Remove Code Object V3 (#67118 )" This reverts commit 544d91280c26fd5f7acd70eac4d667863562f4cc.	2023-10-18 12:55:36 +02:00
Jay Foad	104db26004	[AMDGPU] Fix image intrinsic optimizer on loads from different resources (#69355 ) The image intrinsic optimizer pass was neglecting to check any arguments of the load intrinsic after the VAddr arguments. For example multiple loads from different resources should not have been combined but were, because the pass was not checking the resource argument.	2023-10-18 11:08:01 +01:00
Paul Walker	675231eb09	[SVE ACLE] Allow default zero initialisation for svcount_t. (#69321 ) This matches the behaviour of the other SVE ACLE types.	2023-10-18 10:40:07 +01:00
Amara Emerson	e93bddb287	[AArch64][GlobalISel] Precommit indexed sextload/zextload tests.	2023-10-18 00:23:20 -07:00
Shao-Ce SUN	f48dab5237	Add RV64 constraint to SRLIW (#69416 ) Fixes #69408	2023-10-18 15:01:17 +08:00
Noah Goldstein	112e49b381	[DAGCombiner] Transform `(icmp eq/ne (and X,C0),(shift X,C1))` to use rotate or to getter constants. If `C0` is a mask and `C1` shifts out all the masked bits (to essentially compare two subsets of `X`), we can arbitrarily re-order shift as `srl` or `shl`. If `C1` (shift amount) is a power of 2, we can replace the and+shift with a rotate. Otherwise, based on target preference we can arbitrarily swap `shl` and `shl` in/out to get better constants. On x86 we can use this re-ordering to: 1) get better `and` constants for `C0` (zero extended moves or avoid imm64). 2) covert `srl` to `shl` if `shl` will be implementable with `lea` or `add` (both of which can be preferable). Proofs: https://alive2.llvm.org/ce/z/qzGM_w Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D152116	2023-10-18 01:16:55 -05:00
Noah Goldstein	0c2d28a448	[X86] Add tests for transform `(icmp eq/ne (and X, C0), (shift X, C1))`; NFC Differential Revision: https://reviews.llvm.org/D152115	2023-10-18 01:16:55 -05:00
Pierre van Houtryve	c464fea779	[DAG] Constant fold FMAD (#69324 ) This has very little effect on codegen in practice, but is a nice to have I think. See #68315	2023-10-18 07:46:24 +02:00
Kai Luo	b42738805a	[PowerPC] Auto gen test checks for #69299 . NFC.	2023-10-18 02:21:22 +00:00
Nitin John Raj	ae3ba725b7	[RISCV][GlobalISel] Select G_FRAME_INDEX (#68254 ) This patch is a bandage to get G_FRAME_INDEX working. We could import the SelectionDAG patterns for the ComplexPattern FrameAddrRegImm, and perhaps we will do that in the future. For now we just select it as an addition with 0.	2023-10-17 17:56:42 -07:00
Mircea Trofin	ab91e05e48	[mlgo] Fix tests post 760e7d0	2023-10-17 12:19:54 -07:00
Artem Belevich	b33723710f	[NVPTX] Fixed few more corner cases for v4i8 lowering. (#69263 ) Fixes https://github.com/llvm/llvm-project/issues/69124	2023-10-17 11:06:11 -07:00
Stanislav Mekhanoshin	a22a1fe151	[AMDGPU] support 64-bit immediates in SIInstrInfo::FoldImmediate (#69260 ) This is a part of https://github.com/llvm/llvm-project/issues/67781. Until we select more 64-bit move immediates the impact is minimal.	2023-10-17 10:53:22 -07:00
David Green	4266815f4d	[AArch64] Convert negative constant aarch64_neon_sshl to VASHR (#68918 ) In replacing shifts by splat with constant shifts, we can handle negative shifts by flipping the sign and using a VASHR or VLSHR.	2023-10-17 18:41:23 +01:00
David Green	658ed58de6	[AArch64] Add additional tests for fptosi/fptoui. NFC	2023-10-17 18:39:37 +01:00
akirchhoff-modular	4480e650b3	[YAMLParser] Improve plain scalar spec compliance (#68946 ) The `YAMLParser.h` header file claims support for YAML 1.2 with a few deviations, but our plain scalar parsing failed to parse some valid YAML according to the spec. This change puts us more in compliance with the YAML spec, now letting us parse plain scalars containing additional special characters in cases where they are not ambiguous.	2023-10-17 11:28:14 -06:00
Guozhi Wei	760e7d00d1	[X86, Peephole] Enable FoldImmediate for X86 Enable FoldImmediate for X86 by implementing X86InstrInfo::FoldImmediate. Also enhanced peephole by deleting identical instructions after FoldImmediate. Differential Revision: https://reviews.llvm.org/D151848	2023-10-17 16:22:42 +00:00

... 46 47 48 49 50 ...

52796 Commits