llvm-project

Author	SHA1	Message	Date
Nicola Lancellotti	43fe14c056	[AArch64] Canonicalize ZERO_EXTEND to VSELECT Differential Revision: https://reviews.llvm.org/D135596	2022-10-17 15:42:46 +01:00
Simon Pilgrim	efd0d66269	[AMDGPU] Add regression test cases reported on D136042	2022-10-17 14:54:27 +01:00
Simon Pilgrim	0aa9a7f8d9	[AMDGPU] Regenerate bfe-combine.ll and bfe-patterns.ll	2022-10-17 14:41:14 +01:00
Jay Foad	0c22f4f5fe	[AMDGPU] Common up some generated checks in fnearbyint.ll Also remove -mattr=-flat-for-global which is not needed for generated checks.	2022-10-17 11:02:19 +01:00
Roman Lebedev	3c5a164994	[NFC][X86] Test commit, add test with bad mask vector legalization Inspired by codegen of `@test` from `llvm/test/Analysis/CostModel/X86/masked-interleaved-*-i16.ll`.	2022-10-16 22:22:10 +03:00
Simon Pilgrim	986ca95e06	[BPF] Add (failing) testcase for Issue #57872	2022-10-16 18:16:18 +01:00
Amara Emerson	13792ba417	[AArch64][GlobalISel] When lowering signext i1 parameters, don't zero-extend to s8 first. Fixes https://github.com/llvm/llvm-project/issues/57181	2022-10-15 20:25:43 -07:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Simon Pilgrim	0b36d1ef1f	[Mips] Regenerate unalignedload.ll	2022-10-15 18:29:54 +01:00
Simon Pilgrim	1901bd0404	[Mips] Regenerate return-struct.ll	2022-10-15 18:21:55 +01:00
Simon Pilgrim	f2c4204d8a	[Mips] Regenerate load-store-left-right.ll	2022-10-15 18:21:54 +01:00
wanglei	506e936871	[LoongArch] Fix wrong VariantKind for MO_GOT_PC_{HI/LO} flags Differential Revision: https://reviews.llvm.org/D135946	2022-10-15 17:45:08 +08:00
Kazushi (Jam) Marukawa	0278c9ceb6	[VE] Change the way to lower select Change to use VEISD::CMOV in combineSelect for better optimization. Support VEISD::CMOV in combineTRUNCATE also to optimize trancate. Merge functions to handle condition codes to VE.h. And add basic CMOV patterns to VEInstrInfo.td. Update regression tests also. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D135878	2022-10-15 08:49:36 +09:00
Krzysztof Parzyszek	361a27c155	[Hexagon] Recognize idioms for fixed-point vector multiplication Recognize Q.15Q.15 and Q.31Q.31, with and without rounding.	2022-10-14 15:22:25 -07:00
Philip Reames	d91b0d6816	[RISCV] Merge rv32 and rv64 fixed vector stepvector tests	2022-10-14 14:54:37 -07:00
Martin Storsjö	6eb205b257	Reapply [AArch64] Fix aligning the stack after calling __chkstk Whenever a call to __chkstk was made, the frame lowering previously omitted the aligning (as NumBytes was reset to zero before doing alignment). This fixes https://github.com/llvm/llvm-project/issues/56182. The initial version of this produced invalid code for small functions with no local stack allocations, if those functions were marked with the "stackrealign" attribute. If building with -mstack-alignment=16 (which otherwise mostly would be a no-op), this attribute is added on the main function. Differential Revision: https://reviews.llvm.org/D135687	2022-10-15 00:40:13 +03:00
Krzysztof Parzyszek	705e77abed	[Hexagon] Lower funnel shifts for HVX HVX v62+ has bidirectional shifts, which do not mask the shift amount to the bit width. Instead, the shift amount is sign-extended from the log(BW) bit value, and a negative value causes a shift in the other direction. For the shift amount being -log(BW), this reversed shift will shift all bits out, inserting 0s or sign bits depending on the type and direction.	2022-10-14 14:13:18 -07:00
Filipp Zhinkin	ef774bec63	[AArch64] Support SETCCCARRY lowering Support SETCCCARRY lowering to SBCS instruction. Related issue: https://github.com/llvm/llvm-project/issues/44629 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135302	2022-10-14 22:29:31 +03:00
Krzysztof Parzyszek	7f4ce3f1eb	[Hexagon] Introduce PS_vsplat[ir][bhw] pseudo instructions HVX v60 only has splats that take a 32-bit word as input, while v62+ has splats that take 8- or 16-bit value. This makes writing output patterns that need to use a splat annoying, because the entire output pattern needs to be replicated for various versions of HVX. To avoid this, the patterns will always use the pseudos, and then the pseudos will be handled using a post-ISel hook.	2022-10-14 12:03:13 -07:00
Chris Bieneman	e530a1188e	[DX] Add pass to pretty-print DXIL metadata in asm When DXC prints IR output it adds a bunch of IR comments in a header that describe the DXIL metadata in a more human-readable format. This pass will serve that purpose for LLVM by printing out ahead of the IR printer. Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D135802	2022-10-14 13:32:59 -05:00
Anshil Gandhi	94ac8f3a8c	[BranchRelaxation] Fix test for duplicate branch instruction This patch is a follow up for D134557, inserting a check for a duplicate unconditional branch to fall through. Differential Revision: https://reviews.llvm.org/D135975	2022-10-14 12:21:26 -06:00
Caroline Concatto	60e2aad109	[AArch64]Change printVectorList to print SVE vector range This patch has the prefered disassembly changed for SVE vector list. For instance, instead of printing this assembly: ld4d { z1.d, z2.d, z3.d, z4.d }, p0/z, [x0] it will print this: ld4d { z1.d-z4.d }, p0/z, [x0] Differential Revision: https://reviews.llvm.org/D135952	2022-10-14 18:59:56 +01:00
Hassnaa Hamdi	2c72d90ecc	[AArch64-SVE]: Force generating code compatible to streaming mode. Add a compile-time flag for enabling streaming mode. When streaming mode is enabled, lower basic loads and stores of fixed-width vectors; to generate code that is compatible to streaming mode. Differential Revision: https://reviews.llvm.org/D133433	2022-10-14 17:46:56 +00:00
chenglin.bi	c1909d7337	[DAGCombiner] Fix crash for the merge stores with different value type The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32. When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash. So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135954	2022-10-15 01:16:35 +08:00
Amy Kwan	22e4203df8	[PowerPC][NFC] Pre-commit case for lowering vector shuffles to xxsplti32dx (64 bit) This patch adds a test case for lowering vector shuffles to xxsplti32dx in preparation for D135024. The test case added in this patch only adds the 64-bit CHECKs, as the 32-bit CHECKs cannot be generated (in which D135024 aims to fix).	2022-10-14 10:15:34 -05:00
Sander de Smalen	02df03c5b7	[AArch64][SME] Add support for arm_locally_streaming functions. Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start of the function, and a SMSTOP at the end of the function, such that all operations use the right value for vscale. Because the placement of these nodes is critically important (i.e. no vscale-dependent operations should be done before SMSTART has been issued), we require glueing the CopyFromReg to the Entry node such that we can insert the SMSTART as part of that glued chain. More details about the SME attributes and design can be found in D131562. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131582	2022-10-14 13:47:53 +00:00
chenglin.bi	85e41fcaac	[AArch64] Select to CCMN when the CCMP's second operator is negative constant CCMP/CCMN's second operator support const from 0 to 31. When the CCMP's second operator is in the range [-31, -1] we can replace it with CCMN to avoid extra mov. Fix: #57034 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135939	2022-10-14 21:41:25 +08:00
Martin Storsjö	f309f095e7	Revert "[AArch64] Fix aligning the stack after calling __chkstk" This reverts commit 50e0aced4521260af842dba73f1d8c50d36314ea. This could accidentally start producing invalid code in some cases (in particular, if compiling with -mstack-alignment=16, which one could expect to be a no-op for a target where the stack always is aligned to 16 bytes anyway).	2022-10-14 11:55:59 +03:00
gonglingqin	e632bb6543	[LoongArch] Add codegen support for atomicrmw umin/umax operation on LA64 Furthermore, use `beqz $rd, .BB` instead of `beq $rd, $zero, .BB`. Differential Revision: https://reviews.llvm.org/D135525	2022-10-14 15:24:43 +08:00
Leon Clark	6370bc2435	Add f16 nearbyint support. Enable lowering of FNEARBYINT for f16 and extend existing tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135124	2022-10-14 08:05:24 +01:00
Matt Arsenault	99dff82118	AMDGPU: Fix failing test with expensive checks Fixes failure after d383adec4d3914492e67267462e6f00fdd4934af	2022-10-13 23:34:20 -07:00
Anshil Gandhi	d383adec4d	[BranchRelaxation] Fall through only if block has no unconditional branches Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid inserting additional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134557	2022-10-13 22:48:41 -06:00
chenglin.bi	07c5270043	[AArch64] add tests for ccmp with negative constant op1; NFC	2022-10-14 12:07:43 +08:00
Xiang1 Zhang	aad013de41	[InlineAsm][bugfix] Correct function addressing in inline asm In Linux PIC model, there are 4 cases about value/label addressing: Case 1: Function call or Label jmp inside the module. Case 2: Data access (such as global variable, static variable) inside the module. Case 3: Function call or Label jmp outside the module. Case 4: Data access (such as global variable) outside the module. Due to current llvm inline asm architecture designed to not "recognize" the asm code, there are quite troubles for us to treat mem addressing differently for same value/adress used in different instuctions. For example, in pic model, call a func may in plt way or direclty pc-related, but lea/mov a function adress may use got. This patch fix/refine the case 1 and case 2 in inline asm. Due to currently inline asm didn't support jmp the outsider lable, this patch mainly focus on fix the function call addressing bugs in inline asm. Reviewed By: Pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D133914	2022-10-14 09:47:26 +08:00
Nemanja Ivanovic	0d253bbd33	[PowerPC] Change CRNOT to a code gen single operand instruction Inputs to crnor can come from operands with chains so if it is being used simply to negate such an operand, the repeated input cannot be CSE'd. This patch just adds a code-gen only instruction for this that takes a single input and duplicates it in the encoding of the underlying crnor. Differential revision: https://reviews.llvm.org/D133577	2022-10-13 20:09:44 -05:00
Michal Paszkowski	14ea4f5bf2	[SPIRV] Fix formatting of function tests Differential Revision: https://reviews.llvm.org/D135624	2022-10-14 01:55:27 +02:00
Jakub Chlanda	8407fdbd69	[NVPTX] Support neg{.ftz} for f16 and f16x2 Differential Revision: https://reviews.llvm.org/D135428	2022-10-13 10:48:33 -07:00
Craig Topper	e68b0d5875	[RISCV] Match (select C, -1, X)->(or -C, X) during lowerSelect Same with (select C, X, -1), (select C, 0, X), and (select C, X, 0). There's a DAGCombine after we turn the select into select_cc, but that may introduce a setcc that didn't previously exist. We could add more DAGCombines to remove the extra setcc, but this seemed lower effort. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D135833	2022-10-13 09:06:12 -07:00
David Green	16e4e4ab87	[CodeGenPrep] Handle constants in ConvertPhiType This is a simple addition to the convertPhiTypes in CodeGenPrepare to consider and convert constants as it converts the phi type. Someone fixed the bug in the motivating example, so the undef is now a constant 0. This does mean converting between integer and floating point constants, which may have different materialization. Differential Revision: https://reviews.llvm.org/D135561	2022-10-13 16:41:44 +01:00
David Green	1e80201f7f	[AArch64] Add ConvertPhiType constant tests. NFC	2022-10-13 16:23:34 +01:00
Nemanja Ivanovic	a77a70fa3c	[PowerPC] Stash GPR to VSR if emergency spill slot is not reachable When removing frame indices on PowerPC, we need to scavenge a GPR to materialize a large constant if the stack offset for the spill/reload cannot be reached by a D-Form instruction. However, in a perfect storm of conditions, we may not have GPR's available to scavenge, thereby requiring an emergency spill. If such an emergency spill also needs to be spilled to a location with a large offset, it would itself require register scavenging thereby creating an infinite loop. This patch detects when the scavenger cannot scavenge a register and the spill/reload is to a location with a large offset. It then stashes a GPR into a VSR so that it can use the GPR to materialize the constant (rather than scavenging a GPR). Fixes: https://github.com/llvm/llvm-project/issues/52894 Differential revision: https://reviews.llvm.org/D124841	2022-10-13 09:06:37 -05:00
Simon Pilgrim	fa9c12ed96	[X86] Attempt to combine binary shuffles where both operands come from the same larger vector Allows us to use combineX86ShuffleChainWithExtract to combine targetshuffle(low_subvector(x),high_subvector(x)) -> low_subvector(targetshuffle(x)) style patterns This is currently very limited (it must have a v2i64/v2f64 result), but while triaging I noticed we might be able to extend this to allow more types for targets with suitable variable cross lane shuffle support. Fixes #58339	2022-10-13 14:34:11 +01:00
WANG Xuerui	f017e92c1c	[LoongArch] Add support for llvm.trap and llvm.debugtrap Similar to D69390 for RISCV, use a guaranteed non-existing insn for llvm.trap and the break insn for llvm.debugtrap. Differential Revision: https://reviews.llvm.org/D134365	2022-10-13 19:27:47 +08:00
WANG Xuerui	4e2dfd3589	[LoongArch] Updates for the LoongArch ELF psABI v2.01 revision The e_flags of existing object files are all 0x3 which happens to be compatible. From this commit on, all LoongArch objects produced with upstream LLVM will be of object file ABI v1, which is already supported by binutils' master branch (to be released as 2.40), and is allowed by the same binutils version to interlink with v0 objects so the existing distributions have time to migrate. Differential Revision: https://reviews.llvm.org/D134601	2022-10-13 19:12:26 +08:00
Sheng	62fc58a61d	[AArch64] Improve codegen for "trunc <4 x i64> to <4 x i8>" for all cases To achieve this, we need this observation: `uzp1` is just a `xtn` that operates on two registers For example, given the following register with type v2i64: LSB_______MSB x0 x1 x2 x3 Applying xtn on it we get: x0 x2 This is equivalent to bitcast it to v4i32, and then applying uzp1 on it: x0 x1 x2 x3 \| uzp1 v x0 x2 <value from other register> We can transform xtn to uzp1 by this observation, and vice versa. This observation only works on little endian target. Big endian target has a problem: the uzp1 cannot be replaced by xtn since there is a discrepancy in the behavior of uzp1 between the little endian and big endian. To illustrate, take the following for example: LSB____________________MSB x0 x1 x2 x3 On little endian, uzp1 grabs x0 and x2, which is right; on big endian, it grabs x3 and x1, which doesn't match what I saw on the document. But, since I'm new to AArch64, take my word with a pinch of salt. This bevavior is observed on gdb, maybe there's issue in the order of the value printed by it ? Whatever the reason is, the execution result given by qemu just doesn't match. So I disable this on big endian target temporarily until we find the crux. Fixes #57502 Reviewed By: dmgreen, mingmingl Co-authored-by: Mingming Liu <mingmingl@google.com> Differential Revision: https://reviews.llvm.org/D133850	2022-10-13 19:08:33 +08:00
Simon Pilgrim	7055751115	[X86][AVX2] Add shuffle test case where we fail to merge vpunpcklqdq(vextracti128(x,0),vextracti128(x,1)) -> vpermq These are likely to appear during truncation	2022-10-13 11:47:37 +01:00
Archibald Elliott	7d15212b8c	[ARM] Support fp16/bf16 using w constraint fp16 and bf16 values can be used in GCC's inline assembly using the "w" constraint, which means "VFP floating-point registers d0-d31" - fp16 and bf16 values are stored in S registers (which alias the D registers). This change ensures that LLVM is compatible with GCC for programs that use fp16 and the 'w' constraint. Differential Revision: https://reviews.llvm.org/D135662	2022-10-13 10:32:06 +01:00
Simon Tatham	526ce9c929	Propagate tied operands when copying a MachineInstr. MachineInstr's copy constructor works by calling the addOperand method to add each operand of the old MachineInstr to the new one, one by one. But addOperand deliberately avoids trying to replicate ties between operands, on the grounds that the tie refers to operands by index, and the indices aren't necessarily finalized yet. This led to a code generation fault when the machine pipeliner cloned an Arm conditional instruction, and lost the tie between the output register and the input value to be used when the condition failed to execute. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D135434	2022-10-13 09:40:35 +01:00
Leon Clark	98852a0f3d	Precommit for SWDEV-353076: Add check directives to existing tests. Add FileCheck directives to existing tests in preparation for new tests. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135788	2022-10-13 08:02:37 +01:00
Martin Storsjö	cbd8464595	[MC] [Win64EH] Check that ARM64 prologs and epilogs have the right matching number of instructions This matches what was done for the ARM implementation (where getting the instruction sizes right is even more tricky, and hence needed tighter testing). This will allow catching any future cases where prologs and epilogs don't match the instructions within them. Differential Revision: https://reviews.llvm.org/D131394	2022-10-13 09:47:39 +03:00

1 2 3 4 5 ...

45278 Commits