llvm-project

Author	SHA1	Message	Date
Luo, Yuanke	614c63bec6	[X86] Create extra prolog/epilog for stack realignment [part 2] This patch is to support D145650 for elf target as well. Differential Revision: https://reviews.llvm.org/D146489	2023-03-21 13:43:39 +08:00
Congcong Cai	d9661d79f4	[Webassembly][multivalue] update libcall signature when multivalue feature enabled fixed: #59095 Update libcall signatures to use multivalue return rather than returning via a pointer when the multivalue features is enabled in the WebAssembly backend. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D146271	2023-03-21 12:10:51 +08:00
Ben Shi	4fa9dc9482	[AVR] Fix incorrect expansion of the pseudo 'ELPMBRdZ' instruction The 'ELPM' instruction has three forms: -------------------------- \| form \| feature \| \| ----------- \| -------- \| \| ELPM \| hasELPM \| \| ELPM Rd, Z \| hasELPMX \| \| ELPM Rd, Z+ \| hasELPMX \| -------------------------- The second form is always used in the expansion of the pseudo instruction 'ELPMBRdZ'. But for devices without ELPMX but only with ELPM, only the first form can be emitted. Reviewed By: jacquesguan Differential Revision: https://reviews.llvm.org/D141221	2023-03-21 11:33:56 +08:00
Luo, Yuanke	e4c1dfed38	[X86] Create extra prolog/epilog for stack realignment The base pointer register is reserved by compiler when there is dynamic size alloca and stack realign in a function. However the base pointer register is not defined in X86 ABI, so user can use this register in inline assembly. The inline assembly would clobber base pointer register without being awared by user. This patch is to create extra prolog to save the stack pointer to a scratch register and use this register to reference argument from stack. For some calling convention (e.g. regcall), there may be few scratch register. Below is the example code for such case. ``` extern int bar(void p); long long foo(size_t size, char c, int id) { __attribute__((__aligned__(64))) int a; char p = (char *)alloca(size); asm volatile ("nop"::"S"(405):); asm volatile ("movl %0, %1"::"r"(id), "m"(a):); p[2] = 8; memset(p, c, size); return bar(p); } ``` And below prolog/epilog will be emit for this case. ``` leal 4(%esp), %ebx .cfi_def_cfa %ebx, 0 andl $-128, %esp pushl -4(%ebx) ... leal 4(%ebx), %esp .cfi_def_cfa %esp, 4 ``` Differential Revision: https://reviews.llvm.org/D145650	2023-03-21 08:09:56 +08:00
Nemanja Ivanovic	6ee4ea8e2f	[PowerPC][NFC] Test needs to include constant pool values	2023-03-20 16:43:59 -05:00
Nemanja Ivanovic	da40f7e8b1	[PowerPC][NFC] Pre-commit a test case for upcoming patch	2023-03-20 15:42:07 -05:00
David Green	cd22e7c3ad	[AArch64] Regenerate neon-vcmla.ll tests and add tests for combining fadd with vcmla. NFC See D146407.	2023-03-20 16:29:28 +00:00
Muhammad Omair Javaid	8d6ab7d519	Revert "Revert "[SVE] Add patterns for shift intrinsics with FalseLanesZero mode"" This reverts commit 32bd1f562f835044d11b7ecfb36362a29eb00a02.	2023-03-20 15:33:20 +05:00
Muhammad Omair Javaid	32bd1f562f	Revert "[SVE] Add patterns for shift intrinsics with FalseLanesZero mode" This reverts commit 22c3ba4bb519e12395c676ffe436ea4b8400234a. Breaks buildbot https://lab.llvm.org/buildbot/#/builders/197/builds/4272 Differential Revision: https://reviews.llvm.org/D145551	2023-03-20 12:39:39 +05:00
lizhijin	22c3ba4bb5	[SVE] Add patterns for shift intrinsics with FalseLanesZero mode This patch adds patterns to reduce redundant mov and sel instructions for shift intrinsics with FalseLanesZero mode, when FeatureExperimentalZeroingPseudosis supported. For example, before: mov z1.b, #0 sel z0.b, p0, z0.b, z1.b asr z0.b, p0/m, z0.b, #7 After: movprfx z0.b, p0/z, z0.b asr z0.b, p0/m, z0.b, #7 Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D145551	2023-03-19 13:49:01 +08:00
Austin Kerbow	864a2b25be	[AMDGPU] Reserve extra SGPR blocks wth XNACK "any" TID Setting ASMPrinter was relying on feature bits to setup extra SGRPs in the knerel descriptor for the xnack_mask. This was broken for the dynamic XNACK "any" TID setting which could cause user SGPRs to be clobbered if the number of SGPRs reserved was near a granulated block boundary. When XNACK was enabled this worked correctly in the ASMParser which meant some kernels were only failing without "-save-temps". Fixes: SWDEV-382764 Reviewed By: kzhuravl Differential Revision: https://reviews.llvm.org/D145401	2023-03-17 20:26:23 -07:00
Heejin Ahn	4e844a1498	[WebAssembly] Replace Bugzilla links with Github issues Reviewed By: dschuff, asb Differential Revision: https://reviews.llvm.org/D145966	2023-03-17 20:13:00 -07:00
Pavel Kopyl	7adacaa098	[NVPTX] Report fatal error on empty argument type. Differential Revision: https://reviews.llvm.org/D146331	2023-03-18 01:27:43 +01:00
Craig Topper	101cf0b8ab	[RISCV] Add isReMaterializable to FLI instructions. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D146321	2023-03-17 12:16:37 -07:00
Craig Topper	0a895c39ad	[RISCV] Add isAsCheapAsAMove to FLI instructions. This can prevent unnecessarily hoisting out of loops. Test case cribbed from AArch64. I also intend to make them rematerializable. Differential Revision: https://reviews.llvm.org/D146314	2023-03-17 12:16:14 -07:00
Craig Topper	f36ec414c9	[RISCV] Add test case showing fli being hoisted out of a loop and creating extra copies/spills. Test case for D146314. Differential Revision: https://reviews.llvm.org/D146315	2023-03-17 12:16:14 -07:00
Matt Arsenault	9356ec1516	CodeGen: Reorder case handling for is.fpclass legalization Subnormal and zero checks can be combined into one, so move the code closer to reduce the diff in a future change.	2023-03-17 11:29:50 -04:00
Krzysztof Parzyszek	0eac3c5004	[Hexagon] Ensure proper ordering of instructions in HVC::AlignVectors The shuffle reduction creates a dependency chain. Make sure that the inputs to the next instruction are placed ahead of the instruction itself.	2023-03-17 08:13:49 -07:00
Nikita Popov	687b5b9a0c	[SCEVExpander] Always use scevgep as name With opaque pointers the scevgep / uglygep distinction no longer makes sense -- GEPs are always emitted in offset-based representation.	2023-03-17 14:27:03 +01:00
Craig Topper	4063369fd4	[RISCV] Add MULW to RISCVStripWSuffix. This converts MULW to MUL if the upper bits aren't used. This will give more opportunities to use c.mul with Zcb.	2023-03-16 19:42:33 -07:00
Vitaly Buka	aa15fe98b6	Revert "[AMDGPUUnifyDivergentExitNodes] Add NewPM support" Introduces nullptr dereference. This reverts commit a5455e32b364dabe499ec11722626d4bbaf047ba.	2023-03-16 19:03:46 -07:00
zhijian	49bc3077cb	[AIX] unset bit "IsBackChainStored" of traceback table for leaf functions with no stack frame Summary: In function PPCAIXAsmPrinter::emitTracebackTable() ,the bit "IsBackChainStored" of traceback table always set true, it will cause aix debug tools "dbx" emit an error info "libdebug assertion "(framep->getGpr(STKP, &addr) == DB_SUCCESS && *nextStkpp == addr)" when debug a leaf functions with no stack frame. If a a leaf functions with no stack frame , the bit IsBackChainStored should be unset. Reviewers: ChenZheng Differential Revision: https://reviews.llvm.org/D146071	2023-03-16 15:26:12 -04:00
Mirko Brkusanin	d5c0c1b6f0	[AMDGPU] Select flat atomic fmin/fmax Also disables global atomic fmin/fmax x2 patterns on gfx11 Differential Revision: https://reviews.llvm.org/D146137	2023-03-16 18:07:26 +01:00
Mikhail R. Gadelha	185ea867eb	[RISCV] Fix missing addi in test to validate lower inline asm m with offset	2023-03-16 13:30:53 -03:00
Anshil Gandhi	a5455e32b3	[AMDGPUUnifyDivergentExitNodes] Add NewPM support Meanwhile, use UniformityAnalysis instead of LegacyDivergenceAnalysis to collect divergence info. Reviewed By: arsenm, sameerds Differential Revision: https://reviews.llvm.org/D141355	2023-03-16 16:13:29 +00:00
Mikhail R. Gadelha	4bbee03d8a	[RISCV] Added tests to validate lower inline asm m and A with offsets	2023-03-16 13:12:39 -03:00
Tim Northover	2d690684f6	Recommit DwarfEHPrepare: insert extra unwind paths for stack protector to instrument This is a mitigation patch for https://bugs.chromium.org/p/llvm/issues/detail?id=30, where existing stack protection is skipped if a function is returned through by an unwinder rather than the normal call/return path. The recent patch D139254 added the ability to instrument a visible unwind path, at least in the IR case (I'm working on the SelectionDAG instrumentation too) but there are still invisible unwinds it can't reach. So this patch adds logic to DwarfEHPrepare that goes through a function, converting any call that might throw into an invoke to a simple resume cleanup, and adding cleanup clauses to existing landingpads that lack them. Obviously we don't really want to do this if it's wasted effort, so I also exposed requiresStackProtector from the actual StackProtector code to skip the extra paths if they won't be used. Changes: * Move test to AArch64 directory as it relies on target presence. * Re-add Dominator-tree maintenance. Accidentally cherry-picked wrong patch. * Skip adding paths on Windows EH functions. https://reviews.llvm.org/D143637	2023-03-16 13:43:17 +00:00
Tim Northover	e4b352a0b9	Revert "DwarfEHPrepare: insert extra unwind paths for stack protector to instrument" It's caused more failures than are trivially fixable. This reverts commit 203b6f31bb71ce63488eb96b303e000e91aee376.	2023-03-16 11:55:53 +00:00
Tim Northover	203b6f31bb	DwarfEHPrepare: insert extra unwind paths for stack protector to instrument This is a mitigation patch for https://bugs.chromium.org/p/llvm/issues/detail?id=30, where existing stack protection is skipped if a function is returned through by an unwinder rather than the normal call/return path. The recent patch D139254 added the ability to instrument a visible unwind path, at least in the IR case (I'm working on the SelectionDAG instrumentation too) but there are still invisible unwinds it can't reach. So this patch adds logic to DwarfEHPrepare that goes through a function, converting any call that might throw into an invoke to a simple resume cleanup, and adding cleanup clauses to existing landingpads that lack them. Obviously we don't really want to do this if it's wasted effort, so I also exposed requiresStackProtector from the actual StackProtector code to skip the extra paths if they won't be used. https://reviews.llvm.org/D143637	2023-03-16 11:32:45 +00:00
Nikita Popov	bbfb13a5ff	[ConstExpr] Remove select constant expression This removes the select constant expression, as part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179. Uses of this expressions have already been removed in advance, so this just removes related infrastructure and updates tests. Differential Revision: https://reviews.llvm.org/D145382	2023-03-16 10:32:08 +01:00
WANG Xuerui	19e2ebbf45	[LoongArch] Emit bytepick for picking from concatenation of two values It seems the ISA manual's pseudo-code description for the `BYTEPICK.[WD]` instructions is inaccurate; the behavior described here should be correct though. The instructions' names are misleading too (they pick full GRLen-wide words instead of bytes; they just index by bytes) but let's stick to the official names for now. Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D143880	2023-03-16 15:07:06 +08:00
WANG Xuerui	ff475a0dd9	[LoongArch] Add baseline tests for `bytepick` codegen. NFC Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D143879	2023-03-16 15:07:06 +08:00
LiaoChunyu	fc9730376c	[RISCV]Optimize (riscvisd::select_cc x, 0, ne, x, 1) This patch reduces the number of unpredictable branches. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D146117	2023-03-16 10:56:26 +08:00
WANG Xuerui	db5dfec9d4	[Clang][LoongArch] Implement patchable function entry Similar to D98610 for RISCV. This is going to be required by the upcoming Linux/LoongArch [[ https://git.kernel.org/linus/4733f09d88074 \| support for dynamic ftrace ]]. Reviewed By: SixWeining, MaskRay Differential Revision: https://reviews.llvm.org/D141785	2023-03-16 09:33:58 +08:00
Jon Roelofs	aba4e4d6c1	[AArch64] Add hex comments to mov-imm spellings in the InstPrinter Differential Revision: https://reviews.llvm.org/D146105	2023-03-15 14:29:44 -07:00
Jon Roelofs	cdee83b015	Revert "[AArch64] Add hex comments to mov-imm spellings in the InstPrinter" This reverts commit 1def3141135c072a1d3e51e82e113dd67b0def97.	2023-03-15 14:21:08 -07:00
Jon Roelofs	1def314113	[AArch64] Add hex comments to mov-imm spellings in the InstPrinter Differential Revision: https://reviews.llvm.org/D146105	2023-03-15 14:08:51 -07:00
Zain Jaffal	4b09d7a8ac	[AArch64] Change GeneratePerfectShuffle to return one destination operand for zip and transpose operations. The tests added where crashing because zip instruction was returning two destination operands. ZIP according to arm returns only one destination operand. Reviewed By: dmgreen, fhahn Differential Revision: https://reviews.llvm.org/D146055	2023-03-15 21:05:18 +00:00
Simon Pilgrim	5be5510098	[X86] lzcnt-cmp.ll - enable CMOV on 32-bit LZCNT tests There are no 32-bit targets that have LZCNT but not CMOV, and this allows us to test the straight line i64 pattern - otherwise we're doing the same branchy code as the 32-bit BSR test	2023-03-15 18:14:53 +00:00
Simon Pilgrim	28a0d0e85a	[DAG] Don't fold zext(logicalshift(zext(x),c)) -> logicalshift(zext(x),c) if the outer zext is free Avoid widening the shift to a bigger type if the zext would be free anyway Pulled out of D146121	2023-03-15 17:45:12 +00:00
Simon Pilgrim	2281286eb7	[X86] Add more thorough testing of the zext(logicalshift(zext(x),c)) -> logicalshift(zext(x),c) fold Add tests for more extension combos, 64-bit targets and some illegal types	2023-03-15 17:20:42 +00:00
Paul Kirth	ade336d6e1	[codegen][riscv] Emit CFI directives when using shadow call stack Currently we don't emit any CFI instructions for the SCS register when enabling SCS on RISCV. This causes problems when unwinding, since the SCS register isn't being handled properly. Reviewed By: mcgrathr Differential Revision: https://reviews.llvm.org/D145205	2023-03-15 17:10:23 +00:00
Konstantina Mitropoulou	6bc5aa592a	[AMDGPU] Update mul.ll with auto-generated checks Reviewed By: foad Differential Revision: https://reviews.llvm.org/D145990	2023-03-15 08:16:28 -07:00
Simon Pilgrim	4ead58914c	[X86] add-and-not.ll - add 32-bit test coverage	2023-03-15 15:13:02 +00:00
pvanhout	723a53caaf	[AMDGPU] Avoid constant bus limitation on V_BFE GISel pattern For D141247 - if that pattern was used by GISel it could cause constant bus limitation failures. Just use inline immediates instead of S_MOV to avoid the issue. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D146131	2023-03-15 15:01:33 +01:00
Sander de Smalen	93b89bee47	[AArch64][SVE] Fix the indexed addressing mode when FI = 0. This is an alternative fix to D145497, which also addresses https://github.com/llvm/llvm-project/issues/60918 In D124457 which added the original code for this, @efriedma pointed out that it wasn't safe to assume that FI #0 would be allocated at offset 0, but that part of the patch went in without any changes. The downside of this solution is that any access to an object on the stack that has been allocated at SP + 0, still gets moved to a separate register first, which degrades performance. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D146056	2023-03-15 13:39:43 +00:00
Simon Pilgrim	c1f81e7604	[DAG] mergeStore - peek through truncates when finding dead store(trunc(load())) patterns Extend the existing store(load()) removal code to account for intermediate truncates that some targets won't remove with canCombineTruncStore - we only care about the load/store MemoryVT. Fixes regression from D146121	2023-03-15 11:54:13 +00:00
Simon Pilgrim	70562607ab	[DAG] Fold multiple insert_vector_elt of zero values into an AND mask This also allows us to make use of the existing isVectorClearMaskLegal shuffle canonicalization Differential Revision: https://reviews.llvm.org/D145939	2023-03-15 09:56:26 +00:00
Kito Cheng	cf40b8a4dd	[RISCV] Pass vector argument by stack correctly. We've a argument lowering logic to prevent floating-point value pass passed with bit-conversion, but that rule should not applied to vector arguments. --- How to pass argument to `foo`: ``` tail call void @foo(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, <vscale x 16 x float> zeroinitializer, <vscale x 16 x float> zeroinitializer, <vscale x 16 x float> zeroinitializer) ``` `foo` take 13 arguments, first 8 argument pass in GPR, and next 2 LMUL 8 vector arguments passed in v8-v23, and now we run out of argument register for GPR and vector register, so we must pass last LMUL 8 vector argument by stack. Which means we should reserve `vlenb * 8` byte for stack for the last vector argument. Reviewed By: craig.topper, asb Differential Revision: https://reviews.llvm.org/D145938	2023-03-15 17:22:47 +08:00
Kito Cheng	ba1c7731f1	[RISCV] Precommit test to show wrong way to pass scalable FP vector on stack Test case to demo scaleable vector on stack will cause stack corruption. Detail explan what happened: ``` tail call void @foo(i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, i32 0, <vscale x 16 x float> zeroinitializer, <vscale x 16 x float> zeroinitializer, <vscale x 16 x float> zeroinitializer) ``` `foo` take 13 arguments, first 8 argument pass in GPR, and next 2 LMUL 8 vector arguments passed in v8-v23, and now we run out of argument register for GPR and vector register, so we must pass last LMUL 8 vector argument by stack. However LLVM only reserve 8 byte on stack for the LMUL 8 vector argument, it will cause stack corruption when we try to store that into stack. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145934	2023-03-15 17:21:07 +08:00

1 2 3 4 5 ...

47405 Commits