llvm-project

Author	SHA1	Message	Date
Simon Tatham	b09c575975	[AArch64] Add Defs=[NZCV] to MTE loop pseudos. The `STGloop` family of pseudo-instructions all expand to a loop which iterates over a region of memory setting all its MTE tags to a given value. The loop writes to the flags in order to check termination. But the unexpanded pseudo-instructions were not marked as modifying the flags. Therefore it was possible for one to end up in a location where the flags were live, and then the loop would corrupt them. We spotted the effect of this in a libc++ test involving a lot of complicated inlining, and haven't been able to construct a smaller test case that demonstrates actual incorrect output code. So my test here is just checking that `implicit-def $nzcv` shows up on the pseudo-instructions as they're output from isel. Reviewed By: DavidSpickett Differential Revision: https://reviews.llvm.org/D158262	2023-08-21 09:17:25 +01:00
chenli	0c76f46ca6	[LoongArch] Add testcases of LSX intrinsics with immediates The testcases mainly cover three situations: - the arguments which should be immediates are non immediates. - the immediate is out of upper limit of the argument type. - the immediate is out of lower limit of the argument type. Depends on D155829 Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D157570	2023-08-21 11:04:19 +08:00
Neumann Hon	43207225b6	Revert "[SystemZ][z/OS] Fix the entry point marker for leaf functions" This reverts commit 8af297bbb8e97de8908b857eae1a44f46a0d5afe. Testcase LLVM :: MC/GOFF/ppa1.ll needs to be updated to account for this.	2023-08-20 22:04:02 -04:00
Neumann Hon	8af297bbb8	[SystemZ][z/OS] Fix the entry point marker for leaf functions The function emitFunctionEntryLabel does not look at whether or not a function is a leaf when setting the entry flags, and instead blindly marks all functions as non-leaf routines. Change it to check if a function is a leaf function and mark it accordingly.	2023-08-20 21:53:13 -04:00
Freddy Ye	6acff5390d	[X86] Support -march=gracemont gracemont has some different tuning features from alderlake. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D158046	2023-08-21 08:49:01 +08:00
Sameer Sahasrabuddhe	ef38e6d97f	[GlobalISel] introduce MIFlag::NoConvergent Some opcodes in MIR are defined to be convergent by the target by setting IsConvergent in the corresponding TD file. For example, in AMDGPU, the opcodes G_SI_CALL and G_INTRINSIC* are marked as convergent. But this is too conservative, since calls to functions that do not execute convergent operations should not be marked convergent. This information is available in LLVM IR. The new flag MIFlag::NoConvergent now allows the IR translator to mark an instruction as not performing any convergent operations. It is relevant only on occurrences of opcodes that are marked isConvergent in the target. Differential Revision: https://reviews.llvm.org/D157475	2023-08-20 21:14:46 +05:30
Nico Weber	3d22dac6c3	Revert "[clang][test] Refine clang machine-function-split tests." This reverts commit b9d079d6188b50730e0a67267b7fee36008435ce. Breaks tests on Windows, see https://reviews.llvm.org/D157565#4600939	2023-08-20 10:38:29 -04:00
Simon Pilgrim	2c090e9e67	[X86] Add test case for Issue #64655	2023-08-20 15:34:47 +01:00
Simon Pilgrim	9405b67a9e	[X86] Add test coverage for PR33879 (Issue #33226 ) Ensure we only use the eflags results from shift instructions when it won't cause stalls shift by variable causes stalls as it has to preserve eflags when the shift amount was zero, so we're better off using a separate test	2023-08-20 15:32:46 +01:00
Simon Pilgrim	95865e5138	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB REAPPLIED from 54d663d5896008 with fix for using the correct DemandedBits mask.	2023-08-20 14:20:49 +01:00
Simon Pilgrim	ca10a6caee	[X86] Add test coverage for min/max signbit simplification If we're only demanding the signbit from a min/max then we can simplify this to a logic op	2023-08-20 14:20:49 +01:00
Filipp Zhinkin	08d0b558f5	[SwiftError] Use IMPLICIT_DEF as a definition for unreachable VReg uses SwiftErrorValueTracking creates vregs at swifterror use sites and then connects it with appropriate definitions after instruction selection. To propagate swifterror values SwiftErrorValueTracking::propagateVRegs iterates over basic blocks in RPO, but some vregs previously created at use sites may be located in blocks that became unreachable after instruction selection. Because of that there will no definition for such vregs and that may cause issues down the pipeline. To ensure that all vregs created by the SwiftErrorValueTracking will be defined propagateVRegs was updated to insert IMPLICIT_DEF at the beginning of unreachable blocks containing swifterror uses. Related issue: https://github.com/llvm/llvm-project/issues/59751 Reviewed By: compnerd Differential Revision: https://reviews.llvm.org/D141053	2023-08-20 13:00:31 +02:00
Simon Pilgrim	1b95661616	[AArch64] Regenerate sve-fixed-length-fp-minmax.ll Should remove the D158053 diffs	2023-08-20 11:46:44 +01:00
Craig Topper	d6cd49dd9a	[RISCV][GISel] Add legalizer tests for G_SEXT/ZEXT from s32 to s64 for rv64.	2023-08-19 21:28:48 -07:00
Craig Topper	b41e75c8a4	[RISCV][GISel] Make s32 a legal type for RV64 for any operation that has a W version. My thought is that we can directly select W instructions using s32. This will likely require combines and other optimizations eventually, but this makes a simple starting point. I'm slowly prototyping a similar approach for SelectionDAG. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D157770	2023-08-19 11:20:42 -07:00
chenli	82bbf7003c	[LoongArch] Add testcases of LASX intrinsics with immediates The testcases mainly cover three situations: - the arguments which should be immediates are non immediates. - the immediate is out of upper limit of the argument type. - the immediate is out of lower limit of the argument type. Depends on D155830 Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D157571	2023-08-19 17:14:16 +08:00
chenli	83311b2b5d	[LoongArch] Add LASX intrinsic testcases Depends on D155830 Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D155835	2023-08-19 17:12:31 +08:00
chenli	f3aa441631	[LoongArch] Add LSX intrinsic testcases Depends on D155829 Reviewed By: SixWeining Differential Revision: https://reviews.llvm.org/D155834	2023-08-19 17:10:46 +08:00
Jim Lin	18f5ada244	[DAGCombiner] Don't reduce BUILD_VECTOR to BITCAST before LegalizeTypes if VT is legal. Targets may lose some optimization opportunities for certain vector operation if we reduce BUILD_VECTOR to BITCAST early. And if VT is not legal, reduce BUILD_VECTOR to BITCAST before LegailizeTypes can get benefit. Because type-legalizer often scalarizes illegal type of vectors. Reviewed By: sebastian-ne Differential Revision: https://reviews.llvm.org/D156645	2023-08-19 12:53:50 +08:00
Craig Topper	92464ccbad	[RISCV][GISel] Initial legalization support for G_LOAD and G_STORE. This patch focuses on power of 2 bytes up to 2x XLen with and without alignment. Other cases will be handled in future patches. Reviewed By: nitinjohnraj Differential Revision: https://reviews.llvm.org/D157828	2023-08-18 20:17:19 -07:00
Han Shen	b9d079d618	[clang][test] Refine clang machine-function-split tests. This CL includes two changes: 1. moved clang backend-warnings test cases from Driver/ to CodeGen/. 2. removed multiple `cd "$(dirname "%t")"` and replaced with `-o %t`. Reviewed By: maskray (Fangrui Song) Differential Revision: https://reviews.llvm.org/D157565	2023-08-18 18:05:47 -07:00
Craig Topper	3e569883fa	[RISCV][GISel] Lower G_UADDE, G_UADDO, G_USUBE, and G_USUBO RISC-V doesn't have flag registers, we need to implement these with add/sub and compares. Remove the untested legalization for the signed versions. We can add it back when we write tests. Reviewed By: nitinjohnraj Differential Revision: https://reviews.llvm.org/D157772	2023-08-18 17:22:30 -07:00
Philip Reames	92e0c0dc1a	[DAG] Restrict insert_subvector undef, splat_veector, dontcare transform On the extract_subvector side, we already have the restriction. With D158201, we'd start getting unprofitable splat combines unless we add the same one on the extract_subvector side. Differential Revision: https://reviews.llvm.org/D158202	2023-08-18 12:44:09 -07:00
Philip Reames	67b71ad04a	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 12:28:27 -07:00
Craig Topper	bbbb93eb48	Revert "[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types" This reverts commit 770be43f6782dab84d215d01b37396d63a9c2b6e. Forgot to remove from my tree while experimenting.	2023-08-18 12:00:07 -07:00
Craig Topper	0a5347f40d	[DAG] SimplifyDemandedBits - Use DemandedBits intead of OriginalDemandedBits to when simplifying UMIN/UMAX to AND/OR. DemandedBits is forced to all ones if there are multiple users. The changes X86 test cases looks like they were miscompiles before. The value of eax/rax from the cmov is returned from the function in addition to being used by the sar. That usage needs all bits even though the sar doesn't.	2023-08-18 11:59:18 -07:00
Craig Topper	770be43f67	[DAG] Fold insert_subvector undef, (extract_subvector X, 0), 0 with non-matching types We have an existing DAG combine for when an insert/extract subvector pair is entirely a nop, but we hadn't handled the case where the net result was either an insert or an extract (but not both). The transform is restricted to index = 0 to avoid having to adjust indices after the transform. Reviews, a couple comments on the test changes: * Mostly RISCV, mostly schedule reordering. * One real regression in splats-with-mixed-vl.ll due to a different overly aggressive combine, fix in a follow up patch. * The test/CodeGen/X86/vector-replicaton-i1-mask.ll diff looked concerning at first, but not the mask size at most 4 i1s. I think the type changes on the mask loads are correct, but would welcome a second opinion with someone more familiar with AVX512 codegen. Differential Revision: https://reviews.llvm.org/D158201	2023-08-18 11:59:18 -07:00
Thurston Dang	29b2009061	Revert "[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively." This reverts commit 54d663d5896008c09c938f80357e2a056454bc65, which breaks the test CodeGen/SystemZ/ctpop-01.ll for stage2-ubsan check (see https://lab.llvm.org/buildbot/#/builders/85/builds/18410) I manually confirmed that the test had been passing immediately prior to that commit (BUILDBOT_REVISION=4772c66cfb00d60f8f687930e9dd3aa1b6872228 llvm-zorg/zorg/buildbot/builders/sanitizers/buildbot_bootstrap_ubsan.sh)	2023-08-18 18:08:10 +00:00
Pravin Jagtap	c931f2e6fd	[AMDGPU] Autogenerate & pre-commit tests for D156301 and D157388 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157712	2023-08-18 09:50:44 -04:00
Simon Pilgrim	4cd1c07491	[DAG] SimplifyDemandedBits - if we're only demanding the msb, a UMIN/UMAX node can be simplified to a AND/OR node respectively. Alive2: https://alive2.llvm.org/ce/z/qnvmc6	2023-08-18 12:12:22 +01:00
Simon Pilgrim	54d663d589	[DAG] SimplifyDemandedBits - if we're only demanding the signbit, a SMIN/SMAX node can be simplified to a OR/AND node respectively. Alive2: https://alive2.llvm.org/ce/z/MehvFB	2023-08-18 11:35:34 +01:00
Carl Ritson	ad9eed1e77	[MachineVerifier] Verify LiveIntervals for PHIs Implement basic support for verifying LiveIntervals for PHIs. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D156872	2023-08-18 18:14:22 +09:00
Kazushi (Jam) Marukawa	2e2395651e	[VE] Change the way of lowering store Change lowering store iff the data operand is leagalized. In this way, llvm can lower only operands first, then lower store instruction later. Reviewed By: efocht Differential Revision: https://reviews.llvm.org/D158253	2023-08-18 17:13:55 +09:00
David Green	42b3419339	[AArch64] Split LSLFast into Addr and ALU parts As far as I can tell FeatureLSLFast was originally added to specify that a lsl of <= 3 was cheap when folded into an addressing operand, so should override the one-use checks usually intended to make sure we don't perform redundant work. At a later point it also came to also mean that add x0, x1, x2, lsl N with N <= 4 was cheap, in that it took a single cycle not multiple cycles that more complex adds usually take. This patch splits those two concepts out into separate subtarget features. The biggest change is the change to AArch64DAGToDAGISel::isWorthFoldingALU, making ALU operations now produce a ADDWrs if the shift is <= 4. Otherwise the patch is mostly an NFC as it tries to keep the subtarget features the same for each cpu. I believe that the Arm OoO CPUs should eventually be changed to a new subtarget feature that specifies that a shift of 2 or 3 with any extend should be treated as cheap (just not shifts of 1 or 4). Differential Revision: https://reviews.llvm.org/D157982	2023-08-18 08:59:24 +01:00
XinWang10	b7cf9bbfde	Fix regression of D157680 Test cases in D157680 should be target specific, but miss some limit, add them back to make buildbot pass. Reviewed By: skan, Hahnfeld Differential Revision: https://reviews.llvm.org/D158252	2023-08-18 00:12:10 -07:00
XinWang10	993bdb047c	[X86]Support options -mno-gather -mno-scatter Gather instructions could lead to security issues, details please refer to https://www.intel.com/content/www/us/en/developer/articles/technical/software-security-guidance/technical-documentation/gather-data-sampling.html. This supported options -mno-gather and -mno-scatter, which could avoid generating gather/scatter instructions in backend except using intrinsics or inline asms. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D157680	2023-08-17 23:02:25 -07:00
4vtomat	29f11e4fb7	[RISCV] Bump vector crypto to v1.0 RC2 Differential Revision: https://reviews.llvm.org/D158067	2023-08-17 21:19:59 -07:00
Craig Topper	f64eb69d96	[RISCV][GISel] Swap lo/hi register names in legalizer tests. This makes "lo" refer to the least significant bits and "hi" refer to the most significant bits. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158228	2023-08-17 20:37:42 -07:00
Craig Topper	c6dee6982f	[GlobalISel][Mips] Sync G_UADDE and G_USUBE legalization with LegalizeDAG. This modifies the G_UADDE legalizaton to a version that looks shorter on Mips and RISC-V when feeding the equivalent IR to SelectionDAG. This also removes the boolean select from G_USUBE. Comments taken from LegalizeDAG and tweaked. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158232	2023-08-17 20:36:55 -07:00
laichunfeng	13454a6e87	[RISCV] Compress stack insts by adjust offset. For callee saved/restored operations, they mostly use the following inst patterns, sw rs2, offset(x2) sd rs2, offset(x2) fsw rs2, offset(x2) fsd rs2, offset(x2) lw rd, offset(x2) ld rd, offset(x2) flw rd, offset(x2) fld rd, offset(x2) and offset decides whether the instructions can be compressed. now offset 2032 will be set by default if stacksize is bigger than 2^12-1 to save and restore callee saved register, so it will prevent all the callee saved/restored stack insts be compressed. Allocating proper offset for stack insts is useful to make them be compressed. Reviewed By: craig.topper, wangpc Differential Revision: https://reviews.llvm.org/D157373	2023-08-18 10:49:53 +08:00
Kito Cheng	0816b3efbf	[RISCV] Check floating point vector instruction with SEW=64 is valid when vsetvl insertion Scalar move and splat instruction are only demand the SEW is greater than its own needs, but floating point vector with SEW=64 is not alwaws valid even SEW=64 is valid, because we have a special configuration: zve64f. So we need to check floating point vector instruction with SEW=64 is valid when compute demand of floating point scalar move and splat instruction. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D158086	2023-08-18 10:31:01 +08:00
Kito Cheng	b00b4697ae	[RISCV] Precommit test for D158086 Test case for demonstrate invalid vsetvli insertion case Differential Revision: https://reviews.llvm.org/D158087	2023-08-18 10:14:23 +08:00
Craig Topper	846fbb06b8	[DAGCombiner][RISCV] Return SDValue(N, 0) instead of SDValue() after 2 calls to CombineTo in visitSTORE. RISC-V found a case where the CombineTo caused N to be CSEd with an existing node and then deleted. The top level DAGCombiner loop was surprised to find a node was deleted, but SDValue() was returned from the visit function. We need to return SDValue(N, 0) to tell the top level loop that a change was made, but the worklist updates were already handled. Fixes #64772. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158208	2023-08-17 15:13:36 -07:00
Craig Topper	ebb2e5ebb2	[GlobalISel][Mips] Correct corner case in G_UADDE legalization. If carryin was 1, and RHS is 0xffffffff we were not giving a carry out. In that case Res would be equal to LHS, so Res <u LHS would be false. But there should be a carry out since carryin+RHS wraps around to 0. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D157943	2023-08-17 15:06:16 -07:00
Nitin John Raj	b5c106e873	[RISCV][GlobalISel] Legalize division and remainder Legalize division and remainder. We test for (s7, s8, s16, s32, s48, s64) on rv64 and (s8, s15, s16, s32, s64, s72, s128) on rv64, with and without the +m, +zmmul extensions. We do not handle types with size > 2 x XLen -- these ought to be handled in the IR pass. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157422	2023-08-17 14:41:40 -07:00
Hiroshi Yamauchi	3406934e4d	[MC][COFF][AArch64] Fix the storage class for private linkage symbols. Use IMAGE_SYM_CLASS_STATIC like X86. Differential Revision: https://reviews.llvm.org/D158122	2023-08-17 13:54:12 -07:00
Nitin John Raj	638865c8f9	[RISCV][GlobalISel] Legalize multiplication Legalize multiplication with the +m, +zmmul extensions and without extensions. With extensions, we test for (s7, s8, s16, s32, s48, s64, s96) on rv32 and (s8, s15, s32, s64, s72, s128, s192) on rv64. Without extensions, test (s7, s8, s16, s32) on rv32 and (s8, s15, s16, s32, s64) on rv64. Does not yet work for the type which is 2 times XLen without extensions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157416	2023-08-17 12:59:34 -07:00
Nitin John Raj	b03c7efe9a	[RISCV][GlobalISel] Test legalization for bitshifting with wider types We test for (s48, s64, s96) on rv32 and (s72, s128, s192) on rv64. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D157415	2023-08-17 11:53:32 -07:00
Joe Nash	6aab000874	[AMDGPU] Convert fmul-2-combine-multi-use test to auto-gen NFC. Deletes the unused SI runline. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D158198	2023-08-17 14:23:20 -04:00
Keith Walker	2d9c6e699a	[Thumb1] Use callee-saved register to adjust stack pointer When adjusting the Stack Pointer at the end of the function epilogue, use a callee-saved register, rather than explicitly using R4 which may not have been saved. Differential Revision: https://reviews.llvm.org/D157500	2023-08-17 18:29:50 +01:00

... 63 64 65 66 67 ...

52796 Commits