llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	b2ffc867ad	[DAG] getNode() - begin generalizing the (zext (trunc (assertzext x))) -> (assertzext x) fold. We'll need to generalize this fold to check for any zero upperbits to address some of the D155472 regressions, but this exposes a number of issues. For now, just use the general MaskedValueIsZero test instead of the assertzext.	2023-09-18 15:32:31 +01:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Simon Pilgrim	e6b85c3027	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case (REAPPLIED) Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Reapplied after reversion at e1e3c75c7dad72 with a tweak to the pseudo-probe-peep.ll test Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 12:33:39 +01:00
Simon Pilgrim	e1e3c75c7d	Revert rG6c56cf71ee82ec3a28e0dfc2b751bd10c16929da "[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case" Need to address a missed test change	2023-09-13 11:27:47 +01:00
Simon Pilgrim	6c56cf71ee	[DAG] FoldSetCC - add missing icmp(X,undef) -> isTrueWhenEqual case Followup to D59363 which failed to handle the icmp(X,undef) -> isTrueWhenEqual case - similar to llvm::ConstantFoldCompareInstruction As discussed on the review, this is affecting some previously reduced test cases, but will also prevent reductions from relying on this inconsistent behaviour in the future. Differential Revision: https://reviews.llvm.org/D158068	2023-09-13 11:01:58 +01:00
Mohamed Atef	741c127817	[SelectionDAG] Add computeOverflowForSignedMul / computeOverflowForUnsignedMul overflow handlers Support signed multiplication Support unsigned multiplication Differential Revision: https://reviews.llvm.org/D159406	2023-09-07 10:03:18 +01:00
Simon Pilgrim	84447c044f	[DAG] Add SelectionDAG::isADDLike helper. NFC. Make the DAGCombine helper global so we can more easily reuse it.	2023-09-06 16:54:25 +01:00
Ting Wang	71be020dda	[SelectionDAG][PowerPC] Memset reuse vector element for tail store On PPC there are instructions to store element from vector(e.g. stxsdx/stxsiwx), and these instructions can be leveraged to avoid tail constant in memset and constant splat array initialization. This patch tries to explore these opportunities. Reviewed By: shchenz Differential Revision: https://reviews.llvm.org/D138883	2023-09-06 01:52:38 -04:00
Simon Pilgrim	5463503ae1	[DAG] Move scalar BITCAST constant folds from getNode to FoldConstantArithmetic	2023-09-05 13:11:20 +01:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
Simon Pilgrim	15b561ed38	[DAG] Move STEP_VECTOR constant fold from getNode to FoldConstantArithmetic	2023-09-01 15:47:37 +01:00
Simon Pilgrim	1d47d5d67c	[DAG] Move F16<->FP constant folds from getNode to FoldConstantArithmetic	2023-09-01 15:47:36 +01:00
Simon Pilgrim	4b9c2cf0a7	[DAG] Move INT<->FP constant folds from getNode to FoldConstantArithmetic	2023-09-01 14:02:02 +01:00
Daniel Paoliello	0c5c7b52f0	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-31 12:06:50 -07:00
Simon Pilgrim	376050db9f	[DAG] Move some unary constant folds from getNode() to FoldConstantArithmetic() We need to clean up some type handling before the remainder (int<->fp and bitcasts) can be moved over.	2023-08-30 13:59:28 +01:00
Craig Topper	299b1b4071	[SelectionDAG][RISCV] Teach getConstant to use SPLAT_VECTOR_PARTS if vXi64 SPLAT_VECTOR is legal but i64 scalars are not. That matches how such a SPLAT_VECTOR would have been type legalized so assume it is ok to use for creating constants after type legalization. Still need some improvements to SPLAT_VECTOR lowering. This overlaps with some of what D158742 was trying to fix. Reviewed By: luke Differential Revision: https://reviews.llvm.org/D158870	2023-08-29 09:22:17 -07:00
Luke Lau	8f1d1e2b61	[SDAG] Add computeKnownBits support for ISD::SPLAT_VECTOR_PARTS We can work out the known bits for a given lane by concatenating the known bits of each scalar operand. In the description of ISD::SPLAT_VECTOR_PARTS in ISDOpcodes.h it says that the total size of the scalar operands must cover the output element size, but I've added a stricter assertion here that the total width of the scalar operands must be exactly equal to the element size. It doesn't seem to trigger, and I'm not sure if there any targets that use SPLAT_VECTOR_PARTS for anything other than v4i32 -> v2i64 splats. We also need to include it in isTargetCanonicalConstantNode, otherwise returning the known bits introduces an infinite combine loop. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D158852	2023-08-28 10:35:58 +01:00
Arthur Eubanks	0a4fc4ac1c	Revert "Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables" This reverts commit 8d0c3db388143f4e058b5f513a70fd5d089d51c3. Causes crashes, see comments in https://reviews.llvm.org/D149367. Some follow-up fixes are also reverted: This reverts commit 636269f4fca44693bfd787b0a37bb0328ffcc085. This reverts commit 5966079cf4d4de0285004eef051784d0d9f7a3a6. This reverts commit e7294dbc85d24a08c716d9babbe7f68390cf219b.	2023-08-25 18:34:15 -07:00
Daniel Paoliello	8d0c3db388	Emit the CodeView `S_ARMSWITCHTABLE` debug symbol for jump tables The CodeView `S_ARMSWITCHTABLE` debug symbol is used to describe the layout of a jump table, it contains the following information: * The address of the branch instruction that uses the jump table. * The address of the jump table. * The "base" address that the values in the jump table are relative to. * The type of each entry (absolute pointer, a relative integer, a relative integer that is shifted). Together this information can be used by debuggers and binary analysis tools to understand what an jump table indirect branch is doing and where it might jump to. Documentation for the symbol can be found in the Microsoft PDB library dumper: `0fe89a942f/cvdump/dumpsym7.cpp (L5518)` This change adds support to LLVM to emit the `S_ARMSWITCHTABLE` debug symbol as well as to dump it out (for testing purposes). Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D149367	2023-08-25 10:19:17 -07:00
Kazu Hirata	134115618a	[CodeGen] Use isAllOnesConstant and isNullConstant (NFC)	2023-08-20 22:56:40 -07:00
Jeffrey Byrnes	d26a06728d	[DAG] NFC: Add getBitcastedExtOrTrunc Simple function which scalarizes Ops then ExtOrTruncs them according to function parameters Differential Revision: https://reviews.llvm.org/D157733 Change-Id: Ie5215069228f7bf530cd2dbb4bd17cbf409e046a	2023-08-17 14:29:17 -07:00
Paul Walker	566065207b	[SelectionDAG] Use TypeSize variant of ComputeValueVTs to compute correct offsets for scalable aggregate types. Differential Revision: https://reviews.llvm.org/D157872	2023-08-16 11:56:31 +00:00
Noah Goldstein	2549ec1866	[SelectionDAG] Improve `isKnownToBeAPowerOfTwo` Add additional cases for: select, vselect, {u,s}{min,max}, and, casts, rotl, rotr And improve handling of constants and shifts. Differential Revision: https://reviews.llvm.org/D156778	2023-08-16 02:00:15 -05:00
Noah Goldstein	ac485e4072	[SelectionDAG] Add/Improve cases in `isKnownNeverZero` 1) Handle casts a bit more cleanly just with a loop rather than with recursion. 2) Add additional cases for smin/smax 3 ) For shifts we can also deduce non-zero if the maximum shift amount on the known 1s is non-zero. Differential Revision: https://reviews.llvm.org/D156777	2023-08-16 02:00:15 -05:00
David Green	de775f264d	[DAG] Add constant SPLAT handling in getNodes SIGN_EXTEND_INREG This helps simplify constant splats a little. Without this the code in llvm/lib/CodeGen/SelectionDAG/DAGCombiner.cpp#L14072 always returns the existing node. Differential Revision: https://reviews.llvm.org/D157259	2023-08-08 10:27:55 +01:00
Matt Arsenault	0efdf3baf5	DAG: Remove getTargetIndex as it's unused Fixes #29973	2023-08-05 09:20:09 -04:00
Bjorn Pettersson	4ce7c4a92a	[llvm] Drop some typed pointer handling/bitcasts Differential Revision: https://reviews.llvm.org/D157016	2023-08-03 22:54:33 +02:00
Simon Pilgrim	076bee1020	[DAG] getNode() - fold (zext (trunc (assertzext x))) -> (assertzext x) If the pre-truncated value was the same width as the extension, and the assertzext guarantees that the extended bits are already zero, then skip the zext/trunc 'zero_extend_inreg' pattern. Addresses several regressions noticed in D155472	2023-07-31 10:43:11 +01:00
Zhongyunde	05aae0839f	Reland [AArch64][NFC] Call the API getVScaleRange directly Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range. the previous changes bring in a Buildbot failure because MinSVEVectorSize = MinSVEVectorSize. error: explicitly assigning value of variable of type 'unsigned int' to itself [-Werror,-Wself-assign] Reviewed By: sdesmalen, nikic, dmgreen Differential Revision: https://reviews.llvm.org/D155708	2023-07-26 18:55:31 +08:00
Zhongyunde	ebaac2b2d6	Revert "[AArch64][NFC] Call the API getVScaleRange directly" This reverts commit 67005c8e6fa9464f8bc436305a422071013ae499.	2023-07-26 16:44:14 +08:00
Zhongyunde	67005c8e6f	[AArch64][NFC] Call the API getVScaleRange directly Use the maximum 64 for BitWidth of getVScaleRange to avoid returning an empty range. Reviewed By: sdesmalen, nikic, dmgreen Differential Revision: https://reviews.llvm.org/D155708	2023-07-26 15:54:04 +08:00
David Green	0c41c59dee	[DAG][AArch64] Fix truncated vscale constant types It appears that vscale values truncated to i1 causes mismatches in the constant types when created in getNode. https://godbolt.org/z/TaaTo86ne. Differential Revision: https://reviews.llvm.org/D155626	2023-07-20 09:12:05 +01:00
Matt Arsenault	296e24cd2e	DAG: Constant fold frexp nodes Special casing the nonfinite exponent value everywhere is kind of annoying.	2023-07-17 17:34:29 -04:00
Simon Pilgrim	4f95821f58	[DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI. This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!	2023-07-17 17:17:40 +01:00
Noah Goldstein	a4c461c063	[SelectionDAG] Fill in some more cases in `isKnownNeverZero` This mostly copies cases that already exist in ValueTracking, although it skips the more complex ones. Those can be filled in as needed. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149199	2023-07-12 17:17:53 -05:00
Matt Arsenault	003b58f65b	IR: Add llvm.frexp intrinsic Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and exponent parts. AMDGPU has native instructions to return the two halves, but could use some generic legalization and optimization handling. For example, we should be able to handle legalization of f16 on older targets, and for bf16. Additionally antique targets need a hardware workaround which would be better handled in the backend rather than in library code where it is now.	2023-06-28 14:50:16 -04:00
Simon Pilgrim	64d01432d2	Fix "for for" duplicate typo in comment. NFC.	2023-06-27 11:43:09 +01:00
Alex MacLean	17aa37dd30	[SelectionDAG] Add memory size for CSEMap ID calculation In NVPTX `ReplaceVectorLoad()`, i1 and i8 types are promoted to i16, followed by a truncate operation. Thus, v2i8 (or v2i1) and v2i16 will have the same VTList, which causes a collision in CSEMap. To differentiate the original VTList, let's add the size in generating an ID. Otherwise the compiler crashes in refineAlignment: `MMO->getSize() == getSize() && "Size mismatch!"` Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D153712	2023-06-26 16:12:48 -07:00
Craig Topper	eea865bd4a	Recommit "[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND." I have fixed an existing DAGCombiner bug that caused the previous assertion failure. See 7163539466d7e8930416e55dd9fd29891f8239f2. Original message We don't have VP_ANY_EXTEND or VP_SIGN_EXTEND_INREG yet so I've deviated a little from the non-VP lowering. My goal was to fix the crashes that occurs on these test cases without this patch. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D152854	2023-06-15 12:03:25 -07:00
Alan Zhao	222d73ff7a	Revert "[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND." This reverts commit 6bf79fb09416b02b3f8589a4998610d70c185dae. Reason: causes Clang to crash during Chrome debug builds: https://crbug.com/1455144	2023-06-15 10:20:03 -07:00
Craig Topper	6bf79fb094	[SelectionDAG][RISCV] Add very basic PromoteIntegerResult/Op support for VP_SIGN/ZERO_EXTEND. We don't have VP_ANY_EXTEND or VP_SIGN_EXTEND_INREG yet so I've deviated a little from the non-VP lowering. My goal was to fix the crashes that occurs on these test cases without this patch. Reviewed By: fakepaper56 Differential Revision: https://reviews.llvm.org/D152854	2023-06-14 08:52:56 -07:00
Craig Topper	a5cd198181	[SelectionDAG] Don't allow type legalization to create noop VP_TRUNCATE. Type legalization may need to promote the result to the same type as the input. Instead of forming a vp_truncate with the same source and dest type, don't create any vp_truncate. Handling in getNode like is done for ISD::TRUNCATE.	2023-06-13 12:51:24 -07:00
Anna Thomas	26bfbec5d2	[Intrinsic] Introduce reduction intrinsics for minimum/maximum This patch introduces the reduction intrinsic for floating point minimum and maximum which has the same semantics (for NaN and signed zero) as llvm.minimum and llvm.maximum. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D152370	2023-06-13 12:29:58 -04:00
Phoebe Wang	1c6fd98ffb	[SelectionDAG] Do not salvage with vector node rG2eb7cbf987f21 added this code, which results in crash for vector nodes. This patch solves it by skipping for the vector nodes. Thanks Steve for helping reducing the test case. Co-authored-by: Steve Merritt <steve.merritt@intel.com> Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D152492	2023-06-09 14:55:16 +08:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Serge Pavlov	eecaeb6f10	[FPEnv] Intrinsics for access to FP environment The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some default state. They do the same actions as C library functions 'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls to these functions. The new intrinsics specify FP environment as a value of integer type, it is convenient of most targets where the FP state is a content of some register. Some targets however use long representations. On X86 the size of FP environment is 256 bits, and even half of this size is not a legal ibteger type. To facilitate legalization in such cases, two sets of DAG nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP environment may be represented by a legal integer type. Nodes GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in memory, much like `fesetenv` and `fegetenv` do. They are used when target has long representation for floationg-point state. Differential Revision: https://reviews.llvm.org/D71742	2023-06-05 13:10:01 +07:00
Nikita Popov	e506bfa7ae	[SDAG] Fix incorrect use of undef for boolean contents (PR63055) FoldSetCC() returns UNDEF in a number of cases. However, the SetCC result must follow BooleanContents. Unless the type is a pre-legalization i1 or we have UndefinedBooleanContents, the use of UNDEF will not uphold the requirement that the top bits are either zero or match the low bit. In such cases, return zero instead. Fixes https://github.com/llvm/llvm-project/issues/63055. Differential Revision: https://reviews.llvm.org/D151883	2023-06-01 15:19:22 +02:00
David Green	7740216f2e	[DAG] Combine insert(shuffle(load), load, 0) into a single load Given an insert of a scalar load into a vector shuffle with mask u,0,1,2,3,4,5,6 or 1,2,3,4,5,6,7,u (depending on the insert index), it can be more profitable to convert to a single load and avoid the shuffles. This adds a DAG combine for it, providing the new load is still fast. Differential Revision: https://reviews.llvm.org/D151029	2023-05-31 19:48:57 +01:00
Dhruv Chawla	3b3912e9b8	Reapply [SelectionDAG] Handle NSW for ADD/SUB in computeKnownBits() This exposed a miscompile due to incorrect flag preservation in integer type legalization, which has been fixed in D151472. ----- This patch is a continuation of D150110. It separates the cases for ADD and SUB into their own cases so that computeForAddSub can be directly called and the NSW flag passed. This allows better optimization when the NSW flag is enabled, and allows fixing up the TODO that was there previously in SimplifyDemandedBits. Differential Revision: https://reviews.llvm.org/D150769	2023-05-31 12:25:41 +02:00
Craig Topper	a4f437f012	SelectionDAG: Teach ComputeKnownBits about VSCALE This reverts commit 9b92f70d4758f75903ce93feaba5098130820d40. The issue with the re-applied change was an implicit truncation due to the multiplication. Although the operations were converted to `APInt`, the values were implicitly converted to `long` due to the typing rules. Fixes: #59594 Differential Revision: https://reviews.llvm.org/D140347	2023-05-26 10:48:49 -07:00

1 2 3 4 5 ...

2454 Commits