llvm-project

Author	SHA1	Message	Date
Graham Hunter	fbb37e9606	[AArch64] Add an all-in-one histogram intrinsic Based on discussion from https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788 Current interface is: llvm.experimental.histogram(<vecty> ptrs, <intty> inc_amount, <vecty> mask) The integer type used by 'inc_amount' needs to match the type of the buckets in memory. The intrinsic covers the following operations: * Gather load * histogram on the elements of 'ptrs' * multiply the histogram results by 'inc_amount' * add the result of the multiply to the values loaded by the gather * scatter store the results of the add Supports lowering to histcnt instructions for AArch64 targets, and scalarization for all others at present.	2024-05-13 11:35:28 +01:00
Min-Yih Hsu	f8063ffe73	[VP][RISCV] Add vp.reduce.fmaximum/fminimum and its RISC-V codegen (#91782 ) `vp.reduce.fmaximum/fminimum` are the VP version of `vector.reduce.fmaximum/fminimum`.	2024-05-10 16:01:47 -07:00
David Green	8fc9e3d577	[DAG] Lower frem of power-2 using div/trunc/mul+sub (#91148 ) If we are lowering a frem and the divisor is known to be an integer power-2, we can use the formula 'frem = x - trunc(x / d) * d'. This avoids the more expensive call to fmod. The results are identical as fmod so long as d is a power-2 (so the mul does not round incorrectly), and the sign of the return is either always positive or not important for zeroes (nsz). Unfortunately Alive2 does not handle this well at the moment. I was using exhaustive checking to test this: (https://gist.github.com/davemgreen/6078015f30d3bacd1e9572f8db5d4b64). I found this in cpythons implementation of float_pow. I currently added it as a DAG combine for frem with power-2 fp constants.	2024-05-10 14:58:48 +01:00
Simon Pilgrim	caacf8685a	[DAG] Fold freeze(shuffle(x,y,m)) -> shuffle(freeze(x),freeze(y),m) (#90952 ) If the shuffle mask contains no undef elements, then we can move the freeze through a shuffle node. This requires special case handling to create a new ShuffleVectorSDNode. Includes VECTOR_SHUFFLE support for isGuaranteedNotToBeUndefOrPoison / canCreateUndefOrPoison.	2024-05-04 12:03:10 +01:00
zxc12523	171aeb20ad	[DAG] SelectionDAG.computeKnownBits - add NSW/NUW flags support to ISD::SHL handling (#89877 ) fix #89414	2024-05-02 10:31:56 +01:00
Craig Topper	a03eeb0e98	[SelectionDAG][X86] Add a NoWrap flag to SelectionDAG::isAddLike. NFC (#90681 ) If this flag is set, Xor will not be considered AddLike. If an Xor were treated as an Add it may wrap. If we can prove there would be no carry out and thus no wrap, the Xor would be turned into a disjoint Or by DAGCombine. Use this new flag to fix a bug in X86 where an Xor is incorrectly being treated as an NUWAdd. Fixes #90668.	2024-04-30 16:52:56 -07:00
Bjorn Pettersson	55c6bda01e	Revert "Revert "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921 )" and more..." This reverts commit 16bd10a38730fed27a3bf111076b8ef7a7e7b3ee. Re-applies: b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)" 8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)" 73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)" with a fix in DAGCombiner::visitFREEZE.	2024-04-29 13:08:52 +02:00
David Spickett	16bd10a387	Revert "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921 )" and more... This reverts: b3c55b707110084a9f50a16aade34c3be6fa18da - "[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921)" (because it updates a test case that I don't know how to resolve the conflict for) 8e2f6495c0bac1dd6ee32b6a0d24152c9c343624 - "[DAGCombiner] Do not always fold FREEZE over BUILD_VECTOR (#85932)" 73472c5996716cda0dbb3ddb788304e0e7e6a323 - "[SelectionDAG] Treat CopyFromReg as freezing the value (#85932)" Due to a test suite failure on AArch64 when compiling for SVE. https://lab.llvm.org/buildbot/#/builders/197/builds/13955 clang: ../llvm/llvm/include/llvm/CodeGen/ValueTypes.h:307: MVT llvm::EVT::getSimpleVT() const: Assertion `isSimple() && "Expected a SimpleValueType!"' failed.	2024-04-29 09:47:41 +01:00
Björn Pettersson	b3c55b7071	[SelectionDAG] Handle more opcodes in canCreateUndefOrPoison (#84921 ) [SelectionDAG] Handle more opcodes in canCreateUndefOrPoison Handle SELECT_CC similarly as SETCC. Handle these operations that only propagate poison/undef based on the input operands: SADDSAT, UADDSAT, SSUBSAT, USUBSAT, MULHU, MULHS, SMIN, SMAX, UMIN, UMAX These operations may create poison based on shift amount and exact flag being violated: SRL, SRA One goal here is to allow pushing freeze through these operations when allowed, as well as letting analyses such as isGuaranteedNotToBeUndefOrPoison to not break on such operations. Since some problems have been observed with pushing freeze through SRA/SRL we block that explicitly in DAGCombiner::visitFreeze now. That way we can still model SRA/SRL properly in SelectionDAG::canCreateUndefOrPoison, e.g. when used by isGuaranteedNotToBeUndefOrPoison, even if we do not want to push freeze through those instructions.	2024-04-29 07:56:49 +02:00
Bjorn Pettersson	73472c5996	[SelectionDAG] Treat CopyFromReg as freezing the value (#85932 ) The description of CopyFromReg in ISDOpcodes.h says that the input valus is defined outside the scope of the current SelectionDAG. I think that means that we basically can treat it as a FREEZE in the sense that it can be seen as neither being undef nor poison. Being able to fold freeze(CopyFromReg) into CopyFromReg seems useful to avoid regressions if we start to introduce freeze instruction in DAGCombiner/foldBoolSelectToLogic, e.g. to solve https://github.com/llvm/llvm-project/issues/84653 Things _not_ dealt with in this patch: - Depending on calling convention an input argument can be passed also on the stack and not in a register. If it is allowed to treat an argument received in a register as not being poison, then I think we want to treat arguments received on the stack the same way. But then we need to attribute load instructions, or add explicit FREEZE when lowering formal arguments. - A common pattern is that there is an AssertZext or AssertSext just after CopyFromReg. I think that if we treat CopyFromReg as never being poison, then it should be allowed to fold (freeze(AssertZext(CopyFromReg))) -> AssertZext(CopyFromReg))	2024-04-26 13:41:21 +02:00
Philip Reames	f4e3daa562	[DAG] Early exit for flags in canCreateUndefOrPoison [nfc] (#89834 ) This matches the style used in the Analysis version of this routine, and makes it less likely we'll miss a poison generating flag in future changes. Unlike IR, the check for poison generating flags doesn't need to switch over opcode since all nodes have the SDFlags storage.	2024-04-25 09:12:59 -07:00
Craig Topper	c5dcb5239e	[SelectionDAG] Move GlobalAddressSDNode and AddrSpaceCastSDNode constructors into header. NFC These constructors are no more complicated than any of the other *SDNode constructors that are already in the header.	2024-04-24 13:11:57 -07:00
Craig Topper	fc538b070d	[SelectionDAG] Pass SDVTList instead of VTs to SDNode constructors. NFC (#89880 ) All of these constructors were creating a SDVTList using an EVT created by SDNode::getValueTypeList. This EVT needs to live at least as long as the SDNode that uses it. To do this, SDNode::getValueTypeList contains several function scoped static variables that hold the memory for the EVT. So the EVT lives until global destructors run. This is problematic since an EVT contains a Type* that points to memory allocated by an LLVMContext. If multiple LLVMContexts are used that don't have overlapping lifetimes, we can end up with stale or or incorrect pointers cached in the EVTs owned by SDNode::getValueTypeList. I want to try to make the EVTs be owned by SelectionDAG instead. This is already done for SDVTLists with more than 1 VT. The single value case is a very old optimizaton that should be re-evaluated. In order to do this, I need the SDVTLists to be created by SelectionDAG rather than by the SDNode itself. This patch doesn't change how the allocation is done yet. It just moves the code around. This patch does reduce the number of calls to getVTList since we now share with the call needed for the SDNode FoldingSet. Part of fixing #88233.	2024-04-24 12:31:14 -07:00
Simon Pilgrim	9f2a068bff	[DAG] Add getValid*ShiftAmountConstant wrappers without DemandedElts Simplify callers which don't have their own DemandedElts mask. Noticed while reviewing #88801	2024-04-24 13:26:43 +01:00
Pierre van Houtryve	cf328ff96d	[IR] Memory Model Relaxation Annotations (#78569 ) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5	2024-04-24 08:52:25 +02:00
Craig Topper	b82a4bfb54	[SelectionDAG] Remove unnecessary cast of nullptr in std::fill call. NFC	2024-04-23 22:51:20 -07:00
AtariDreams	a4bacb0f42	[SelectionDAG] Remove redundant KnownBits smin and smax operations (#89519 ) It turns out that if any of the operations can be zero, and neither of the operands can be proven to be positive, it is possible for smax to be zero, and KnownBits cannot prove otherwise even with KnownBits::smax. In fact, proving it based on the KnownBits itself at that point without increasing the depth is actually, provably impossible. Same with smin. This covers all the possible cases and is proven to be complete.	2024-04-22 09:48:31 +02:00
Craig Topper	ce48f43f05	[SelectionDAG] Require UADDO_CARRY carryin and carryout to have the same type. (#89255 ) This requires type legalization to keep them the same. This means we no longer need to legalize the operand since it will be legalized when we legalize the second result.	2024-04-19 12:38:53 -07:00
Craig Topper	823eb1a325	[SelectionDAG] Add some validation of (S/U)(ADD/SUB)O_CARRY nodes. (#89133 )	2024-04-17 16:52:02 -07:00
Paul Walker	7089c359a3	[LLVM][SelectionDAG] Allow verification of target ISD nodes. (#88121 ) Patch includes an initial implementation for AArch64 that covers a handful of nodes where I've observed bogus nodes within the DAG.	2024-04-15 13:50:09 +01:00
fengfeng	7177dc2ef7	[SDAG] Apply or-disjoint in SelectionDAG::isBaseWithConstantOffset (#88493 ) Signed-off-by: feng.feng <feng.feng@iluvatar.com>	2024-04-15 13:59:48 +09:00
Craig Topper	6a85cf8fc0	[SelectionDAG] Verify SPLAT_VECTOR nodes when they are created. (#88305 ) This applies the same rules we have for the scalar operands of a BUILD_VECTOR where the scalar type must match the element type or for integer vectors we allow the scalar type to be larger than the element type. Hexagon uses i32 for an FP zero vector so we allow that as an exception.	2024-04-12 10:22:21 -07:00
SahilPatidar	ab037c4ff3	[DAG] computeKnownBits - add ISD::ABDU/ISD::ABDS handling #84905 (#88253 ) Resolve #84905	2024-04-12 13:04:54 +01:00
AtariDreams	5d6b00929b	[NFC] Replace m_Sub(m_Zero(), X) with m_Neg(X) (#88461 )	2024-04-12 18:24:03 +09:00
Simon Pilgrim	2d0087424f	[DAG] Remove extract_vector_elt(freeze(x)), idx -> freeze(extract_vector_elt(x), idx) fold (#87480 ) Reverse the fold with handling inside canCreateUndefOrPoison for cases where we know that the extract index is in bounds. This exposed a number or regressions, and required some initial freeze handling of SCALAR_TO_VECTOR, which will require us to properly improve demandedelts support to handle its undef upper elements. There is still one outstanding regression to be addressed in the future - how do we want to handle folds involving frozen loads? Fixes #86968	2024-04-04 11:10:55 +01:00
Luke Lau	3a7b5223a6	[DAGCombiner][RISCV] Handle truncating splats in isNeutralConstant (#87338 ) On RV64, we legalize zexts of i1s to (vselect m, (splat_vector i64 1), (splat_vector i64 0)), where the splat_vectors are implicitly truncating. When the vselect is used by a binop we want to pull the vselect out via foldSelectWithIdentityConstant. But because vectors with an element size < i64 will truncate, isNeutralConstant will return false. This patch handles truncating splats by getting the APInt value and truncating it. We almost don't need to do this since most of the neutral elements are either one/zero/all ones, but it will make a difference for smax and smin. I wasn't able to figure out a way to write the tests in terms of select, since we need the i1 zext legalization to create a truncating splat_vector. This supercedes #87236. Fixed vectors are unfortunately not handled by this patch (since they get legalized to _VL nodes), but they don't seem to appear in the wild.	2024-04-04 12:36:15 +08:00
Atousa Duprat	4aba595f09	[ADT] Add signed and unsigned mulh to APInt (#84719 ) Fixes #84207	2024-04-02 17:07:56 +01:00
Sizov Nikita	6654235594	[SelectionDAG] implement computeKnownBits for add AVG* instructions (#86754 ) knownBits calculation for AVGFLOORU / AVGFLOORS / AVGCEILU / AVGCEILS instructions Prerequisite for #76644	2024-04-02 10:39:49 +01:00
AtariDreams	f5a067bb90	[SelectionDAG]: Deduce KnownNeverZero from SMIN and SMAX (#85722 )	2024-03-25 10:35:28 +00:00
Harvin Iriawan	57146daeaa	[CodeGen] Update for scalable MemoryType in MMO (#70452 ) Remove getSizeOrUnknown call when MachineMemOperand is created. For Scalable TypeSize, the MemoryType created becomes a scalable_vector. 2 MMOs that have scalable memory access can then use the updated BasicAA that understands scalable LocationSize. Original Patch by Harvin Iriawan Co-authored-by: David Green <david.green@arm.com>	2024-03-23 12:56:25 +00:00
Yingwei Zheng	6c1932ffd8	[LLVM] Pass APInt by const reference. NFC. (#86278 ) This patch adjusts argument passing for `APInt` to improve the compile-time. Compile-time improvement: https://llvm-compile-time-tracker.com/compare.php?from=d1f182c895728d89c5c3d198b133e212a5d9d4a3&to=32d6611af69bf4e76373f9bc7d9649650f760e48&stat=instructions:u	2024-03-23 14:57:35 +08:00
Simon Pilgrim	e4fa2e3562	[DAG] isGuaranteedNotToBeUndefOrPoisonForTargetNode - add fallback implementation (#86125 ) Allow targets to rely on TargetLowering::isGuaranteedNotToBeUndefOrPoisonForTargetNode to test nodes for canCreateUndefOrPoisonForTargetNode + all arguments are isGuaranteedNotToBeUndefOrPoison. Targets can still perform this themselves for specific special case nodes (e.g. target shuffles). Matches the fallback in SelectionDAG::isGuaranteedNotToBeUndefOrPoison	2024-03-21 15:11:59 +00:00
Simon Pilgrim	2377b9773d	[DAG] SimplifyShift - shift i1/vXi1 X, Y --> X (any non-zero shift amount is undefined). Alive2: https://alive2.llvm.org/ce/z/SdESbg Fixes #85681	2024-03-19 20:18:37 +00:00
Jonas Paulsson	8b8e1adbde	[SystemZ] Don't lower ATOMIC_LOAD/STORE to LOAD/STORE (#75879 ) - Instead of lowering float/double ISD::ATOMIC_LOAD / ISD::ATOMIC_STORE nodes to regular LOAD/STORE nodes, make them legal and select those nodes properly instead. This avoids exposing them to the DAGCombiner. - AtomicExpand pass no longer casts float/double atomic load/stores to integer (FP128 is still casted).	2024-03-18 17:21:50 -04:00
David Green	18da51b2b2	[CodeGen] More uses of LocationSize::beforeOrAfterPointer(). As an extension to #84751, this adds some extra uses of beforeOrAfterPointer() instead of UnknownSize.	2024-03-18 20:18:49 +00:00
David Green	601e102bdb	[CodeGen] Use LocationSize for MMO getSize (#84751 ) This is part of #70452 that changes the type used for the external interface of MMO to LocationSize as opposed to uint64_t. This means the constructors take LocationSize, and convert ~UINT64_C(0) to LocationSize::beforeOrAfter(). The getSize methods return a LocationSize. This allows us to be more precise with unknown sizes, not accidentally treating them as unsigned values, and in the future should allow us to add proper scalable vector support but none of that is included in this patch. It should mostly be an NFC. Global ISel is still expected to use the underlying LLT as it needs, and are not expected to see unknown sizes for generic operations. Most of the changes are hopefully fairly mechanical, adding a lot of getValue() calls and protecting them with hasValue() where needed.	2024-03-17 18:15:56 +00:00
Atousa Duprat	aff0570891	[ADT] Add implementations for avgFloor and avgCeil to APInt (#84431 ) Supports both signed and unsigned expansions. SelectionDAG now calls the APInt implementation of these functions. Fixes #84211.	2024-03-14 10:00:08 +00:00
Simon Pilgrim	f18d78b477	[DAG] isKnownToBeAPowerOfTwo - use sd_match to match both commutations of `x & -x` pattern`. NFC. Allows us to remove some tricky commutation matching	2024-03-13 14:47:36 +00:00
Simon Pilgrim	3358838446	[ADT] Add APIntOps::abds signed absolute difference and rename absdiff -> abdu (#84791 ) When I created APIntOps::absdiff, I totally missed that we already have ISD::ABDS/ABDU nodes, and we use this term in other places/targets as well. I've added the APIntOps::abds implementation and renamed APIntOps::absdiff to APIntOps::abdu. Given that APIntOps::absdiff is so young I don't think we need to create a deprecation wrapper, but I can if anyone thinks it important. I'll do a KnownBits rename patch after this.	2024-03-12 10:41:59 +00:00
Noah Goldstein	a9d913ebcd	[KnownBits] Add API support for `exact` in `lshr`/`ashr`; NFC	2024-03-11 15:51:06 -05:00
Craig Topper	6b270358c7	[SelectionDAG] Allow FREEZE to be hoisted before FP SETCC. (#84358 ) No nans/infs in SelectionDAG is complicated. Hopefully I've captured all of the cases. I've only applied to ConsiderFlags to the SDNodeFlags since those are the only ones that will be droped by hoisting. The condition code and TargetOptions would still be in effect. Recovers some regression from #84232.	2024-03-08 17:21:21 -08:00
Craig Topper	a456885efc	[SelectionDAG] Allow FREEZE to be hoisted before integer SETCC. (#84241 ) Teach canCreateUndefOrPoison that ISD::SETCC with integer operands can never create undef/poison. FP SETCC is more complicated and will be handled in a future patch. Teach isGuaranteedNotToBeUndefOrPoison that ISD::CONDCODE is not poison/undef. Its a special constant only used by setcc/select_cc like nodes. This is needed since the hoisting will only hoist if exactly one operand might be poison. setcc has 3 operand including the condition code. Recovers some regression from #84232.	2024-03-08 10:17:54 -08:00
Noah Goldstein	61c06775c9	[KnownBits] Add API for `nuw` flag in `computeForAddSub`; NFC	2024-03-05 12:59:58 -06:00
David Green	dbca8a49b6	[DAG] Improve known bits of Zext/Sext loads with range metadata (#80829 ) This extends the known bits for extending loads which have range metadata, handling the range metadata on the original memory type, extending that to the correct BitWidth.	2024-02-29 12:53:13 +00:00
Craig Topper	e7a303e3cf	[SelectionDAG] Remove unused getIndexedStridedLoadVP/getIndexedStridedStoreVP functions. NFC (#82847 ) These appear to have been copied from getIndexedLoadVP/getIndexedStoreVP which in turn were copied from the non-VP versions.	2024-02-28 15:02:48 -08:00
Noah Goldstein	15a7de697a	[SelectionDAG] Support sign tracking through `{S\|U}INT_TO_FP` Just a minimal amount of easily provable tracking. Proofs: https://alive2.llvm.org/ce/z/RQYbdw Closes #82808 Alive2 to has an issue with `(sitofp i1)`, but it can be verified by hand: https://godbolt.org/z/qKr7hT7s9	2024-02-26 15:35:38 -06:00
Craig Topper	962a6970f2	[SelectionDAG] Remove unused VP strided load/store creation functions that build an MMO. (#82676 ) The base case of these call InferPtrInfo. This is dangerous due to #82657, but it turns out none of these are used. It seemed best to reduce the surface area until these are needed.	2024-02-23 10:15:49 -08:00
Craig Topper	f8cbb67b10	[DAGCombiner] Preserve nneg flag from inner zext when we combine (z/s/aext (zext X)) (#82199 )	2024-02-19 12:21:17 -08:00
Simon Pilgrim	d30e941a03	[DAG] Add SelectionDAG::getShiftAmountConstant APInt variant (#81484 ) Asserts that the shift amount is in range and update ExpandShiftByConstant to use getShiftAmountConstant (and legal shift amount types).	2024-02-13 08:06:16 +00:00
Luke Lau	ece66dbc60	[SelectionDAG] Add computeKnownBits support for ISD::STEP_VECTOR (#80452 ) This handles two cases where we can work out some known-zero bits for ISD::STEP_VECTOR. The first case handles when we know the low bits are zero because the step amount is a power of two. This is taken from https://reviews.llvm.org/D128159, and even though the original patch didn't end up landing this case due to it not having any test difference, I've included it here for completeness's sake. The second case handles the case when we have an upper bound on vscale_range. We can use this to work out the upper bound on the number of elements, and thus what the maximum step will be. From the maximum step we then know which hi bits are zero. On its own, computing the known hi bits results in some small improvements for RVV with -mrvv-vector-bits=zvl across the llvm-test-suite. However I'm hoping to be able to use this later to reduce the LMUL in index calculations for vrgather/indexed accesses. --------- Co-authored-by: Philip Reames <preames@rivosinc.com>	2024-02-08 10:04:55 +08:00

1 2 3 4 5 ...

2544 Commits