llvm-project

Author	SHA1	Message	Date
Craig Topper	f139bde8d8	[SelectionDAG] Move SDNode::use_iterator::getOperandNo to SDUse. (#120536 ) This allows us to write more range based for loops because we no longer need the iterator. It also matches IR's Use class.	2024-12-19 09:07:42 -08:00
Craig Topper	e6b2495545	[SelectionDAG] Split SDNode::use_iterator into user_iterator and use_iterator. (#120531 ) SDNode::use_iterator now returns an SDUse& when dereferenced. SDNode::user_iterator returns SDNode*. SDNode::use_begin/use_end/uses work on use_iterator. SDNode::user_begin/user_end/users work on user_iterator. We can now write range based for loops using SDUse& and SDNode::uses(). I've converted many of these in this patch. I didn't update loops that have additional variables updated in their for statement. Some loops use SDNode::use_iterator::getOperandNo() which also prevents using range based for loops. I plan to move this into SDUse in a follow up patch.	2024-12-19 08:35:32 -08:00
Craig Topper	bd261ecc5a	[SelectionDAG] Add SDNode::user_begin() and use it in some places (#120509 ) Most of these are just places that want the first user and aren't iterating over the whole list. While there I changed some use_size() == 1 to hasOneUse() which is more efficient. This is part of an effort to rename use_iterator to user_iterator and provide a use_iterator that dereferences to SDUse&. This patch helps reduce the diff on later patches.	2024-12-18 22:13:04 -08:00
Craig Topper	4ca4287da4	[SelectionDAG] Replace findGlueUse in SelectionDAGISel with SDNode::getGluedUser. NFC (#120512 )	2024-12-18 21:46:52 -08:00
Craig Topper	104ad9258a	[SelectionDAG] Rename SDNode::uses() to users(). (#120499 ) This function is most often used in range based loops or algorithms where the iterator is implicitly dereferenced. The dereference returns an SDNode * of the user rather than SDUse * so users() is a better name. I've long beeen annoyed that we can't write a range based loop over SDUse when we need getOperandNo. I plan to rename use_iterator to user_iterator and add a use_iterator that returns SDUse& on dereference. This will make it more like IR.	2024-12-18 20:09:33 -08:00
Zhaoxin Yang	f334db92be	[llvm][CodeGen] Intrinsic `llvm.powi.*` code gen for vector arguments (#118242 ) Scalarize vector FPOWI instead of promoting the type. This allows the scalar FPOWIs to be visited and converted to libcalls before promoting the type. FIXME: This should be done in LegalizeVectorOps/LegalizeDAG, but call lowering needs the unpromoted EVT. Without this patch, in some backends, such as RISCV64 and LoongArch64, the i32 type is illegal and will be promoted. This causes exponent type check to fail when ISD::FPOWI node generates a libcall. Fix https://github.com/llvm/llvm-project/issues/118079	2024-12-19 08:57:31 +08:00
Benjamin Maxwell	a7dafea384	[SDAG] Allow folding stack slots into sincos/frexp in more cases (#118117 ) This adds a new helper `canFoldStoreIntoLibCallOutputPointers()` to check that it is safe to fold a store into a node that will expand to a library call that takes output pointers. This requires checking for two (independent) properties: 1. The store is not within a CALLSEQ_START..CALLSEQ_END pair * If it is, the expansion would lead to nested call sequences (which is invalid) 2. The node does not appear as a predecessor to the store * If it does, attempting to merge the store into the call would result in a cycle in the DAG These two properties are checked as part of the same traversal in `canFoldStoreIntoLibCallOutputPointers()`	2024-12-17 10:54:17 +00:00
Simon Pilgrim	0954c67d7a	[DAG] visitFREEZE - only fold integer types to an all ones constant ISD::isBuildVectorAllOnes can peek through bitcasts, so this can match against FP NAN (ish) data (e.g. double (bitcast i64 -1)) under certain circumstances - bail if the type isn't an integer and let bitcast folding handle it first. Fixes #120093	2024-12-16 16:46:38 +00:00
Björn Pettersson	3ad2399148	[DAGCombiner] Refactor and improve ReduceLoadOpStoreWidth (#119564 ) This patch make a couple of improvements to ReduceLoadOpStoreWidth. When determining the minimum size of "NewBW" we now take byte boundaries into account. If we for example touch bits 6-10 we shouldn't accept NewBW=8, because we would fail later when detecting that we can't access bits from two different bytes in memory using a single load. Instead we make sure to align LSB/MSB according to byte size boundaries up front before searching for a viable "NewBW". In the past we only tried to find a "ShAmt" that was a multiple of "NewBW", but now we use a sliding window technique to scan for a viable "ShAmt" that is a multiple of the byte size. This can help out finding more opportunities for optimization (specially if the original type isn't byte sized, and for big-endian targets when the original load/store is aligned on the most significant bit).	2024-12-16 12:15:11 +01:00
David Green	a35db2880a	[NFC] Remove some unnecessary semicolons All inside LLVM_DEBUG, some of which have been cleaned up by adding block scopes to allow them to format more nicely.	2024-12-16 08:48:57 +00:00
Matt Arsenault	ea632e1b34	Reapply "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575 ) (#119634 ) This reverts commit 40986feda8b1437ed475b144d5b9a208b008782a. Reapply with fix to prevent temporary Twine from going out of scope.	2024-12-11 16:01:48 -08:00
Vitaly Buka	40986feda8	Revert "DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm" (#119575 ) Reverts llvm/llvm-project#119485 Breaks builders, details in llvm/llvm-project#119485	2024-12-11 07:51:36 -08:00
Bjorn Pettersson	22780f808a	[DAGCombiner] Fix to avoid writing outside original store in ReduceLoadOpStoreWidth (#119203 ) DAGCombiner::ReduceLoadOpStoreWidth could replace memory accesses with more narrow loads/store, although sometimes the new load/store would touch memory outside the original object. That seemed wrong and this patch is simply avoiding doing the DAG combine in such situations. Also simplifying the expression used to align ShAmt down to a multiple of NewBW. Subtracting (ShAmt % NewBW) should do the same thing as the old more complicated expression. Intention is to follow up with a patch that make more attempts, trying to align the memory accesses at other offsets, allowing to trigger the transform in more situations. The current strategy for deciding size (NewBW) and offset (ShAmt) for the narrowed operations are a bit ad-hoc, and not really considering big endian memory order in same way as little endian.	2024-12-11 15:07:16 +01:00
Bjorn Pettersson	bc1f3eb593	[DAGCombiner] Pre-commit test case for ReduceLoadOpStoreWidth. NFC Adding test cases related to narrowing of load-op-store sequences. ReduceLoadOpStoreWidth isn't careful enough, so it may end up creating load/store operations that access memory outside the region touched by the original load/store. Using ARM as a target for the test cases to show what happens for both little-endian and big-endian. This patch also adds a way to override the TLI.isNarrowingProfitable check in DAGCombiner::ReduceLoadOpStoreWidth by using the option -combiner-reduce-load-op-store-width-force-narrowing-profitable. Idea is that it should be simpler to for example add lit tests verifying that the code is correct for big-endian (which otherwise is difficult since there are no in-tree big-endian targets that is overriding TLI.isNarrowingProfitable). This is a pre-commit for https://github.com/llvm/llvm-project/pull/119203	2024-12-11 15:07:15 +01:00
Sergei Barannikov	3057ac1c9a	[SelectionDAG] Fix "unused variable" warnings after #119268 (NFC) (#119550 )	2024-12-11 15:15:42 +03:00
Sergei Barannikov	6c7e5827ed	[SelectionDAG] Don't call ComputeValueVTs for "demote register" (NFC) (#119268 ) `ComputeValueVTs` only breaks down aggregate types. For pointer types it is equivalent to calling `TargetLoweringBase::getPointerTy`.	2024-12-11 14:46:12 +03:00
Matt Arsenault	884f2ad6f9	DiagnosticInfo: Clean up usage of DiagnosticInfoInlineAsm (#119485 ) Currently LLVMContext::emitError emits any error as an "inline asm" error which does not make any sense. InlineAsm appears to be special, in that it uses a "LocCookie" from srcloc metadata, which looks like a parallel mechanism to ordinary source line locations. This meant that other types of failures had degraded source information reported when available. Introduce some new generic error types, and only use inline asm in the appropriate contexts. The DiagnosticInfo types are still a bit of a mess, and I'm not sure why DiagnosticInfoWithLocationBase exists instead of just having an optional DiagnosticLocation in the base class. DK_Generic is for any error that derives from an IR level instruction, and thus can pull debug locations directly from it. DK_GenericWithLoc is functionally the generic codegen error, since it does not depend on the IR and instead can construct a DiagnosticLocation from the MI debug location.	2024-12-11 17:16:07 +09:00
Craig Topper	839c8217b9	[LegalizeTypes][RISCV][X86] Legalize FP_ROUND to libcall in SoftPromoteHalfRes_FP_ROUND if the input type is softened. (#119481 ) Previously we created an FP_TO_FP16 and legalized it in SoftenFloatOp_FP_ROUND. This caused i16 to be sent to call lowering instead of f16. This results in the ABI not being followed if f16 is supposed to be passed in a different register than i16. Looking at the libgcc binary for the library function it appears the value is returned in xmm0 so the X86 test was being miscompiled before. Fixes #107607.	2024-12-10 22:21:49 -08:00
Craig Topper	5797ed660a	[GISel][SDAG] Avoid push_back in loops for some shuffle mask handling. (#119434 ) Each call to push_back contains a check to see if the vector needs to grow. Using resize or giving the size to the constructor can reduce the number of checks for growing.	2024-12-10 22:18:46 -08:00
Dan Gohman	e665e781dc	[SelectionDAG] Use the nuw flag when expanding loads. (#119288 ) When expanding a load into two loads, use nuw for the add that computes the offset from the base of the second load, because the original load doesn't straddle the address space. It turns out there's already a dedicated helper function for doing this, `getObjectPtrOffset`. This is in target-independent code, however in practice it only seems to affact WebAssembly code, because WebAssembly load and store instructions' constant offsets don't perform wrapping, so constant folding often depends on the nuw flag being present. This was noticed in the development of #119204.	2024-12-10 06:28:09 -08:00
LiqinWeng	3083acc215	[DAGCombine] Remove oneuse restrictions for RISCV in folding (shl (add_nsw x, c1)), c2) and folding (shl(sext(add x, c1)), c2) in some scenarios (#101294 ) This patch remove the restriction for folding (shl (add_nsw x, c1)), c2) and folding (shl(sext(add x, c1)), c2), and test case from dhrystone , see this link: riscv32: https://godbolt.org/z/o8GdMKrae riscv64: https://godbolt.org/z/Yh5bPz56z	2024-12-10 11:17:54 +08:00
Sergei Barannikov	e55c167777	[TargetLowering] Return Align from getByValTypeAlignment (NFC) (#119233 )	2024-12-09 23:39:19 +03:00
David Sherwood	8630a7ba7c	Reapply "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#117566 )" (#118823 ) [Reverts d57892a2a153ab71a796f07e39d939eae6910c21] For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant. --------- Co-authored-by: Paul Walker <paul.walker@arm.com>	2024-12-09 10:56:44 +00:00
abhishek-kaushik22	d20731ce6b	[CGData][GlobalIsel][Legalizer][DAG][MC][AsmParser][X86][AMX] Use `std::move` to avoid copy (#118068 )	2024-12-06 09:46:15 +08:00
Craig Topper	1d3f9f8862	[SelectionDAG] Stop storing EVTs in a function scoped static std::set. (#118715 ) EVTs potentially contain a Type * that points into memory owned by an LLVMContext. Storing them in a function scoped static means they may outlive the LLVMContext they point to. This std::set is used to unique single element VT lists containing a single extended EVT. Single element VT list with a simple EVT are uniqued by a separate cache indexed by the MVT::SimpleValueType enum. VT lists with more than one element are uniqued by a FoldingSet owned by the SelectionDAG object. This patch moves the single element cache into SelectionDAG so that it will be destroyed when SelectionDAG is destroyed. Fixes #88233	2024-12-05 12:56:36 -08:00
Alex MacLean	6018820c48	[NVPTX] Fix lowering of i1 SETCC (#115035 ) Add DAG legalization support for expanding i1 SETCC nodes using appropriate logical operations to simulate integer comparisons. Use these expansions to handle i1 SETCC in NVPTX. fixes #58428 and #57405	2024-12-05 12:54:24 -08:00
Vitaly Buka	d57892a2a1	Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc" (#118693 ) Reverts llvm/llvm-project#117566 Breaks libc++ tests with HWASAN https://lab.llvm.org/buildbot/#/builders/55/builds/3959	2024-12-04 12:36:46 -08:00
Oliver Stannard	99b862efba	[DAGISel][ARM] Fix vector truncate combine for big-endian (#118101 ) This DAG combine was incorrect for big-endian targets, because it assumes that when a bitcast changes the lane width, the least-significant bits of the wider lanes are in the lower-numbered lanes of the smaller type, which is only true for little-endian.	2024-12-04 14:32:15 +00:00
David Sherwood	4675db5f39	[DAGCombiner] Add support for scalarising extracts of a vector setcc (#117566 ) For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant.	2024-12-04 10:26:51 +00:00
Sam Elliott	73731d6873	[llvm-tblgen] Increase Coverage Index Size (#118329 )	2024-12-04 09:19:13 +00:00
Simon Pilgrim	b1a48af56a	[DAG] SimplifyDemandedVectorElts - add handling for INT<->FP conversions (#117884 )	2024-12-04 07:37:01 +00:00
Craig Topper	b076fbb844	[TargetLowering] Use Type* instead of EVT in shouldSignExtendTypeInLibCall. (#118587 ) I want to use this function for GISel too so Type * is a better common interface. All of the callers already convert EVT to Type * as needed by calling lowering anyway.	2024-12-03 22:06:55 -08:00
Brandon Wu	109e4a147f	[RISCV] Handle zeroinitializer of vector tuple Type (#113995 ) It doesn't make sense to add a new generic ISD to handle riscv tuple type. Instead we use `SPLAT_VECTOR` for ISD and further lower to `VMV_V_X`. Note: If there's `visitSPLAT_VECTOR` in generic DAG combiner, it needs to skip riscv vector tuple type. Stack on https://github.com/llvm/llvm-project/pull/114329	2024-12-04 13:40:02 +08:00
Craig Topper	caa8aa551b	[SelectionDAG] Rename CallOptions::IsSExt to IsSigned. NFC (#118574 ) This is eventually passed to shouldSignExtendTypeInLibCall which calls it IsSigned.	2024-12-03 18:25:44 -08:00
Nikita Popov	b2df007413	[FastISel] Support unreachable with NoTrapAfterNoReturn (#118296 ) Currently FastISel triggers a fallback if there is an unreachable terminator and the TrapUnreachable option is enabled (the ISD::TRAP selection does not actually work). Add handling for NoTrapAfterNoReturn, in which case we don't actually need to emit a trap. The test is just there to make sure there is no FastISel fallback (which is why I'm not testing the case without noreturn). We have other tests that check the actual unreachable codegen variations.	2024-12-03 12:54:26 +01:00
fengfeng	7907292daa	[DAG] Apply Disjoint flag. (#118045 ) or disjoint (or disjoint (x, c0), c1) --> or disjont x, or (c0, c1) Alive2: https://alive2.llvm.org/ce/z/3wPth5 --------- Signed-off-by: feng.feng <feng.feng@iluvatar.com>	2024-12-03 09:21:03 +08:00
Craig Topper	73186546f0	[LegalizeTypes][RISCV] Call setTypeListBeforeSoften from ExpandIntRes_FP_TO_XINT if the FP type needs to be softened (#118269 ) This avoids an unnecessary sext.w before the libcall.	2024-12-02 09:06:08 -08:00
Simon Pilgrim	31b7d4333a	[DAG] Extend extract_element(bitcast(scalar_to_vector(X))) -> trunc(srl(X,C)) (#117900 ) When extracting a smaller integer from a scalar_to_vector source, we were limited to only folding/truncating the lowest bits of the scalar source. This patch extends the fold to handle extraction of any other element, by right shifting the source before truncation. Fixes a regression from #117884	2024-11-29 17:24:38 +00:00
antangelo	dd4844722d	[SelectionDAG] Add generic implementation for @llvm.expect.with.probability when optimizations are disabled (#117459 ) Handle \@llvm.expect.with.probability in SelectionDAGBuilder, FastISel, and IntrinsicLowering in the same way \@llvm.expect is handled, where the value is passed through as-is. This can be reached if the intrinsic is used without optimizations, where it would otherwise be properly transformed out. Fixes #115411 for SelectionDAG. A similar patch is likely needed for GlobalISel.	2024-11-26 20:22:25 -05:00
Nikita Popov	3e1b55cafc	[SDAG] Don't allow implicit trunc in getConstant() (#117558 ) Assert that the passed value is a valid unsigned integer value for the specified type. For signed values getSignedConstant() / getSignedTargetConstant() should be used instead.	2024-11-26 10:36:00 +01:00
Craig Topper	bc282605df	[SelectionDAG] Require last operand of (STRICT_)FP_ROUND to be a TargetConstant. (#117639 ) Fix all the places I could find that did't do this. We were already mostly correct for FP_ROUND after 9a976f36615dbe15e76c12b22f711b2e597a8e51, but not STRICT_FP_ROUND.	2024-11-25 21:36:33 -08:00
Craig Topper	c2bb056482	[SelectionDAG][RISCV][AArch64] Allow f16 STRICT_FLDEXP to be promoted. Fix integer promotion of STRICT_FLDEXP in type legalizer. (#117633 ) A special case in type legalization wasn't accounting for different operand numbering between FLDEXP and STRICT_FLDEXP. AArch64 already asked STRICT_FLDEXP to be promoted, but had no test for it.	2024-11-25 16:12:45 -08:00
David Sherwood	9b76e7fc60	Revert "[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 )" (#117556 ) This reverts commit 22ec44f509ff266b581dbb490d7b040473b7c31a.	2024-11-25 13:49:21 +00:00
David Sherwood	22ec44f509	[DAGCombiner] Add support for scalarising extracts of a vector setcc (#116031 ) For IR like this: %icmp = icmp ult <4 x i32> %a, splat (i32 5) %res = extractelement <4 x i1> %icmp, i32 1 where there is only one use of %icmp we can take a similar approach to what we already do for binary ops such add, sub, etc. and convert this into %ext = extractelement <4 x i32> %a, i32 1 %res = icmp ult i32 %ext, 5 For AArch64 targets at least the scalar boolean result will almost certainly need to be in a GPR anyway, since it will probably be used by branches for control flow. I've tried to reuse existing code in scalarizeExtractedBinop to also work for setcc. NOTE: The optimisations don't apply for tests such as extract_icmp_v4i32_splat_rhs in the file CodeGen/AArch64/extract-vector-cmp.ll because scalarizeExtractedBinOp only works if one of the input operands is a constant.	2024-11-25 09:25:01 +00:00
Nikita Popov	3317c9ceac	[AMDGPU] Use getSignedConstant() where necessary (#117328 ) Create signed constant using getSignedConstant(), to avoid future assertion failures when we disable implicit truncation in getConstant(). This also touches some generic legalization code, which apparently only AMDGPU tests.	2024-11-25 09:49:34 +01:00
Félix-Antoine Constantin	7a56dc7245	[Clang] Attribute NoFPClass should not prevent tail call optimization. (#116741 ) Fixes #111950	2024-11-22 17:28:45 -08:00
Jonathan Cohen	00d383ee9d	[DAGCombiner] Limit steps in shouldCombineToPostInc (#116030 ) Currently the function will walk the entire DAG to find other candidates to perform a post-inc store. This leads to very long compilation times on large functions. Added a MaxSteps limit to avoid this, which is also aligned to how hasPredecessorHelper is used elsewhere in the code.	2024-11-21 11:58:37 +02:00
abhishek-kaushik22	a23260087d	[SDAG] [X86] Extend SplitVecOp_VSETCC for STRICT_FSETCCS (#116768 ) Closes #116767	2024-11-21 17:43:01 +08:00
Benjamin Maxwell	0a1795f781	[SDAG] Generalize FSINCOS type legalization (NFC) (#116848 ) There's nothing that specific to FSINCOS about these; they could be used for similar nodes in the future.	2024-11-20 10:56:39 +00:00
Lei Huang	ed8ebad6eb	[SelectionDAG] Support integer promotion for VP_LOAD and VP_STORE (#81299 ) Add integer promotion support for for VP_LOAD and VP_STORE via legalization of extend and truncate of each form. Patch commandeered from: https://reviews.llvm.org/D109377	2024-11-18 13:32:58 -05:00

1 2 3 4 5 ...

13947 Commits