llvm-project

Author	SHA1	Message	Date
Craig Topper	2b7b8bdc16	[X86] Accept the canonical form of a sign bit test in MatchVectorAllEqualTest. (#154421 ) This function tries to look for (seteq (and (reduce_or), mask), 0). If the mask is a sign bit, InstCombine will have turned it into (setgt (reduce_or), -1). We should handle that case too. I'm looking into adding the same canonicalization to SimplifySetCC and this change is needed to prevent test regressions.	2025-08-20 09:09:55 -07:00
Simon Pilgrim	d770567a51	[X86] SimplifyDemandedVectorEltsForTargetNode - don't split X86ISD::CVTTP2UI nodes without AVX512VL (#154504 ) Unlike CVTTP2SI, CVTTP2UI is only available on AVX512 targets, so we don't fallback to the AVX1 variant when we split a 512-bit vector, so we can only use the 128/256-bit variants if we have AVX512VL. Fixes #154492	2025-08-20 12:18:10 +01:00
Simon Pilgrim	1359f72a03	[X86] canCreateUndefOrPoisonForTargetNode - add X86ISD::MOVMSK (#154321 ) MOVMSK nodes don't create undef/poison when extracting the signbits from the source operand	2025-08-19 14:40:42 +01:00
Adam Nemet	350cb989b8	[X86] Explicitly widen larger than v4f16 to the legal v8f16 (NFC) (#153839 ) This patch makes the current behavior explicit to prepare for adding VTs for v[567]f16. Right now these types are EVTs and hence don't fall under getPreferredVectorAction and are simply widened to the next legal power-of-two vector type. For SSE2 this is v8f16. Without the preparatory patch however, the behavior would change after adding these types. getPreferredVectorAction would try to split them because this is the current behavior for any f16 vector type that is not legal. There is a lot more detail at https://github.com/llvm/llvm-project/issues/152150 in particular how splitting these new types leads to an inconsistency between NumRegistersForVT and getTypeAction. The patch ensures that after the new types are added they would continue to be widened rather than split. Once the patch to enable v[567]f16 lands, it will be an NFC for x86.	2025-08-17 19:15:10 +00:00
Nikita Popov	01bc742185	[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817 ) This ensures that the required fields are set, and also makes the construction more convenient.	2025-08-15 18:06:07 +02:00
Simon Pilgrim	c96d0da62b	[X86] lowerShuffleAsLanePermuteAndPermute - ensure we've simplified the demanded shuffle mask elts before testing for a matching shuffle (#153554 ) When lowering using sublane shuffles, we can sometimes end up with the same mask as we started with. We already bail in these occasions, but we weren't fully simplifying the new shuffle mask before testing if it matched. Fixes #153457	2025-08-14 10:47:11 +01:00
Abhishek Kaushik	11eeb4d133	[X86] `combinePMULH` - combine `mulhu` + `srl` (#132548 ) Fixes #132166	2025-08-05 16:10:56 +05:30
paperchalice	03e902cc68	[X86] Remove `UnsafeFPMath` uses (#151667 ) Remove `UnsafeFPMath` in X86 part, it blocks some bugfixes related to clang and the ultimate goal is to remove `resetTargetOptions` method in `TargetMachine`, see FIXME in `resetTargetOptions`. See also https://discourse.llvm.org/t/rfc-honor-pragmas-with-ffp-contract-fast https://discourse.llvm.org/t/allowfpopfusion-vs-sdnodeflags-hasallowcontract	2025-08-05 08:24:52 +08:00
woruyu	38bfe9ae56	[DAG] combineVSelectWithAllOnesOrZeros - missing freeze (#150388 ) This PR resolves https://github.com/llvm/llvm-project/issues/150069 --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-04 15:55:12 +01:00
AZero13	8e9e38acc8	[X86] Try to shrink i64 compares if the input has enough sign bits (#149719 ) If there are enough sign bits in a 64 bit value, we can just compare the bottom 32 bits. --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-02 18:01:33 +01:00
Phoebe Wang	740758a5fd	[X86][APX] Combine `xor .., -1` into Cload/Cstore conditions (#151457 ) Remove redundant NOT instruction: https://godbolt.org/z/jM89ejnsh	2025-07-31 15:12:47 +08:00
Phoebe Wang	743177c1ef	[X86][APX] Use TEST instruction for CLOAD/CSTORE (#151160 )	2025-07-30 16:34:45 +08:00
Simon Pilgrim	3345582542	[X86] getTargetConstantBitsFromNode - early-out if the element bitsize doesn't align with the source bitsize (#150184 ) As we use getTargetConstantBitsFromNode in a wider variety of places we can't guarantee that all the sources match (or are legal) anymore - better to early out than assert. Fixes #150117	2025-07-23 11:22:38 +01:00
Simon Pilgrim	d87bf79a23	[X86] isGuaranteedNotToBeUndefOrPoisonForTargetNode - X86ISD::GlobalBaseReg and X86ISD::Wrapper/WrapperRIP nodes are never poison (#149854 ) Fixes #149841	2025-07-22 07:51:01 +01:00
Simon Pilgrim	069f0fea00	[X86] canCreateUndefOrPoisonForTargetNode - SSE PINSR/PEXTR vector element insert/extract are never out of bounds (#149822 ) The immediate index is guaranteed to be treated as modulo	2025-07-22 07:50:17 +01:00
Tobias Decking	10b0dee97d	[X86] Ensure that bit reversals of byte vectors are properly lowered on pure GFNI targets (#148304 ) Fixes #148238. When GFNI is present, custom bit reversal lowerings for scalar integers become active. They work by swapping the bytes in the scalar value and then reversing bits in a vector of bytes. However, the custom bit reversal lowering for a vector of bytes is disabled if GFNI is present in isolation, resulting messed up code. --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-18 19:14:34 +01:00
Brad Smith	0d2e11f3e8	Remove Native Client support (#133661 ) Remove the Native Client support now that it has finally reached end of life.	2025-07-15 13:22:33 -04:00
David Green	0736f330b0	[DAG] Handle truncated splat in isBoolConstant (#145473 ) This allows truncated splat / buildvector in isBoolConstant, to allow certain not instructions to be recognized post-legalization, and allow vselect to optimize. An override for x86 avx512 predicated vectors is required to avoid an infinite recursion from the code that detects zero vectors. From: ``` // Check if the first operand is all zeros and Cond type is vXi1. // If this an avx512 target we can improve the use of zero masking by // swapping the operands and inverting the condition. ```	2025-07-10 20:59:34 +01:00
Simon Pilgrim	75656d8c11	[X86] combineStore - remove rangedata when converting 64-bit copies to f64 load/store (#147904 ) We're changing from i64 to f64 - we can't retain any range metadata Fixes #147781	2025-07-10 09:32:09 +01:00
woruyu	7edf6bfb54	[DAG][X86] Use pattern matching to simplify PSADBW+ADD combine (#147637 ) This patch refactors the add(psadbw(x, 0), psadbw(y, 0)) -> psadbw(x + y, 0) combine to use SDPatternMatch matchers instead of manually checking opcodes and operands. Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-09 10:26:12 +01:00
Simon Pilgrim	f0bc41181c	[X86] combineBasicSADPattern - pattern match various vXi8 ABDU patterns (#147570 ) We were previously limited to abs(sub(zext(),zext()) patterns, but add handling for a number of other abdu patterns until a topological sorted dag allows us to rely on a ABDU node having already been created. Now that we don't just match zext() sources, I've generalised the createPSADBW helper to explicitly zext/truncate to the expected vXi8 source type - it still assumes the sources are correct for a PSADBW node. Fixes #143456	2025-07-09 08:08:22 +01:00
woruyu	b0790e04a3	[DAG] combineVSelectWithAllOnesOrZeros - fold select Cond, 0, x -> and not(Cond), x (#147472 ) ### Summary This patch extends the work from [#145298](https://github.com/llvm/llvm-project/pull/145298) by removing the now-unnecessary X86-specific combineVSelectWithLastZeros logic. That combine is now correctly and more generally handled in the target-independent combineVSelectWithAllOnesOrZeros. This simplifies the X86 DAG combine logic and avoids duplication. Fixes: [#144513](https://github.com/llvm/llvm-project/issues/144513) Related for reference: [#146831](https://github.com/llvm/llvm-project/pull/146831)	2025-07-08 14:45:40 +01:00
Simon Pilgrim	153c6db069	[X86] Merge detectZextAbsDiff into combineBasicSADPattern. NFC. (#147368 ) detectZextAbsDiff had already been simplified a great deal when it was converted to SDPatternMatch, and a future patch should allow us to match to ISD::ABDU directly making it entirely redundant.	2025-07-08 07:19:08 +01:00
Matt Arsenault	d8ef156379	DAG: Remove verifyReturnAddressArgumentIsConstant (#147240 ) The intrinsic argument is already marked with immarg so non-constant values are rejected by the IR verifier.	2025-07-07 16:28:47 +09:00
Phoebe Wang	eca05fde84	[X86] Switch operands order for FMINIMUMNUM/FMAXIMUMNUM (#147193 ) When optimizate for NaN, switch operands order for FMINIMUMNUM/FMAXIMUMNUM. Fixes #135313	2025-07-07 09:57:02 +08:00
Phoebe Wang	a438c60997	[X86][FP16] Do not customize WidenLowerNode for half if VLX not enabled (#146994 ) The #142763 tried to reuse ISD node to workaround the non-VLX lowering problem, but it caused a new problem: https://godbolt.org/z/1hEGnddhY	2025-07-04 22:57:03 +08:00
Simon Pilgrim	043789519a	[X86] combineShiftToPMULH - convert matching to use SDPatternMatch. NFC.	2025-07-04 11:13:29 +01:00
Simon Pilgrim	0f717044ff	[X86] lowerX86FPLogicOp - use MVT::changeVectorElementTypeToInteger(). NFC.	2025-07-03 16:23:12 +01:00
Simon Pilgrim	f019c89008	[X86] foldXorTruncShiftIntoCmp - pull out repeated SDLoc. NFC.	2025-07-03 16:03:14 +01:00
Simon Pilgrim	51ff8f2f7e	[X86] foldXor1SetCC - pull out repeated SDLoc. NFC.	2025-07-03 16:03:13 +01:00
Simon Pilgrim	a282c68580	[X86] combineX86AddSub - pull out repeated getOperand() call. NFC.	2025-07-03 16:03:13 +01:00
Simon Pilgrim	30eb97c584	[X86] commuteSelect - update to use SDPatternMatch. NFC. (#146868 )	2025-07-03 14:51:36 +01:00
Simon Pilgrim	aa8e1bc0e9	[X86] Add BLEND/UNPCK shuffles to canCreateUndefOrPoisonForTargetNode/isGuaranteedNotToBeUndefOrPoisonForTargetNode (#146728 ) None of these implicitly generate UNDEF/POISON	2025-07-02 18:38:00 +01:00
woruyu	bbcebec3af	[DAG] Refactor X86 combineVSelectWithAllOnesOrZeros fold into a generic DAG Combine (#145298 ) This PR resolves https://github.com/llvm/llvm-project/issues/144513 The modification include five pattern : 1.vselect Cond, 0, 0 → 0 2.vselect Cond, -1, 0 → bitcast Cond 3.vselect Cond, -1, x → or Cond, x 4.vselect Cond, x, 0 → and Cond, x 5.vselect Cond, 000..., X -> andn Cond, X 1-4 have been migrated to DAGCombine. 5 still in x86 code. The reason is that you cannot use the andn instruction directly in DAGCombine, you can only use and+xor, which will introduce optimization order issues. For example, in the x86 backend, select Cond, 0, x → (~Cond) & x, the backend will first check whether the cond node of (~Cond) is a setcc node. If so, it will modify the comparison operator of the condition.So the x86 backend cannot complete the optimization of andn.In short, I think it is a better choice to keep the pattern of vselect Cond, 000..., X instead of and+xor in combineDAG. For commit, the first is code changes and x86 test(note 1), the second is tests in other backend(node 2). --------- Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-07-02 15:07:48 +01:00
Matt Arsenault	dbe441e716	X86: Avoid some uses of getPointerTy (#146306 ) In most contexts the pointer type is implied by the operation and should be propagated; getPointerTy is for niche cases where there is a synthesized value.	2025-07-02 22:14:16 +09:00
Simon Pilgrim	fd46e409a9	[X86] detectZextAbsDiff - use m_SpecificVectorElementVT matcher. NFC. (#146498 )	2025-07-01 11:59:37 +01:00
Phoebe Wang	67b740bd73	[X86] Add diagnostic for fp128 inline assemble for 32-bit (#146458 ) Suggested by Craig from #146259	2025-07-01 12:39:43 +08:00
Simon Pilgrim	372c808217	[X86] canCreateUndefOrPoisonForTargetNode - PCMPEQ/PCMPGT don't create poison/undef (#146116 )	2025-06-28 17:01:10 +01:00
Simon Pilgrim	fe4b4033ed	[X86] lowerShuffleAsVTRUNC - use combineConcatVectorOps to catch more "cheap" concats (#145876 )	2025-06-26 14:12:18 +01:00
Simon Pilgrim	db4dc88d06	[X86] combineEXTRACT_SUBVECTOR - remove unnecessary bitcast handling. (#145496 ) We already aggressively fold extract_subvector(bitcast()) -> bitcast(extract_subvector())	2025-06-24 13:47:03 +01:00
Simon Pilgrim	594ebe6340	[X86] combineSelect - move vselect(cond, pshufb(x), pshufb(y)) -> or(pshufb(x), pshufb(y)) fold (#145475 ) Move the OR(PSHUFB(),PSHUFB()) fold to reuse an existing createShuffleMaskFromVSELECT result and ensure it is performed before the combineX86ShufflesRecursively combine to prevent some hasOneUse failures noticed in #133947 (combineX86ShufflesRecursively still unnecessarily widens vectors in several locations).	2025-06-24 10:50:29 +01:00
Matt Arsenault	48155f93dd	CodeGen: Emit error if getRegisterByName fails (#145194 ) This avoids using report_fatal_error and standardizes the error message in a subset of the error conditions.	2025-06-23 16:33:35 +09:00
Abhishek Kaushik	cbfec48697	Revert "[X86][NFC] Use std::move to avoid copy" (#145215 ) Reverts llvm/llvm-project#141455	2025-06-22 12:52:57 +05:30
Abhishek Kaushik	4c1a1009ad	[X86][NFC] Use std::move to avoid copy (#141455 )	2025-06-21 22:04:41 +05:30
Simon Pilgrim	1753aba034	[X86] combineINSERT_SUBVECTOR - directly fold to X86ISD::SUBV_BROADCAST_LOAD to prevent vector split infinite loop (#145077 ) This reverts #140919 / f1d03dedfbe87119cfcafb07e0e0f90ec291cb97 - which could result in another fold trying to split the concatenation apart again before it was folded to a SUBV_BROADCAST_LOAD	2025-06-21 00:38:30 +02:00
Simon Pilgrim	f8ee5774b6	[X86] combineConcatVectorOps - only concat AVX1 v4i64 shift-by-32 to a shuffle if the concat is free (#145043 )	2025-06-20 18:09:07 +01:00
Simon Pilgrim	151ee0faad	[X86] SimplifyDemandedVectorEltsForTargetNode - ensure X86ISD::VPERMILPV node use v2f64/v4f32 types When reducing v4f64/v8f32 non-lane crossing X86ISD::VPERMV nodes, we use X86ISD::VPERMILPV nodes for 128-bits, but these are only available for fp types. Fixes #145046	2025-06-20 17:03:30 +01:00
Simon Pilgrim	95c6c11c74	[X86] combineConcatVectorOps - only always concat logic ops on AVX512 targets (#145036 ) We should only concat logic ops if at least one operand will freely concatenate. We've now addressed the remaining regressions on AVX2 targets, but still have a number on AVX512 targets which can aggressively use VPTERNLOG in many cases.	2025-06-20 15:51:04 +01:00
Matt Arsenault	1c35fe4e6b	RuntimeLibcalls: Pass in exception handling type (#144696 ) All of the ABI options that influence libcall decisions need to be passed in.	2025-06-19 19:08:52 +09:00
Simon Pilgrim	34a4894149	[X86] detectZextAbsDiff - use SDPatternMatch::m_Abs() matcher. NFC.	2025-06-18 13:21:09 +01:00

1 2 3 4 5 ...

9450 Commits