llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	8e77458578	[DAG] visitShiftByConstant - replace constant detection with FoldConstantArithmetic Instead of checking that an operand is constant/opaque before calling getNode() and then checking that the result is a constant, just use FoldConstantArithmetic which will just early-out if the operands are not constant foldable.	2022-10-17 16:19:10 +01:00
Simon Pilgrim	af5942cc09	Remove trailing whitespace. NFC.	2022-10-17 15:20:26 +01:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Filipp Zhinkin	ef774bec63	[AArch64] Support SETCCCARRY lowering Support SETCCCARRY lowering to SBCS instruction. Related issue: https://github.com/llvm/llvm-project/issues/44629 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D135302	2022-10-14 22:29:31 +03:00
chenglin.bi	c1909d7337	[DAGCombiner] Fix crash for the merge stores with different value type The crash case comes from #58350. It have two stores, one store is type f32 and the other is v1f32. When we try to merge these two stores on v1f32, the memVT is vector type so the old code will use ISD::EXTRACT_SUBVECTOR for type f32 also then compiler crash. So this patch insert a build_vector for f32 store to generate v1f32 also when memVT is v1f32. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135954	2022-10-15 01:16:35 +08:00
Nicola Lancellotti	ce1a2ccf94	[NFC] Fix typo in DAGCombiner	2022-10-14 17:47:25 +01:00
Sander de Smalen	02df03c5b7	[AArch64][SME] Add support for arm_locally_streaming functions. Functions with `aarch64_sme_pstatesm_body` will emit a SMSTART at the start of the function, and a SMSTOP at the end of the function, such that all operations use the right value for vscale. Because the placement of these nodes is critically important (i.e. no vscale-dependent operations should be done before SMSTART has been issued), we require glueing the CopyFromReg to the Entry node such that we can insert the SMSTART as part of that glued chain. More details about the SME attributes and design can be found in D131562. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D131582	2022-10-14 13:47:53 +00:00
Matt Arsenault	d0750ec475	AtomicExpand: Avoid some operations if the atomic is overaligned Let some of the pointer bithacking fold away if we know the LSB are 0.	2022-10-13 23:31:00 -07:00
Anshil Gandhi	d383adec4d	[BranchRelaxation] Fall through only if block has no unconditional branches Prior to inserting an unconditional branch from X to its fall through basic block, check if X has any terminators to avoid inserting additional branches. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134557	2022-10-13 22:48:41 -06:00
Matt Arsenault	c427ee9798	AsmPrinter: Remove pointless code in inline asm emission This was scanning through def operands looking for the symbol operand. This is pointless because the symbol is always the first operand as enforced by the verifier, and all operands are implicit.	2022-10-13 21:12:11 -07:00
Xiang1 Zhang	aad013de41	[InlineAsm][bugfix] Correct function addressing in inline asm In Linux PIC model, there are 4 cases about value/label addressing: Case 1: Function call or Label jmp inside the module. Case 2: Data access (such as global variable, static variable) inside the module. Case 3: Function call or Label jmp outside the module. Case 4: Data access (such as global variable) outside the module. Due to current llvm inline asm architecture designed to not "recognize" the asm code, there are quite troubles for us to treat mem addressing differently for same value/adress used in different instuctions. For example, in pic model, call a func may in plt way or direclty pc-related, but lea/mov a function adress may use got. This patch fix/refine the case 1 and case 2 in inline asm. Due to currently inline asm didn't support jmp the outsider lable, this patch mainly focus on fix the function call addressing bugs in inline asm. Reviewed By: Pengfei, RKSimon Differential Revision: https://reviews.llvm.org/D133914	2022-10-14 09:47:26 +08:00
David Green	16e4e4ab87	[CodeGenPrep] Handle constants in ConvertPhiType This is a simple addition to the convertPhiTypes in CodeGenPrepare to consider and convert constants as it converts the phi type. Someone fixed the bug in the motivating example, so the undef is now a constant 0. This does mean converting between integer and floating point constants, which may have different materialization. Differential Revision: https://reviews.llvm.org/D135561	2022-10-13 16:41:44 +01:00
Anton Sidorenko	4431e705cc	[NFC] Use forward decl of MachineCombinerPattern enum to reduce dependencies Differential Revision: https://reviews.llvm.org/D135776	2022-10-13 14:56:14 +01:00
Simon Tatham	526ce9c929	Propagate tied operands when copying a MachineInstr. MachineInstr's copy constructor works by calling the addOperand method to add each operand of the old MachineInstr to the new one, one by one. But addOperand deliberately avoids trying to replicate ties between operands, on the grounds that the tie refers to operands by index, and the indices aren't necessarily finalized yet. This led to a code generation fault when the machine pipeliner cloned an Arm conditional instruction, and lost the tie between the output register and the input value to be used when the condition failed to execute. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D135434	2022-10-13 09:40:35 +01:00
Mirko Brkusanin	8b8463ef6c	[SelectionDAG] Use consistent type sizes for opcode	2022-10-12 17:33:04 +02:00
Craig Topper	ac9209751a	Revert "[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y))" This reverts commit 0148df8157f05ecf3b1064508e6f012aefb87dad. Getting a lit test failures on AMDGPU but I can't reproduce it so far. Reverting to investigate.	2022-10-11 16:30:40 -07:00
Craig Topper	0148df8157	[DAGCombiner] Fold (mul (sra X, BW-1), Y) -> (neg (and (sra X, BW-1), Y)) (sra X, BW-1) is either 0 or -1. So the multiply is a conditional negate of Y. This pattern shows up when type legalizing wide multiplies involving a sign extended value. Fixes PR57549. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D133399	2022-10-11 16:20:55 -07:00
Jessica Paquette	0f1a51e173	[GlobalISel] Allow vectors in redundant or + add combines We support KnownBits for vectors, so we can enable these. https://godbolt.org/z/r9a9W4Gj1 Differential Revision: https://reviews.llvm.org/D135719	2022-10-11 15:31:09 -07:00
Jessica Paquette	036a13065b	[GlobalISel] Combine (X op Y) == X --> Y == 0 This matches patterns of the form ``` (X op Y) == X ``` And transforms them to ``` Y == 0 ``` where appropriate. Example: https://godbolt.org/z/hfW811c7W Differential Revision: https://reviews.llvm.org/D135380	2022-10-11 09:52:48 -07:00
Philip Reames	487695e7c9	[SDAG] Treat DemandedElts argument to isSplatVector as splat for scalable vectors [nfc] The previous code used a APInt(1, 0) to represent the demanded elts of a scalable vector, and then ignored that argument if type was scalable. This was inconsistent with the UndefElts parameter which is set to either APInt(1, 0) or APInt(1,1) - that is, implicitly broadcast across all lanes. Particularly since the undef code relied on the DemandedElts parameter having bitwidth 1 to achieve that result! This change switches the demanded parameter to APInt(1,1), documents the broadcast semantics, and takes advantage of it to remove one special case for scalable vectors which is no longer required.	2022-10-11 09:49:28 -07:00
Philip Reames	ac4f3fff8c	[SDAG] Clarify behavior of scalable demanded/undef elts in isSplatValue [nfc] Update comment, and add an assertion to check property expected by sole (non-test) caller. Remove tests which appear to have been copied from fixed vector tests, and whose demanded bits don't correspond to the way this interface is otherwise used.	2022-10-11 07:28:34 -07:00
Craig Topper	0121b1a4ac	Revert "[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant." This reverts commit d4facda414b6b9b8b1a34bc7e6b7c15172775318. This has been reported to cause failures. Reverting while I investigate.	2022-10-10 14:53:29 -07:00
Craig Topper	d4facda414	[TargetLowering][RISCV][X86] Support even divisors in expandDIVREMByConstant. If the divisor is even, we can first shift the dividend and divisor right by the number of trailing zeros. Now the divisor is odd and we can do the original algorithm to calculate a remainder. Then we shift that remainder left by the number of trailing zeros and add the bits that were shifted out of the dividend. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D135541	2022-10-10 11:02:22 -07:00
wanglei	730ee6568c	[LoongArch] Set correct encodings for DWARF exception handling This patch sets correct encodings for DWARF exception handling for LoongArch. Differential Revision: https://reviews.llvm.org/D134710	2022-10-08 11:53:48 +08:00
Craig Topper	9f67047cf0	[VP][RISCV] Add vp.smax/smin/umax/umin intrinsics Differential Revision: https://reviews.llvm.org/D135418	2022-10-07 17:14:31 -07:00
Amaury Séchet	62ea6c5be7	[DAGCombine] Deduplicate addcarry node using commutativity. The first two parameters of addcarry are commutative. We may face a situation where both variant are present in the DAG, in which case we benefit from using just one. Depends on D57302 and D33587 Reviewed By: RKSimon, chfast Differential Revision: https://reviews.llvm.org/D57317	2022-10-08 00:55:14 +02:00
eopXD	dbc681c98e	[VP][RISCV] Add vp.roundtozero and its RISC-V support The scalar instruction of this is `llvm.trunc`. However the naming of ISD::VP_TRUNC is already taken by `trunc` of the LLVM IR. Naming this as `vp.ftrunc` would likely cause confusion with `vp.fptrunc`. So adding `vp.roundtozero` that will look similar to `vp.roundeven`. Reviewed By: frasercrmck Differential Revision: https://reviews.llvm.org/D135233	2022-10-07 02:15:23 -07:00
Pierre van Houtryve	36c3833783	[GISel] Add Trunc/Lshr/BuildVector Folding Similar to the current "Trunc/BuildVector" folding - which folds low element extracts of BuildVectors, folds hi element extracts done using bitshifts. For D134354 Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135148	2022-10-07 08:44:03 +00:00
Pierre van Houtryve	a34977c4d0	[GISel] Handle G_TRUNC in `matchExtractVecEltBuildVec` Spotted some cases in D134354 where this was an issue. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D135147	2022-10-07 08:37:18 +00:00
Mike Hommey	d3b0e745e8	[CodeView] Avoid NULL deref of Scope Regression from D131400: cross-language LTO causes a crash in the compiler on the NULL deref of Scope in `isa` call when Rust IR is involved. Presumably, this might affect other languages too, and even Rust itself without cross-language LTO when the Rust compiler switched to LLVM 16. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D134616	2022-10-07 08:34:57 +02:00
Philip Reames	04bb32e58a	[DAG] Extract helper for (neg x) [nfc] This is a frequently reoccurring pattern, let's factor it out. Differential Revision: https://reviews.llvm.org/D135301	2022-10-06 13:23:52 -07:00
Pierre van Houtryve	3ec0085c3f	[DAG] Update `isKnownNeverNaN` for `FMA/FMAD` We can still get a NaN even if none of the operands are NaN, e.g. from +inf/-inf. D50804 didn't catch that. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D134854	2022-10-06 06:52:36 +00:00
Ellis Hoag	549773f9e9	[Dwarf] Reference the correct CU when inlining Sometimes when a function is inlined into a different CU, `llvm-dwarfdump --verify` would find an inlined subroutine with an invalid abstract origin. This is because `DwarfUnit::addDIEEntry()` will incorrectly assume the inlined subroutine and the abstract origin are from the same CU if it can't find the CU for the inlined subroutine. In the added test, the inlined subroutine for `bar()` is created before the CU for `B.swift` is created, so it tries to point to `goo()` in the wrong CU. Interestingly, if we swap the order of the two functions then we don't see a crash since the module for `goo()` is created first. The fix is to give a parent DIE to `ScopeDIE` before calling `addDIEEntry()` so that its CU can be found. Luckily, `constructInlinedScopeDIE()` is only called once so we can pass it the DIE of the scope's parent and give it a child just after it's created. `constructInlinedScopeDIE()` should always return a DIE, so assert that it is not null. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D135114	2022-10-05 09:19:12 -07:00
Fraser Cormack	08497a785b	[VP] Fix unused variable in release configurations	2022-10-05 10:33:07 +01:00
Fraser Cormack	a3a9b0743e	[VP][NFC] Remove \brief commands from doxygen comments Following a precedent set in D46861.	2022-10-05 08:08:30 +01:00
Fraser Cormack	3362e2d57f	[VP] Add IR expansion for vp.icmp and vp.fcmp These intrinsics are simply expanded to regular icmp/fcmp instructions. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D121594	2022-10-05 08:07:39 +01:00
Serguei Katkov	d330731f94	[RegAllocFast] Clean-up. Remove redundant operations. NFC. Reviewed By: MatzeB, arsenm Subscribers: llvm-commits Differential Revision: https://reviews.llvm.org/D109213	2022-10-05 11:38:54 +07:00
Amara Emerson	c5cebf78bd	[GlobalISel] Add computeNumSignBits() support for compares. Doing so allows G_SEXT_INREG to be combined away for many vector cases. Differential Revision: https://reviews.llvm.org/D135168	2022-10-05 00:28:08 +01:00
Amara Emerson	8055aa8e8a	[AArch64][GlobalISel] Make vector G_SEXT_INREG legal and allow combining. As a result of making these legal, and tweaking the combine to allow vectors, we generate vector G_SEXT_INREG during legalization. The reason we want to make these legal in the first place is to allow for more combine opportunities. Once those have been done, we can just lower them back to shifts in the post-legalizer lowering. This needs to be one commit otherwise we start causing tests to fail due to incomplete support for selection etc.	2022-10-05 00:28:08 +01:00
jeff	cebec42089	[DAGCombiner] [AMDGPU] Allow vector loads in MatchLoadCombine Since SROA chooses promotion based on reaching load / stores of allocas, we may run into scenarios in which we alloca a vector, but promote it to an integer. The result of which is the familiar LoadCombine pattern (i.e. ZEXT, SHL, OR). However, instead of coming directly from distinct loads, the elements to be combined are coming from ExtractVectorElements which stem from a shared load. This patch identifies such a pattern and combines it into a load. Change-Id: I0bc06588f11e88a0a975cde1fd71e9143e6c42dd	2022-10-04 12:16:00 -07:00
Sanjay Patel	17dcbd8165	[SDAG] don't hoist div/rem through a select with neutral constant This bug was introduced with D134966.	2022-10-04 13:15:01 -04:00
Jay Foad	af947d9fcb	[ISel] Fix crash in new FMA DAG combine Fix a crash in the FMA combine added by D132837 and amended by D134810. In cases where the newly created node could be folded, the combiner would fail this assertion: llc: DAGCombiner.cpp:268: void (anonymous namespace)::DAGCombiner::AddToWorklist(llvm::SDNode *): Assertion `N->getOpcode() != ISD::DELETED_NODE && "Deleted Node added to Worklist"' failed. Differential Revision: https://reviews.llvm.org/D135150	2022-10-04 15:13:18 +01:00
Amara Emerson	07ccf651b9	x[AArch64][GlobalISel] Enable vector support for G_SELECT->G_FMAXIMUM/MINIMUM. Vector support seems to work immediately, as long as we run the combine before legalization (so the vector SELECTs don't get lowered) and the legalizer rules are there to enable generation. Differential Revision: https://reviews.llvm.org/D135047	2022-10-03 21:39:52 +01:00
Philip Reames	a200b0fc25	[DAG] Introduce getSplat utility for common dispatch pattern [nfc] We have a very common pattern of dispatching between BUILD_VECTOR and SPLAT_VECTOR creation repeated in many cases in code. Common the pattern into a utility function.	2022-10-03 12:49:39 -07:00
Jessica Paquette	970cb99e0a	[GlobalISel] Combine `(x + y) - y -> x` and friends This adds a combine that handles ``` (x + y) - y -> x (x + y) - x -> y x - (y + x) -> 0 - y x - (x + z) -> 0 - z ``` On AArch64, we get added benefit for `0 - y` because it can be selected to a `neg` instruction. Differential Revision: https://reviews.llvm.org/D135010	2022-10-03 10:06:48 -07:00
Philip Reames	21f97fdc97	[DAG] Use getSplatBuildVector in a couple more places [nfc]	2022-10-03 09:48:49 -07:00
Markus Böck	36af4c8418	[SelectionDAG] Fix use-after-free introduced in D130881 The code introduced in https://reviews.llvm.org/D130881 has a bug as it may cause a use-after-free error that can be caught by ASAN. The bug essentially boils down to iterator invalidation of `DenseMap`. The expression `SDEI[To] = I->second;` may cause `SDEI` to grow if `To` is inserted for the very first time. When that happens, all existing iterators to the map are invalidated as their backing storage has been freed. Accessing `I->second` is then invalid and attempts to access freed memory (as `I` is an iterator of `SDEI`). This patch fixes that quite simply by first making a copy of `I->second`, and then moving into the possibly newly inserted KV of the ` DenseMap`. No test attached as I am not sure it is practible to test. Differential revision: https://reviews.llvm.org/D135019	2022-10-03 15:09:14 +02:00
Petar Avramovic	1fa2019828	[SelectionDAG] Add check for BUILD_VECTOR in isKnownNeverNaN Includes handling of constants with vector type in isKnownNeverNaN. For AMDGPU results in not making fcanonicalize during legalization for vector inputs to fmaxnum_ieee and fminnum_ieee. Does not affect end result since there is a combine that eliminates fcanonicalize. Differential Revision: https://reviews.llvm.org/D88573	2022-10-03 12:47:07 +02:00
Amara Emerson	3daf7ddaef	[GlobalISel] Allow prelegalizer combiners to have access to LegalizerInfo. Before, the isPreLegalize() query in CombinerHelper only checked for the presence of a LegalizerInfo object. This is problematic when we want to have a combine actually check for legality in a pre-legalizer combine pass, since if we pass a LegalizerInfo object to the constructor it causes the combines to think that we're running post legalizer, which isn't true. This change fixes it to instead check an explicit bool that passes to signal whether the pass will be run before or after legalization. Doing so exposed a bug in the extending loads combine, which tried to check for legality of candidate extending loads if LegalizerInfo was present. Since we only ran it pre-legalizer and therefore with a null LegalizerInfo, it never actually ran. Also fixes the legality checks to keep the tests passing. Differential Revision: https://reviews.llvm.org/D135044	2022-10-03 07:36:18 +01:00
David Green	3651635eca	[ARM][DAG] BF16 constant handling. Much like f16 and f32, we shouldn't try to shrink bf16 to smaller fp constant. The code may not be optimal, but this allows us to legalize bf16 constants under Arm without errors.	2022-10-02 11:51:08 +01:00

1 2 3 4 5 ...

33066 Commits