llvm-project

Author	SHA1	Message	Date
Matt Arsenault	bc7d88faf1	CodeGen: Disable isCopyInstrImpl if there are implicit operands This is a conservative workaround for broken liveness tracking of SUBREG_TO_REG to speculatively fix all targets. The current reported failures are on X86 only, but this issue should appear for all targets that use SUBREG_TO_REG. The next minimally correct refinement would be to disallow only implicit defs. The coalescer now introduces implicit-defs of the super register to track the dependency on other subregisters. If we see such an implicit operand, we cannot simply treat the subregister def as the result operand in case downstream users depend on the implicitly defined parts. Really target implementations should be considering the implicit defs and trying to interpret them appropriately (maybe with some generic helpers). The full implicit def could possibly be reported as the move result, rather than the subregister def but that requires additional work. Hopefully fixes #64060 as well. This needs to be applied to the release branch. https://reviews.llvm.org/D156346	2023-10-02 15:16:40 +03:00
Simon Pilgrim	6741dd0696	Fix MSVC "cannot convert from 'llvm::Register' to 'llvm::MCRegister'" build error. NFCI.	2023-10-02 12:41:08 +01:00
JP Lehr	e816c89c84	Revert "InlineSpiller: Consider if all subranges are the same when avoiding redundant spills" This reverts commit d8127b2ba8a87a610851b9a462f2fc2526c36e37.	2023-10-02 06:26:33 -05:00
Matt Arsenault	414ff812d6	RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG Currently coalescing with SUBREG_TO_REG introduces an invisible load bearing undef. There is liveness for the super register not represented in the MIR. This is part 1 of a fix for regressions that appeared after b7836d856206ec39509d42529f958c920368166b. The allocator started recognizing undef-def subregister MOVs as copies. Since there was no representation for the dependency on the high bits, different undef segments of the super register ended up disconnected and downstream users ended up observing different undefs than they did previously. This does not yet fix the regression. The isCopyInstr handling needs to start handling implicit-defs on any instruction. I wanted to include an end to end IR test since the actual failure only appeared with an interaction between the coalescer and the allocator. It's a bit bigger than I'd like but I'm having a bit of trouble reducing it to something which definitely shows a diff that's meaningful. The same problem likely exists everywhere trying to do anything with SUBREG_TO_REG. I don't understand how this managed to be broken for so long. This needs to be applied to the release branch. https://reviews.llvm.org/D156345	2023-10-02 13:57:09 +03:00
Jie Fu	2214026e95	[CodeGen] Fix -Wunused-variable in RegisterCoalescer.cpp (NFC) /llvm-project/llvm/lib/CodeGen/RegisterCoalescer.cpp:1320:18: error: unused variable 'DefSubIdx' [-Werror,-Wunused-variable] const unsigned DefSubIdx = DefMI->getOperand(0).getSubReg(); ^ 1 error generated.	2023-10-02 18:38:45 +08:00
Matt Arsenault	e28708d4f0	RegisterCoalescer: Avoid redundant implicit-def on rematerialize If this was coalescing a def of a subregister with a def of the super register, it was introducing a redundant super-register def and marking the subregister def as dead. Resulting in something like: dead $eax = MOVr0, implicit-def $rax, implicit-def $rax Avoid this by checking if the new instruction already has the super def, so we end up with this instead: dead $eax = MOVr0, implicit-def $rax The dead flag looks suspicious to me, seems like it's easy to buggily interpret dead def of subreg and a non-dead def of an aliasing register. It seems to be intentional though. https://reviews.llvm.org/D156343	2023-10-02 13:33:52 +03:00
Matt Arsenault	b1295dd5c9	RegisterCoalescer: Handle implicit-def of a super register when rematerializing Permit an implicit-def of a virtual register when rematerializing if it defines a super register of a subregister def. The rematerialization pre-legality check should really have been checking the implicit operands, but that should be fixed separately. https://reviews.llvm.org/D156331	2023-10-02 13:11:22 +03:00
Matt Arsenault	32a23aecf8	RegisterCoalescer: Forcibly leave SSA to avoid MIR test errors Not sure how to produce a test that demonstrates the problem today. The coalescer would have to introduce a verifier caught SSA violation, like multiple defs of a virtual register. I'm not sure what would do that now, but an upcoming patch will. https://reviews.llvm.org/D156271	2023-10-02 12:10:06 +03:00
elhewaty	9103b1d68d	[DAG] Extend the computeOverflowForSignedSub/computeOverflowForUnsignedSub implementations with ConstantRange (#67890 ) - Add tests for computeOverflowFor*Sub functions - extend the computeOverflowForSignedSub/computeOverflowForUnsignedSub implementations with ConstantRange (#37109)	2023-10-01 14:57:34 +01:00
Matt Arsenault	d8127b2ba8	InlineSpiller: Consider if all subranges are the same when avoiding redundant spills This avoids some redundant spills of subranges, and avoids a compile failure. This greatly reduces the numbers of spills in a loop. The main range is not informative when multiple instructions are needed to fully define a register. A common scenario is a lowered reg_sequence where every subregister is sequentially defined, but each def changes the main range's value number. If we look at specific lanes at the use index, we can see the value is actually the same. In this testcase, there are a large number of materialized 64-bit constant defs which are hoisted outside of the loop by MachineLICM. These are feeding REG_SEQUENCES, which is not considered rematerializable inside the loop. After coalescing, the split constant defs produce main ranges with an apparent phi def. There's no phi def if you look at each individual subrange, and only half of the register is really redefined to a constant. Fixes: SWDEV-380865 https://reviews.llvm.org/D147079	2023-10-01 11:37:53 +03:00
Matt Arsenault	7252787dd9	RegAllocGreedy: Fix detection of lanes read by a bundle SplitKit creates questionably formed bundles of copies when it needs to copy a subset of live lanes and can't do it with a single subregister index. These are merely marked as part of a bundle, and don't start with a BUNDLE instruction. Queries for the slot index would give the first copy in the bundle, and we need to inspect the operands of all the other bundled copies. Also fix and simplify detection of read lane subsets. This causes some RISCV test regressions, but these look like accidentally beneficial splits. I don't see a subrange based reason to perform these splits. Avoids some really ugly regressions in a future patch. https://reviews.llvm.org/D146859	2023-10-01 11:37:48 +03:00
Christian Sigg	5b7a7ec5a2	[NVPTX] Fix code generation for `trap-unreachable`. (#67478 ) https://reviews.llvm.org/D152789 added an `exit` op before each `unreachable`. This means we never get to the `trap` instruction. This change limits the insertion of `exit` instructions to the cases where `unreachable` is not lowered to `trap`. Trap itself is changed to be emitted as `trap; exit;` to convey to `ptxas` that it exits the CFG.	2023-10-01 07:59:24 +02:00
XinWang10	fef1bec396	[X86]Remove X86-specific dead code in ScheduleDAGRRList.cpp (#67629 ) After patch https://github.com/llvm/llvm-project/pull/67288 landed, unfoldMemoryOperand would not return NewMIs whose size ==3. So the removed line is useless.	2023-09-30 15:49:37 +08:00
JOE1994	204883623e	[NFC] Replace uses of Type::getPointerTo Replace some uses of `Type::getPointerTo` via 2 ways * Remove entirely if it's only used to support an unnecessary bitcast (remove the bitcast as well). * Replace with `PointerType::get`/`PointerType::getUnqual` NFC opaque pointer clean-up effort.	2023-09-29 21:38:53 -04:00
Mircea Trofin	f179486204	[AsmPrint] Correctly factor function entry count when dumping MBB frequencies (#67826 ) The goal in #66818 was to capture function entry counts, but those are not the same as the frequency of the entry (machine) basic block. This fixes that, and adds explicit profiles to the test. We also increase the precision of `MachineBlockFrequencyInfo::getBlockFreqRelativeToEntryBlock` to double. Existing code uses it as float so should be unaffected.	2023-09-29 18:06:53 -07:00
Alexey Bataev	ebcb5d59fc	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit 9f5960e004ff54082ccfa9396522e07358f5b66b to fix buildbots reported here https://lab.llvm.org/buildbot/#/builders/230/builds/19412.	2023-09-29 15:03:46 -07:00
Alexey Bataev	9f5960e004	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-29 13:16:03 -07:00
Jay Foad	6e3d2a4b38	[ISel] Fix another crash in new FMA DAG combine (#67818 ) Following on from D135150, this patch fixes another crash caused by this DAG combine: fadd (fma A, B, (fmul C, D)), E --> fma A, B, (fma C, D, E) The combine calls ReplaceAllUsesOfValueWith to replace (fmul C, D) with (fma C, D, E). This can cause nodes to get CSEd. In D135150 the problem was that the (fma C, D, E) node got CSEd away. In this new case, the problem is that the outer fadd node gets CSEd away. To fix it we have to return SDValue(N, 0) from the combine and be careful not to add a deleted node to the worklist.	2023-09-29 17:18:23 +01:00
Hans Wennborg	eee1f7cef8	Revert "[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7)" This caused asserts: llvm/lib/CodeGen/AsmPrinter/DwarfDebug.cpp:2331: virtual void llvm::DwarfDebug::endFunctionImpl(const llvm::MachineFunction *): Assertion `LScopes.getAbstractScopesList().size() == NumAbstractSubprograms && "getOrCreateAbstractScope() inserted an abstract subprogram scope"' failed. See comment on the code review for reproducer. > RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 > > Similar to imported declarations, the patch tracks function-local types in > DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with > the aforementioned metadata change and provided a support of function-local > types scoped within a lexical block. > > The patch assumes that DICompileUnit's 'enums field' no longer tracks local > types and DwarfDebug would assert if any locally-scoped types get placed there. > > Reviewed By: jmmartinez > > Differential Revision: https://reviews.llvm.org/D144006 This reverts commit f8aab289b5549086062588fba627b0e4d3a5ab15.	2023-09-29 14:23:31 +02:00
Nikita Popov	739c86df80	[llvm] Use more explicit cast methods (NFC) Instead of ConstantExpr::getCast() with a fixed opcode, use the corresponding getXYZ methods instead. For the one place creating a pointer bitcast drop it entirely, as this is redundant with opaque pointers.	2023-09-29 11:21:13 +02:00
Momchil Velikov	b454b04d68	[AArch64] Fix a compiler crash in MachineSink (#67705 ) There were a couple of issues with maintaining register def/uses held in `MachineRegisterInfo`: * when an operand is changed from one register to another, the corresponding instruction must already be inserted into the function, or MRI won't be updated * when traversing the set of all uses of a register, that set must not change	2023-09-29 09:29:20 +01:00
Mikael Holmen	23b8a19a1b	[DwarfDebug] Add forward declarations of "<" operators [NFC] The operators are defined in DwarfDebug.cpp but are referenced in the struct definitions of FrameIndexExpr and EntryValueInfo in DwarfDebug.h, and since they weren't declared before, gcc warned with [694/5646] Building CXX object lib/CodeGen/AsmPrinter/CMakeFiles/LLVMAsmPrinter.dir/DwarfDebug.cpp.o ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:273:6: warning: 'bool llvm::operator<(const llvm::FrameIndexExpr&, const llvm::FrameIndexExpr&)' has not been declared within 'llvm' 273 \| bool llvm::operator<(const FrameIndexExpr &LHS, const FrameIndexExpr &RHS) { \| ^~~~ In file included from ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:13: ../lib/CodeGen/AsmPrinter/DwarfDebug.h:112:15: note: only here as a 'friend' 112 \| friend bool operator<(const FrameIndexExpr &LHS, const FrameIndexExpr &RHS); \| ^~~~~~~~ ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:278:6: warning: 'bool llvm::operator<(const llvm::EntryValueInfo&, const llvm::EntryValueInfo&)' has not been declared within 'llvm' 278 \| bool llvm::operator<(const EntryValueInfo &LHS, const EntryValueInfo &RHS) { \| ^~~~ In file included from ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:13: ../lib/CodeGen/AsmPrinter/DwarfDebug.h:121:15: note: only here as a 'friend' 121 \| friend bool operator<(const EntryValueInfo &LHS, const EntryValueInfo &RHS); \| ^~~~~~~~	2023-09-29 09:17:56 +02:00
Tobias Stadler	305fbc1b32	Revert "[GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND" This reverts commit 3686a0b611c65f0d7190345b8e3e73cdca9fa657. This seems to have broken some sanitizer tests: https://lab.llvm.org/buildbot/#/builders/184/builds/7721	2023-09-29 03:35:40 +02:00
Tobias Stadler	3686a0b611	[GlobalISel] LegalizationArtifactCombiner: Elide redundant G_AND The legalizer currently generates lots of G_AND artifacts. For example between boolean uses and defs there is always a G_AND with a mask of 1, but when the target uses ZeroOrOneBooleanContents, this is unnecessary. Currently these artifacts have to be removed using post-legalize combines. Omitting these artifacts at their source in the artifact combiner has a few advantages: - We know that the emitted G_AND is very likely to be useless, so our KnownBits call is likely worth it. - The G_AND and G_CONSTANT can interrupt e.g. G_UADDE/... sequences generated during legalization of wide adds which makes it harder to detect these sequences in the instruction selector (e.g. useful to prevent unnecessary reloading of AArch64 NZCV register). - This cleans up a lot of legalizer output and even improves compilation-times. AArch64 CTMark geomean: `O0` -5.6% size..text; `O0` and `O3` ~-0.9% compilation-time (instruction count). Since this introduces KnownBits into code-paths used by `O0`, I reduced the default recursion depth. This doesn't seem to make a difference in CTMark, but should prevent excessive recursive calls in the worst case. Reviewed By: aemerson Differential Revision: https://reviews.llvm.org/D159140	2023-09-29 02:11:57 +02:00
Alexey Bataev	3204f88a8b	Revert "[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst." This reverts commit c88c281cf1ac1a01c55231b93826d7c8ae83985b to fix the crash revealed by https://lab.llvm.org/buildbot/#/builders/230/builds/19353.	2023-09-28 11:57:32 -07:00
Noah Goldstein	de7881ebf5	[DAGCombiner] Combine `(select c, (and X, 1), 0)` -> `(and (zext c), X)` The middle end canonicalizes: `(and (zext c), X)` -> `(select c, (and X, 1), 0)` But the `and` + `zext` form gets better codegen.	2023-09-28 13:46:46 -05:00
Alexey Bataev	c88c281cf1	[IR]Add NumSrcElts param to is..Mask static function in ShuffleVectorInst. Need to add NumSrcElts param to is..Mask functions in ShuffleVectorInstruction class for better mask analysis. Mask.size() not always matches the sizes of the permuted vector(s). Allows to better estimate the cost in SLP and fix uses of the functions in other cases. Differential Revision: https://reviews.llvm.org/D158449	2023-09-28 11:03:21 -07:00
Nick Desaulniers	e0a48c065b	[InlineAsm] add comments for NumOperands and ConstraintType (#67474 ) Splitting up patches for #20571. I found these comments generally useful to add and not predicated on those changes. Hopefully they help future travelers.	2023-09-28 08:24:56 -07:00
Nikita Popov	17d276a6b8	[TypePromotion] Avoid use of ConstantExpr::getZExt() (NFC) Instead work on APInt.	2023-09-28 16:19:08 +02:00
Karl-Johan Karlsson	fa3a685926	[MachineLICM] Clear subregister kill flags (#67240 ) When hosting a loop invariant instruction the resulting register must be live in all the basic blocks of the loop body and the killed flags of the register must be cleared. Before this patch killed flags of subregister to a hoisted superregister was not cleared in the loop body. This was found in an out of tree target, but the testcase mlicm-stack-write-check.mir was modified to trigger the case.	2023-09-28 07:26:39 +02:00
Daniel Paoliello	9aa378d89e	[llvm] Fix 32bit build after change to implement S_INLINEES debug symbol (#67607 ) https://github.com/llvm/llvm-project/pull/67490 broke 32bit builds by having mismatched types in a call to `std::min" This change standardizes on using `size_t` to avoid the mismatch.	2023-09-27 14:46:45 -07:00
Daniel Paoliello	050bb26174	[llvm] Implement S_INLINEES debug symbol (#67490 ) The `S_INLINEES` debug symbol is used to record all the functions that are directly inlined within the current function (nested inlining is ignored). This change implements support for emitting the `S_INLINEES` debug symbol in LLVM, and cleans up how the `S_INLINEES` and `S_CALLEES` debug symbols are dumped.	2023-09-27 14:06:22 -07:00
Jay Foad	21c2ba4bdb	[GlobalISel] Remove TargetLowering::isConstantUnsignedBitfieldExtractLegal Use LegalizerInfo::isLegalOrCustom instead. Differential Revision: https://reviews.llvm.org/D116807	2023-09-27 15:58:01 +01:00
Alexey Lapshin	08136d822c	[DWARFLinkerParallel] Add support of accelerator tables to DWARFLinkerParallel. This patch is extracted from D96035, it adds support for the accelerator tables to the DWARFLinkerParallel functionality. Differential Revision: https://reviews.llvm.org/D154793	2023-09-27 13:37:35 +02:00
lennyxiao	13c603a41f	[ScheduleDAG] Fix false assert target In SUnit::removePred, N->WeakSuccsLeft is reduced but WeakSuccsLeft is checked. Reviewed By: kerbowa Differential Revision: https://reviews.llvm.org/D151311	2023-09-27 16:30:16 +08:00
Jianjian Guan	435da4ef55	[RISCV] Promote SETCC and VP_SETCC of f16 vectors when only have zvfhmin (#66866 ) This patch implements the promotion of fp16 vectors SETCC and VP_SETCC when we only have zvfhmin but no zvfh.	2023-09-27 11:00:19 +08:00
Nick Desaulniers	35a364fa5c	[TargetLowering] fix index OOB (#67494 ) I accidentally introduced this in commit 330fa7d2a4e0 ("[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057)") Fix forward.	2023-09-26 15:50:26 -07:00
Vladislav Dzhidzhoev	f8aab289b5	[DebugMetadata][DwarfDebug] Support function-local types in lexical block scopes (4/7) RFC https://discourse.llvm.org/t/rfc-dwarfdebug-fix-and-improve-handling-imported-entities-types-and-static-local-in-subprogram-and-lexical-block-scopes/68544 Similar to imported declarations, the patch tracks function-local types in DISubprogram's 'retainedNodes' field. DwarfDebug is adjusted in accordance with the aforementioned metadata change and provided a support of function-local types scoped within a lexical block. The patch assumes that DICompileUnit's 'enums field' no longer tracks local types and DwarfDebug would assert if any locally-scoped types get placed there. Reviewed By: jmmartinez Differential Revision: https://reviews.llvm.org/D144006	2023-09-26 23:07:29 +04:00
weiguozhi	31f81e96a4	[RA] Don't split a register generated from another split (#67351 ) Split a register generated from another split usually doesn't bring us too much benefit. It may also cause dead loop as pr67188 shows if the heuristic cost always satisfy the split condition. So prevent such splitting. It fixed pr67188.	2023-09-26 08:38:18 -07:00
David Green	03647e2e4b	[AArch64] Handle scalable vectors in combineFMulOrFDivWithIntPow2. The transform will still not trigger as takeInexpensiveLog2 will bail out for any scalable vector, but this guards against a scalable typesize error.	2023-09-26 15:34:34 +01:00
Jingu Kang	ff68e43c81	[MachineLICM] Handle Subloops It is a re-commit from reverted commit 3454cf67bd0a650097dc6ca99874a34e1d59b500. Following discussion on https://reviews.llvm.org/D154205, make MachineLICM pass handle subloops with only visiting outermost loop's blocks once. Differential Revision: https://reviews.llvm.org/D154205	2023-09-26 14:25:11 +01:00
Sam McCall	679c3a1791	[TargetLowering] use stable_sort to avoid nondeterminism After 330fa7d2a4e0cfbb4b078 we were seeing nondeterministic failures of llvm/test/CodeGen/ARM/thumb-big-stack.ll, with different code being generated in different runs. Switching sort -> stable_sort fixes this. It looks like the old algorithm picked the first best option, and using stable_sort restores that behavior.	2023-09-26 15:16:09 +02:00
Amara Emerson	ea7157ff4f	[GlobalISel] Propagate mem operands for indexed load/stores. There isn't a test for this yet since the combines aren't used atm, but it will be tested as part of a future commit. I'm just making this a separate change tidyness reasons.	2023-09-25 14:41:20 -07:00
Nick Desaulniers	330fa7d2a4	[TargetLowering] Deduplicate choosing InlineAsm constraint between ISels (#67057 ) Given a list of constraints for InlineAsm (ex. "imr") I'm looking to modify the order in which they are chosen. Before doing so, I noticed a fair amount of logic is duplicated between SelectionDAGISel and GlobalISel for this. That is because SelectionDAGISel is also trying to lower immediates during selection. If we detangle these concerns into: 1. choose the preferred constraint 2. attempt to lower that constraint Then we can slide down the list of constraints until we find one that can be lowered. That allows the implementation to be shared between instruction selection frameworks. This makes it so that later I might only need to adjust the priority of constraints in one place, and have both selectors behave the same.	2023-09-25 08:53:03 -07:00
Momchil Velikov	c649fd34e9	[MachineSink][AArch64] Sink instruction copies when they can replace copy into hard register or folded into addressing mode This patch adds a new code transformation to the `MachineSink` pass, that tries to sink copies of an instruction, when the copies can be folded into the addressing modes of load/store instructions, or replace another instruction (currently, copies into a hard register). The criteria for performing the transformation is that: * the register pressure at the sink destination block must not exceed the register pressure limits * the latency and throughput of the load/store or the copy must not deteriorate * the original instruction must be deleted Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D152828	2023-09-25 10:49:44 +01:00
Amara Emerson	7e5c2672cb	[GlobalISel][NFC] Clean up and modernize the indexed load/store combines. Use wrappers and helpers to tidy it up, and remove some debug prints.	2023-09-25 00:32:36 -07:00
Amara Emerson	bc6e7f0573	[GlobalISel][NFC] Remove unused method CombinerHelper::tryCombine() The combines were ported to the tablegen combiner a long time ago so this manual method isn't needed.	2023-09-24 22:20:21 +08:00
Simon Pilgrim	8b36d082c4	[DAG] getNode() - fold (zext (trunc x)) -> x iff the upper bits are known zero - add SRL support This is part of the work to address the D155472 regressions, there's a number of issues with generalizing this fold which is why I'm just adding SRL support atm. Differential Revision: https://reviews.llvm.org/D159533	2023-09-24 13:40:07 +01:00
Noah Goldstein	bc38c427d4	[DAGCombiner][AArch64] Fix incorrect cast VT in `takeInexpensiveLog2` Previously, we where taking `CurVT` before finalizing `ToCast` which meant potentially returning an `SDValue` with an illegal `ValueType` for the operation. Fix is to just take `CurVT` after we have finalized `ToCast` with `PeekThroughCastsAndTrunc`.	2023-09-23 09:50:42 -05:00
Kazu Hirata	ce8c22856e	Use llvm::drop_begin and llvm::drop_end (NFC)	2023-09-22 17:29:10 -07:00

1 2 3 4 5 ...

34723 Commits