llvm-project

Author	SHA1	Message	Date
Róbert Ágoston	cd4ed08b5a	[GlobalISel] Don't combine instructions which are fed by memory instructions using different size Memory instructions like extending loads from the same address are not equal if their size is not equal. This fixes https://github.com/llvm/llvm-project/issues/53524. Differential Revision: https://reviews.llvm.org/D118805	2022-02-04 15:00:47 -08:00
John Brawn	0d8092dd48	[AArch64] Fix legalization of v1f64 strict_fsetcc and strict_fsetccs These operations are scalarized but the result type v1i1 isn't which needs special handling (the same as is done for the non-strict versions of these operations). Differential Revision: https://reviews.llvm.org/D118258	2022-02-04 12:55:38 +00:00
serge-sans-paille	ffe8720aa0	Reduce dependencies on llvm/BinaryFormat/Dwarf.h This header is very large (3M Lines once expended) and was included in location where dwarf-specific information were not needed. More specifically, this commit suppresses the dependencies on llvm/BinaryFormat/Dwarf.h in two headers: llvm/IR/IRBuilder.h and llvm/IR/DebugInfoMetadata.h. As these headers (esp. the former) are widely used, this has a decent impact on number of preprocessed lines generated during compilation of LLVM, as showcased below. This is achieved by moving some definitions back to the .cpp file, no performance impact implied[0]. As a consequence of that patch, downstream user may need to manually some extra files: llvm/IR/IRBuilder.h no longer includes llvm/BinaryFormat/Dwarf.h llvm/IR/DebugInfoMetadata.h no longer includes llvm/BinaryFormat/Dwarf.h In some situations, codes maybe relying on the fact that llvm/BinaryFormat/Dwarf.h was including llvm/ADT/Triple.h, this hidden dependency now needs to be explicit. $ clang++ -E -Iinclude -I../llvm/include ../llvm/lib/Transforms/Scalar/*.cpp -std=c++14 -fno-rtti -fno-exceptions \| wc -l after: 10978519 before: 11245451 Related Discourse thread: https://llvm.discourse.group/t/include-what-you-use-include-cleanup [0] https://llvm-compile-time-tracker.com/compare.php?from=fa7145dfbf94cb93b1c3e610582c495cb806569b&to=995d3e326ee1d9489145e20762c65465a9caeab4&stat=instructions Differential Revision: https://reviews.llvm.org/D118781	2022-02-04 11:44:03 +01:00
Bjorn Pettersson	3db39e7479	[DAGCombiner] Fix dependency analysis in checkMergeStoreCandidatesForDependencies In the aftermath of D116895 a problem was found in the analysis of dependencies between store merge candidates in checkMergeStoreCandidatesForDependencies, that is needed to avoid the cycles are introduced in the DAG. In the past it has been enough (or assumed to be enough) to start scanning from non-chain operands when analysing the store merge candidates for dependencies, assuming that the analysis of chain dependencies performed when finding the candidates would cover up for potential dependencies that exist involving the chain operands. It was however discovered that one could end up with scenarios such as descibed in the aarch64-checkMergeStoreCandidatesForDependencies.ll test case, when the dependency between two stores is given by a mix of chain operand dependencies and non-chain operand dependencies. The fix in this patch make sure that we also account for chain operand dependencies when doing the more elaborate analysis in checkMergeStoreCandidatesForDependencies, no longer relying on that the earlier check involving chain operands is enough. Differential Revision: https://reviews.llvm.org/D118943	2022-02-04 08:53:01 +01:00
Mircea Trofin	91a33ad32b	[nfc][mlgo][regalloc] Cache live interval feature components Lazily cache the feature components of a LiveInterval. Differential Revision: https://reviews.llvm.org/D118674	2022-02-03 17:01:42 -08:00
Jessica Paquette	9a61e731ff	[GlobalISel] Combine (G_ADDO x, 0) -> x + no carry out Similar to the G_MULO change. The code for checking if a constant is legal/pre-legalize is shared between these, and is kind of hairy. So, factor it out into a new function: `isConstantLegalOrBeforeLegalizer`. To make the refactoring clean, further refactor `isLegalOrBeforeLegalizer` into a wrapper for two functions: - `isPreLegalize` - `isLegal` This is a bit easier to read in general. https://godbolt.org/z/KW7oszP1o Differential Revision: https://reviews.llvm.org/D118655	2022-02-03 14:25:15 -08:00
Jessica Paquette	c636899dc1	[GlobalISel] Combine: (G_MULO x, 0) -> 0 + no carry out Similar to the following combine in `DAGCombiner::visitMULO`: ``` // fold (mulo x, 0) -> 0 + no carry out if (isNullOrNullSplat(N1)) return CombineTo(N, DAG.getConstant(0, DL, VT), DAG.getConstant(0, DL, CarryVT)); ``` This fixes some generally poor codegen for `mulo`: https://godbolt.org/z/eTxYsvz8f Differential Revision: https://reviews.llvm.org/D118635	2022-02-03 14:23:58 -08:00
Mircea Trofin	592f52de33	[nfc][regalloc] const LiveIntervals within the allocator Once built, LiveIntervals are immutable. This patch captures that. Differential Revision: https://reviews.llvm.org/D118918	2022-02-03 12:35:36 -08:00
Bjorn Pettersson	0352ee1a22	[CodeGenPrepare] Avoid out-of-bounds shift AddressingModeMatcher::matchOperationAddr may attempt to shift a variable by the same amount of steps as found in the IR in a SHL instruction. This was done without considering that there could be undefined behavior in the IR, so the shift performed when compiling could end up having undefined behavior as well. This patch avoid UB in the codegenprepare by making sure that we limit the shift amount used, in a similar way as already being done in CodeGenPrepare::optimizeLoadExt. Differential Revision: https://reviews.llvm.org/D118602	2022-02-03 21:03:58 +01:00
Mircea Trofin	79b98f0a07	Revert "[nfc][mlgo] De-const a parameter" This reverts commit bc3b372161716a4c4845d47a877e4892df0d08da. The planned change that would have needed non-const MachineFunction refs isn't needed after all.	2022-02-03 09:20:36 -08:00
John Brawn	94843ea7d7	[AArch64] Make machine combiner patterns preserve MIFlags This is mainly done so that we don't lose the nofpexcept flag once we start emitting it. Differential Revision: https://reviews.llvm.org/D118621	2022-02-03 11:58:59 +00:00
Sander de Smalen	01bfe9729a	[ISEL] Canonicalize STEP_VECTOR to LHS if RHS is a splat. This helps recognise patterns where we're trying to match STEP_VECTOR patterns to INDEX instructions that take a GPR for the Start/Step. The reason for canonicalising this operation to the LHS is because it will already be canonicalised to the LHS if the RHS is a constant splat vector. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D118459	2022-02-03 09:31:46 +00:00
Jeremy Morse	4654fa89ea	Follow up to 6e03a68b776dc, squelch another leak This patch is a sticking-paster until D118774 solves the situation with unique_ptrs. I'm certainly wishing I'd focused on that first X_X.	2022-02-02 21:02:11 +00:00
Jeremy Morse	6e03a68b77	[DebugInfo] Re-enable instruction referencing for x86_64 After discussion in D116821 this was turned off in 74db5c8c95e, 14aaaa12366f7 applied to limit the maximum memory consumption in rare conditions, plus some performance patches.	2022-02-02 19:41:59 +00:00
Matt Arsenault	a96dbb9035	CodeGen: Use asm register names in warning message This was using the ugly tablegenerated register enum names, which are really hideous for register tuples on AMDGPU. Use the prettier names which are recognized by the asm parser.	2022-02-02 14:20:12 -05:00
Jeremy Morse	206cafb680	Follow up to 9fd9d56dc6b, avoid a memory leak Gaps in the basic block number range (from blocks being deleted or folded) get block-value-tables allocated but never ejected, leading to a memory leak, currently tripping up the asan buildbots. Fix this up by manually freeing that memory. As suggested elsewhere, if these things were owned by a unique_ptr then cleanup would happen automagically. D118774 should eliminate the need for this dance.	2022-02-02 16:01:11 +00:00
Masoud Ataei	256d253332	[PowerPC] Scalar IBM MASS library conversion pass This patch introduces the conversions from math function calls to MASS library calls. To resolves calls generated with these conversions, one need to link libxlopt.a library. This patch is tested on PowerPC Linux and AIX. Differential: https://reviews.llvm.org/D101759 Reviewer: bmahjour	2022-02-02 07:54:19 -08:00
Mircea Trofin	660ff655c8	Fix buildbreak introduced in ed2deab5956fea9e8f64ef6020fe0b4e19734ecc	2022-02-02 07:34:51 -08:00
Mircea Trofin	ed2deab595	[nfc][regalloc] Make the max inference cutoff configurable Added a flag to make configurable the number of interferences after which we 'bail out' and treat a set of intervals as un-evictable. Also using it on the ML side, as it turns out to be a good control for compile-time. With this configurable, we can do a bit of trial and error and see if bumping it has any effect on heuristic/policy quality. Differential Revision: https://reviews.llvm.org/D118707	2022-02-02 07:29:34 -08:00
Jeremy Morse	43de305704	[DebugInfo][InstrRef] Fix a tombstone-in-DenseMap crash from D117877 This is a follow-up to D117877: variable assignments of DBG_VALUE $noreg, or DBG_INSTR_REFs where no value can be found, are represented by a DbgValue object with Kind "Undef", explicitly meaning "there is no value". In D117877 I added a special-case to some assignment accounting faster, without considering this scenario. It causes variables to be given the value ValueIDNum::EmptyValue, which then ends up being a DenseMap key. The DenseMap asserts, because EmptyValue is the tombstone key. Fix this by handling the assign-undef scenario in the special case, to match what happens in the general case: the variable has no value if it's only ever assigned $noreg / undef. Differential Revision: https://reviews.llvm.org/D118715	2022-02-02 15:08:49 +00:00
Jeremy Morse	9fd9d56dc6	[DebugInfo][InstrRef][NFC] Use depth-first scope search for variable locs This patch aims to reduce max-rss from instruction referencing, by avoiding keeping variable value information in memory for too long. Instead of computing all the variable values then emitting them to DBG_VALUE instructions, this patch tries to stream the information out through a depth first search: * Make use of the fact LexicalScopes gives a depth-number to each lexical scope, * Produce a map that identifies the last lexical scope to make use of a block, * Enumerate each scope in LexicalScopes' DFS order, solving the variable value problem, * After each scope is processed, look for any blocks that won't be used by any other scope, and emit all the variable information to DBG_VALUE instructions. Differential Revision: https://reviews.llvm.org/D118460	2022-02-02 14:09:54 +00:00
Jeremy Morse	a80181a81e	[DebugInfo][InstrRef][NFC] Free resources at an earlier stage This patch releases some memory from InstrRefBasedLDV earlier that it would otherwise. The underlying problem is: * We store a big table of "live in values for each block", * We translate that into DBG_VALUE instructions in each block, And both exist in memory at the same time, which needlessly doubles that information. The most of what this patch does is: as we progressively translate live-in information into DBG_VALUEs, we free the variable-value / machine-value tracking information as we go, which significantly reduces peak memory. While I'm here, also add a clear method to wipe variable assignments that have been accumulated into VLocTracker objects, and turn a DenseMap into a SmallDenseMap to avoid an initial allocation. Differential Revision: https://reviews.llvm.org/D118453	2022-02-02 12:58:15 +00:00
Jeremy Morse	d556eb7e27	[DebugInfo][InstrRef][NFC] Cache some PHI resolutions Install a cache of DBG_INSTR_REF -> ValueIDNum resolutions, for scenarios where the value has to be reconstructed from several DBG_PHIs. Whenever this happens, it's because branch folding + tail duplication has messed with the SSA form of the program, and we have to solve a mini SSA problem to find the variable value. This is always called twice, so it makes sense to cache the value. This gives a ~0.5% geomean compile-time-performance improvement on CTMark. Differential Revision: https://reviews.llvm.org/D118455	2022-02-02 12:21:28 +00:00
Simon Pilgrim	5aa2acc86b	[DAG] SimplifyDemandedVectorElts - remove KnownZero/KnownUndef from DCI helper wrapper None of the external users actual touch these (they're purely used internally down the recursive call) - its trivial to add another wrapper if anything ever does want to track known elements.	2022-02-02 12:04:49 +00:00
Jeremy Morse	14aaaa1236	Re-apply 3fab2d138e30, now with a triple added Was reverted in 1c1b670a73a9 as it broke all non-x86 bots. Original commit message: [DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out In certain circumstances with things like autogenerated code and asan, you can end up with thousands of Values live at the same time, causing a large working set and a lot of information spilled to the stack. Unfortunately InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory when there are many many stack slots. See the reproducer in D116821. It seems very unlikely that a developer would be able to reason about hundreds of live named local variables at the same time, so a huge working set and many stack slots is an indicator that we're likely analysing autogenerated or instrumented code. In those cases: gracefully degrade by setting an upper bound on the amount of stack slots to track. This limits peak memory consumption, at the cost of dropping some variable locations, but in a rare scenario where it's unlikely someone is actually going to use them. In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero. Differential Revision: https://reviews.llvm.org/D118601	2022-02-02 11:04:00 +00:00
Sam Parker	281d29b8fe	[TypePromotion] Avoid some unnecessary truncs Check for legal zext 'sinks' before inserting a trunc. Differential Revision: https://reviews.llvm.org/D115451	2022-02-02 10:05:15 +00:00
Simon Moll	7d926b7177	[VE] LEGALAVL and staged VVP legalization The new LEGALAVL node annotates that the AVL refers to packs of 64bit. We use a two-stage lowering approach with LEGALAVL: First, standard SDNodes are translated into illegal VVP layer nodes. Regardless of source (VP or standard), all VVP nodes have a mask and AVL parameter. The AVL parameter refers to the element position (just as in VP intrinsics). Second, we legalize the AVL usage in VVP layer nodes. If the element size is < 64bit, the EVL parameter has to be adjusted to refer to packs of 64bits. We wrap the legalized AVL in a LEGALAVL node to track this. Reviewed By: kaz7 Differential Revision: https://reviews.llvm.org/D118321	2022-02-02 09:11:41 +01:00
Kevin Athey	1c1b670a73	Revert "[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out" This reverts commit 3fab2d138e30c65249e1eaea6cc68b2b7f50955a. Breaking PPC sanitizer build: https://lab.llvm.org/buildbot/#/builders/105/builds/20857	2022-02-01 18:37:02 -08:00
David Blaikie	f69f23396d	Revert "DebugInfo: Don't put types in type units if they reference internal linkage types" This reverts commit ab4756338c5b2216d52d9152b2f7e65f233c4dac. Breaks some cases, including this: namespace { template <typename> struct a {}; } // namespace class c { c(); }; class b { b(); a<c> ax; }; b::b() {} c::c() {} By producing a reference to a type unit for "c" but not producing the type unit.	2022-02-01 16:13:07 -08:00
David Green	c89cfbd4dd	Revert "[DAG] Extend SearchForAndLoads with any_extend handling" This reverts commit 100763a88fe97b22cd5e3f69d203669aac3ed48f as it was making incorrect assumptions about implicit zero_extends.	2022-02-01 20:18:40 +00:00
Jeremy Morse	8e75536e51	[DebugInfo][InstrRef][NFC] Bypass a frequently-noop loop Bypass this loop if it would do nothing -- if there are no register masks to be examined, there's no point looking at each location to see if the location has been def'd. Awkwardly, this was responsible for almost an entire half a percent of performance improvement on CTMark. Differential Revision: https://reviews.llvm.org/D118613	2022-02-01 19:39:09 +00:00
Jeremy Morse	3fab2d138e	[DebugInfo][InstrRef] Add a max-stack-slots-to-track cut-out In certain circumstances with things like autogenerated code and asan, you can end up with thousands of Values live at the same time, causing a large working set and a lot of information spilled to the stack. Unfortunately InstrRefBasedLDV doesn't cope well with this and consumes a lot of memory when there are many many stack slots. See the reproducer in D116821. It seems very unlikely that a developer would be able to reason about hundreds of live named local variables at the same time, so a huge working set and many stack slots is an indicator that we're likely analysing autogenerated or instrumented code. In those cases: gracefully degrade by setting an upper bound on the amount of stack slots to track. This limits peak memory consumption, at the cost of dropping some variable locations, but in a rare scenario where it's unlikely someone is actually going to use them. In terms of the patch, this adds a cl::opt for max number of stack slots to track, and has the stack-slot-numbering code optionally return None. That then filters through a number of code paths, which can then chose to not track a spill / restore if it touches an untracked spill slot. The added test checks that we drop variable locations that are on the stack, if we set the limit to zero. Differential Revision: https://reviews.llvm.org/D118601	2022-02-01 19:25:29 +00:00
Jeremy Morse	91fb66cf91	[DebugInfo][InstrRef][NFC] Don't build a map of un-needed values When finding locations for variable values at the start of a block, we build a large map of every value to every location, and then pick out the locations for values that are desired. This takes up quite a lot of time, because, unsurprisingly, there are usually more values in registers and stack slots than there are variables. This patch instead creates a map of desired values to their locations, which are initially illegal locations. Then, as we examine every available value, we can select locations for values we care about, and ignore those that we don't. This substantially reduces the amount of work done (i.e., building a map up of values to locations that nothing wants or needs). Geomean performance improvement of 1% on CTMark, woo. Differential Revision: https://reviews.llvm.org/D118597	2022-02-01 18:58:06 +00:00
Mircea Trofin	22d3bbdf4e	[nfc][regalloc] Move DefaultEvictionAdvisor::* to RegAllocEvictionAdvisor.cpp This is leftover from the advisor refactoring. Straight-forward copy and paste.	2022-02-01 07:59:25 -08:00
Simon Pilgrim	904395ab8f	[DAG] SimplifyMultipleUseDemandedBits - add default Depth = 0 argument. Simplifies an upcoming change.	2022-02-01 12:34:38 +00:00
Simon Pilgrim	d83a96f59f	[DAG] Make it clear mul(x,x) knownbits bit[1] == 0 check should be for x is undef only As raised on rGffd0e464b4b9, if x is poison, this fold is still ok.	2022-02-01 11:32:14 +00:00
Bjorn Pettersson	3885879046	[DAGCombine] Add simple folds for SSHLSAT/USHLSAT Do "simplifyShift" and "FoldConstantArithmetic" folds for the SSHLSAT and USHLSAT DAG nodes. This includes folds such as: (shlsat undef/poison, x) -> 0 (shlsat x, undef/poison) -> undef (shlsat x, too_large_shamt) -> undef (shlsat 0, x) -> 0 (shlsat x, 0) -> x (shlsat c1, c2) -> c3 Differential Revision: https://reviews.llvm.org/D118603	2022-02-01 10:51:35 +01:00
David Sherwood	daa80339df	[CodeGen] Support folds of not(cmp(cc, ...)) -> cmp(!cc, ...) for scalable vectors I have updated TargetLowering::isConstTrueVal to also consider SPLAT_VECTOR nodes with constant integer operands. This allows the optimisation to also work for targets that support scalable vectors. Differential Revision: https://reviews.llvm.org/D117210	2022-02-01 09:50:00 +00:00
Mircea Trofin	a3f1491849	[nfc][mlgo][regalloc] 'hasPreferredPhys' out of feature components It isn't cacheable, it can be updated by other events than live interval resizing.	2022-01-31 18:59:47 -08:00
Mircea Trofin	9aa2c914b9	[mlgo][regalloc] Factor live interval feature calculation Factoring it out so we can subsequently cache it. This should be a NFC, however, for the float quantities, we see small errors in the least significant digits. This is because, before, we were summing up one by one. Now, we sum up results of sums. This shouldn't matter for ML, and will require rework when we do quantization (avoiding floats altogether), but meanwhile, it did require an update to the reference file used for testing. The patch also bumps the precision of the variables involved in this, to reduce the error (note they are casted back to float at the end by the SET macro, since we only work with float and not double in TF) Differential Revision: https://reviews.llvm.org/D118659	2022-01-31 15:19:15 -08:00
Mircea Trofin	d46305e22d	[NFC][regalloc] Move evict advisor initialization before VRAI This is because a subsequent patch will propose obtaining the VRAI from the advisor, which will enable feature caching for the ML advisor, for better compile time. Making this change first as it's both innocuous and keeps the future patch to be reviewed small.	2022-01-31 14:04:59 -08:00
Mircea Trofin	bc3b372161	[nfc][mlgo] De-const a parameter We plan to pass the MachineFunction& to APIs that expect it non-const (for legitimate reasons). The advisor still holds the ref as a const ref, though, so we keep most of the maintainability value of that.	2022-01-31 13:44:33 -08:00
Philip Reames	57cf29ac1b	[Statepoint] Remove another use of getActualReturnType [NFC] For the cross block gc.result projection case, we only care about the return type if there is a cross block gc.result, and if there is one, we can take the type from the gc.result. At the moment, this makes little difference, but for opaque pointers we need a means to get result typing without relying on pointee types.	2022-01-31 09:57:46 -08:00
Adrian Prantl	f85c6b79f3	Fix a fragment overflow problem when composing super-registers. Addresses https://github.com/llvm/llvm-project/issues/53342 Differential Revision: https://reviews.llvm.org/D118412	2022-01-31 09:47:29 -08:00
Philip Reames	6e4f7c0823	[Statepoints] Take result type from gc.result [NFC] When lowering a gc.result, we can assume that the result type of the gc.result matches the type of the underlying call. This is explicitly required in LangRef. At the moment, this makes little difference, but for opaque pointers we need a means to get result typing without relying on pointee types.	2022-01-31 09:42:34 -08:00
Philip Reames	093b43f48d	Sink getGCResultLocality to sole use [NFC]	2022-01-31 09:33:57 -08:00
Jeremy Morse	4a2cb01370	[DebugInfo][InstrRef][NFC] Refactor ahead of further optimisations This patch shuffles some functions around so that some blocks of code can be reused. In particular, * Move the determination of "which blocks are in scope" to its own function, as it's non-trivial to solve. Delete the "InScopeBlocks" collection too, which nothing reads from. * Split transfer emission (i.e., installing DBG_VALUEs into blocks) into its own function. * Name some useful types. * Rename "ScopeToBlocks" to "ScopeToAssignBlocks", as that's what the collection contains, blocks where assignments happen. Differential Revision: https://reviews.llvm.org/D118454	2022-01-31 16:45:53 +00:00
Jeremy Morse	e9739f116d	Revert "[DebugInfo][InstrRef][NFC] Add a missing assignment operator" This reverts commit f18429372f12b571aef539855c4dbef23a96f494. Bitten by -Werror,-Wdeprecated-copy on a buildbot, alas!	2022-01-31 16:15:21 +00:00
Jeremy Morse	f18429372f	[DebugInfo][InstrRef][NFC] Add a missing assignment operator ValueIDNum is supposed to be a value type that boils down to a uint64_t, that has some bitfields for convenience. If we use the default operator=, we end up with each bit field being individually assigned, which is un-necessarily slow. Implement the assignment operator by just copying the uint64_t value of the object. This is quicker, and matches how the comparison operators work already. Doing so is 0.1% faster on the compile-time-tracker.	2022-01-31 16:08:38 +00:00
Kerry McLaughlin	002b944dfa	[SVE] Fix TypeSize->uint64_t implicit conversion in visitAlloca() Fixes a crash ('Invalid size request on a scalable vector') in visitAlloca() when we call this function for a scalable alloca instruction, caused by the implicit conversion of TySize to uint64_t. This patch changes TySize to a TypeSize as returned by getTypeAllocSize() and ensures the allocation size is multiplied by vscale for scalable vectors. Reviewed By: sdesmalen, david-arm Differential Revision: https://reviews.llvm.org/D118372	2022-01-31 14:37:23 +00:00

1 2 3 4 5 ...

31897 Commits