llvm-project

Author	SHA1	Message	Date
Kazu Hirata	627b5bda11	[llvm] Add missing header guards (NFC) Identified with llvm-header-guard.	2021-01-30 09:53:42 -08:00
Kazu Hirata	1a2d67fa23	[llvm] Use llvm::lower_bound and llvm::upper_bound (NFC)	2021-01-29 23:23:36 -08:00
Sriraman Tallam	c32f399802	Detect Source Drift with Propeller. Source Drift happens when the sources are updated after profiling the binary but before building the final optimized binary. If the source has changed since the profiles were obtained, optimizing basic blocks might be sub-optimal. This only applies to BasicBlockSection::List as it creates clusters of basic blocks using basic block ids. Source drift can invalidate these groupings leading to sub-optimal code generation with regards to performance. PGO source drift for a particular function can be detected using function metadata added in D95495. When source drift is deected, disable basic block clusters by default which can be re-enabled with -mllvm option bbsections-detect-source-drift=false. Differential Revision: https://reviews.llvm.org/D95593	2021-01-29 18:47:26 -08:00
Roman Lebedev	ddc4b56eef	[ExpandMemCmpPass] Preserve Dominator Tree, if available This finishes getting rid of all the avoidable Dominator Tree recalculations in X86 optimized codegen pipeline.	2021-01-30 01:14:51 +03:00
Roman Lebedev	c2534a7097	[ShadowStackGCLowering] Preserve Dominator Tree, if avaliable This doesn't help avoid any Dominator Tree recalculations just yet, there's one more pass to go..	2021-01-30 01:14:51 +03:00
Jessica Paquette	d6656c3b25	[GlobalISel] Remove hint instructions in generic InstructionSelect code. I think every target will want to remove these in the same way. Rather than making them all implement the same code, let's just put this in InstructionSelect. Differential Revision: https://reviews.llvm.org/D95652	2021-01-29 11:20:07 -08:00
Jay Foad	5cf6412a27	[GlobalISel] Fix modifying a G_OR without notifying the observer Remove the call to setFlags in favour of creating the instruction with the correct flags in the first place, so we don't have to explicitly notify the observer. Differential Revision: https://reviews.llvm.org/D95681	2021-01-29 16:32:24 +00:00
Sjoerd Meijer	f03f3a8474	[MachineLICM] Fix wrong and confusing comment. NFC.	2021-01-29 13:39:07 +00:00
Florian Hahn	f3a710cade	[LTO] Update splitCodeGen to take a reference to the module. (NFC) splitCodeGen does not need to take ownership of the module, as it currently clones the original module for each split operation. There is an ~4 year old fixme to change that, but until this is addressed, the function can just take a reference to the module. This makes the transition of LTOCodeGenerator to use LTOBackend a bit easier, because under some circumstances, LTOCodeGenerator needs to write the original module back after codegen. Reviewed By: tejohnson Differential Revision: https://reviews.llvm.org/D95222	2021-01-29 11:53:11 +00:00
Kazu Hirata	7925aa091d	[llvm] Populate SmallVector at construction time (NFC)	2021-01-28 22:21:14 -08:00
Wei Mi	e15ae67a0a	[LiveDebugVariables] Add cache for SkipPHIsLabelsAndDebug to prevent iterating the same PHI/LABEL/Debug instructions repeatedly. We run into a compiling timeout problem when building a target after its SampleFDO profile is updated. It is because some very large blocks with a bunch of PHIs at the beginning. LiveDebugVariables::emitDebugValues called during VirtRegRewriter phase searchs the insertion point for those large BBs repeatedly in SkipPHIsLabelsAndDebug, and each time SkipPHIsLabelsAndDebug needs to go through the same set of PHIs before it can find the first non PHI/Label/Debug instruction. This patch adds a cache to save the last position for the sequence which has been checked in the previous call of SkipPHIsLabelsAndDebug. Differential Revision: https://reviews.llvm.org/D94981	2021-01-28 21:58:17 -08:00
Christudasan Devadasan	892e4567e1	Support a list of CostPerUse values This patch allows targets to define multiple cost values for each register so that the cost model can be more flexible and better used during the register allocation as per the target requirements. For AMDGPU the VGPR allocation will be more efficient if the register cost can be associated dynamically based on the calling convention. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D86836	2021-01-29 10:14:52 +05:30
Jessica Paquette	d5736a2746	[GlobalISel] Implement regbankselect for G_ASSERT_ZEXT This adds generic regbankselect support for G_ASSERT_ZEXT. It inherits whatever register bank the source was given, always, on all targets. I think that at the point where we run into these, the source register bank should be decided. This also adds some AArch64-specific code which makes sure we can handle G_ASSERT_ZEXT when deciding on register banks for G_STORE, G_PHI, ... etc. Differential Revision: https://reviews.llvm.org/D95649	2021-01-28 16:56:14 -08:00
Jessica Paquette	f19971d1de	[GlobalISel] Implement computeKnownBits for G_ASSERT_ZEXT It's the same as the ZEXT/TRUNC case, except SrcBitWidth is given by the immediate operand. Update KnownBitsTest.cpp and a MIR test for a concrete example. Differential Revision: https://reviews.llvm.org/D95566	2021-01-28 16:34:34 -08:00
Jessica Paquette	daffab1985	Recommit "[GlobalISel] Walk through hints in getDefIgnoringCopies et al" Recommit of 4580acf6752ea3cc884657b5aa3e174bed86fc8c `Opc = DefMI->getOpcode()` was in the wrong place.	2021-01-28 14:43:00 -08:00
Jessica Paquette	dcb5b5f1f2	Revert "[GlobalISel] Walk through hints in getDefIgnoringCopies et al" This reverts commit 4580acf6752ea3cc884657b5aa3e174bed86fc8c. Reverting while looking into some test failures.	2021-01-28 14:37:57 -08:00
Jessica Paquette	4580acf675	[GlobalISel] Walk through hints in getDefIgnoringCopies et al Treat hint instructions like G_ASSERT_ZEXT like COPY instructions in helpers which walk through copies. This ensures that instructions like G_ASSERT_ZEXT won't impact any optimizations that rely on these helpers. Differential Revision: https://reviews.llvm.org/D95577	2021-01-28 14:27:00 -08:00
Cassie Jones	f22f4557a7	[GlobalISel] Implement widenScalar for carry-in add/sub These are widened to a wider UADDE/USUBE, with the overflow value unused, and with the same synthesis of a new overflow value as for the O operations. Reviewed By: paquette Differential Revision: https://reviews.llvm.org/D95326	2021-01-28 17:06:24 -05:00
Jessica Paquette	24261729a4	[GlobalISel] Add G_ASSERT_ZEXT This adds a generic opcode which communicates that a type has already been zero-extended from a narrower type. This is intended to be similar to AssertZext in SelectionDAG. For example, ``` %x_was_extended:_(s64) = G_ASSERT_ZEXT %x, 16 ``` Signifies that the top 48 bits of %x are known to be 0. This is useful in cases like this: ``` define i1 @zeroext_param(i8 zeroext %x) { %cmp = icmp ult i8 %x, -20 ret i1 %cmp } ``` In AArch64, `%x` must use a 32-bit register, which is then truncated to a 8-bit value. If we know that `%x` is already zero-ed out in the relevant high bits, we can avoid the truncate. Currently, in GISel, this looks like this: ``` _zeroext_param: and w8, w0, #0xff ; We don't actually need this! cmp w8, #236 cset w0, lo ret ``` While SDAG does not produce the truncation, since it knows that it's unnecessary: ``` _zeroext_param: cmp w0, #236 cset w0, lo ret ``` This patch - Adds G_ASSERT_ZEXT - Adds MIRBuilder support for it - Adds MachineVerifier support for it - Documents it It also puts G_ASSERT_ZEXT into its own class of "hint instruction." (There should be a G_ASSERT_SEXT in the future, maybe a G_ASSERT_ALIGN as well.) This allows us to skip over hints in the legalizer etc. These can then later be selected like COPY instructions or removed. Differential Revision: https://reviews.llvm.org/D95564	2021-01-28 13:58:37 -08:00
David Blaikie	85b7b5625a	Fix memory leak in 4318028cd2d7633a0cdeb0b5d4d2ed81fab87864	2021-01-28 12:08:23 -08:00
David Blaikie	4318028cd2	DebugInfo: Add a DWARF FORM extension for addrx+offset references to reduce relocations This is an alternative to the use of complex DWARF expressions for addresses - shaving off a few extra bytes of expression overhead.	2021-01-28 10:20:02 -08:00
Shaurya Gupta	e29552c5af	Revert "[DWARF] Create subprogram's DIE in DISubprogram's unit" This reverts commit ef0dcb506300dc9644e8000c6028d14214be9d97. This change is causing a lot of compiler crashes inside, sorry I don't have a small repro/stacktrace with symbols to share right now. Differential Revision: https://reviews.llvm.org/D95622	2021-01-28 16:39:01 +00:00
Roman Lebedev	6617529a1d	[CodeGen][DwarfEHPrepare] Preserve Dominator Tree Now that D94827 has flipped the switch, and SimplifyCFG is officially marked as production-ready regarding Dominator Tree preservation, we can update this user pass to also preserve Dominator Tree. This is a geomean compile-time win of `-0.05%`..`-0.08%`. https://llvm-compile-time-tracker.com/compare.php?from=51a25846c198cff00abad0936f975167357afa6f&to=082499aac236a5c141e50a9e77870d5be2de5f0b&stat=instructions Differential Revision: https://reviews.llvm.org/D95548	2021-01-28 14:11:34 +03:00
Tomas Matheson	b9ed8ebe0e	[ARM][RegisterScavenging] Don't consider LR liveout if it is not reloaded https://bugs.llvm.org/show_bug.cgi?id=48232 When PrologEpilogInserter writes callee-saved registers to the stack, LR is not reloaded but is instead loaded directly into PC. This was not taken into account when determining if each callee-saved register was liveout for the block. When frame elimination inserts virtual registers, and the register scavenger tries to scavenge LR, it considers it liveout and tries to spill again. However there is no emergency spill slot to use, and it fails with an error: fatal error: error in backend: Error while trying to spill LR from class GPR: Cannot scavenge register without an emergency spill slot! This patch pervents any callee-saved registers which are not reloaded (including LR) from being marked liveout. They are therefore available to scavenge without requiring an extra spill.	2021-01-28 09:22:55 +00:00
Kazu Hirata	0da15ea581	[llvm] Use append_range (NFC)	2021-01-27 23:25:41 -08:00
David Blaikie	dd7297e1bf	DebugInfo: Fix bug in addr+offset exprloc to use DWARFv5 addrx op instead of DWARFv4 GNU extension	2021-01-27 18:39:44 -08:00
Roman Lebedev	7e88942d25	[CodeGen] IndirectBrExpandPass: preserve Dominator Tree, if available This fully de-pessimizes the common case of no indirectbr's, (where we don't actually need to do anything to preserve domtree) and avoids domtree recomputation in the case there were indirectbr's. Note that two indirectbr's could have a common successor, and not all successors of an indirectbr's are meant to survive the expansion. Though, the code assumes that an indirectbr's doesn't have duplicate successors, those should have been deduplicated by simplifycfg or something already.	2021-01-28 01:58:53 +03:00
David Blaikie	7e6c87ee04	DebugInfo: Deduplicate addresses in debug_addr Experimental, using non-existent DWARF support to use an expr for the location involving an addr_index (to compute address + offset so addresses can be reused in more places). The global variable debug info had to be deferred until the end of the module (so bss variables would all be emitted first - so their labels would have the relevant section). Non-bss variables seemed to not have their label assigned to a section even at the end of the module, so I didn't know what to do there. Also, the hashing code is broken - doesn't know how to hash these expressions (& isn't hashing anything inside subprograms, which seems problematic), so for test purposes this change just skips the hash computation. (GCC's actually overly sensitive in its hash function, it seems - I'm forgetting the specific case right now - anyway, we might want to just use the frontend-known file hash and give up on optimistic .dwo/.dwp reuse)	2021-01-27 14:00:43 -08:00
Craig Topper	0b50fa9945	[FaultsMaps][llvm-objdump] Move FaultMapParser to Object/. Remove CodeGen dependency from llvm-objdump FaultsMapParser lived in CodeGen and was forcing llvm-objdump to link CodeGen and everything CodeGen depends on. This was previously attempted in r240364 to fix a link failure. The CodeGen dependency was independently added to fix the same link failure, and that ended up being kept. Removing the dependency seems like the correct layering for llvm-objdump. Reviewed By: MaskRay, jhenderson Differential Revision: https://reviews.llvm.org/D95414	2021-01-27 10:39:59 -08:00
Simon Pilgrim	5ded5ab78f	ExecutionDomainFix.cpp - use const refs in for-range loops. NFCI. Avoid unnecessary copies. Reported by clang-tidy.	2021-01-27 15:39:32 +00:00
Roman Lebedev	51a25846c1	[CodeGen] SafeStack: preserve DominatorTree if it is avaliable While this is mostly NFC right now, because only ARM happens to run this pass with DomTree available before it, and required after it, more backends will be affected once the SimplifyCFG's switch for domtree preservation is flipped, and DwarfEHPrepare also preserves the domtree.	2021-01-27 18:32:35 +03:00
Roman Lebedev	4de3bdd65f	[NFC] StackProtector: be consistent and to initialize DominatorTreeWrapperPass We already ask for it, so it might be good to ensure that it is actually initialized before us. Doesn't seem to matter in practice though.	2021-01-27 18:32:35 +03:00
Jeremy Morse	ef0dcb5063	[DWARF] Create subprogram's DIE in DISubprogram's unit This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted to be shared between CUs. However, the creation of a subprogram DIE can be triggered early, from other CUs. The subprogram definition is then created in one CU, and when the function is actually emitted children are attached to the subprogram that expect to be in another CU. This breaks internal CU references in the children. Fix this by redirecting the creation of subprogram DIEs in getOrCreateContextDIE to the CU specified by it's DISubprogram definition. This ensures that the subprogram DIE is always created in the correct CU. Differential Revision: https://reviews.llvm.org/D94976	2021-01-27 12:36:14 +00:00
Sjoerd Meijer	48ecba350e	[MachineLICM][MachineSink] Move SinkIntoLoop to MachineSink. This moves SinkIntoLoop from MachineLICM to MachineSink. The motivation for this work is that hoisting is a canonicalisation transformation, but we do not really have a good story to sink instructions back if that is better, e.g. to reduce live-ranges, register pressure and spilling. This has been discussed a few times on the list, the latest thread is: https://lists.llvm.org/pipermail/llvm-dev/2020-December/147184.html There it was pointed out that we have the LoopSink IR pass, but that works on IR, lacks register pressure informatiom, and is focused on profile guided optimisations, and then we have MachineLICM and MachineSink that both perform sinking. MachineLICM is more about hoisting and CSE'ing of hoisted instructions. It also contained a very incomplete and disabled-by-default SinkIntoLoop feature, which we now move to MachineSink. Getting loop-sinking to do something useful is going to be at least a 3-step approach: 1) This is just moving the code and is almost a NFC, but contains a bug fix. This uses helper function `isLoopInvariant` that was factored out in D94082 and added to MachineLoop. 2) A first functional change to make loop-sink a little bit less restrictive, which it really is at the moment, is the change in D94308. This lets it do more (alias) analysis using functions in MachineSink, making it a bit more powerful. Nothing changes much: still off by default. But it shows that MachineSink is a better home for this, and it starts using its functionality like `hasStoreBetween`, and in the next step we can use `isProfitableToSinkTo`. 3) This is the going to be he interesting step: decision making when and how many instructions to sink. This will be driven by the register pressure, and deciding if reducing live-ranges and loop sinking will help in better performance. 4) Once we are happy with 3), this should be enabled by default, that should be the end goal of this exercise. Differential Revision: https://reviews.llvm.org/D93694	2021-01-27 10:49:56 +00:00
Jessica Paquette	f36007e811	[GlobalISel] Implement computeKnownBits for G_SEXT_INREG Just use the existing `Known.sextInReg` implementation. - Update KnownBitsTest.cpp. - Update combine-redundant-and.mir for a more concrete example. Differential Revision: https://reviews.llvm.org/D95484	2021-01-26 15:01:38 -08:00
Amara Emerson	cbed865e1e	[GlobalISel][IRTranslator] Ignore the llvm.experimental.noalias.scope.decl intrinsic. These don't generate any code.	2021-01-26 13:04:11 -08:00
Fangrui Song	34b60d8a56	Add -fbinutils-version= to gate ELF features on the specified binutils version There are two use cases. Assembler We have accrued some code gated on MCAsmInfo::useIntegratedAssembler(). Some features are supported by latest GNU as, but we have to use MCAsmInfo::useIntegratedAs() because the newer versions have not been widely adopted (e.g. SHF_LINK_ORDER 'o' and 'unique' linkage in 2.35, --compress-debug-sections= in 2.26). Linker We want to use features supported only by LLD or very new GNU ld, or don't want to work around older GNU ld. We currently can't represent that "we don't care about old GNU ld". You can find such workarounds in a few other places, e.g. Mips/MipsAsmprinter.cpp PowerPC/PPCTOCRegDeps.cpp X86/X86MCInstrLower.cpp AArch64 TLS workaround for R_AARCH64_TLSLD_MOVW_DTPREL_* (PR ld/18276), R_AARCH64_TLSLE_LDST8_TPREL_LO12 (https://bugs.llvm.org/show_bug.cgi?id=36727 https://sourceware.org/bugzilla/show_bug.cgi?id=22969) Mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER components (supported by LLD in D84001; GNU ld feature request https://sourceware.org/bugzilla/show_bug.cgi?id=16833 may take a while before available). This feature allows to garbage collect some unused sections (e.g. fragmented .gcc_except_table). This patch adds `-fbinutils-version=` to clang and `-binutils-version` to llc. It changes one codegen place in SHF_MERGE to demonstrate its usage. `-fbinutils-version=2.35` means the produced object file does not care about GNU ld<2.35 compatibility. When `-fno-integrated-as` is specified, the produced assembly can be consumed by GNU as>=2.35, but older versions may not work. `-fbinutils-version=none` means that we can use all ELF features, regardless of GNU as/ld support. Both clang and llc need `parseBinutilsVersion`. Such command line parsing is usually implemented in `llvm/lib/CodeGen/CommandFlags.cpp` (LLVMCodeGen), however, ClangCodeGen does not depend on LLVMCodeGen. So I add `parseBinutilsVersion` to `llvm/lib/Target/TargetMachine.cpp` (LLVMTarget). Differential Revision: https://reviews.llvm.org/D85474	2021-01-26 12:28:23 -08:00
Freddy Ye	b3b0acdc6f	[NFC] Refine some uninitialized used variables. These warning are reported by static code analysis tool: Klocwork Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D95421	2021-01-26 16:51:05 +08:00
Amara Emerson	03bce0bf4e	[GlobalISel][Localizer] Don't localize phi operands which are used more than once in the phi. The current algorithm just tries to localize defs as far as they can go, and in the case of G_PHI operands, it clones the def into the predecessor block for each incoming edge. When multiple edges have the same register value, this can cause unnecessary code bloat, and inhibit later optimizations. This change checks if a given phi operand is unique in the phi, if not the def of that register is not localized to the predecessor. Differential Revision: https://reviews.llvm.org/D95406	2021-01-25 17:48:04 -08:00
Craig Topper	ea87cf2acd	[TargetLowering][RISCV] Don't transform (seteq/ne (sext_inreg X, VT), C1) -> (seteq/ne (zext_inreg X, VT), C1) if the sext_inreg is cheaper RISCV has to use 2 shifts for (i64 (zext_inreg X, i32)), but we can use addiw rd, rs1, x0 for sext_inreg. We already understood this when type legalizing i32 seteq/ne on rv64. But this transform in SimplifySetCC would sometimes undo it. Reviewed By: luismarques Differential Revision: https://reviews.llvm.org/D95289	2021-01-25 16:37:21 -08:00
David Blaikie	70e251497c	DebugInfo: Generalize the .debug_addr minimization flag to pave the way for including other strategies	2021-01-25 16:24:35 -08:00
Mitch Phillips	c9466ede7e	Revert "Revert "[GlobalISel] LegalizerHelper - Extract widenScalarAddoSubo method"" This reverts commit 554b3211fefd09b56b64357b9edd66c78ae200b5. Differential Revision: https://reviews.llvm.org/D95035	2021-01-25 16:22:22 -08:00
Cassie Jones	aa8f3677f7	Recommit "[AArch64][GlobalISel] Implement widenScalar for signed overflow" Implement widening for G_SADDO and G_SSUBO. Add legalize-add/sub tests for narrow overflowing add/sub on AArch64. Differential Revision: https://reviews.llvm.org/D95034	2021-01-25 16:57:20 -05:00
Fraser Cormack	fde2466171	[SelectionDAG] Support scalable-vector splats in more cases This patch adds support for scalable-vector splats in DAGCombiner's `isConstantOrConstantVector` and `ISD::matchUnaryPredicate` functions, which enable the SelectionDAG div/rem-by-constant optimizations for scalable vector types. It also fixes up one case where the UDIV optimization was generating a SETCC without first consulting the target for its preferred SETCC result type. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D94501	2021-01-25 10:58:15 +00:00
Fangrui Song	d745b82de1	[XRay] Support DW_TAG_call_site and delete unneeded PATCHABLE_EVENT_CALL/PATCHABLE_TYPED_EVENT_CALL lowering	2021-01-25 00:49:18 -08:00
Fangrui Song	d5bbaaaf95	[XRay] Make __xray_customevent support non-Linux	2021-01-25 00:48:21 -08:00
QingShan Zhang	ffc3e800c6	[NFC] [DAGCombine] Correct the result for sqrt even the iteration is zero For now, we correct the result for sqrt if iteration > 0. This doesn't make sense as they are not strict relative. Reviewed By: dmgreen, spatel, RKSimon Differential Revision: https://reviews.llvm.org/D94480	2021-01-25 04:02:44 +00:00
Chen Zheng	0ed4cf4bf3	[PowerPC] support register pressure reduction in machine combiner. Reassociating some patterns to generate more fma instructions to reduce register pressure. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D92071	2021-01-24 21:28:21 -05:00
Kazu Hirata	16baad8f4e	[llvm] Use pop_back_val (NFC)	2021-01-24 12:18:57 -08:00
Kazu Hirata	d44ca0cf2f	[CodeGen] Forward-declare TargetMachine (NFC) InstrEmitter.h needs TargetMachine but relies on a forward declaration of TargetMachine in MachineOperand.h. This patch adds a forward declaration right in InstrEmitter.h. While we are at it, this patch removes the one in MachineOperand.h, where it is unnecessary.	2021-01-24 12:18:54 -08:00

... 39 40 41 42 43 ...

32115 Commits