llvm-project

Author	SHA1	Message	Date
serge-sans-paille	a494ae43be	Cleanup includes: TransformsUtils Estimation on the impact on preprocessor output: before: 1065307662 after: 1064800684 Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup Differential Revision: https://reviews.llvm.org/D120741	2022-03-01 21:00:07 +01:00
Craig Topper	bf8054644d	[DAGCombiner] Don't expand (neg (abs x)) if the abs has an additional user. If the types aren't legal, the expansions may get type legalized in a different way preventing code sharing. If the type is legal, we will share some instructions between the two expansions, but we will need an extra register. Since we don't appear to fold (neg (sub A, B)) if the sub has an additional user, I think it makes sense not to expand NABS. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120513	2022-03-01 07:32:07 -08:00
Jeremy Morse	ab49dce01f	[DebugInfo][InstrRef][NFC] Use unique_ptr instead of raw pointers InstrRefBasedLDV allocates some big tables of ValueIDNum, to store live-in and live-out block values in, that then get passed around as pointers everywhere. This patch wraps the allocation in a std::unique_ptr, names some types based on unique_ptr, and passes references to those around instead. There's no functional change, but it makes it clearer to the reader that references to these tables are borrowed rather than owned, and we get some extra validity assertions too. Differential Revision: https://reviews.llvm.org/D118774	2022-03-01 12:49:50 +00:00
Sam Parker	20d75059a2	Revert "[TypePromotion] Avoid some unnecessary truncs" This reverts commit 281d29b8fed369c6ebe33ed85c518fc1ed81f44a. Report of a miscompilation and awaiting a reproducer.	2022-03-01 08:59:52 +00:00
Phoebe Wang	e03d216c28	[X86] Use bit test instructions to optimize some logic atomic operations This is to match GCC's optimizations: https://gcc.godbolt.org/z/3odh9e7WE Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D120199	2022-03-01 09:57:08 +08:00
Sanjay Patel	69684b84c6	[SDAG] fold (rotate X) eq/ne (0/-1) This is the SDAG equivalent of an instcombine transform added with: fd807601a78 This is another step towards solving #49541 and part of an alternative set of more general transforms than what is proposed in D111530. https://alive2.llvm.org/ce/z/ToxaE8	2022-02-27 11:31:19 -05:00
Sanjay Patel	acb96ffd14	[SDAG] fold bitwise logic with shifted operands LOGIC (LOGIC (SH X0, Y), Z), (SH X1, Y) --> LOGIC (SH (LOGIC X0, X1), Y), Z https://alive2.llvm.org/ce/z/QmR9rR This is a reassociation + factoring fold. The common shift operation is moved after a bitwise logic op on 2 input operands. We get simpler cases of these patterns in IR, but I suspect we would miss all of these exact tests in IR too. We also handle the simpler form of this plus several other folds in DAGCombiner::hoistLogicOpWithSameOpcodeHands(). This is a partial implementation of a transform suggested in D111530 (only handles 'or' bitwise logic as a first step - need to stamp out more tests for other opcodes). Several of the same tests added for D111530 are altered here (but not fully optimized). I'm not sure yet if this would help/hinder that patch, but this should be an improvement for all tests added with ecf606cb4329ae since it removes a shift operation in those examples. Differential Revision: https://reviews.llvm.org/D120516	2022-02-27 09:54:12 -05:00
Simon Pilgrim	fadd20f80d	[DAG] Ensure type is legal for bswap(shl(x,c)) -> zext(bswap(trunc(shl(x,c-bw/2)))) fold As reported on D120192	2022-02-27 11:25:22 +00:00
Benjamin Kramer	1de11fe360	Use RegisterInfo::regsOverlaps instead of checking aliases This is both less code and faster since it doesn't have to expand all the sub & superreg sets. NFCI.	2022-02-26 20:32:12 +01:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Rong Xu	ccbbb4f6c7	[Sample-PGO] Emit FS discriminators only when -fdebug-info-for-profiling is set IR level addDiscriminator pass is guarded by DebugInfoForProfiling (set by option -fdebug-info-for-profiling). This patch syncs the logic for the MIR and IR level implementations. Differential Revision: https://reviews.llvm.org/D120536	2022-02-25 09:41:17 -08:00
Nikita Popov	87ebd9a36f	[IR] Use CallBase::getParamElementType() (NFC) As this method now exists on CallBase, use it rather than the one on AttributeList.	2022-02-25 10:01:58 +01:00
Rahman Lavaee	aeec9671fb	Revert "Encode address offsets of basic blocks relative to the end of the previous basic blocks." This reverts commit 029283c1c0d8d06fbf000f5682c56b8595a1101f. The code in `ELFFile::decodeBBAddrMap` was not changed in the submitted patch. Differential Revision: https://reviews.llvm.org/D120457	2022-02-24 13:31:15 -08:00
Simon Pilgrim	370ebc9d9a	[DAG] Attempt to fold bswap(shl(x,c)) -> zext(bswap(trunc(shl(x,c-bw/2)))) If the shl is at least half the bitwidth (i.e. the lower half of the bswap source is zero), then we can reduce the shift and perform the bswap at half the bitwidth and just zero extend. Based off PR51391 + PR53867 Differential Revision: https://reviews.llvm.org/D120192	2022-02-24 19:33:51 +00:00
Sanjay Patel	4a3708cd6b	[SDAG] remove shift that is redundant with part of funnel shift This is the SDAG translation of D120253 : https://alive2.llvm.org/ce/z/qHpmNn The SDAG nodes can have different operand types than the result value. We can see an example of that with AArch64 - the funnel shift amount is an i64 rather than i32. We may need to make that match even more flexible to handle post-legalization nodes, but I have not stepped into that yet. Differential Revision: https://reviews.llvm.org/D120264	2022-02-24 11:25:46 -05:00
Jay Foad	719bac55df	[MIRParser] Diagnose too large align values in MachineMemOperands When parsing MachineMemOperands, MIRParser treated the "align" keyword the same as "basealign". Really "basealign" should specify the alignment of the MachinePointerInfo base value, and "align" should specify the alignment of that base value plus the offset. This worked OK when the specified alignment was no larger than the alignment of the offset, but in cases like this it just caused confusion: STW killed %18, 4, %stack.1.ap2.i.i :: (store (s32) into %stack.1.ap2.i.i + 4, align 8) MIRPrinter would never have printed this, with an offset of 4 but an align of 8, so it must have been written by hand. MIRParser would interpret "align 8" as "basealign 8", but I think it is better to give an error and force the user to write "basealign 8" if that is what they really meant. Differential Revision: https://reviews.llvm.org/D120400 Change-Id: I7eeeefc55c2df3554ba8d89f8809a2f45ada32d8	2022-02-24 15:32:08 +00:00
Matthias Braun	6a383369f9	PGOInstrumentation, GCOVProfiling: Split indirectbr critical edges regardless of PHIs The `SplitIndirectBrCriticalEdges` function was originally designed for `CodeGenPrepare` and skipped splitting of edges when the destination block didn't contain any `PHI` instructions. This only makes sense when reducing COPYs like `CodeGenPrepare`. In the case of `PGOInstrumentation` or `GCOVProfiling` it would result in missed counters and wrong result in functions with computed goto. Differential Revision: https://reviews.llvm.org/D120096	2022-02-23 16:27:37 -08:00
Craig Topper	c7d6448d03	[DAGCombiner][TargetLowering] Pass SDValue by value to isMulAddWithConstProfitable. Internally to DAGCombiner the SDValues were passed by non-const reference despite not being modified. They were then passed by const reference to TLI. This patch passes them by value which is consistent with the vast majority of code. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120420	2022-02-23 12:40:45 -08:00
Pawe Bylica	afdaa86b77	[DAGCombine] Extend combineCarryDiamond() In combineCarryDiamond() use getAsCarry() to find more candidates for being a carry flag. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D118362	2022-02-23 21:37:49 +01:00
Jessica Paquette	68c718c8f4	Revert "[MachineOutliner][AArch64] NFC: Split MBBs into "outlinable ranges"" This reverts commit d97f997eb79d91b2872ac13619f49cb3a7120781. This commit was not NFC. (See: https://reviews.llvm.org/rGd97f997eb79d91b2872ac13619f49cb3a7120781)	2022-02-23 10:35:52 -08:00
Sanjay Patel	21d7c3bcc6	[DAG] try to convert multiply to shift via demanded bits This is a fix for a regression discussed in: https://github.com/llvm/llvm-project/issues/53829 We cleared more high multiplier bits with 995d400, but that can lead to worse codegen because we would fail to recognize the now disguised multiplication by neg-power-of-2 as a shift-left. The problem exists independently of the IR change in the case that the multiply already had cleared high bits. We also convert shl+sub into mul+add in instcombine's negator. This patch fills in the high-bits to see the shift transform opportunity. Alive2 attempt to show correctness: https://alive2.llvm.org/ce/z/GgSKVX The AArch64, RISCV, and MIPS diffs look like clear wins. The x86 code requires an extra move register in the minimal examples, but it's still an improvement to get rid of the multiply on all CPUs that I am aware of (because multiply is never as fast as a shift). There's a potential follow-up noted by the TODO comment. We should already convert that pattern into shl+add in IR, so it's probably not common: https://alive2.llvm.org/ce/z/7QY_Ga Fixes #53829 Differential Revision: https://reviews.llvm.org/D120216	2022-02-23 12:09:32 -05:00
Rainer Orth	365be7ac72	[MC][ELF] Use SHF_SUNW_NODISCARD instead of SHF_GNU_RETAIN on Solaris As requested in D107955 <https://reviews.llvm.org/D107955>, this patch splits off the `MC` and `CodeGen` parts and adds a testcase. Tested on `sparcv9-sun-solaris2.11`, `amd64-pc-solaris2.11`, and `x86_64-pc-linux-gnu`. Differential Revision: https://reviews.llvm.org/D120318	2022-02-23 15:43:12 +01:00
Bill Wendling	a5bbc6ef99	[NFC] Remove unnecessary "#include"s from header files	2022-02-23 01:20:48 -08:00
Rahman Lavaee	029283c1c0	Encode address offsets of basic blocks relative to the end of the previous basic blocks. Conceptually, the new encoding emits the offsets and sizes as label differences between each two consecutive basic block begin and end label. When decoding, the offsets must be aggregated along with basic block sizes to calculate the final relative-to-function offsets of basic blocks. This encoding uses smaller values compared to the existing one (offsets relative to function symbol). Smaller values tend to occupy fewer bytes in ULEB128 encoding. As a result, we get about 25% reduction in the size of the bb-address-map section (reduction from about 9MB to 7MB). Reviewed By: tmsriram, jhenderson Differential Revision: https://reviews.llvm.org/D106421	2022-02-22 15:46:46 -08:00
Jay Foad	b47e2dc91f	[StableHashing] Hash machine basic blocks and functions This adds very basic support for hashing MachineBasicBlock and MachineFunction, for use in MachineFunctionPass to detect passes that modify the MachineFunction wrongly. Differential Revision: https://reviews.llvm.org/D120122	2022-02-22 17:38:47 +00:00
Joseph Huber	456ffd7a22	[OpenMP] Ensure offloading sections do not have SHF_ALLOC flag We use offloading sections in the new Clang driver scheme to embed device code into the host. We later use these sections to link the device image, after which point they are completely unused and should not be loaded into memory if they are still in the executable. Reviewed By: JonChesterfield Differential Revision: https://reviews.llvm.org/D120275	2022-02-21 21:35:17 -05:00
Jessica Paquette	d97f997eb7	[MachineOutliner][AArch64] NFC: Split MBBs into "outlinable ranges" We found a case in the Swift benchmarks where the MachineOutliner introduces about a 20% compile time overhead in comparison to building without the MachineOutliner. The origin of this slowdown is that the benchmark has long blocks which incur lots of LRU checks for lots of candidates. Imagine a case like this: ``` bb: i1 i2 i3 ... i123456 ``` Now imagine that all of the outlining candidates appear early in the block, and that something like, say, NZCV is defined at the end of the block. The outliner has to check liveness for certain registers across all candidates, because outlining from areas where those registers are used is unsafe at call boundaries. This is fairly wasteful because in the previously-described case, the outlining candidates will never appear in an area where those registers are live. To avoid this, precalculate areas where we will consider outlining from. Anything outside of these areas is mapped to illegal and not included in the outlining search space. This allows us to reduce the size of the outliner's suffix tree as well, giving us a potential memory win. By precalculating areas, we can also optimize other checks too, like whether or not LR is live across an outlining candidate. Doing all of this is about a 16% compile time improvement on the case. This is likely useful for other targets (e.g. ARM + RISCV) as well, but for now, this only implements the AArch64 path. The original "is the MBB safe" method still works as before.	2022-02-21 15:29:16 -08:00
Paweł Bylica	df0c16ce00	[NFC][DAGCombine] Use isOperandOf() in combineCarryDiamond Pre-commit for https://reviews.llvm.org/D118362.	2022-02-21 21:41:31 +01:00
Matt Arsenault	9c7ca51b2c	MIR: Start diagnosing too many operands on an instruction Previously this would just assert which was annoying and didn't point to the specific instruction/operand.	2022-02-21 10:36:39 -05:00
Simon Pilgrim	46f1e8359e	[DAG] visitBSWAP - pull out repeated SDLoc. NFC Cleanup for D120192	2022-02-21 13:08:01 +00:00
Jay Foad	9a547e7009	[StableHashing] Hash vregs with multiple defs This allows stableHashValue to be used on Machine IR that is not in SSA form. Differential Revision: https://reviews.llvm.org/D120121	2022-02-21 10:26:34 +00:00
Craig Topper	440c4b705a	[SelectionDAG][RISCV][ARM][PowerPC][X86][WebAssembly] Change default abs expansion to use sra (X, size(X)-1); sub (xor (X, Y), Y). Previous we used sra (X, size(X)-1); xor (add (X, Y), Y). By placing sub at the end, we allow RISCV to combine sign_extend_inreg with it to form subw. Some X86 tests for Z - abs(X) seem to have improved as well. Other targets look to be a wash. I had to modify ARM's abs matching code to match from sub instead of xor. Maybe instead ISD::ABS should be made legal. I'll try that in parallel to this patch. This is an alternative to D119099 which was focused on RISCV only. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D119171	2022-02-20 21:11:23 -08:00
Chen Zheng	efe5b8ad90	[ISEL] remove unnecessary getNode(); NFC Reviewed By: RKSimon, craig.topper Differential Revision: https://reviews.llvm.org/D120049	2022-02-20 21:08:49 -05:00
Luo, Yuanke	67ef63138b	[SDAG] enable binop identity constant folds for sub This patch extract the sub folding from D119654 and leave only add folding in that patch. Differential Revision: https://reviews.llvm.org/D120116	2022-02-21 09:37:36 +08:00
David Blaikie	323c672789	DebugInfo: Add an assert about cross-unit references in dwo units This is helping me debug some issues with simplified template names	2022-02-20 14:53:17 -08:00
Amara Emerson	b09e63bad1	[AArch64][GlobalISel] Implement combines for boolean G_SELECT->bitwise ops. Differential Revision: https://reviews.llvm.org/D117160	2022-02-20 00:53:09 -08:00
Craig Topper	24bfa24355	[SelectionDAGBuilder] Simplify visitShift. NFC This code was detecting whether the value returned by getShiftAmountTy can represent all shift amounts. If not, it would use MVT::i32 as a placeholder. getShiftAmountTy was updated last year to return i32 if the type returned by the target couldn't represent all values. This means the MVT::i32 case here is dead and can the logic can be simplified. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D120164	2022-02-19 12:40:59 -08:00
Craig Topper	1df8efae56	[SelectionDAG][X86] Support f16 in getReciprocalOpName. If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations. This patch addes an 'h' suffix' to prevent the crash. I've added simple tests that just enable the estimate for all vec-sqrt and one test case that explicitly tests the new 'h' suffix to override the default steps. There may be some frontend change needed to, but I haven't checked that yet. Reviewed By: pengfei Differential Revision: https://reviews.llvm.org/D120158	2022-02-18 21:55:49 -08:00
Craig Topper	8e7247a377	[SelectionDAG] Fix off by one error in range check in DAGTypeLegalizer::ExpandShiftByConstant. The code was considering shifts by an about larger than the number of bits in the original VT to be out of range. Shifts exactly equal to the original bit width are also out of range. I don't know how to test this. DAGCombiner should usually fold this away. I just noticed while looking for something else in this code. The llvm-cov report shows that we don't have coverage for out of range shifts here. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D120170	2022-02-18 18:42:20 -08:00
Craig Topper	0d59a54cea	Revert "[SelectionDAG][X86] Support f16 in getReciprocalOpName." This reverts commit 86b5e256628ae49193ad9962626a73bafeda2883. This wasn't supposed to be commited yet	2022-02-18 15:39:50 -08:00
Craig Topper	04f815c26f	[SelectionDAGBuilder] Remove LegalTypes=false from a call to getShiftAmountConstant. getShiftAmountTy will return MVT::i32 if the shift amount coming from the target's getScalarShiftAmountTy can't reprsent all possible values. That should eliminate the need to use the pointer type which is what we do when LegalTypes is false. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D120165	2022-02-18 15:36:35 -08:00
Craig Topper	86b5e25662	[SelectionDAG][X86] Support f16 in getReciprocalOpName. If the "reciprocal-estimates" attribute is present and it doesn't contain "all", "none", or "default", we previously crashed on f16 operations. This patch addes an 'h' suffix' to prevent the crash. I've added simple tests that just enable the estimate for all vec-sqrt and one test case that explicitly tests the new 'h' suffix to override the default steps. There may be some frontend change needed to, but I haven't checked that yet. Differential Revision: https://reviews.llvm.org/D120158	2022-02-18 15:36:35 -08:00
Sanjay Patel	a2963d871e	[SDAG] fold sub-of-shift to add-of-shift This fold is done in IR: https://alive2.llvm.org/ce/z/jWyFrP There is an x86 test that shows an improvement from the added flexibility of using add (commutative). The other diffs are presumed neutral. Note that this could also be folded to an 'xor', but I'm not sure if that would be universally better (eg, x86 can convert adds more easily into LEA). This helps prevent regressions from a potential fold for issue #53829.	2022-02-18 11:55:50 -05:00
Jay Foad	074d1e2536	[CodeGen] Return better Changed status from PostRAHazardRecognizer Differential Revision: https://reviews.llvm.org/D119954	2022-02-18 09:46:24 +00:00
Jessica Paquette	12389e3758	[MachineOutliner] Add statistics for unsigned vector size Useful for debugging + evaluating improvements to the outliner. Stats are the number of illegal, legal, and invisible instructions in the unsigned vector, and it's total length.	2022-02-17 18:25:51 -08:00
Heejin Ahn	4f9b839772	[WebAssembly] Make EH/SjLj vars unconditionally thread local This makes three thread local variables (`__THREW__`, `__threwValue`, and `__wasm_lpad_context`) unconditionally thread local. If the target doesn't support TLS, they will be downgraded to normal variables in `stripThreadLocals`. This makes the object not linkable with other objects using shared memory, which is what we intend here; these variables should be thread local when used with shared memory. This is what we initially tried in D88262. But D88323 changed this: It only created these variables when threads were supported, because `__THREW__` and `__threwValue` were always generated even if Emscripten EH/SjLj was not used, making all objects built without threads not linkable with shared memory, which was too restrictive. But sometimes this is not safe. If we build an object using variables such as `__THREW__` without threads, it can be linked to other objects using shared memory, because the original object's `__THREW__` was not created thread local to begin with. So this CL basically reverts D88323 with some additional improvements: - This checks each of the functions and global variables created within `LowerEmscriptenEHSjLj` pass and removes it if it's not used at the end of the pass. So only modules using those variables will be affected. - Moves `CoalesceFeaturesAndStripAtomics` and `AtomicExpand` passes after all other IR pasess that can create thread local variables. It is not sufficient to move them to the end of `addIRPasses`, because `__wasm_lpad_context` is created in `WasmEHPrepare`, which runs inside `addPassesToHandleExceptions`, which runs before `addISelPrepare`. So we override `addISelPrepare` and move atomic/TLS stripping and expanding passes there. This also removes merges `TLS` and `NO-TLS` FileCheck lines into one `CHECK` line, because in the bitcode level we always create them as thread local. Also some function declarations are deleted `CHECK` lines because they are unused. Reviewed By: tlively, sbc100 Differential Revision: https://reviews.llvm.org/D120013	2022-02-17 16:04:18 -08:00
Matt Arsenault	c46aab01c0	RegAllocGreedy: Fix last chance recolor assert in impossible case This example is not compilable without handling eviction of specific subregisters. Last chance recoloring was deciding it could try evicting an overlapping superregister, which doesn't help make any progress. The LiveIntervalUnion would then assert due to an overlapping / identical range when trying the new assignment. Unfortunately this is also producing a verifier error after the allocation fails. I've seen a number of these, and not sure if we should just start deleting the function on error rather than trying to figure out how to put together valid MIR. I'm not super confident this is the right place to fix this. I also have a number of failing testcases I need to fix by handling partial evictions of superregisters.	2022-02-17 18:30:56 -05:00
Paul Walker	6457f42bde	[DAGCombiner] Extend ISD::ABDS/U combine to handle more cases. The current ABD combine doesn't quite work for SVE because only a single scalable vector per scalar integer type is legal (e.g. for i32, <vscale x 4 x i32> is the only legal scalable vector type). This patch extends the combine to also trigger for the cases when operand extension must be retained. Differential Revision: https://reviews.llvm.org/D115739	2022-02-17 13:32:20 +00:00
Bjorn Pettersson	1a8bdf95a3	[DAG] Fix in ReplaceAllUsesOfValuesWith When doing SelectionDAG::ReplaceAllUsesOfValuesWith a worklist is prepared containing all users that should be updated. Then we use the RemoveNodeFromCSEMaps/AddModifiedNodeToCSEMaps helpers to handle recursive CSE updates while doing the replacements. This patch aims at solving a problem that could arise if the recursive CSE updates would result in an SDNode present in the worklist is being removed as a side-effect of morphing a prio user in the worklist. To examplify such a scenario, imagine that we have these nodes in the DAG t12: i64 = add t8, t11 t13: i64 = add t12, t8 t14: i64 = add t11, t11 t15: i64 = add t14, t8 t16: i64 = sub t13, t15 and that the t8 uses should be replaced by t11. An initial worklist (listing the users that should be morphed) could be [t12, t13, t15]. When updating t12 we get t12: i64 = add t11, t11 which results in a CSE update that replaces t14 by t12, so we get t15: i64 = add t12, t8 which results in a CSE update that replaces t13 by t12, so we get t16: i64 = sub t12, t15 and then t13 is removed given that it was the last use of t13. So when being done with the updates triggered by rewriting the use of t8 in t12 the t13 node no longer exist. And we used to end up hitting an assertion when continuing with the worklist aiming at replacing the t8 uses in t13. The solution is based on using a DAGUpdateListener, making sure that we prune a user from the worklist if it is removed during the recursive CSE updates. The bug was found using an OOT target. I think the problem is quite old, even if the particular intree target reproducer added in this patch seem to pass when using LLVM 13.0.0. Differential Revision: https://reviews.llvm.org/D119088	2022-02-17 14:29:59 +01:00
Jay Foad	50ddb5d2d1	[CodeGen] Return better Changed status from LocalStackSlotAllocation Differential Revision: https://reviews.llvm.org/D119942	2022-02-17 09:31:41 +00:00

1 2 3 4 5 ...

32115 Commits