llvm-project

Author	SHA1	Message	Date
Jun Ma	00eef4f7c3	[SelectionDAG] Fix mismatched truncate when combine BUILD_VECTOR with EXTRACT_SUBVECTOR Just use correct type for truncation. Fixes PR59625 Differential Revision: https://reviews.llvm.org/D145757	2023-03-13 08:59:52 +08:00
Simon Pilgrim	82dc04befd	[DAG] visitZERO_EXTEND - pull out the repeated SDLoc(N) variables	2023-03-12 15:18:46 +00:00
Simon Pilgrim	4d7da0e711	[DAG] Cleanup the (zext (shl (zext x), cst)) -> (shl (zext x), cst) fold. NFC. Preliminary cleanup before adding some additional legality and value tracking handling.	2023-03-12 15:01:33 +00:00
Simon Pilgrim	b53ea2b9c5	[DAG] visitAND - fold (and (any_ext V), c) -> (zero_ext (and (trunc V), c)) if profitable. Try to more aggressively narrow masks of extended values. This is mainly for cases where the mask is trying to zero out any_extended upper bits, assuming we can zext/trunc the values for free. This catches a few actual missed folds, as well as helps canonicalize a number of other cases which were being caught in isel etc. Differential Revision: https://reviews.llvm.org/D145866	2023-03-12 13:25:23 +00:00
Simon Pilgrim	fad852efe4	[DAG] combineShiftAnd1ToBitTest - improve support for peeking through truncations Allows us to handle shift amounts that exceed the original bitwidth	2023-03-11 16:37:47 +00:00
Yuanfang Chen	9aae408d55	[NFC] fix typo `funciton` -> `function` credits to @jmagee	2023-03-10 18:05:25 -08:00
Tim Northover	5c18444289	MachO: support custom section names on global variables These attributes have been accepted in ELF for a while, and are generated by Clang in some places, so it makes sense to support them on MachO too. https://reviews.llvm.org/D143173	2023-03-10 18:23:25 +00:00
Sameer Sahasrabuddhe	fd98416d37	[llvm][Uniformity] consistently handle always-uniform instructions An instruction that is "always uniform" is so even if it occurs in an irreducible cycle. The output produced by such an instruction may depend on the implementation defined cycle hierarchy, but that does not affect the uniformity of the output. In other words, an "always uniform" instruction is uniform even if it is not m-converged. Reviewed By: ruiling, ronlieb Differential Revision: https://reviews.llvm.org/D145572	2023-03-10 14:23:40 +05:30
Rong Xu	ebe09e2a95	[FSAFDO] Improve FS discriminator encoding This change improves FS discriminators in the following ways: (1) use call-stack debug information in the the to generate discriminators: the same (src/line) DILs can now have same discriminator value if they come from different call-stacks. This effectively increases the usable discriminator values for each round of FS discriminator pass. (2) don't generate the FS discriminator for meta instructions (i.e. instructions not emitted). This reduces the number discriminators conflicts (for the case we run out of discriminator bits for that pass). (3) use less expensive hashing of xxHash64. These improvements should bring better performance for FSAFDO and they should be used by default. But this change creates incompatible FS discriminators. For the iterative profile users, they might see a performance drop in the first release with this change (due to the fact that the profiles have the old discriminators and the compiler uses the new discriminator). We have measured that this is not more than 1.5% on several benchmarks. Note the degradation should be gone in the second release and one should expect a performance gain over the binary without this change. One possible solution to the iterative profile issue would be separating discriminators for profile-use and the ones emitted to the binary. This would require a mechanism to allow two sets of discriminators to be maintained and then phasing out the first approach. This is too much churn in the compiler and the performance implications do not seem to be worth the effort. Instead, we put the changes under an option so iterative profile users can do a gradual rollout of this change. We will make the option default value to true in a later patch and eventually purge this option from the code base. Differential Revision: https://reviews.llvm.org/D145171	2023-03-09 23:18:48 -08:00
Yeting Kuo	b2c48559c8	[IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND. The patch mainly does two things. The first is allowing scalable vector ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower strict_fpextend to riscv_strict_fpextend_vl, the strict version of riscv_fpextend_vl. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D145548	2023-03-09 17:37:59 +08:00
Felipe de Azevedo Piovezan	c0967995d2	[CodeGen] Prevent nullptr deref in genAlternativeCodeSequence A pointer dereference was added (D141302) above an assert that checks whether the pointer is null. This commit moves the assert above the dereference and transforms it into an llvm_unreachable to better express the intent that certain switch cases should never be reached. Differential Revision: https://reviews.llvm.org/D145599	2023-03-08 13:41:32 -05:00
Juneyoung Lee	a66bc1c4a3	[DAGCombiner] Avoid converting (x or/xor const) + y to (x + y) + const if benefit is unclear This patch resolves suboptimal code generation reported by https://github.com/llvm/llvm-project/issues/60571 . DAGCombiner currently converts `(x or/xor const) + y` to `(x + y) + const` if this is valid. However, if `.. + const` is broken down into a sequences of adds with carries, the benefit is not clear, introducing two more add(-with-carry) ops (total 6) in the case of the reported issue whereas the optimal sequence must only have 4 add(-with-carry)s. This patch resolves this issue by allowing this conversion only when (1) `.. + const` is legal or promotable, or (2) `const` is a sign bit because it does not introduce more adds. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D144116	2023-03-08 18:13:57 +00:00
Paul Walker	adbdf273ef	[CodeGenPrepare] Stop llvm.vscale() -> getelementptr(null, 1) transformation. I've pulled this change from D145404 to land in isolation because I'm concerned the code might be more important than the test coverage might suggest (NOTE: the code has no test coverage).	2023-03-08 15:47:03 +00:00
Xiang1 Zhang	eed31bbb37	[NFC] Remove dead code in ExtAddrMode::print checked by coverty tool	2023-03-08 15:01:28 +08:00
Chen Zheng	fc26ab36a2	[DAGCombiner] don't use the pointer info for widen store The merged store touches memory for other underlying objects, so mapping the merged store to the first underlying object is not correct. For example in https://github.com/llvm/llvm-project/issues/60744, the merged store is not correctly analyzed as dependent with memory operations which are also part of the merged store. Fixes #60744 Reviewed By: foad Differential Revision: https://reviews.llvm.org/D144711	2023-03-07 20:31:09 -05:00
Nikita Popov	ffe8f47d72	[IR] Add operator<< overload for CmpInst::Predicate (NFC) I regularly try and fail to use this while debugging.	2023-03-07 15:10:56 +01:00
Jay Foad	0265dd9925	Fix "compatiable" typos	2023-03-07 12:57:39 +00:00
Noah Goldstein	c1ecd0a3f4	[DAGCombiner] Add fold for `~x + x` -> `-1` This is generally done by the InstCombine, but can be emitted as an intermediate step and is cheap to handle. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D145177	2023-03-06 20:30:27 -06:00
Noah Goldstein	d4b24b4a55	[DAGCombiner] Add fold for `~x & x` -> `0` This is generally done by the InstCombine, but can be emitted as an intermediate step and is cheap to handle. Differential Revision: https://reviews.llvm.org/D145143	2023-03-06 20:30:20 -06:00
Marco Elver	bdb4353ae0	[SelectionDAG] Optimize copyExtraInfo deep copy It turns out that there are relatively trivial, albeit rare, cases that require a MaxDepth of more than 16 (see added test). However, we want to avoid having to rely on a large fixed MaxDepth. Since these cases are relatively rare, apply the following strategy: 1. Start with a low MaxDepth of 16 - if the entry node was not reached, we can return (the common case). 2. If the entry node was reached, exponentially increase MaxDepth up to some large limit that should cover all cases and guard against stack exhaustion. This retains the better performance with a low MaxDepth in the common case, and in complex cases backs off and retries. On a whole, this is preferable vs. starting with a large MaxDepth which would unnecessarily penalize the common case where a low MaxDepth is sufficient. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D145386	2023-03-06 17:29:53 +01:00
Caroline Concatto	204800ad0a	[IR][Legalization] Promote illegal deinterleave and interleave vectors To make legalization easier, the operands and outputs have the same size for these ISD Nodes. When legalizing the results in PromoteIntegerResult the operands are legalized to the same size as the outputs. The ISD Node has two output/results, therefore the legalizing functions update both results/outputs. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D144846	2023-03-03 10:54:52 +00:00
Craig Topper	01487f384a	[TypePromotion] Deference pointer before printing it in a debug message. Without deferencing it just prints the value of the pointer which isn't meaningful. Dereferencing prints the operand.	2023-03-02 23:43:36 -08:00
Marco Elver	7ecd2a23f5	[SelectionDAG] Fix missing lambda capture Move MaxDepth into the lambda, since it is not needed outside. This fixes some compilers that complain about missing capture: error C3493: 'MaxDepth' cannot be implicitly captured because no default capture mode has been specified Fixes: f693932fbea7 ("[SelectionDAG] Transitively copy NodeExtraInfo on RAUW")	2023-03-02 23:47:36 +01:00
Aditya Nandakumar	00e55531df	[GISel][CSE][NFC]: Handle mutual recursion when inserting node GISel's CSE mechanism lazily inserts instructions into the CSE List to improve on efficiency as well as efficacy of CSE (for allowing partially built instructions to be fully built). There's unfortunately a mutual recursion via `handleRecordedInsts -> handleRecordedInst -> insertNode-> handleRecordedInsts`. So this change simply records that we're already draining this list so we can just bail out on the recursion. No changes to codegen are expected as we're still draining/handling the temporary list via pop_back and we should get the same sequence of instructions whether we call pop_back in a loop at the top level or recursive. https://reviews.llvm.org/D145006 reviewed by: dsanders	2023-03-02 14:42:38 -08:00
Marco Elver	f693932fbe	[SelectionDAG] Transitively copy NodeExtraInfo on RAUW During legalization of the SelectionDAG, some nodes are replaced with arch-specific nodes. These may be complex nodes, where the root node no longer corresponds to the node that should carry the extra info. Fix the issue by copying extra info to the new node and all its new transitive operands during RAUW. See code comments for more details. This fixes the remaining pcsections-atomics.ll tests on X86. v2: Optimize copyExtraInfo() deep copy. For now we assume that only NodeExtraInfo that have PCSections set require deep copy. Furthermore, limit the depth of graph search while pre-populating the visited set, assuming the to-be-replaced subgraph 'From' has limited complexity. An assertion catches if the maximum depth needs to be increased. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D144677	2023-03-02 23:07:19 +01:00
Craig Topper	06c6b787b2	[SelectionDAG][AArch64] Constant fold in SelectionDAG::getVScale if VScaleMin==VScaleMax. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D145113	2023-03-02 12:02:38 -08:00
Craig Topper	c546f13f1f	[DAGCombiner] Replace LegalOperations check in visitSIGN_EXTEND with LegalTypes. This is guarding a check for isTypeLegal so it should check is LegalTypes. Fixes PR61111. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D145139	2023-03-02 07:52:53 -08:00
Sander de Smalen	170e7a0ec2	[AArch64][SME2] Add CodeGen support for target("aarch64.svcount"). This patch adds AArch64 CodeGen support such that the type can be passed and returned to/from functions, and also adds support to use this type in load/store operations and PHI nodes. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D136862	2023-03-02 12:07:41 +00:00
J. Ryan Stinnett	22b8e82c12	[DebugInfo] Remove `dbg.addr` from CodeGen As part of this work, removing `SDDbgValue::clearIsEmitted` originally added for `dbg.addr` in 045c67769d7fe577fc38cccb6fb40fd814437447 was attempted, but it appears some tests for `DBG_INSTR_REF` now depend on that behaviour as well, so it was kept and comments were updated instead. Part of `dbg.addr` removal Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898 Differential Revision: https://reviews.llvm.org/D144800	2023-03-02 09:29:43 +00:00
J. Ryan Stinnett	f5b85c02e9	[DebugInfo][NFC] Remove `FuncArgumentDbgValueKind::Addr` from SelectionDAG This removes the unused `FuncArgumentDbgValueKind::Addr` value originally added by e24f5348798605a799c63ff09169d177d262cd37. The intent was to signal the original intrinsic that marked a function argument, but the `Addr` part was never used. Part of `dbg.addr` removal Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898 Differential Revision: https://reviews.llvm.org/D144794	2023-03-02 09:29:42 +00:00
Marco Elver	e0bc779000	Revert "[SelectionDAG] Transitively copy NodeExtraInfo on RAUW" This reverts commit 7f635b90e7bdf1378fd9a65fc62b99e8e07d4aaf. The current implementation causes pathological slowdowns in certain cases: https://github.com/llvm/llvm-project/issues/61108	2023-03-02 09:39:44 +01:00
Yashwant Singh	5230f6c1c2	[llvm][GenericUniformity] Prevent assert while calculating temporal divergence analyzeTemporalDivergence() was missing the check for always-uniform before evaluating weather an instruction depends on a value defined in the cycle. Fix for #60638 https://github.com/llvm/llvm-project/issues/60638 Reviewed By: sameerds, foad, #amdgpu Differential Revision: https://reviews.llvm.org/D144070	2023-03-02 12:42:35 +05:30
Nick Desaulniers	9cec2b246e	[RegAllocFast] insert additional spills along indirect edges of INLINEASM_BR When generating spills (stores) for values produced by INLINEASM_BR instructions, make sure to insert one spill per indirect target. Otherwise the reload generated may load from a stack slot that has not yet been stored to (resulting in a load of an uninitialized stack slot). Link: https://github.com/llvm/llvm-project/issues/53562 Fixes: https://github.com/llvm/llvm-project/issues/60855 Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D144907	2023-03-01 15:21:11 -08:00
Simon Pilgrim	73cdccad55	[DAG] expandIntMINMAX - attempt to match existing SETCC node As noticed on D144789, when we have pairs of min/max nodes we often end up with multiple comparisons which we could reuse with commuted select ops, so check to see if a suitable SETCC already exists. This also allowed us to remove a similar X86 peephole. There are other getSETCC cases where we could safely reuse other CondCodes as well - I've been trying to think of how we could reuse this logic in SelectionDAG but haven't found anything that always works well. An alternative would be to have a TLI callback that returns a preferred CondCode from a list of options, I've noticed this helped fpclamptosat tests on some other targets (MVE + WebAssembly), but other tests suffered. Differential Revision: https://reviews.llvm.org/D145065	2023-03-01 19:04:03 +00:00
David Green	337215ddf9	[DAG] ABD is not reassociative I'm not sure how I missed this in the testing, but as far as I understand whilst ABDS and ABDU are commutive they are not associative. This patch disables reassociateOps from visitABD, fixing the problems found in #61069. ABDU: https://alive2.llvm.org/ce/z/eiT5QG ABDS: https://alive2.llvm.org/ce/z/HzE29l Differential Revision: https://reviews.llvm.org/D145064	2023-03-01 16:22:13 +00:00
Nikita Popov	ddccc5ba44	[CodeGen] Always expand division larger than i128 Default MaxDivRemBitWidthSupported to 128, so that divisions larger than 128 bits are always expanded, without requiring additional configuration from the target. Note that this may still emit calls to __udivti3 on 32-bit targets, which likely don't have an implementation of that builtin. However, I believe this is sufficient to fix https://github.com/llvm/llvm-project/issues/60531, because Zig must already be defining those builtins. Differential Revision: https://reviews.llvm.org/D144871	2023-03-01 15:33:45 +01:00
Ben Shi	0d25418273	[NFC] Fix incorrect comment in VLIW packetizer Reviewed By: bcain Differential Revision: https://reviews.llvm.org/D145050	2023-03-01 21:19:06 +08:00
Caroline Concatto	cb96eba27c	[IR][Legalization] Split illegal deinterleave and interleave vectors To make legalization easier, the operands and outputs have the same size for these ISD Nodes. When legalizing the results in SplitVectorResult the operands are legalized to the same size as the outputs. The ISD Node has two output/results, therefore the legalizing functions update both results/outputs. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D144744	2023-03-01 08:30:16 +00:00
Wei Xiao	3fd533fd33	[COFF][X86_64] Put jump table in .rdata for Windows Put jump table in .rdata for Windows to align with that for Linux. It can avoid loading the same code page into I$ and D$ simultaneously and thus favor performance. Differential Revision: https://reviews.llvm.org/D144701	2023-03-01 10:35:38 +08:00
Craig Topper	bf9e0ed1e6	[CodeGen] Use LLVM_ATTRIBUTE_UNUSED instead of LLVM_DUMP_METHOD on a raw_ostream operator<<. LLVM_DUMP_METHOD includes ATTRIBUTE_NOINLINE. operator<< isn't what we normally consider a dump method so it should be ok to inline. This fixes a warning from gcc that some other declaration for some other class was inline but this one is noinline. Seems like a bogus warning from gcc really.	2023-02-27 18:12:18 -08:00
Vladislav Dzhidzhoev	3a51eed948	[AArch64][GlobalISel] Legalize G_SHUFFLE_VECTOR with smaller dest size Legalize G_SHUFFLE_VECTOR having destination vector length smaller than source vector length by reshaping destination vector. Differential Revision: https://reviews.llvm.org/D144670	2023-02-27 23:46:44 +01:00
Michal Paszkowski	5ac69674bf	[SPIR-V] Support TargetExtType for SPIR-V builtin types This patch adds support for TargetExtType/target(...) representing SPIR-V builtin types. After D135202, target(...) is the preferred way for representing SPIR-V builtin types in LLVM IR and the only working in the opaque pointer mode. In order to maintain compatibility with LLVM IR generated by older versions of Clang and LLVM/SPIR-V Translator, pointers-to-opaque-structs denoting SPIR-V/OpenCL builtin types will be translated to equivalent SPIR-V target extension types. This translation is only available in the typed pointer mode (-opaque-pointers=0). The relevant LIT tests with SPIR-V builtins were converted to use the new target(...) notation. Differential Revision: https://reviews.llvm.org/D144494	2023-02-27 21:39:25 +01:00
David Green	06daa515b2	[AArch64] Don't remove free sext_inreg(vector_extract(x)) if it leads to multiple extracts If we have sext_inreg(vector_extract(x)) but the top bits are not used, DAG will try to remove the sext_inreg, using vector_extract(x) directly. This can lead to multiple uses of both sext_inreg(vector_extract(x)) and vector_extract(x), leading to the generation of both umov and smov extracts. This adds a target hook to prevent that under AArch64 where the sext_inreg can be considered free if there are multiple uses of the sext and no uses of the vector_extract. This helps fix a small regression from D144550. Differential Revision: https://reviews.llvm.org/D144850	2023-02-27 19:20:10 +00:00
Marco Elver	7f635b90e7	[SelectionDAG] Transitively copy NodeExtraInfo on RAUW During legalization of the SelectionDAG, some nodes are replaced with arch-specific nodes. These may be complex nodes, where the root node no longer corresponds to the node that should carry the extra info. Fix the issue by copying extra info to the new node and all its new transitive operands during RAUW. See code comments for more details. This fixes the remaining pcsections-atomics.ll tests on X86. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D144677	2023-02-27 12:16:14 +01:00
Amara Emerson	4bc6434624	[GlobalISel] Fix an assertion failure in matchHoistLogicOpWithSameOpcodeHands(). We use this combine in the AArch64 postlegalizer combiner, which causes this function to query the legalizer rules for the action for an invalid opcode/type combination (G_AND and p0). Moving the legalizer query until after the validity check in matchHoistLogicOpWithSameOpcodeHands() fixes this.	2023-02-26 15:42:57 -08:00
Noah Goldstein	e981e6d10e	Add transform for `(and/or (icmp eq/ne A,-1),(icmp eq/ne A,-1+C))`->`(and/or (icmp eq/ne (and ~A,-1+C),0))` This works of `-1+C` is a negative power of 2. This can be more useful than the `AddAnd` case as `~A` does not necessarily require materializing a constant. This makes the transform worth it for X86 vector types. Alive2 Links: EQ: https://alive2.llvm.org/ce/z/P6u8cq NE: https://alive2.llvm.org/ce/z/_Kkqp1 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D144284	2023-02-24 15:22:09 -06:00
Noah Goldstein	8c74c5402f	Make `(and/or (icmp eq/ne A,C0), (icmp eq/ne A,C1))` where `IsPow(dif(C0,C1))` work for more patterns. `(and/or (icmp eq/ne A,C0), (icmp eq/ne A,C1))` can be lowered to `(icmp eq/ne (and (sub A, (smin C0, C1)), (not (sub (smax C0, C1), (smin C0, C1)))), 0)` generically if `(sub (smax C0, C1), (smin C0,C1))` is a power of 2. This covers the existing case of `(and/or (icmp eq/ne A, C_Pow2),(icmp eq/ne A, -C_Pow2))` as well as other cases. Alive2 Links: EQ: https://alive2.llvm.org/ce/z/mLJiUW NE: https://alive2.llvm.org/ce/z/TKnzUr Differential Revision: https://reviews.llvm.org/D144283	2023-02-24 15:22:09 -06:00
Steve Merritt	750a6870eb	[Codeview] Fix incorrect size determination for complex types. In Codeview, the basic type of a complex represents the size of an individual component rather than the sum of the real and imaginary components. Differential Revision: https://reviews.llvm.org/D143760	2023-02-24 09:20:52 -05:00
Serge Pavlov	7f81dd4dd6	[NFC] Make FPClassTest a bitmask enumeration This is recommit of 2e416cdd52, fixed to be accepatble by GCC. The original commit message is below. With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-24 15:12:16 +07:00
Jez Ng	865c2b0d15	[MC][nfc] Don't use a value after it has been std::move()'d Reviewed By: serge-sans-paille Differential Revision: https://reviews.llvm.org/D144662	2023-02-23 15:15:24 -05:00

1 2 3 4 5 ...

33746 Commits