llvm-project

Author	SHA1	Message	Date
Florian Hahn	17bad1a9da	[LV] Bail out on header phis in shouldConsiderInvariant. This fixes an infinite recursion in rare cases. Fixes https://github.com/llvm/llvm-project/issues/113794.	2024-11-01 20:51:25 +00:00
Han-Kuan Chen	a795a18bba	[SLP][REVEC] VF should be scaled when ScalarTy is FixedVectorType. (#114551 )	2024-11-02 03:03:52 +08:00
Simon Pilgrim	718d50d6d0	[VectorCombine] foldPermuteOfBinops - prefer the new fold for matching costs. Minor tweak to #114101 - as we're reducing the instruction count, we should prefer the fold if the old/new costs are the same.	2024-11-01 17:28:37 +00:00
Min-Yih Hsu	64314dedeb	[InlineCost] Print inline cost for invoke call sites as well (#114476 ) Previously InlineCostAnnotationPrinter only prints inline cost for call instructions. I don't think there is any reason not to analyze invoke and its callee, and this patch adds such support.	2024-11-01 09:55:17 -07:00
Yingwei Zheng	a77dedcacb	[InstSimplify][InstCombine][ConstantFold] Move vector div/rem by zero fold to InstCombine (#114280 ) Previously we fold `div/rem X, C` into `poison` if any element of the constant divisor `C` is zero or undef. However, it is incorrect when threading udiv over an vector select: https://alive2.llvm.org/ce/z/3Ninx5 ``` define <2 x i32> @vec_select_udiv_poison(<2 x i1> %x) { %sel = select <2 x i1> %x, <2 x i32> <i32 -1, i32 -1>, <2 x i32> <i32 0, i32 1> %div = udiv <2 x i32> <i32 42, i32 -7>, %sel ret <2 x i32> %div } ``` In this case, `threadBinOpOverSelect` folds `udiv <i32 42, i32 -7>, <i32 -1, i32 -1>` and `udiv <i32 42, i32 -7>, <i32 0, i32 1>` into `zeroinitializer` and `poison`, respectively. One solution is to introduce a new flag indicating that we are threading over a vector select. But it requires to modify both `InstSimplify` and `ConstantFold`. However, this optimization doesn't provide benefits to real-world programs: https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/IR/ConstantFold.cpp.html#L908 https://dtcxzyw.github.io/llvm-opt-benchmark/coverage/data/zyw/opt-ci/actions-runner/_work/llvm-opt-benchmark/llvm-opt-benchmark/llvm/llvm-project/llvm/lib/Analysis/InstructionSimplify.cpp.html#L1107 This patch moves the fold into InstCombine to avoid breaking numerous existing tests. Fixes #114191 and #113866 (only poison-safety issue).	2024-11-01 22:56:22 +08:00
Yingwei Zheng	e577f14b67	[InstCombine] Use `m_NotForbidPoison` when folding `(X u< Y) ? -1 : (~X + Y) --> uadd.sat(~X, Y)` (#114345 ) Alive2: https://alive2.llvm.org/ce/z/mTGCo- We cannot reuse `~X` if `m_AllOnes` matches a vector constant with some poison elts. An alternative solution is to create a new not instead of reusing `~X`. But it doesn't worth the effort because we need to add a one-use check. Fixes https://github.com/llvm/llvm-project/issues/113869.	2024-11-01 22:18:44 +08:00
David Green	0f919444ad	[ValueTracking] Handle recursive phis in knownFPClass (#114008 ) As a follow-on to 113686, this breaks the recursion between phi nodes that have p1 = phi(x, p2) and p2 = phi(y, p1). The knownFPClass can be calculated from the classes of p1 and p2.	2024-11-01 13:38:29 +00:00
Han-Kuan Chen	e4aeeba84c	[SLP][REVEC] When ScalarTy is FixedVectorType, the insertion index should consider the number of elements of ScalarTy. (#114526 )	2024-11-01 21:17:57 +08:00
Nuno Lopes	344d972736	AssumeBundleBuilder: switch placeholder from undef to poison [NFC]	2024-11-01 10:12:10 +00:00
Yingwei Zheng	f16bff1261	[GVN][NewGVN][Local] Handle attributes for function calls after CSE (#114011 ) This patch intersects attributes of two calls to avoid introducing UB. It also skips incompatible call pairs in GVN/NewGVN. However, I cannot provide negative tests for these changes. Fixes https://github.com/llvm/llvm-project/issues/113997.	2024-11-01 12:44:33 +08:00
Lei Wang	bef3b54ea1	[InstrPGO] Avoid using global variable to fix potential data race (#114364 ) In https://github.com/llvm/llvm-project/pull/109837, it sets a global variable(`PGOInstrumentColdFunctionOnly`) in PassBuilderPipelines.cpp which introduced a data race detected by TSan. To fix this, I decouple the flag setting, the flags are now set separately(`instrument-cold-function-only-path` is required to be used with `--pgo-instrument-cold-function-only`).	2024-10-31 21:28:13 -07:00
Yingwei Zheng	96b14f2ccb	[Reland][InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114499 ) Relands #114356. Compared to the last version, this patch only merges poison-generating/nsz flags from the select to fix LV regression in `llvm/test/Transforms/PhaseOrdering/AArch64/predicated-reduction.ll`.	2024-11-01 12:22:57 +08:00
c8ef	cf0b6cc711	Revert "[ConstantFold] Fold `tgamma` and `tgammaf` when the input parameter is a constant value." (#114496 ) Reverts llvm/llvm-project#114065	2024-11-01 09:26:11 +08:00
c8ef	1f07f995cc	[ConstantFold] Fold `tgamma` and `tgammaf` when the input parameter is a constant value. (#114065 ) This patch adds support for constant folding for the `tgamma` and `tgammaf` libc functions.	2024-11-01 09:07:55 +08:00
Ruiling, Song	54d31bde32	Reapply "StructurizeCFG: Optimize phi insertion during ssa reconstruction (#101301 )" (#114347 ) This reverts commit be40c723ce2b7bf2690d22039d74d21b2bd5b7cf.	2024-11-01 08:29:59 +08:00
Florian Hahn	b021464d35	[VPlan] Introduce scalar loop header in plan, remove VPLiveOut. (#109975 ) Update VPlan to include the scalar loop header. This allows retiring VPLiveOut, as the remaining live-outs can now be handled by adding operands to the wrapped phis in the scalar loop header. Note that the current version only includes the scalar loop header, no other loop blocks and also does not wrap it in a region block. PR: https://github.com/llvm/llvm-project/pull/109975	2024-10-31 21:36:44 +01:00
Igor Kudrin	454abad7b0	[CFI][LowerTypeTests] Fix indirect call with alias (#113987 ) This is a fixed version of #106185, which was reverted in #113978 due to a buildbot failure. Motivation example: ``` > cat test.cpp extern "C" [[gnu::weak]] void f() {} void alias() __attribute__((alias("f"))); int main() { auto p = alias; p(); } > clang test.cpp -fsanitize=cfi-icall -flto=thin -fuse-ld=lld > ./a.out [1] 1868 illegal hardware instruction ./a.out ``` If the address of a function was only taken through its alias, the function was not considered exported and therefore was not included in the CFI jumptable. This resulted in `@llvm.type.test()` being lowered to `false`, and consequently the indirect call to the function was eventually optimized to `ubsantrap()`.	2024-10-31 13:29:07 -07:00
gulfemsavrun	d183dc7c24	Revert "[InstCombine] Fix FMF propagation in `foldSelectIntoOp`" (#114458 ) Reverts llvm/llvm-project#114356 because it caused test failures. https://lab.llvm.org/buildbot/#/builders/190/builds/8601 https://luci-milo.appspot.com/ui/p/fuchsia/builders/toolchain.ci/clang-base-linux-x64/b8732549597609293617/overview	2024-10-31 13:21:52 -07:00
Matt Arsenault	9cc298108a	AtomicExpand: Copy metadata from atomicrmw to cmpxchg (#109409 ) When expanding an atomicrmw with a cmpxchg, preserve any metadata attached to it. This will avoid unwanted double expansions in a future commit. The initial load should also probably receive the same metadata (which for some reason is not emitted as an atomic).	2024-10-31 11:54:07 -07:00
Matt Arsenault	e3222e6f80	AMDGPU: Add baseline tests for cmpxchg custom expansion (#109408 ) We need a non-atomic path if flat may access private.	2024-10-31 11:46:13 -07:00
Alexey Bataev	e05def081e	[SLP]Do not vectorize code in EH and non-returning blocks The code in EH and non-returning blocks can be skipped by the vectorizer, since it does not add to the perfromance, just consumes compile/link time. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/112221	2024-10-31 13:50:02 -04:00
Alexey Bataev	19a34dded7	[SLP]Do not account external uses in EH block and in non-returning blocks No need to account the cost of the external uses in EH and non-returning basic blocks. Reviewers: RKSimon Reviewed By: RKSimon Pull Request: https://github.com/llvm/llvm-project/pull/112045	2024-10-31 13:23:43 -04:00
Alexey Bataev	e7080fd735	[SLP]Extra check if the intruction matked for removal, must be replaced in reduction ops If the instruction is vectorized and it is a part of the reduced values gather/buildvector node, it should replaced in reduced operation instructions before removal properly, to avoid compiler crash. Fixes #114371	2024-10-31 09:59:35 -07:00
Artem Belevich	8129b6b53b	[NVPTX, InstCombine] instcombine known pointer AS checks. (#114325 ) The change improves the code in general and, as a side effect, avoids crashing on an impossible address space casts guarded by `__isGlobal/__isShared`, which partially fixes https://github.com/llvm/llvm-project/issues/112760 It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code. This is #112964 + a small fix for the crash on unintended argument access which was the root cause to revers the earlier version of the patch.	2024-10-31 09:24:51 -07:00
Yingwei Zheng	cf1963afad	[InstCombine] Fix FMF propagation in `foldSelectIntoOp` (#114356 ) Closes https://github.com/llvm/llvm-project/issues/113423.	2024-10-31 23:26:45 +08:00
Matt Arsenault	1d0370872f	AMDGPU: Expand flat atomics that may access private memory (#109407 ) If the runtime flat address resolves to a scratch address, 64-bit atomics do not work correctly. Insert a runtime address space check (which is quite likely to be uniform) and select between the non-atomic and real atomic cases. Consider noalias.addrspace metadata and avoid this expansion when possible (we also need to consider it to avoid infinitely expanding after adding the predication code).	2024-10-31 08:08:48 -07:00
Kenji Mouri / 毛利研二	7e877fc0ac	[Reland][TLI] Add support for hypot libcall. (#114343 ) This patch adds basic support for `hypot`. Constant folding support will be submitted in a subsequent patch. Related issue: https://github.com/llvm/llvm-project/issues/113711 Note: It's my first time contributing to the LLVM with encouragement from one of my friends, @fawdlstty. I learned a lot from https://github.com/llvm/llvm-project/pull/99611, and thanks for that. Note: I had created the same PR and merged (https://github.com/llvm/llvm-project/pull/113724), but reverted caused by the merging issue. (The CI issue happened in 3 A.M. at my timezone. So, I need to fall asleep again after I replied about why issue happened.) So, I rebased to the latest main branch and recreate the PR and hope I won't have the third time to create the same PR. I hope @arsenm can help me review the code again. I’m sorry for that. Kenji Mouri	2024-10-31 07:50:29 -07:00
goldsteinn	1e072ae289	[CGP] [CodeGenPrepare] Folding `urem` with loop invariant value plus offset (#104724 ) This extends the existing fold: ``` for(i = Start; i < End; ++i) Rem = (i nuw+- IncrLoopInvariant) u% RemAmtLoopInvariant; ``` -> ``` Rem = (Start nuw+- IncrLoopInvariant) % RemAmtLoopInvariant; for(i = Start; i < End; ++i, ++rem) Rem = rem == RemAmtLoopInvariant ? 0 : Rem; ``` To work with a non-zero `IncrLoopInvariant`. This is a common usage in cases such as: ``` for(i = 0; i < N; ++i) if ((i + 1) % X) == 0) do_something_occasionally_but_not_first_iter(); ``` Alive2 w/ i4/unrolled 6x (needs to be ran locally due to timeout): https://alive2.llvm.org/ce/z/6tgyN3 Exhaust proof over all uint8_t combinations in C++: https://godbolt.org/z/WYa561388	2024-10-31 09:14:33 -05:00
Hari Limaye	b396921d0c	[SCCP] Handle llvm.vscale intrinsic calls (#114033 ) Teach SCCP to compute a constant range for calls to llvm.vscale intrinsics.	2024-10-31 12:22:15 +00:00
Simon Pilgrim	92af82a48d	[VectorCombine] Fold "shuffle (binop (shuffle, shuffle)), undef" --> "binop (shuffle), (shuffle)" (#114101 ) Add foldPermuteOfBinops - to fold a permute (single source shuffle) through a binary op that is being fed by other shuffles. Fixes #94546 Fixes #49736	2024-10-31 10:58:09 +00:00
Dmitry Chernenkov	d924a9ba03	Revert "[InstrPGO] Support cold function coverage instrumentation (#109837 )" This reverts commit e517cfc531886bf6ed64b4e7109bb3141ac7f430.	2024-10-31 10:55:17 +00:00
Ami-zhang	1897bf61f0	[LoongArch] Enable FeatureExtLSX for generic-la64 processor (#113421 ) This commit makes the `generic` target to support FP and LSX, as discussed in #110211. Thereby, it allows 128-bit vector to be enabled by default in the loongarch64 backend.	2024-10-31 15:58:15 +08:00
David Green	9735c05186	[ValueTracking] Compute KnownFP state from recursive select/phi. (#113686 ) Given a recursive phi with select: %p = phi [ 0, entry ], [ %sel, loop] %sel = select %c, %other, %p The fp state can be calculated using the knowledge that the select/phi pair can only be the initial state (0 here) or from %other. This adds a short-cut into computeKnownFPClass for PHI to detect that the select is recursive back to the phi, and if so use the state from the other operand. This helps to address a regression from #83200.	2024-10-31 07:50:44 +00:00
Paul Kirth	b01e2a8b56	[llvm] Allow always dropping all llvm.type.test sequences Currently, the `DropTypeTests` parameter only fully works with phi nodes and llvm.assume instructions. However, we'd like CFI to work in conjunction with FatLTO, in so far as the bitcode section should be able to contain the CFI instrumentation, while any incompatible bits are dropped when compiling the object code. To do that, we need to drop the llvm.type.test instructions everywhere, and not just their uses in phi nodes. This patch updates the LowerTypeTest pass so that uses are removed, and replaced with `true` in all cases, and not just in phi nodes. Addressing this will allow us to fix #112053 by modifying the FatLTO pipeline. Reviewers: pcc, nikic Reviewed By: pcc Pull Request: https://github.com/llvm/llvm-project/pull/112787	2024-10-30 16:56:30 -07:00
Artem Belevich	04e876e6c6	Revert "[NVPTX] instcombine known pointer AS checks." (#114319 ) Reverts llvm/llvm-project#112964 Crashes MLIR: https://lab.llvm.org/buildbot/#/builders/138/builds/5665	2024-10-30 15:34:08 -07:00
Artem Belevich	1cecc58c3f	[NVPTX] instcombine known pointer AS checks. (#112964 ) The change improves the code in general and, as a side effect, avoids crashing on an impossible address space casts guarded by `__isGlobal/__isShared`, which partially fixes https://github.com/llvm/llvm-project/issues/112760 It's still possible to trigger the issue by using explicit AS casts w/o AS checks, but LLVM should no longer crash on valid code.	2024-10-30 15:13:06 -07:00
gulfemsavrun	36d5692570	Revert "[TLI] Add support for hypot libcall." (#114312 ) Reverts llvm/llvm-project#113724	2024-10-30 15:10:29 -07:00
Luke Lau	14045de250	[RISCV] Account for factor in interleave memory op costs (#111511 ) Currently we cost an interleaved memory op as if it were a load/store of the widened vector type, but this was undercosting in all cases when compared to the measured performance of todays hardware. On the x280 at NF=2 and spacemit-x60 at NF=2,3 and 4, a segmented load is carried out as a wide load and NF LMUL shuffle ops: https://github.com/preames/bp3-microarch#vlseg_lmul_x_sew_throughput All other NFs go through a slow path. On the spacemit-x60 this is proportional to VLMAX * NF, and on the x280 proportional to the number of segments. This patch increases the cost by implementing a wide load + NF LMUL shuffle op cost for the lowest common denominator NF=2, and then a slower cost proportional to VL for the other NFs. In a follow up patch we can add a tuning flag to use the faster cost model for NF=3 and 4 on the spacemit-x60. Note that the FIXME about illegal vectors seems to have been fixed in #100436	2024-10-31 05:36:46 +08:00
Kenji Mouri / 毛利研二	feb2d867fa	[TLI] Add support for hypot libcall. (#113724 ) This patch adds basic support for `hypot`. Constant folding support will be submitted in a subsequent patch. Related issue: https://github.com/llvm/llvm-project/issues/113711 Note: It's my first time contributing to the LLVM with encouragement from one of my friends, @fawdlstty. I learned a lot from https://github.com/llvm/llvm-project/pull/99611, and thanks for that. Kenji Mouri	2024-10-30 10:34:32 -07:00
Steven Perron	f405c683ba	[OPT] Search whole BB for convergence token. (#112728 ) The spec for llvm.experimental.convergence.entry says that is must be in the entry block for a function, and must preceed any other convergent operation. It does not have to be the first instruction in the entry block. Inlining assumes that the call to llvm.experimental.convergence.entry will be the first instruction after any phi instructions. This commit modifies inlining to search the entire block for the call.	2024-10-30 11:19:23 -04:00
David Sherwood	7f498a865f	[CostModel][LoopVectorize] Move some loop vectoriser tests (#113702 ) Many tests that were in test/Analysis/CostModel were actually loop vectoriser tests. I've moved them as follows: Analysis/CostModel/X86 -> Transforms/LoopVectorize/X86/CostModel Analysis/CostModel/AArch64/arith-fp-frem.ll -> Transforms/LoopVectorize/AArch64/arith-fp-frem-costs.ll	2024-10-30 13:50:02 +00:00
Simon Pilgrim	80c8ecd565	[VectorCombine] Add baseline "shuffle (binop (shuffle, shuffle)), undef" tests for #114101	2024-10-30 13:42:58 +00:00
Simon Pilgrim	bc999ee57a	[PhaseOrdering][X86] Add test coverage for #94546	2024-10-30 11:55:04 +00:00
Simon Pilgrim	2de1fc8286	[PhaseOrdering][X86] Add additional test coverage for #49736 I've kept the old PR50392 tag since this is such an old issue....	2024-10-30 11:10:48 +00:00
David Green	f358422268	[Attributor] Add nofpclass test for phi+select recurrences. NFC	2024-10-30 08:10:35 +00:00
Teresa Johnson	bb3915149a	[MemProf] Support for random hotness when writing profile (#113998 ) Add support for generating random hotness in the memprof profile writer, to be used for testing. The random seed is printed to stderr, and an additional option enables providing a specific seed in order to reproduce a particular random profile.	2024-10-29 22:10:33 -07:00
Matthias Braun	255e441613	X86: Do not return invalid cost for fp16 conversion (#114128 ) Returning invalid instruction costs when converting from/to fp16 in `X86TTIImpl::getCastInstrCost` when there is no hardware support available was triggering asserts. This changes the code to return a large (arbitrary) number to model the fact that libcalls are used to implement the conversion. This also simplifies the code by only reporting costs for the scalar fp16 conversion; vectorized costs being left to the fallback assuming scalarization. This is a follow-up to assertion issues reported for the changes in #113195	2024-10-29 17:16:17 -07:00
Hari Limaye	e19a5fc6d3	[FuncSpec] Improve accounting of specialization codesize growth (#113448 ) Only accumulate the codesize increase of functions that are actually specialized, rather than for every candidate specialization that we analyse. This fixes a subtle bug where prior analysis of candidate specializations that were deemed unprofitable could prevent subsequent profitable candidates from being recognised.	2024-10-29 11:53:12 +00:00
Hari Limaye	06664fdc76	[FuncSpec] Enable SpecializeLiteralConstant by default (#113442 ) Enable specialization on literal constant arguments by default in Function Specialization. --------- Co-authored-by: Alexandros Lamprineas <alexandros.lamprineas@arm.com>	2024-10-29 11:41:25 +00:00
Rohit Aggarwal	dfb60bb919	Adding more vector calls for -fveclib=AMDLIBM (#109662 ) AMD has it's own implementation of vector calls. New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos Please refer [https://github.com/amd/aocl-libm-ose] --------- Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>	2024-10-29 10:09:55 +00:00

1 2 3 4 5 ...

30252 Commits