llvm-project

Author	SHA1	Message	Date
Paschalis Mpeis	08513281bd	[BOLT][test] Drop toolname from X86/perf2bolt-spe.test (#145515 )	2025-06-24 15:12:16 +01:00
Kazu Hirata	63f30d7d82	[mlir] Migrate away from {TypeRange,ValueRange}(std::nullopt) (NFC) (#145445 ) ArrayRef has a constructor that accepts std::nullopt. This constructor dates back to the days when we still had llvm::Optional. Since the use of std::nullopt outside the context of std::optional is kind of abuse and not intuitive to new comers, I would like to move away from the constructor and eventually remove it. This patch migrates away from TypeRagne(std::nullopt) and ValueRange(std::nullopt).	2025-06-24 07:03:59 -07:00
Nicolas Vasilache	6dad1e87fb	[mlir][transform][Linalg] NFC - DCE unused options in PadTilingInterfaceOptions	2025-06-24 15:46:15 +02:00
David	8fec6d1177	llvm-c: Introduce 'LLVMDISubprogramReplaceType' (#143461 ) The C API does not provide a way to replace the subroutine type after creating a subprogram. This functionality is useful for creating a subroutine type composed of types which have the subprogram as scope	2025-06-24 14:42:06 +01:00
Nikita Popov	68f09370f9	[Module] Use getDeclarationIfExists() (NFC) Don't insert declarations in order to immediately remove them again.	2025-06-24 15:37:27 +02:00
Orlando Cazalet-Hyams	75cf826849	[KeyInstr][Clang] Fix atomic ops atoms test Fixup test added in #141624 (ddecfa696c4929ac364053f3eef66fefe4873448).	2025-06-24 14:37:21 +01:00
Ellis Hoag	b77c7138a8	[lld][BP] Fix duplicate section size measurment (#145384 )	2025-06-24 06:31:23 -07:00
Pavel Labath	3e98d2b031	[lldb] Fix windows build for #145293	2025-06-24 15:25:10 +02:00
Tobias Stadler	9186df9b08	[InlineCost] Simplify extractvalue across callsite (#145054 ) Motivation: When using libc++, `std::bitset<64>::count()` doesn't optimize to a single popcount instruction on AArch64, because we fail to inline the library code completely. Inlining fails, because the internal bit_iterator struct is passed as a [2 x i64] %arg value on AArch64. The value is built using insertvalue instructions and only one of the array entries is constant. If we know that this entry is constant, we can prove that half the function becomes dead. However, InlineCost only considers operands for simplification if they are Constants, which %arg is not. Without this simplification the function is too expensive to inline. Therefore, we had to teach InlineCost to support non-Constant simplified values (PR #145083). Now, we enable this for extractvalue, because we want to simplify the extractvalue with the insertvalues from the caller function. This is enough to get bitset::count fully optimized. There are similar opportunities we can explore for BinOps in the future (e.g. cmp eq %arg1, %arg2 when the caller passes the same value into both arguments), but we need to be careful here, because InstSimplify isn't completely safe to use with operands owned by different functions.	2025-06-24 14:15:27 +01:00
Balázs Benics	e04c938cc0	[analyzer][NFC] Add xrefs to a test case that has poor git blame (#145501 )	2025-06-24 14:50:14 +02:00
Balázs Benics	6fe8543a2a	[analyzer][docs] Mention perfetto for visualizing trace JSONs (#145500 )	2025-06-24 14:49:43 +02:00
Simon Pilgrim	db4dc88d06	[X86] combineEXTRACT_SUBVECTOR - remove unnecessary bitcast handling. (#145496 ) We already aggressively fold extract_subvector(bitcast()) -> bitcast(extract_subvector())	2025-06-24 13:47:03 +01:00
Darren Wihandi	9f3931b659	[AMDGPU] Fold fmed3 when inputs include infinity (#144824 )	2025-06-24 21:44:17 +09:00
Ross Brunton	4785832144	[Offload] Fix cmake warning (#145488 ) Cmake was unhappy that there was no space between arguments, now it is.	2025-06-24 13:42:03 +01:00
Kareem Ergawy	9aebfde1e7	[flang] Allow `cycle` in `target teams distribute [simd]` (#145462 ) flang incorrectly issues a semantic erorr when a `cycle` statement is used inside a `target teams distribute [simd]` associated loop. This is not prevented by the spec, therefore this PR allows such construct.	2025-06-24 14:21:06 +02:00
Orlando Cazalet-Hyams	352baa386c	[RemoveDIs] Resolve RemoveRedundantDbgInstrs fwd scan FIXME (#144718 ) These FIXMEs were added to keep the dbg_record implementation identical to the dbg intrinsic versions, which have since been removed. I don't think there's any reason for the old behaviour; my understanding is it was a minor bug no one got round to fixing. I've upgraded the test to be written with dbg_records while I'm here.	2025-06-24 13:09:49 +01:00
David Green	825ad86aea	[DAG] Fold nested add(add(reduce(a), b), add(reduce(c), d)) (#115150 ) This patch reassociates `add(add(vecreduce(a), b), add(vecreduce(c), d))` into `add(vecreduce(add(a, c)), add(b, d))`, to combine the reductions into a single node. This comes up after unrolling vectorized loops. There is another small change to move reassociateReduction inside fadd outside of a AllowNewConst block, as new constants will not be created and it should be OK to perform the combine later after legalization.	2025-06-24 13:08:59 +01:00
Orlando Cazalet-Hyams	db72f6cbe6	[RemoveDIs][NFC] Remove dbg intrinsic handling code from AssignmentTrackingAnalysis (#144674 ) See PR for breakdown into individual commits.	2025-06-24 13:07:31 +01:00
Fabian Mora	8f4da2cbf0	[mlir][affine] Fix min simplification in makeComposedAffineApply (#145376 ) This patch fixes a bug discovered in the `affine::makeComposedFoldedAffineApply` function when `composeAffineMin == true`. The bug happened because the simplification assumed the symbols appearing in the `affine.apply` op corresponded to symbols in the `affine.min` op, and that's not always the case. For example: ```mlir #map = affine_map<()[s0, s1] -> (s1)> #map1 = affine_map<()[s0, s1] -> (s0 ceildiv s1)> module { func.func @min_max_full_simplify() -> index { %0 = test.value_with_bounds {max = 64 : index, min = 32 : index} %1 = test.value_with_bounds {max = 64 : index, min = 32 : index} %2 = affine.min #map()[%0, %1] %3 = affine.apply #map1()[%2, %0] return %3 : index } } ``` This patch also introduces the test `make_composed_folded_affine_apply` transform operation to test this simplification. It also adds tests ensuring we get correct behavior. --------- Co-authored-by: Nicolas Vasilache <nico.vasilache@amd.com>	2025-06-24 07:55:12 -04:00
Orlando Cazalet-Hyams	1dc46d45fc	[RemoveDIs] Fix rotten --implicit-check-not lines (#144711 )	2025-06-24 12:32:50 +01:00
Orlando Cazalet-Hyams	ddecfa696c	[KeyInstr][Clang] Atomic ops atoms (#141624 ) This patch is part of a stack that teaches Clang to generate Key Instructions metadata for C and C++. The feature is only functional in LLVM if LLVM is built with CMake flag LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed. RFC: https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668	2025-06-24 12:20:44 +01:00
David Spickett	fa5d7c926f	[lldb][lldb-dap] Fix runInTerminal test program on Windows	2025-06-24 11:07:45 +00:00
Florian Hahn	b8769104f1	[LAA] Address follow-up suggestions for #128061 . Adjust naming and add argument comments as suggested.	2025-06-24 12:00:17 +01:00
yronglin	e8976e92f6	[clang][Preprocessor] Add peekNextPPToken, makes look ahead next token without side-effects (#143898 ) This PR introduce a new function `peekNextPPToken`. It's an extension of `isNextPPTokenLParen` and can makes look ahead one token in preprocessor without side-effects. It's also the 1st part of https://github.com/llvm/llvm-project/pull/107168 and it was used to look ahead next token then determine whether current lexing pp directive is one of pp-import or pp-module directive. At the start of phase 4 an import or module token is treated as starting a directive and are converted to their respective keywords iff: - After skipping horizontal whitespace are - at the start of a logical line, or - preceded by an export at the start of the logical line. - Are followed by an identifier pp token (before macro expansion), or - <, ", or : (but not ::) pp tokens for import, or - ; for module Otherwise the token is treated as an identifier. --------- Signed-off-by: yronglin <yronglin777@gmail.com>	2025-06-24 18:55:21 +08:00
Pavel Labath	4d2b79b04a	[lldb] Fix build for #145017 Mid-flight collision with #145293.	2025-06-24 12:45:44 +02:00
Chris Jackson	bfde147761	[NFC][AMDGPU] Update and.ll test and automate check line generation (#145371 ) - Convert the test to use update_llc_test_checks.py. - Use different check prefixes for the different -mcpu options. - Remove unused xnack 'off' flag.	2025-06-24 11:42:49 +01:00
Pavel Labath	24438aa488	[lldb] Use Socket::CreatePair for launching debugserver (#145017 ) This lets get rid of platform-specific code in ProcessGDBRemote and use the same code path (module differences in socket types) everywhere. It also unlocks further cleanups in the debugserver launching code. The main effect of this change is that lldb on windows will now use the `--fd` lldb-server argument for "local remote" debug sessions instead of having lldb-server connect back to lldb. This is the same method used by lldb on non-windows platforms (for many years) and "lldb-server platform" on windows for truly remote debug sessions (for ~one year). Depends on #145015.	2025-06-24 12:39:24 +02:00
Michael Buch	371f12f96d	Revert "[lldb] Add count for number of DWO files loaded in statistics" (#145494 ) Reverts llvm/llvm-project#144424 Caused CI failures. macOS CI failure was: ``` 10:20:36 FAIL: test_dwp_dwo_file_count (TestStats.TestCase) 10:20:36 Test "statistics dump" and the loaded dwo file count. 10:20:36 ---------------------------------------------------------------------- 10:20:36 Traceback (most recent call last): 10:20:36 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 639, in test_dwp_dwo_file_count 10:20:36 self.assertEqual(debug_stats["totalDwoFileCount"], 2) 10:20:36 AssertionError: 0 != 2 10:20:36 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 10:20:36 ====================================================================== 10:20:36 FAIL: test_no_debug_names_eager_loads_dwo_files (TestStats.TestCase) 10:20:36 Test the eager loading behavior of DWO files when debug_names is absent by 10:20:36 ---------------------------------------------------------------------- 10:20:36 Traceback (most recent call last): 10:20:36 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 566, in test_no_debug_names_eager_loads_dwo_files 10:20:36 self.assertEqual(debug_stats["totalDwoFileCount"], 2) 10:20:36 AssertionError: 0 != 2 10:20:36 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang 10:20:36 ====================================================================== 10:20:36 FAIL: test_split_dwarf_dwo_file_count (TestStats.TestCase) 10:20:36 Test "statistics dump" and the dwo file count. 10:20:36 ---------------------------------------------------------------------- 10:20:36 Traceback (most recent call last): 10:20:36 File "/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/llvm-project/lldb/test/API/commands/statistics/basic/TestStats.py", line 588, in test_split_dwarf_dwo_file_count 10:20:36 self.assertEqual(len(debug_stats["modules"]), 1) 10:20:36 AssertionError: 42 != 1 10:20:36 Config=arm64-/Users/ec2-user/jenkins/workspace/llvm.org/as-lldb-cmake/lldb-build/bin/clang ```	2025-06-24 11:33:00 +01:00
Kerry McLaughlin	61b99ca512	[AArch64] Consider StreamingSVE in shouldExpandGetActiveLaneMask (#144722 ) If StreamingSVE is available, we may be able to lower the intrinsic to the GET_ACTIVE_LANE_MASK node instead of expanding it. Also adds the node to addTypeForFixedLengthSVE to ensure we lower to the SVE instruction when useSVEForFixedLengthVectors is true.	2025-06-24 11:08:48 +01:00
David Truby	049d61ad65	[flang][AArch64] Always link compiler-rt to flang after libgcc (#144710 ) This patch fixes an issue where the __trampoline_setup symbol is missing with some programs compiled with flang. This symbol is present only in compiler-rt and not in libgcc. This patch adds compiler-rt to the link line after libgcc if libgcc is being used, so that only this symbol will be picked from compiler-rt. Fixes #141147	2025-06-24 11:08:13 +01:00
Simon Pilgrim	594ebe6340	[X86] combineSelect - move vselect(cond, pshufb(x), pshufb(y)) -> or(pshufb(x), pshufb(y)) fold (#145475 ) Move the OR(PSHUFB(),PSHUFB()) fold to reuse an existing createShuffleMaskFromVSELECT result and ensure it is performed before the combineX86ShufflesRecursively combine to prevent some hasOneUse failures noticed in #133947 (combineX86ShufflesRecursively still unnecessarily widens vectors in several locations).	2025-06-24 10:50:29 +01:00
Diana Picus	54b522f6fd	[AMDGPU] Fixup a201f8872a63 (#145486 ) Fix test lines based on old revision for main.	2025-06-24 11:43:28 +02:00
Benjamin Kramer	e4b9aa6192	[bazel] Port d31ba5256327d30f264c2f671bf197877b242cde	2025-06-24 11:37:50 +02:00
Benjamin Kramer	45c5eb168f	[bazel] mlir_copts doesn't exist	2025-06-24 11:31:55 +02:00
Pavel Labath	cf9546b826	[lldb] Remove GDBRemoteCommunication::ConnectLocally (#145293 ) Originally added for reproducers, it is now only used for test code. While we could make it a test helper, I think that after #145015 it is simple enough to not be needed. Also squeeze in a change to make ConnectionFileDescriptor accept a unique_ptr<Socket>.	2025-06-24 11:11:35 +02:00
Pavel Labath	46e1e9f104	Reapply "[lldb/cmake] Plugin layering enforcement mechanism (#144543 )" (#145305 ) The only difference from the original PR are the added BRIEF and FULL_DOCS arguments to define_property, which are required for cmake<3.23.	2025-06-24 11:10:35 +02:00
Diana Picus	a201f8872a	[AMDGPU] Replace dynamic VGPR feature with attribute (#133444 ) Use a function attribute (amdgpu-dynamic-vgpr) instead of a subtarget feature, as requested in #130030.	2025-06-24 11:09:36 +02:00
Lang Hames	6cfa03f1f1	[ORC] Drop unused LinkGraphLinkingLayer::Plugin::notifyLoaded method. (#145457 ) This method was included in the original Plugin API as a counterpart to JITEventListener::notifyLoaded but was never used.	2025-06-24 19:00:24 +10:00
antoine moynault	5fa55b2dfc	Revert "[flang][OpenMP] Skip runtime mapping with no offload targets (#144534 )" (#145478 ) And also revert 6ba1955 "[flang][OpenMP] Fix ignore-target-data.f90 test" As it causes several bot failures https://github.com/llvm/llvm-project/pull/144534#issuecomment-2995303224	2025-06-24 10:51:26 +02:00
Matt Arsenault	73e4f8a71f	ARM: Use member initializer list (#145459 )	2025-06-24 17:47:34 +09:00
Kazu Hirata	8d9911e4a0	[Option] Use a range-based for loop (NFC) (#145446 )	2025-06-24 00:46:17 -07:00
Aviad Cohen	d5c8024dae	[mlir][bazel]: Add FuncUtil rule in bazel files (#145463 )	2025-06-24 10:40:57 +03:00
Nikita Popov	0112f12eb6	[EarlyCSE] Remove void return restriction for call CSE (#145320 ) For readonly/readnone calls returning void we can't CSE the return value. However, making these participate in CSE is still useful, because it allows DCE of calls that are not willreturn/nounwind (something no other part of LLVM is capable of removing). The more interesting use-case is CSE for writeonly calls (not yet supported), but I figured this change makes sense independently. There is no impact on compile-time.	2025-06-24 09:20:03 +02:00
Juan Manuel Martinez Caamaño	8ec0552a7f	Reapply "[CUDA][HIP] Add a __device__ version of std::__glibcxx_assert_fail() (#144886 ) Modifications to reapply the commit: * Add noexcept only after C++11 on __glibcxx_assert_fail * Remove vararg version of __glibcxx_assert_fail	2025-06-24 09:13:13 +02:00
Kazu Hirata	f704738781	[verify-uselistorder] Use llvm::is_sorted (NFC) (#145444 ) We can pass a range to llvm::is_sorted.	2025-06-24 00:10:22 -07:00
Antonio Frighetto	1247fddf36	[SimplifyCFG] Relax `cttz` cost check in `simplifySwitchOfPowersOfTwo` We should be able to allow `simplifySwitchOfPowersOfTwo` transform to take place, as, on recent X86 targets, the weighted latency-size appears to be 2. This favours computing trailing zeroes and indexing into a smaller value table, over generating a jump table with an indirect branch, which overall should be more efficient.	2025-06-24 09:06:18 +02:00
Matthias Springer	c5972da34a	[mlir][Transforms] Dialect Conversion: Simplify block-inline handling (#145308 ) When a block is getting inlined, the destination block does not have to be legalized. That's because the signature of the destination block does not change by inlining. This commit makes the implementation consistent with this comment: ``` // If the pattern moved or created any blocks, make sure the types of block // arguments get legalized. ```	2025-06-24 08:52:13 +02:00
Fabian Ritter	3e1e368824	[AMDGPU][SDAG] Add tests for ISD::PTRADD DAG combines (#142738 ) Pre-committing tests to show improvements in a follow-up PR with the combines.	2025-06-24 08:43:54 +02:00
Feng Zou	b1dcf78378	[X86][APX] Fix issue of push2/pop2 instr with stack clash protection (#145303 ) When -stack-clash-protection option is specified and APX push2pop2 is enabled, there will be two calls to emitSPUpdate function which emits two STACKALLOC_W_PROBING pseudo instructions. The pseudo instruction for push2 padding is silently ignored which leads to the stack misaligned to 16 bytes and GP exception in runtime. Fixed by directly emitting "push %rax" instruction for push2 padding, instead of calling emitSPUpdate. There was a similar issue on https://reviews.llvm.org/D150033.	2025-06-24 14:22:14 +08:00
Durgadoss R	ef048471f7	[NVPTX][NFC] Rearrange the TMA-S2G intrinsics (#144903 ) This patch moves the TMA S2G intrinsics into their own set of loops. This is in preparation for adding im2colw/w128 modes support to the G2S intrinsics (but the S2G ones do not support those modes). Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-06-24 11:47:21 +05:30

1 2 3 4 5 ...

542054 Commits