llvm-project

Author	SHA1	Message	Date
Sergei Barannikov	b7d6f484c8	[RISCV] Remove non-existent operand of nds.vfwcvt/nds.vfncvt instructions (#153865 ) Mask operand is likely a copy-past error, they don't have one.	2025-08-16 00:46:19 +03:00
Peter Collingbourne	6beb6f34bc	dfsan: Fix test with gcc 15. With gcc 15 we end up emitting a reference to the std::__glibcxx_assert_fail function because of this change: `361d230fd7` combined with assertion checks in the std::atomic implementation. This reference is undefined with dfsan causing the test to fail. Fix it by defining the macro that disables assertions. Pull Request: https://github.com/llvm/llvm-project/pull/153873	2025-08-15 14:44:27 -07:00
Peter Collingbourne	19cfc30b33	compiler-rt: Make the tests pass on AArch64 and with page size != 4096. This makes the tests pass on my AArch64 machine with 16K pages. Not sure why some of the AArch64-specific test failures don't seem to occur on sanitizer-aarch64-linux. I could also reproduce them by running buildbot_cmake.sh on my machine. Pull Request: https://github.com/llvm/llvm-project/pull/153860	2025-08-15 14:44:27 -07:00
Haibo Jiang	21a5729b87	[BOLT] Do not use HLT as split point when build the CFG (#150963 ) For x86, the halt instruction is defined as a terminator instruction. When building the CFG, the instruction sequence following the hlt instruction is treated as an independent MBB. Since there is no jump information, the predecessor of this MBB cannot be identified, and it is considered an unreachable MBB that will be removed. Using this fix, the instruction sequences before and after hlt are refused to be placed in different blocks.	2025-08-15 14:35:13 -07:00
Aiden Grossman	d0b19cf792	[Github][CI] Set CC and CXX in CI Container We set these explicitly in a bunch of places. That is annoying and it is nice to get them picked up by default rather than needing to remember.	2025-08-15 21:31:17 +00:00
Stanislav Mekhanoshin	1f25c4883e	[AMDGPU] Mitigate DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 bug (#153872 ) DS_ATOMIC_ASYNC_BARRIER_ARRIVE_B64 shall not be claused (we already do not clause DS instructions) and needs waits before and after.	2025-08-15 14:17:54 -07:00
Chenguang Wang	eecbaac5c6	[bazel] Add yaml2obj to mlir/Test/Target/BUILD.bazel (#153875 ) https://github.com/llvm/llvm-project/pull/152131 uses yaml2obj, which is not listed as a dependency of the lit tests in bazel. This is causing LLVM CI failures, e.g [1]. [1]: https://buildkite.com/llvm-project/upstream-bazel/builds/146788/steps/canvas?sid=0198af37-f624-470f-aac1-d9e0b42fab56	2025-08-15 21:16:03 +00:00
Slava Zakharin	25285b3476	[flang] Lower EOSHIFT into hlfir.eoshift. (#153106 ) Straightforward lowering of EOSHIFT intrinsic into the new hlfir.eoshift operation.	2025-08-15 13:55:05 -07:00
Slava Zakharin	4c6afc7993	[flang] Lower hlfir.eoshift to the runtime call. (#153107 ) Straightforward lowering of hlfir.eoshift to the runtime call in LowerHLFIRIntrinsics pass.	2025-08-15 13:54:49 -07:00
Stanislav Mekhanoshin	e3154559ef	[AMDGPU] Select mul_lohi to V_MAD_NC_{I\|U}64_I32 on gfx1250 (#153851 )	2025-08-15 13:53:08 -07:00
gulfemsavrun	334e9bf2dd	Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) …210)" This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3. Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)" This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de. Revert "TableGen: Emit statically generated hash table for runtime libcalls (#150192)" This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05. Reverted three changes because of a CMake error while building llvm-nm as reported in the following PR: https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073	2025-08-15 13:32:27 -07:00
Matheus Izvekov	5c51a88f19	[clang] fix DependentNameType -> UnresolvedUsingType transforms (#153862 )	2025-08-15 17:21:55 -03:00
Sterling-Augustine	5b0619e79b	Move function info word into its own data structure (#153627 ) The sframe generator needs to construct this word separately from FDEs themselves, so split them into a separate data structure.	2025-08-15 13:16:34 -07:00
Slava Zakharin	95d4362521	[flang] Added hlfir.eoshift operation definition. (#153105 ) This is a basic definition of the operation corresponding to the Fortran's EOSHIFT transformational intrinsic.	2025-08-15 13:15:35 -07:00
Craig Topper	c84a43ff3b	[RISCV] Fold (sext_inreg (xor (setcc), -1), i1) -> (add (setcc), -1). (#153855 ) This improves all 3 vendor extensions that make sext_inreg i1 legal Fixes #153781.	2025-08-15 12:55:18 -07:00
Aiden Grossman	ca8ee49c1f	[MLIR] Set LLVM_LIT_ARGS in Standalone Example CMake (#152423 ) Setting LLVM_LIT_ARGS to include --quiet and then running check-mlir in a standard checkout will otherwise cause test failures here because LLVM_LIT_ARGS gets propagated into this project.	2025-08-15 12:40:32 -07:00
Augusto Noronha	c61fb5ca69	[NFC][lldb] Make C++ symbols in CPlusPlusLanguageTest.cpp valid (#153857 )	2025-08-15 19:40:24 +00:00
Alexey Bataev	b157599156	[SLP]Do not include copyable data to the same user twice If the copyable schedule data is created and the user is used several times in the user node, no need to count same data for the same user several times, need to include it only ones. Fixes #153754	2025-08-15 12:36:45 -07:00
David Green	732eb5427c	[AArch64] Replace SIMDLongThreeVectorBHSabd with SIMDLongThreeVectorBHS. (#152987 ) We just need to use a BinOpFrag to share the patterns. This also moves UABDL to where it belongs in with similar instructions, and removes some patterns that are now handled by abd nodes. This is mostly NFC except for GISel, which will catch back up when it handles abd nodes in the same way.	2025-08-15 20:35:27 +01:00
Florian Hahn	2ed727f3f6	[VPlan] Move SCEV invalidation to ::executePlan. (NFCI) Move SCEV invalidation from legacy ILV code-path directly to ::executePlan.	2025-08-15 20:32:41 +01:00
Chenguang Wang	b3e3a2090b	[bazel] Add missing test inputs inclusion on mlir/test/Target. (#153854 ) https://github.com/llvm/llvm-project/pull/152131 added a few tests that depend on `mlir/test/Target/Wasm/inputs/*`, e.g. `mlir/test/Target/Wasm/import.mlir` reads `inputs/import.yaml.wasm`. These inputs should be included as data dependency.	2025-08-15 12:32:15 -07:00
CatherineMoore	49e28d77b8	[OpenMP] Update ompdModule.c printf to match argument type (#152785 ) Update printf format string to match argument list --------- Co-authored-by: Joachim <protze@rz.rwth-aachen.de> Co-authored-by: Joachim Jenke <jenke@itc.rwth-aachen.de>	2025-08-15 14:30:47 -05:00
Augusto Noronha	c6ea7d72d1	[lldb] Fix CXX's SymbolNameFitsToLanguage matching other languages (#153685 ) The current implementation of CPlusPlusLanguage::SymbolNameFitsToLanguage will return true if the symbol is mangled for any language that lldb knows about.	2025-08-15 12:30:21 -07:00
Bill Wendling	139bde2035	[llvm] Ignore coding assistant artifacts (#153853 ) Now that "vibe coding" is a thing, ignore the documentation artifacts that coding assistants, like Claude and Gemini, use to retain coding workflows and other metadata.	2025-08-15 12:27:54 -07:00
Alexey Bataev	09f5b9ab0a	Revert "[SLP]Do not include copyable data to the same user twice" This reverts commit 758c6852c3ffe6b5e259cafadd811e60d8c276fb to fix buildbot https://lab.llvm.org/buildbot/#/builders/195/builds/13298	2025-08-15 12:08:31 -07:00
Jasmine Tang	d7a29e5d56	[WebAssembly] Reapply #149461 with correct CondCode in combine of SETCC (#153703 ) This PR reapplies https://github.com/llvm/llvm-project/pull/149461 In the original `combineVectorSizedSetCCEquality`, the result of setcc is being negated by returning setcc with the same cond code, leading to wrong logic. For example, with ```llvm %cmp_16 = call i32 @memcmp(ptr %a, ptr %b, i32 16) %res = icmp eq i32 %cmp_16, 0 ``` the original PR producese all_true and then also compares the result equal to 0 (using the same SETEQ in the returning setcc), meaning that semantically, it effectively is calling icmp ne. Instead, the PR should have use SETNE in the returning setcc, this way, all true return 1, then it is compared again ne 0, which is equivalent to icmp eq.	2025-08-15 12:06:47 -07:00
Abhinav Gaba	79cf877627	[Offload] Introduce dataFence plugin interface. (#153793 ) The purpose of this fence is to ensure that any `dataSubmit`s inserted into a queue before a `dataFence` finish before finish before any `dataSubmit`s inserted after it begin. This is a no-op for most queues, since they are in-order, and by design any operations inserted into them occur in order. But the interface is supposed to be functional for out-of-order queues. The addition of the interface means that any operations that rely on such ordering (like ATTACH map-type support in #149036) can invoke it, without worrying about whether the underlying queue is in-order or out-of-order. Once a plugin supports out-of-order queues, the plugin can implement this function, without requiring any change at the libomptarget level. --------- Co-authored-by: Alex Duran <alejandro.duran@intel.com>	2025-08-15 11:49:35 -07:00
zGoldthorpe	82caa251d4	[InstCombine] Fold integer unpack/repack patterns through ZExt (#153583 ) This patch explicitly enables the InstCombiner to fold integer unpack/repack patterns such as ```llvm define i64 @src_combine(i32 %lower, i32 %upper) { %base = zext i32 %lower to i64 %u.0 = and i32 %upper, u0xff %z.0 = zext i32 %u.0 to i64 %s.0 = shl i64 %z.0, 32 %o.0 = or i64 %base, %s.0 %r.1 = lshr i32 %upper, 8 %u.1 = and i32 %r.1, u0xff %z.1 = zext i32 %u.1 to i64 %s.1 = shl i64 %z.1, 40 %o.1 = or i64 %o.0, %s.1 %r.2 = lshr i32 %upper, 16 %u.2 = and i32 %r.2, u0xff %z.2 = zext i32 %u.2 to i64 %s.2 = shl i64 %z.2, 48 %o.2 = or i64 %o.1, %s.2 %r.3 = lshr i32 %upper, 24 %u.3 = and i32 %r.3, u0xff %z.3 = zext i32 %u.3 to i64 %s.3 = shl i64 %z.3, 56 %o.3 = or i64 %o.2, %s.3 ret i64 %o.3 } ; => define i64 @tgt_combine(i32 %lower, i32 %upper) { %base = zext i32 %lower to i64 %upper.zext = zext i32 %upper to i64 %s.0 = shl nuw i64 %upper.zext, 32 %o.3 = or disjoint i64 %s.0, %base ret i64 %o.3 } ``` Alive2 proofs: [YAy7ny](https://alive2.llvm.org/ce/z/YAy7ny)	2025-08-15 12:48:32 -06:00
Alexey Bataev	758c6852c3	[SLP]Do not include copyable data to the same user twice If the copyable schedule data is created and the user is used several times in the user node, no need to count same data for the same user several times, need to include it only ones. Fixes #153754	2025-08-15 11:47:35 -07:00
Erich Keane	dcdbd5b55d	[OpenACC][NFCI] Implement 'recipe' generation for firstprivate copy (#153622 ) The 'firstprivate' clause requires that we do a 'copy' operation, so this patch creates some AST nodes from which we can generate the copy operation, including a 'temporary' and array init. For the most part this is pretty similar to what 'private' does other than the fact that the source is copy (and not default init!), and that there is a temporary from which to copy. --------- Co-authored-by: Andy Kaylor <akaylor@nvidia.com>	2025-08-15 18:42:40 +00:00
Stanislav Mekhanoshin	29976f2e58	[AMDGPU] Handle S_GETREG_B32 hazard on gfx1250 (#153848 ) GFX1250 SPG says: S_GETREG_B32 does not wait for idle before executing. The user must S_WAIT_ALU 0 before S_GETREG_B32 on: STATUS, STATE_PRIV, EXCP_FLAG_PRIV, or EXCP_FLAG_USER.	2025-08-15 11:38:22 -07:00
XChy	3a4a60deff	[VectorCombine] Apply InstSimplify in scalarizeOpOrCmp to avoid infinite loop (#153069 ) Fixes #153012 As we tolerate unfoldable constant expressions in `scalarizeOpOrCmp`, we may fold ```llvm define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) #0 { entry: %158 = insertelement <2 x i64> <i64 5, i64 ptrtoint (ptr @val to i64)>, i64 %idx, i32 0 %159 = or disjoint <2 x i64> splat (i64 2), %158 store <2 x i64> %159, ptr %ptr2 ret void } ``` to ```llvm define void @bug(ptr %ptr1, ptr %ptr2, i64 %idx) { entry: %.scalar = or disjoint i64 2, %idx %0 = or <2 x i64> splat (i64 2), <i64 5, i64 ptrtoint (ptr @val to i64)> %1 = insertelement <2 x i64> %0, i64 %.scalar, i64 0 store <2 x i64> %1, ptr %ptr2, align 16 ret void } ``` And it would be folded back in `foldInsExtBinop`, resulting in an infinite loop. This patch forces scalarization iff InstSimplify can fold the constant expression.	2025-08-15 18:38:04 +00:00
Dave Lee	1dc0005d6d	Revert "[lldb] Fallback to expression eval when Dump of variable fails in dwim-print" (#153824 ) Reverts llvm/llvm-project#151374 Superseded by https://github.com/llvm/llvm-project/pull/152417	2025-08-15 11:29:31 -07:00
Stanislav Mekhanoshin	5d28284dbb	[AMDGPU] gfx1250 does not need nop before VGPR dealloc (#153844 ) This has no impact as the dealloc is now practically disabled.	2025-08-15 11:29:02 -07:00
Valentin Clement (バレンタインクレメン)	3720d8b52d	[flang][cuda] Update some bind name to fast version and add __sincosf (#153744 ) Use the fast version in the bind name and reorder these fast math functions. Add missing __sincosf interface.	2025-08-15 11:07:15 -07:00
Aaron Ballman	ed6d505fab	[C][Docs] Add backported language features (#153837 ) We've backported a lot more features from C to previous C standards than we were documenting. I took a pass over the c_status page for Clang and pulled more entries to add to our documentation.	2025-08-15 13:59:41 -04:00
Kaitlin Peng	0bb1af478a	[DirectX] Add GlobalDCE pass after finalize linkage pass in DirectX backend (#151071 ) Fixes #139023. This PR essentially removes unused global variables: - Restores the `GlobalDCE` Legacy pass and adds it to the DirectX backend after the finalize linkage pass - Converts external global variables with no usage to internal linkage in the finalize linkage pass - (so they can be removed by `GlobalDCE`) - Makes the `dxil-finalize-linkage` pass usable using the new pass manager flag syntax - Adds tests to `finalize_linkage.ll` that make sure unused global variables are removed - Adds a use for variable `@CBV` in `opaque-value_as_metadata.ll` so it isn't removed - Changes the `scalar-data.ll` run command to avoid removing its global variables --------- Co-authored-by: Farzon Lotfi <farzonlotfi@microsoft.com>	2025-08-15 10:45:34 -07:00
Aiden Grossman	069f8121e0	[X86] Add RCU for Skylake Models (#153832 ) We cannot actually retire an infinite number of uops per cycle. This patch adds a RCU to the skylake scheduling model to fix this. I'm purposefully using a loose upper bound here. We're unlikely to actually get four fused uops per cycle, but this is better than not setting anything. Most realistic code I've put through uiCA will retire up to ~6 uops per cycle. Information taken from https://en.wikichip.org/wiki/intel/microarchitectures/skylake_(client). This requires modification of the two zero idiom tests because we do not currently model the CPU frontend which would likely be the actual bottleneck in that case. Related to #153747.	2025-08-15 10:33:26 -07:00
Valentin Clement (バレンタインクレメン)	115f816069	[flang][cuda] Add missing bind name for __int2double_rn (#153720 )	2025-08-15 10:27:19 -07:00
Valentin Clement (バレンタインクレメン)	0e4af726cb	[flang][cuda] Add interface for __fdividef (#153742 )	2025-08-15 10:26:40 -07:00
Valentin Clement (バレンタインクレメン)	0e8c964c21	[flang][cuda] Add interfaces for double_as_longlong and longlong_as_double (#153719 )	2025-08-15 17:26:11 +00:00
Alex MacLean	bc77363235	[NVPTX] Do not mark move of global address as cheap enabling more CSE (#153730 )	2025-08-15 10:17:34 -07:00
Valentin Clement (バレンタインクレメン)	fd3f052aeb	[flang][cuda] Add interfaces for int_as_float and float_as_int (#153716 )	2025-08-15 10:00:53 -07:00
Simon Pilgrim	92cb0414ca	[X86] avx512vnni-builtins.c / avx512vlvnni-builtins.c - add C/C++ and 32/64-bit test coverage	2025-08-15 17:55:33 +01:00
asraa	b045729eb4	[mlir][presburger] add functionality to compute local mod in IntegerRelation (#153614 ) Similar to `IntegerRelation::addLocalFloorDiv`, this adds a utility `IntegerRelation::addLocalModulo` that adds and returns a local variable that is the modulus of an affine function of the variables modulo some constant modulus. The function returns the absolute index of the new var in the relation. This is computed by first finding the floordiv of `exprs // modulus = q` and then computing the remainder `result = exprs - q * modulus`. Signed-off-by: Asra Ali <asraa@google.com>	2025-08-15 09:55:13 -07:00
zGoldthorpe	a8d25683ee	[PatternMatch] Allow `m_ConstantInt` to match integer splats (#153692 ) When matching integers, `m_ConstantInt` is a convenient alternative to `m_APInt` for matching unsigned 64-bit integers, allowing one to simplify ```cpp const APInt *IntC; if (match(V, m_APInt(IntC))) { if (IntC->ule(UINT64_MAX)) { uint64_t Int = IntC->getZExtValue(); // ... } } ``` to ```cpp uint64_t Int; if (match(V, m_ConstantInt(Int))) { // ... } ``` However, this simplification is only true if `V` is a scalar type. Specifically, `m_APInt` also matches integer splats, but `m_ConstantInt` does not. This patch ensures that the matching behaviour of `m_ConstantInt` parallels that of `m_APInt`, and also incorporates it in some obvious places.	2025-08-15 10:43:54 -06:00
keinflue	af96ed6bf6	[clang] Inject IndirectFieldDecl even if name conflicts. (#153140 ) This modifies InjectAnonymousStructOrUnionMembers to inject an IndirectFieldDecl and mark it invalid even if its name conflicts with another name in the scope. This resolves a crash on a further diagnostic diag::err_multiple_mem_union_initialization which via findDefaultInitializer relies on these declarations being present. Fixes #149985	2025-08-15 09:43:29 -07:00
Simon Pilgrim	2c20a9bfb3	[X86] avx512bf16-builtins.c / avx512vlbf16-builtins.c - add C/C++ and 32/64-bit test coverage	2025-08-15 17:38:43 +01:00
CatherineMoore	3a8f579a23	[OpenMP] Update printf statement with missing argument. (#153704 )	2025-08-15 16:34:00 +00:00
Valentin Clement (バレンタインクレメン)	583499a8cf	[flang][cuda] Add missing bind name for __hiloint2double, __double2loint and __double2hiint (#153713 )	2025-08-15 09:32:59 -07:00

... 3 4 5 6 7 ...

548981 Commits