llvm-project

Author	SHA1	Message	Date
erichkeane	d0dc3799b7	[OpenACC][NFCI] Add AST Infrastructure for reduction recipes This patch does the bare minimum to start setting up the reduction recipe support, including adding a type to the AST to store it. No real additional work is done, and a bunch of static_asserts are left around to allow us to do this properly.	2025-08-19 07:58:11 -07:00
Frederik Harwath	50a3368f22	[Clang] Take libstdc++ into account during GCC detection (#145056 ) The Generic_GCC::GCCInstallationDetector class picks the GCC installation directory with the largest version number. Since the location of the libstdc++ include directories is tied to the GCC version, this can break C++ compilation if the libstdc++ headers for this particular GCC version are not available. Linux distributions tend to package the libstdc++ headers separately from GCC. This frequently leads to situations in which a newer version of GCC gets installed as a dependency of another package without installing the corresponding libstdc++ package. Clang then fails to compile C++ code because it cannot find the libstdc++ headers. Since libstdc++ headers are in fact installed on the system, the GCC installation continues to work, the user may not be aware of the details of the GCC detection, and the compiler does not recognize the situation and emit a warning, this behavior can be hard to understand - as witnessed by many related bug reports over the years. The goal of this work is to change the GCC detection to prefer GCC installations that contain libstdc++ include directories over those which do not. This should happen regardless of the input language since picking different GCC installations for a build that mixes C and C++ might lead to incompatibilities. Any change to the GCC installation detection will probably have a negative impact on some users. For instance, for a C user who relies on using the GCC installation with the largest version number, it might become necessary to use the --gcc-install-dir option to ensure that this GCC version is selected. This seems like an acceptable trade-off given that the situation for users who do not have any special demands on the particular GCC installation directory would be improved significantly. This patch does not yet change the automatic GCC installation directory choice. Instead, it does introduce a warning that informs the user about the future change if the chosen GCC installation directory differs from the one that would be chosen if the libstdc++ headers are taken into account. See also this related Discourse discussion: https://discourse.llvm.org/t/rfc-take-libstdc-into-account-during-gcc-detection/86992.	2025-08-19 16:55:45 +02:00
Simon Pilgrim	355b747acd	[Headers][X86] Enable constexpr handling for pmulhw/pmulhuw avx512 mask/maskz intrinsics (#154341 ) Followup to #152524 / #152540 - allow the predicated variants to be used in constexpr as well	2025-08-19 15:55:16 +01:00
Ross Brunton	2c11a83691	[Offload] Add olCalculateOptimalOccupancy (#142950 ) This is equivalent to `cuOccupancyMaxPotentialBlockSize`. It is currently only implemented on Cuda; AMDGPU and Host return unsupported. --------- Co-authored-by: Callum Fare <callum@codeplay.com>	2025-08-19 15:16:47 +01:00
Kazu Hirata	2c4f0e7ac6	[mlir] Replace SmallSet with SmallPtrSet (NFC) (#154265 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 30 instances that rely on this "redirection". Since the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types. I'm planning to remove the redirection eventually.	2025-08-19 07:11:47 -07:00
Kazu Hirata	136b541304	[clang] Replace SmallSet with SmallPtrSet (NFC) (#154262 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 30 instances that rely on this "redirection", with about half of them under clang/. Since the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types. I'm planning to remove the redirection eventually.	2025-08-19 07:11:39 -07:00
Timm Baeder	965b7c2bfc	[clang][bytecode] Implement ia32_pmul* builtins (#154315 )	2025-08-19 16:05:20 +02:00
Matt Arsenault	ed0e531044	AMDGPU: Use Register type for isStackAccess (#154320 )	2025-08-19 23:00:45 +09:00
Younan Zhang	2b32ad1316	[Clang] Only remove lambda scope after computing evaluation context (#154106 ) The immediate evaluation context needs the lambda scope info to propagate some flags, however that LSI was removed in ActOnFinishFunctionBody which happened before rebuilding a lambda expression. This also converts the wrapper function to default arguments as a drive-by fix. Fixes https://github.com/llvm/llvm-project/issues/145776	2025-08-19 21:59:23 +08:00
Charles Zablit	f55dc0824e	[lldb][windows] use Windows APIs to print to the console (#149493 ) This patch uses the Windows APIs to print to the Windows Console, through `llvm::raw_fd_ostream`. This fixes a rendering issue where the characters defined in `DiagnosticsRendering.cpp` (`"╰"` for instance) are not rendered properly on Windows out of the box, because the default codepage is not `utf-8`. This solution is based on [this patch downstream](https://github.com/swiftlang/swift/pull/40632/files#diff-e948e4bd7a601e3ca82d596058ccb39326459a4751470eec4d393adeaf516977R37-R38). rdar://156064500	2025-08-19 14:52:29 +01:00
Simon Pilgrim	1359f72a03	[X86] canCreateUndefOrPoisonForTargetNode - add X86ISD::MOVMSK (#154321 ) MOVMSK nodes don't create undef/poison when extracting the signbits from the source operand	2025-08-19 14:40:42 +01:00
Krzysztof Parzyszek	42350f428d	[flang][OpenMP] Parse GROUPPRIVATE directive (#153807 ) No semantic checks or lowering yet.	2025-08-19 08:32:43 -05:00
Krzysztof Parzyszek	292faf6133	[Frontend][OpenMP] Add definition of groupprivate directive (#153799 ) This is the common point for clang and flang implementations.	2025-08-19 08:27:29 -05:00
Alexandre Ganea	13391ce183	On Windows, in the release build script, fix detecting if clang-cl is in PATH (#149597 ) The checks for detecting if `clang-cl` and `lld-link` are in `%PATH` were wrong. This fixes the comment in https://github.com/llvm/llvm-project/pull/135446#discussion_r2215511129	2025-08-19 13:13:51 +00:00
Erich Keane	dab8c88f15	[OpenACC] Implement 'firstprivate' clause copy lowering (#154150 ) This patch is the last of the 'firstprivate' clause lowering patches. It takes the already generated 'copy' init from Sema and uses it to generate the IR for the copy section of the recipe. However, one thing that this patch had to do, was come up with a way to hijack the decl registration in CIRGenFunction. Because these decls are being created in a 'different' place, we need to remove the things we've added. We could alternatively generate these 'differently', but it seems worth a little extra effort here to avoid having to re-implement variable initialization.	2025-08-19 06:02:10 -07:00
Aiden Grossman	01f2d70222	[X86] Remove unused variable from Atom Scheduling Model (#154191 ) Related to #154180.	2025-08-19 06:00:18 -07:00
Orlando Cazalet-Hyams	da45b6c71d	[RemoveDIs][NFC] Remove dbg intrinsic version of calculateFragmentIntersect (#153378 )	2025-08-19 13:44:25 +01:00
Yang Bai	b4c31dc98d	[mlir][Vector] add vector.insert canonicalization pattern to convert a chain of insertions to vector.from_elements (#142944 ) ## Description This change introduces a new canonicalization pattern for the MLIR Vector dialect that optimizes chains of insertions. The optimization identifies when a vector is completely initialized through a series of vector.insert operations and replaces the entire chain with a single `vector.from_elements` operation. Please be aware that the new pattern doesn't work for poison vectors where only some elements are set, as MLIR doesn't support partial poison vectors for now. New Pattern: InsertChainFullyInitialized * Detects chains of vector.insert operations. * Validates that all insertions are at static positions, and all intermediate insertions have only one use. * Ensures the entire vector is completely initialized. * Replaces the entire chain with a single vector.from_elementts operation. Refactored Helper Function * Extracted `calculateInsertPosition` from `foldDenseElementsAttrDestInsertOp` to avoid code duplication. ## Example ``` // Before: %v1 = vector.insert %c10, %v0[0] : i64 into vector<2xi64> %v2 = vector.insert %c20, %v1[1] : i64 into vector<2xi64> // After: %v2 = vector.from_elements %c10, %c20 : vector<2xi64> ``` It also works for multidimensional vectors. ``` // Before: %v1 = vector.insert %cv0, %v0[0] : vector<3xi64> into vector<2x3xi64> %v2 = vector.insert %cv1, %v1[1] : vector<3xi64> into vector<2x3xi64> // After: %0:3 = vector.to_elements %arg1 : vector<3xi64> %1:3 = vector.to_elements %arg2 : vector<3xi64> %v2 = vector.from_elements %0#0, %0#1, %0#2, %1#0, %1#1, %1#2 : vector<2x3xi64> ``` --------- Co-authored-by: Yang Bai <yangb@nvidia.com> Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com>	2025-08-19 13:43:31 +01:00
Krzysztof Parzyszek	c65c0e87fc	[flang][OpenMP] Break up CheckThreadprivateOrDeclareTargetVar, NFC (#153809 ) Extract the visitors into separate functions to make the code more readable. Join each message string into a single line.	2025-08-19 07:36:25 -05:00
Jie Fu	81f1b46cc6	[AArch64] Silent an unused-variable warning (NFC) /llvm-project/llvm/lib/Target/AArch64/AArch64ExpandPseudoInsts.cpp:1042:11: error: unused variable 'TRI' [-Werror,-Wunused-variable] auto *TRI = MBB.getParent()->getSubtarget().getRegisterInfo(); ^ 1 error generated.	2025-08-19 20:32:33 +08:00
Simon Pilgrim	62d6c10b24	[X86] Add test showing the failure to fold FREEZE(MOVMSK(X)) -> MOVMSK(FREEZE(X))	2025-08-19 13:25:36 +01:00
Leandro Lupori	ddb36a8102	[flang] Preserve dynamic length of characters in ALLOCATE (#152564 ) Fixes #151895	2025-08-19 09:25:08 -03:00
Florian Hahn	1217c8226b	[LoopIdiom] Add test for simplifying SCEV during expansion with flags.	2025-08-19 13:22:45 +01:00
Utkarsh Saxena	92a91f71ee	[LifetimeSafety] Improve Origin information in debug output (#153951 ) The previous debug output only showed numeric IDs for origins, making it difficult to understand what each origin represented. This change makes the debug output more informative by showing what kind of entity each origin refers to (declaration or expression) and additional details like declaration names or expression class names. This improved output makes it easier to debug and understand the lifetime safety analysis.	2025-08-19 12:06:10 +00:00
Mehdi Amini	dc82b2cc70	Revert "[MLIR][WASM] Extending the Wasm binary to WasmSSA dialect importer" (#154314 ) Reverts llvm/llvm-project#154053 Seems like an endianness sensitivity failing a big-endian bot.	2025-08-19 14:05:09 +02:00
Rafal Bielski	9c9d9e4cb6	[Offload] Define additional device info properties (#152533 ) Add the following properties in Offload device info: * VENDOR_ID * NUM_COMPUTE_UNITS * [SINGLE\|DOUBLE\|HALF]_FP_CONFIG * NATIVE_VECTOR_WIDTH_[CHAR\|SHORT\|INT\|LONG\|FLOAT\|DOUBLE\|HALF] * MAX_CLOCK_FREQUENCY * MEMORY_CLOCK_RATE * ADDRESS_BITS * MAX_MEM_ALLOC_SIZE * GLOBAL_MEM_SIZE Add a bitfield option to enumerators, allowing the values to be bit-shifted instead of incremented. Generate the per-type enums using `foreach` to reduce code duplication. Use macros in unit test definitions to reduce code duplication.	2025-08-19 13:02:01 +01:00
Simon Pilgrim	fcb36ca8cc	[DAG] visitTRUNCATE - merge the trunc(abd) and trunc(avg) handling which are almost identical (#154301 ) CC @houngkoungting	2025-08-19 12:59:39 +01:00
Luc Forget	df57bb8c49	[MLIR][WASM] Extending the Wasm binary to WasmSSA dialect importer (#154053 ) This is the continuation of #152131 This PR adds support for parsing the global initializer and function body, and support for decoding scalar numerical instructions and variable related instructions. --------- Co-authored-by: Ferdinand Lemaire <ferdinand.lemaire@woven-planet.global> Co-authored-by: Jessica Paquette <jessica.paquette@woven-planet.global> Co-authored-by: Luc Forget <luc.forget@woven.toyota>	2025-08-19 13:42:47 +02:00
Timm Baeder	2f011ea37a	[clang][bytecode][NFC] Remove unused Program::relocs (#154308 )	2025-08-19 13:29:56 +02:00
Dan Blackwell	2e9494ff96	[ASan] Re-enable duplicate_os_log_reports test and include cstdlib for malloc (#153195 ) rdar://62141527	2025-08-19 12:12:48 +01:00
David Green	22b4021f01	[AArch64][GlobalISel] Add additional vecreduce.fadd and fadd 0.0 tests. NFC	2025-08-19 11:52:50 +01:00
Stephen Tozer	5cedb01487	[Debugify] Fix compile error in tracking coverage build Forward-fixes a compile error in bc216b057d (#150212) in specific build configurations, due to a missing const_cast.	2025-08-19 11:18:42 +01:00
Nuno Lopes	d0029b87d8	remove UB from test [NFC]	2025-08-19 11:18:27 +01:00
yronglin	57bf5dd7a0	[libc++][tuple.apply] Implement P2255R2 make_from_tuple part. (#152867 ) Implement P2255R2 tuple.apply part wording for `std::make_from_tuple`. ``` Mandates: If tuple_size_v<remove_reference_t<Tuple>> is 1, then reference_constructs_from_temporary_v<T, decltype(get<0>(declval<Tuple>()))> is false. ``` Fixes #154274 --------- Signed-off-by: yronglin <yronglin777@gmail.com>	2025-08-19 18:14:13 +08:00
Kareem Ergawy	0e93dbc6b1	[flang] `do concurrent`: Enable delayed localization by default (#154303 ) Enables delayed localization by default for `do concurrent`. Tested both gfortran and Fujitsu test suites. All tests pass for gfortran tests. For Fujitsu, enabled delayed localization passes more tests: Delayed localization disabled: Testing Time: 7251.76s Passed : 88520 Failed : 162 Executable Missing: 408 Delayed localization enabled: Testing Time: 7216.73s Passed : 88522 Failed : 160 Executable Missing: 408	2025-08-19 12:07:17 +02:00
Benjamin Maxwell	7170a81241	[AArch64][SME] Rename `EdgeBundles` to `Bundles` (NFC) (#154295 ) It seems some buildbots do not like the shadowing. See: https://lab.llvm.org/buildbot/#/builders/137/builds/23838	2025-08-19 10:58:19 +01:00
Lang Hames	615f8393c9	[orc-rt] Remove unused LLVM_RT_TOOLS_BINARY_DIR cmake variable. (#154254 ) This was accidentally left in ee7a6a45bdb.	2025-08-19 19:56:27 +10:00
Timm Baeder	ab8b4f6629	[clang][bytecode][NFC] Replace std::optional<unsigned> with UnsignedO… (#154286 ) …rNone	2025-08-19 11:37:32 +02:00
David Green	a7df02f83c	[InstCombine] Make strlen optimization more resilient to different gep types. (#153623 ) This makes the optimization in optimizeStringLength for strlen(gep @glob, %x) -> sub endof@glob, %x a little more resilient, and maybe a bit more correct for geps with non-array types.	2025-08-19 10:37:17 +01:00
Vinay Deshmukh	d286f2ef55	[libc++] Make `std::__tree_node` member private to prepare for UB removal (#154225 ) Prepare for: https://github.com/llvm/llvm-project/pull/153908#discussion_r2281756219	2025-08-19 11:27:56 +02:00
tangaac	ccbcebcfd3	[LoongArch] Fix implicit PesudoXVINSGR2VR error (#152432 ) According to the instructions manual, when `vr0` is changed, high 128 bit of `xr0` is undefined. Use `vinsgr2vr.b/h` to insert an `i8/i16` to low 128bit of a 256 vector may cause undefined behavior when high 128bit is used in later instructions.	2025-08-19 17:22:00 +08:00
Aditi Medhane	948abf1bf5	[PowerPC] Add BCDCOPYSIGN and BCDSETSIGN Instruction Support (#144874 ) Support the following BCD format conversion builtins for PowerPC. - `__builtin_bcdcopysign` – Conversion that returns the decimal value of the first parameter combined with the sign code of the second parameter. ` - `__builtin_bcdsetsign` – Conversion that sets the sign code of the input parameter in packed decimal format. > Note: This built-in function is valid only when all following conditions are met: > -qarch is set to utilize POWER9 technology. > The bcd.h file is included. ## Prototypes ```c vector unsigned char __builtin_bcdcopysign(vector unsigned char, vector unsigned char); vector unsigned char __builtin_bcdsetsign(vector unsigned char, unsigned char); ``` ## Usage Details `__builtin_bcdsetsign`: Returns the packed decimal value of the first parameter combined with the sign code. The sign code is set according to the following rules: - If the packed decimal value of the first parameter is positive, the following rules apply: - If the second parameter is 0, the sign code is set to 0xC. - If the second parameter is 1, the sign code is set to 0xF. - If the packed decimal value of the first parameter is negative, the sign code is set to 0xD. > notes: > The second parameter can only be 0 or 1. > You can determine whether a packed decimal value is positive or negative as follows: > - Packed decimal values with sign codes 0xA, 0xC, 0xE, or 0xF are interpreted as positive. > - Packed decimal values with sign codes 0xB or 0xD are interpreted as negative. --------- Co-authored-by: Aditi-Medhane <aditi.medhane@ibm.com>	2025-08-19 14:47:27 +05:30
Temperz87	9cadc4e153	[DAG] SelectionDAG::canCreateUndefOrPoison - add ISD::SCMP/UCMP handling + tests (#154127 ) This pr aims to resolve #152144 In SelectionDAG::canCreateUndefOrPoison the ISD::SCMP/UCMP cases are added to always return false as they cannot generate poison or undef The `freeze-binary.ll` file is now testing the SCMP/UCMP cases --------- Co-authored-by: Temperz87 <= temperz871@gmail.com> Co-authored-by: Simon Pilgrim <llvm-dev@redking.me.uk>	2025-08-19 10:06:14 +01:00
Timm Baeder	a1039c1b84	[clang][bytecode] Fix initializing float elements from #embed (#154285 ) Fixes #152885	2025-08-19 11:04:21 +02:00
Sergei Barannikov	56ce40bc73	[TableGen][DecoderEmitter] Stop duplicating encodings (NFC) (#154288 ) When HwModes are involved, we can duplicate an instruction encoding that does not belong to any HwMode multiple times. We can do better by mapping HwMode to a list of encoding IDs it contains. (That is, duplicate IDs instead of encodings.) The encodings that were duplicated are still processed multiple times (e.g., we call an expensive populateInstruction() on each instance). This is going to be fixed in subsequent patches.	2025-08-19 09:02:22 +00:00
Benjamin Maxwell	eb764040bc	[AArch64][SME] Implement the SME ABI (ZA state management) in Machine IR (#149062 ) ## Short Summary This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI for ZA state (e.g., lazy saves and agnostic ZA functions). This is currently not enabled by default (but aims to be by LLVM 22). The goal is for this new pass to more optimally place ZA saves/restores and to work with exception handling. ## Long Description This patch reimplements management of ZA state for functions with private and shared ZA state. Agnostic ZA functions will be handled in a later patch. For now, this is under the flag `-aarch64-new-sme-abi`, however, we intend for this to replace the current SelectionDAG implementation once complete. The approach taken here is to mark instructions as needing ZA to be in a specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly defining or using ZA registers (such as $zt0 or $zab0) require the "ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE" state depending on the callee (having shared or private ZA). We already add ZA register uses/definitions to machine instructions, so no extra work is needed to mark these. Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START. These markers are then used by the MachineSMEABIPass to find instructions where there is a transition between required ZA states. These are the points we need to insert code to set up or restore a ZA save (or initialize ZA). To handle control flow between blocks (which may have different ZA state requirements), we bundle the incoming and outgoing edges of blocks. Bundles are formed by assigning each block an incoming and outgoing bundle (initially, all blocks have their own two bundles). Bundles are then combined by joining the outgoing bundle of a block with the incoming bundle of all successors. These bundles are then assigned a ZA state based on the blocks that participate in the bundle. Blocks whose incoming edges are in a bundle "vote" for a ZA state that matches the state required at the first instruction in the block, and likewise, blocks whose outgoing edges are in a bundle vote for the ZA state that matches the last instruction in the block. The ZA state with the most votes is used, which aims to minimize the number of state transitions.	2025-08-19 10:00:28 +01:00
Nikita Popov	4ab87ffd1e	[SCCP] Enable PredicateInfo for non-interprocedural SCCP (#153003 ) SCCP can use PredicateInfo to constrain ranges based on assume and branch conditions. Currently, this is only enabled during IPSCCP. This enables it for SCCP as well, which runs after functions have already been simplified, while IPSCCP runs pre-inline. To a large degree, CVP already handles range-based optimizations, but SCCP is more reliable for the cases it can handle. In particular, SCCP works reliably inside loops, which is something that CVP struggles with due to LVI cycles. I have made various optimizations to make PredicateInfo more efficient, but unfortunately this still has significant compile-time cost (around 0.1-0.2%).	2025-08-19 10:59:38 +02:00
Timm Baeder	fb8ee3adb6	[clang][bytecode] Move pointers from extern globals to new decls (#154273 )	2025-08-19 10:54:33 +02:00
Michael Buch	d9d5090b03	[CI] Run LLDB tests on Clang changes in pre-merge CI (#154154 ) This attempts https://github.com/llvm/llvm-project/issues/132795 again. Last time we tried this we didn't have enough infra capacity, so had to revert. According to recent communication from the Infrastructure Area Team, we should now have enough capacity to re-enable the LLDB tests.	2025-08-19 09:44:08 +01:00
Baranov Victor	ef3ce0dcb2	[Github] Remove redundant 'START_REV', 'END_REV' env variables (NFC) (#154218 ) After https://github.com/llvm/llvm-project/pull/133023, `START_REV` and `END_REV` env variables became redundant.	2025-08-19 11:41:37 +03:00

1 2 3 4 5 ...

549144 Commits