llvm-project

Author	SHA1	Message	Date
Kazu Hirata	035dd1d854	[ADT] Fix a warning This patch fixes: third-party/unittest/googletest/include/gtest/gtest.h:1379:11: error: comparison of integers of different signs: 'const unsigned long' and 'const int' [-Werror,-Wsign-compare]	2025-08-21 08:49:56 -07:00
Chaitanya Koparkar	ad63a70d6d	[ADT] Add fshl/fshr operations to APInt (#153790 ) These operations are required for #153151.	2025-08-21 12:52:18 +01:00
Steven Wu	deab049b5c	[CAS] Add ActionCache to LLVMCAS Library (#114097 ) ActionCache is used to store a mapping from CASID to CASID. The current implementation of the ActionCache can only be used to associate the key/value from the same hash context. ActionCache has two operations: `put` to store the key/value and `get` to lookup the key/value mapping. ActionCache uses the same TrieRawHashMap data structure to store the mapping, where is CASID of the key is the hash to index the map. While CASIDs for key/value are often associcate with actual CAS ObjectStore, it doesn't provide the guarantee of the existence of such object in any ObjectStore.	2025-08-20 14:42:44 -07:00
David Majnemer	0a7eabcc56	Reapply "[APFloat] Fix getExactInverse for DoubleAPFloat" The previous implementation of getExactInverse used the following check to identify powers of two: // Check that the number is a power of two by making sure that only the // integer bit is set in the significand. if (significandLSB() != semantics->precision - 1) return false; This condition verifies that the only set bit in the significand is the integer bit, which is correct for normal numbers. However, this logic is not correct for subnormal values. APFloat represents subnormal numbers by shifting the significand right while holding the exponent at its minimum value. For a power of two in the subnormal range, its single set bit will therefore be at a position lower than precision - 1. The original check would consequently fail, causing the function to determine that these numbers do not have an exact multiplicative inverse. The new logic calculated this correctly but it seems that test/CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll expected the old behavior. Seeing as how getExactInverse does not have tests or documentation, we conservatively maintain (and document) this behavior. This reverts commit 47e62e846beb267aad50eb9195dfd855e160483e.	2025-08-20 14:02:36 -07:00
David Green	4875553f4c	[AArch64][GlobalISel] Port unmerge KnownBits tests to print<gisel-value-tracking>. NFC This takes the known-bits tests added in #112172 and ports them over to be a new print<gisel-value-tracking> test.	2025-08-20 20:57:14 +01:00
David Tenty	63195d3d7a	[NFC][CMake] quote ${CMAKE_SYSTEM_NAME} consistently (#154537 ) A CMake change included in CMake 4.0 makes `AIX` into a variable (similar to `APPLE`, etc.) `ff03db6657` However, `${CMAKE_SYSTEM_NAME}` unfortunately also expands exactly to `AIX` and `if` auto-expands variable names in CMake. That means you get a double expansion if you write: `if (${CMAKE_SYSTEM_NAME} MATCHES "AIX")` which becomes: `if (AIX MATCHES "AIX")` which is as if you wrote: `if (ON MATCHES "AIX")` You can prevent this by quoting the expansion of "${CMAKE_SYSTEM_NAME}", due to policy [CMP0054](https://cmake.org/cmake/help/latest/policy/CMP0054.html#policy:CMP0054) which is on by default in 4.0+. Most of the LLVM CMake already does this, but this PR fixes the remaining cases where we do not.	2025-08-20 12:45:41 -04:00
Steven Wu	2cfba9678d	[FileSystem] Allow exclusive file lock (#114098 ) Add parameter to file lock API to allow exclusive file lock. Both Unix and Windows support lock the file exclusively for write for one process and LLVM OnDiskCAS uses exclusive file lock to coordinate CAS creation.	2025-08-20 08:32:18 -07:00
Mehdi Amini	8b2028ced6	Update log_level for LLVM_DEBUG and associated macros (#154525 ) During the review of #150855 we switched from 0 to 1 for the default log level used, but this macro wasn't updated.	2025-08-20 13:31:13 +00:00
Zhaoxuan Jiang	2738828c0e	[Reland] [CGData] Lazy loading support for stable function map (#154491 ) This is an attempt to reland #151660 by including a missing STL header found by a buildbot failure. The stable function map could be huge for a large application. Fully loading it is slow and consumes a significant amount of memory, which is unnecessary and drastically slows down compilation especially for non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in lazy loading support for the stable function map. The detailed changes are: - `StableFunctionMap` - The map now stores entries in an `EntryStorage` struct, which includes offsets for serialized entries and a `std::once_flag` for thread-safe lazy loading. - The underlying map type is changed from `DenseMap` to `std::unordered_map` for compatibility with `std::once_flag`. - `contains()`, `size()` and `at()` are implemented to only load requested entries on demand. - Lazy Loading Mechanism - When reading indexed codegen data, if the newly-introduced `-indexed-codegen-data-lazy-loading` flag is set, the stable function map is not fully deserialized up front. The binary format for the stable function map now includes offsets and sizes to support lazy loading. - The safety of lazy loading is guarded by the once flag per function hash. This guarantees that even in a multi-threaded environment, the deserialization for a given function hash will happen exactly once. The first thread to request it performs the load, and subsequent threads will wait for it to complete before using the data. For single-threaded builds, the overhead is negligible (a single check on the once flag). For multi-threaded scenarios, users can omit the flag to retain the previous eager-loading behavior.	2025-08-20 06:15:04 -07:00
Qihan Cai	5f0515debd	[RISCV] Support Remaining P Extension Instructions for RV32/64 (#150379 ) This patch implements pages 15-17 from jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf Documentation: jhauser.us/RISCV/ext-P/RVP-baseInstrs-014.pdf jhauser.us/RISCV/ext-P/RVP-instrEncodings-015.pdf	2025-08-20 22:54:07 +10:00
Steven Wu	30c5c48d87	[CAS][Tests] Fix unit tests that hangs on two cores (#154151 )	2025-08-19 08:21:34 -07:00
Benjamin Maxwell	eb764040bc	[AArch64][SME] Implement the SME ABI (ZA state management) in Machine IR (#149062 ) ## Short Summary This patch adds a new pass `aarch64-machine-sme-abi` to handle the ABI for ZA state (e.g., lazy saves and agnostic ZA functions). This is currently not enabled by default (but aims to be by LLVM 22). The goal is for this new pass to more optimally place ZA saves/restores and to work with exception handling. ## Long Description This patch reimplements management of ZA state for functions with private and shared ZA state. Agnostic ZA functions will be handled in a later patch. For now, this is under the flag `-aarch64-new-sme-abi`, however, we intend for this to replace the current SelectionDAG implementation once complete. The approach taken here is to mark instructions as needing ZA to be in a specific ("ACTIVE" or "LOCAL_SAVED"). Machine instructions implicitly defining or using ZA registers (such as $zt0 or $zab0) require the "ACTIVE" state. Function calls may need the "LOCAL_SAVED" or "ACTIVE" state depending on the callee (having shared or private ZA). We already add ZA register uses/definitions to machine instructions, so no extra work is needed to mark these. Calls need to be marked by glueing Arch64ISD::INOUT_ZA_USE or Arch64ISD::REQUIRES_ZA_SAVE to the CALLSEQ_START. These markers are then used by the MachineSMEABIPass to find instructions where there is a transition between required ZA states. These are the points we need to insert code to set up or restore a ZA save (or initialize ZA). To handle control flow between blocks (which may have different ZA state requirements), we bundle the incoming and outgoing edges of blocks. Bundles are formed by assigning each block an incoming and outgoing bundle (initially, all blocks have their own two bundles). Bundles are then combined by joining the outgoing bundle of a block with the incoming bundle of all successors. These bundles are then assigned a ZA state based on the blocks that participate in the bundle. Blocks whose incoming edges are in a bundle "vote" for a ZA state that matches the state required at the first instruction in the block, and likewise, blocks whose outgoing edges are in a bundle vote for the ZA state that matches the last instruction in the block. The ZA state with the most votes is used, which aims to minimize the number of state transitions.	2025-08-19 10:00:28 +01:00
Mehdi Amini	89abccc9a6	[MLIR] Update GreedyRewriter to use the LDBG() debug log mechanism (NFC) (#153961 ) Also improve a bit the LDBG() implementation	2025-08-18 21:05:34 +00:00
Damyan Pepper	cc49f3b3e1	[NFC][HLSL] Remove confusing enum aliases / duplicates (#153909 ) Remove: * DescriptorType enum - this almost exactly shadowed the ResourceClass enum * ClauseType aliased ResourceClass Although these were introduced to make the HLSL root signature handling code a bit cleaner, they were ultimately causing confusion as they appeared to be unique enums that needed to be converted between each other. Closes #153890	2025-08-18 08:58:33 -07:00
Benjamin Maxwell	81c06d198e	Reland "[AArch64][SME] Port all SME routines to RuntimeLibcalls" (#153417 ) This updates everywhere we emit/check an SME routines to use RuntimeLibcalls to get the function name and calling convention.	2025-08-18 14:53:40 +01:00
Nikita Popov	ba45ac61b6	[CAS] Temporarily disable broken test This test hangs forever if executed with less than three cores available, see: https://github.com/llvm/llvm-project/pull/114096#issuecomment-3196698403	2025-08-18 15:09:08 +02:00
林克	6842cc5562	[RISCV] Add SpacemiT XSMTVDot (SpacemiT Vector Dot Product) extension. (#151706 ) The full spec can be found at spacemit-x60 processor support scope: Section 2.1.2.2 (Features): https://developer.spacemit.com/documentation?token=BWbGwbx7liGW21kq9lucSA6Vnpb#2.1 This patch only supports assembler.	2025-08-18 18:03:17 +08:00
Carl Ritson	97d5d483ec	[MsgPack] Add code for floating point assignment and writes (#153544 ) Allow assignment of float to DocType and support output of float in writeToBlob method. Expand tests coverage to various missing basic I/O operations. Co-authored-by: Xavi Zhang <Xavi.Zhang@amd.com>	2025-08-18 10:03:40 +09:00
Matt Arsenault	3e5d8a1439	Reapply "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) This reverts commit 334e9bf2dd01fbbfe785624c0de477b725cde6f2. Check if llvm-nm exists before building the benchmark.	2025-08-16 09:53:50 +09:00
Craig Topper	e67ec12640	[RISCV] Remove experimental from Smctr and Ssctr. (#153903 ) These extensions were ratified in November 2024.	2025-08-15 17:18:09 -07:00
gulfemsavrun	334e9bf2dd	Revert "RuntimeLibcalls: Generate table of libcall name lengths (#153… (#153864 ) …210)" This reverts commit 9a14b1d254a43dc0d4445c3ffa3d393bca007ba3. Revert "RuntimeLibcalls: Return StringRef for libcall names (#153209)" This reverts commit cb1228fbd535b8f9fe78505a15292b0ba23b17de. Revert "TableGen: Emit statically generated hash table for runtime libcalls (#150192)" This reverts commit 769a9058c8d04fc920994f6a5bbb03c8a4fbcd05. Reverted three changes because of a CMake error while building llvm-nm as reported in the following PR: https://github.com/llvm/llvm-project/pull/150192#issuecomment-3192223073	2025-08-15 13:32:27 -07:00
Sterling-Augustine	5b0619e79b	Move function info word into its own data structure (#153627 ) The sframe generator needs to construct this word separately from FDEs themselves, so split them into a separate data structure.	2025-08-15 13:16:34 -07:00
Matt Arsenault	cb1228fbd5	RuntimeLibcalls: Return StringRef for libcall names (#153209 ) Does not yet fully propagate this down into the TargetLowering uses, many of which are relying on null checks on the returned value.	2025-08-15 09:55:39 +09:00
Steven Wu	7e46f5db21	[Support] Add mapped_file_region::sync(), equivalent to msync (#153632 )	2025-08-14 17:05:33 -07:00
Matt Arsenault	769a9058c8	TableGen: Emit statically generated hash table for runtime libcalls (#150192 ) a96121089b9c94e08c6632f91f2dffc73c0ffa28 reverted a change to use a binary search on the string name table because it was too slow. This replaces it with a static string hash table based on the known set of libcall names. Microbenchmarking shows this is similarly fast to using DenseMap. It's possibly slightly slower than using StringSet, though these aren't an exact comparison. This also saves on the one time use construction of the map, so it could be better in practice. This search isn't simple set check, since it does find the range of possible matches with the same name. There's also an additional check for whether the current target supports the name. The runtime constructed set doesn't require this, since it only adds the symbols live for the target. Followed algorithm from this post http://0x80.pl/notesen/2023-04-30-lookup-in-strings.html I'm also thinking the 2 special case global symbols should just be added to RuntimeLibcalls. There are also other global references emitted in the backend that aren't tracked; we probably should just use this as a centralized database for all compiler selected symbols.	2025-08-15 09:02:56 +09:00
Kyungwoo Lee	07d3a73d70	Revert "[CGData] Lazy loading support for stable function map (#151660 )" This reverts commit 76dd742f7b32e4d3acf50fab1dbbd897f215837e.	2025-08-14 16:56:54 -07:00
Zhaoxuan Jiang	76dd742f7b	[CGData] Lazy loading support for stable function map (#151660 ) The stable function map could be huge for a large application. Fully loading it is slow and consumes a significant amount of memory, which is unnecessary and drastically slows down compilation especially for non-LTO and distributed-ThinLTO setups. This patch introduces an opt-in lazy loading support for the stable function map. The detailed changes are: - `StableFunctionMap` - The map now stores entries in an `EntryStorage` struct, which includes offsets for serialized entries and a `std::once_flag` for thread-safe lazy loading. - The underlying map type is changed from `DenseMap` to `std::unordered_map` for compatibility with `std::once_flag`. - `contains()`, `size()` and `at()` are implemented to only load requested entries on demand. - Lazy Loading Mechanism - When reading indexed codegen data, if the newly-introduced `-indexed-codegen-data-lazy-loading` flag is set, the stable function map is not fully deserialized up front. The binary format for the stable function map now includes offsets and sizes to support lazy loading. - The safety of lazy loading is guarded by the once flag per function hash. This guarantees that even in a multi-threaded environment, the deserialization for a given function hash will happen exactly once. The first thread to request it performs the load, and subsequent threads will wait for it to complete before using the data. For single-threaded builds, the overhead is negligible (a single check on the once flag). For multi-threaded scenarios, users can omit the flag to retain the previous eager-loading behavior.	2025-08-14 13:49:09 -07:00
Florian Hahn	177f27d220	[VPlan] Add incoming_[blocks,values] iterators to VPPhiAccessors (NFC) (#138472 ) Add 3 new iterator ranges to VPPhiAccessors * incoming_values(): returns a range over the incoming values of a phi * incoming_blocks(): returns a range over the incoming blocks of a phi * incoming_values_and_blocks: returns a range over pairs of incoming values and blocks. Depends on https://github.com/llvm/llvm-project/pull/124838. PR: https://github.com/llvm/llvm-project/pull/138472	2025-08-14 16:47:04 +01:00
Jakub Kuderski	1633e0ba8b	[ADT] Add `from_range` constructor for (Small)DenseMap (#153515 ) This follows how we support range construction for (Small)DenseSet.	2025-08-14 08:53:52 -04:00
Lang Hames	3bc3b4cf5f	[ORC] Add cloneExternalModuleToContext API. cloneExternalModuleToContext can be used to clone an LLVM module onto a given ThreadSafeContext. Callers of this function are responsible for ensuring exclusive access to the source module and its LLVMContext.	2025-08-14 21:21:17 +10:00
Aiden Grossman	47e62e846b	Revert "[APFloat] Fix getExactInverse for DoubleAPFloat" This reverts commit f4941319cba19d7691baa6ec783c84be4d847637. This broke llvm/test/CodeGen/Thumb2/mve-vcvt-fixed-to-float.ll which took out a ton of buildbots and also broke premerge.	2025-08-14 04:39:50 +00:00
Pedro Lobo	08eff57444	[ADT] Add signed and unsigned mulExtended to APInt (#153399 ) Adds `mulsExtended` and `muluExtended` methods to `APInt`, as suggested in #153293. These are based on the `MULDQ` and `MULUDQ` x86 intrinsics.	2025-08-13 22:37:33 +01:00
David Majnemer	f4941319cb	[APFloat] Fix getExactInverse for DoubleAPFloat Some background: getExactInverse()'s callers expect that the result is not subnormal. DoubleAPFloat implemented getExactInverse() by going through semPPCDoubleDoubleLegacy. This means that numbers like 0x1p1022 which would have a normal inverse in semPPCDoubleDouble would not in semPPCDoubleDoubleLegacy. This commit refactors the logic into a single method on APFloat which uses getExactLog2Abs() and scalbn() to calculate the inverse without having to compute a reciprocal and test if it is inexact. This approach works for both IEEEFloat and DoubleAPFloat.	2025-08-13 11:39:53 -07:00
Steven Wu	7350112c40	[CAS] Disable CAS unittests that requires threads (#153434 ) Disable parallel CAS tests when LLVM is configured to not having threads.	2025-08-13 11:11:56 -07:00
Steven Wu	3ecd331808	[CAS] Fix MSVC warning after #114096 (#153430 ) Correct use of `SCOPED_TRACE` and fix MSVC warning.	2025-08-13 09:51:05 -07:00
Orlando Cazalet-Hyams	f316009997	[RemoveDIs][NFC] Remove more dbg.assign intrinsics code paths (#153371 )	2025-08-13 16:37:04 +01:00
Nikita Popov	48beed5b71	Revert "[AArch64][SME] Port all SME routines to RuntimeLibcalls" (#153392 ) This introduced a 5% compile-time regression on AArch64, see https://llvm-compile-time-tracker.com/compare.php?from=b9138bde3562de5c28a239dbd303caf2406678c6&to=271688b87abe7cf45aceaff8266270a25eb7b436&stat=instructions:u. Reverts llvm/llvm-project#152505.	2025-08-13 11:54:39 +00:00
Benjamin Maxwell	271688b87a	[AArch64][SME] Port all SME routines to RuntimeLibcalls (#152505 ) This updates everywhere we emit/check an SME routines to use RuntimeLibcalls to get the function name and calling convention. Note: RuntimeLibcallEmitter had some issues with emitting non-unique variable names for sets of libcalls, so I tweaked the output to avoid the need for variables.	2025-08-13 08:48:59 +01:00
David Majnemer	acef1db3b2	[APFloat] Remove some overly optimistic assertions An earlier draft of DoubleAPFloat::convertToSignExtendedInteger had arranged for overflow to be handled in a different way. However, these assertions are now possible if Hi+Lo are out of range and Lo != 0. A test has been added to defend against a regression.	2025-08-12 18:32:58 -07:00
David Majnemer	f6d143fd1f	[APFloat] Properly implement frexp(DoubleAPFloat) The prior implementation did not consider that the Lo component may underflow when it undergoes scaling. This means that we need to carefully handle things like binade crossings or how to handle roundTowardZero when Hi and Lo have different signs. Particularly annoying is roundTiesToAway when Hi and Lo have different signs. It basically requires us to implement roundTiesTowardZero.	2025-08-12 17:03:27 -07:00
David Majnemer	e722ef4956	Reapply "[APFloat] Properly implement DoubleAPFloat::convertToSignExtendedInteger" This reverts commit 8b44945a9231d4d7be0858a1c5d9c13d397bc512. The compilation failure under !NDEBUG has been fixed.	2025-08-12 17:01:49 -07:00
Philip Reames	4d629f9744	[MIR] Remove std::variant from multiple save/restore point handling [nfc] (#153226 ) In review of bbde6b, I had originally proposed that we support the legacy text format. As review evolved, it bacame clear this had been a bad idea (too much complexity), but in order to let that patch finally move forward, I approved the change with the variant. This change undoes the variant, and updates all the tests to just use the array form.	2025-08-12 11:23:05 -07:00
Steven Wu	dda996b875	[CAS] Add LLVMCAS library with InMemoryCAS implementation (#114096 ) Add llvm::cas::ObjectStore abstraction and InMemoryCAS as a in-memory CAS object store implementation. The ObjectStore models its objects as: * Content: An array of bytes for the data to be stored. * Refs: An array of references to other objects in the ObjectStore. And each CAS Object can be idenfied with an unqine ID/Hash. ObjectStore supports following general action: * Expected<ID> store(Content, ArrayRef<Ref>) * Expected<Ref> get(ID) It also introduces following types to interact with a CAS ObjectStore: * CASID: Hash representation for an CAS Objects with its context to help print/compare CASIDs. * ObjectRef: A light-weight ref for an object in the ObjectStore. It is implementation defined so it can be optimized for read/store/references depending on the implementation. * ObjectProxy: A proxy for the users of CAS to interact with the data inside CAS Object. It bundles a ObjectHandle and an ObjectStore instance.	2025-08-12 10:25:43 -07:00
Nikita Popov	9d96d01b42	[IR] Add offset stripping test with mixed const/variable offsets (NFC) Regression test for: `a7edc95c79 (commitcomment-163691175)`	2025-08-12 15:12:16 +02:00
Andreas Jonson	ca7ffaaeeb	[ConstantRange] add nuw support to truncate (NFC) (#152990 )	2025-08-12 12:26:35 +02:00
Kazu Hirata	f90ded5b94	[ADT] Reduce memory allocation in SmallPtrSet::reserve() (#153126 ) Previously, reserve() allocated double the required number of buckets. For example, for NumEntries in the range [49, 96], it would reserve 256 buckets when only 128 are needed to maintain the load factor. This patch removes "+ 1" in the NewSize calculation.	2025-08-11 22:51:32 -07:00
Helena Kotas	fb1035cfb4	[DirectX] Fix resource binding analysis incorrectly removing duplicates (#152253 ) The resource binding analysis was incorrectly reducing the size of the `Bindings` vector by one element after sorting and de-duplication. This led to an inaccurate setting of the `HasOverlappingBinding` flag in the `DXILResourceBindingInfo` analysis, as the truncated vector no longer reflected the true binding state. This update corrects the shrink logic and introduces an `assert` in the `DXILPostOptimizationValidation` pass. The assertion will trigger if `HasOverlappingBinding` is set but no corresponding error is detected, helping catch future inconsistencies. The bug surfaced when the `srv_metadata.hlsl` and `uav_metadata.hlsl` tests were updated to include unbounded resource arrays as part of https://github.com/llvm/llvm-project/issues/145422. These updated test files are included in this PR, as they would cause the new assertion to fire if the original issue remained unresolved. Depends on #152250	2025-08-11 10:53:00 -07:00
Yingwei Zheng	84b31581f8	Revert "[PatternMatch] Add `m_[Shift]OrSelf` matchers." (#152953 ) Reverts llvm/llvm-project#152924 According to `f67668b586`, it is not an NFC change.	2025-08-11 09:35:16 +02:00
Nikita Popov	35bad229c1	[PredicateInfo] Use bitcast instead of ssa.copy (#151174 ) PredicateInfo needs some no-op to which the predicate can be attached. Currently this is an ssa.copy intrinsic. This PR replaces it with a no-op bitcast. Using a bitcast is more efficient because we don't have the overhead of an overloaded intrinsic. It also makes things slightly simpler overall.	2025-08-11 09:25:01 +02:00
Yingwei Zheng	1c499351d6	[PatternMatch] Add `m_[Shift]OrSelf` matchers. (#152924 ) Address the comment https://github.com/llvm/llvm-project/pull/147414/files#r2228612726. As they are usually used to match integer packing patterns, it is enough to handle constant shamts.	2025-08-11 09:58:16 +08:00

1 2 3 4 5 ...

11516 Commits