llvm-project

Author	SHA1	Message	Date
Daniel Thornburgh	75c4d315dc	Take symbol name by GlobalValue again to avoid modifying Module	2025-08-21 14:28:08 -07:00
Daniel Thornburgh	dd58f21c9e	IR verifier check and test	2025-08-18 11:37:40 -07:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Diana Picus	ac005e16f6	Reapply "[AMDGPU] Intrinsic for launching whole wave functions" (#153584 ) This reverts commit 14cd1339318b16e08c1363ec6896bd7d1e4ae281. The buildbot failure seems to have been a cmake issue which has been discussed in more detail in this Discourse post: https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901 If any buildbots fail to select arbitrary intrinsics with this patch, it's worth considering using clean builds with ccache instead of incremental builds, as recommended here: https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds The original commit message for this patch: Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Tail calls are handled in a future patch.	2025-08-15 10:12:47 +02:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Diana Picus	14cd133931	Revert "[AMDGPU] Intrinsic for launching whole wave functions" (#152286 ) Reverts llvm/llvm-project#145859 because it broke a HIP test: ``` [34/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o FAILED: External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG -O3 -DNDEBUG -w -Werror=date-time --rocm-path=/opt/botworker/llvm/External/hip/rocm-6.3.0 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /home/botworker/bbot/clang-hip-vega20/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.readfirstlane ```	2025-08-06 12:24:52 +02:00
Diana Picus	0461cd3d1d	[AMDGPU] Intrinsic for launching whole wave functions (#145859 ) Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Unspeakable horrors happen around calls from whole wave functions, the plan is to improve the handling of caller/callee-saved registers in a future patch. Tail calls are also handled in a future patch.	2025-08-06 10:25:53 +02:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
Changpeng Fang	d6094370cb	AMDGPU: Support v_wmma_f32_16x16x128_f8f6f4 on gfx1250 (#149684 ) Co-authored-by: Stanislav Mekhanoshin <Stanislav.Mekhanoshin@amd.com>	2025-07-21 10:09:42 -07:00
Nikita Popov	92c55a315e	[IR] Only allow lifetime.start/end on allocas (#149310 ) lifetime.start and lifetime.end are primarily intended for use on allocas, to enable stack coloring and other liveness optimizations. This is necessary because all (static) allocas are hoisted into the entry block, so lifetime markers are the only way to convey the actual lifetimes. However, lifetime.start and lifetime.end are currently allowed to be used on non-alloca pointers. We don't actually do this in practice, but just the mere fact that this is possible breaks the core purpose of the lifetime markers, which is stack coloring of allocas. Stack coloring can only work correctly if all lifetime markers for an alloca are analyzable. * If a lifetime marker may operate on multiple allocas via a select/phi, we don't know which lifetime actually starts/ends and handle it incorrectly (https://github.com/llvm/llvm-project/issues/104776). * Stack coloring operates on the assumption that all lifetime markers are visible, and not, for example, hidden behind a function call or escaped pointer. It's not possible to change this, as part of the purpose of lifetime markers is that they work even in the presence of escaped pointers, where simple use analysis is insufficient. I don't think there is any way to have coherent semantics for lifetime markers on allocas, while also permitting them on arbitrary pointer values. This PR restricts lifetimes to operate on allocas only. As a followup, I will also drop the size argument, which is superfluous if we always operate on an alloca. (This change also renders various code handling lifetime markers on non-alloca dead. I plan to clean up that kind of code after dropping the size argument as well.) In practice, I've only found a few places that currently produce lifetimes on non-allocas: * CoroEarly replaces the promise alloca with the result of an intrinsic, which will later be replaced back with an alloca. I think this is the only place where there is some legitimate loss of functionality, but I don't think this is particularly important (I don't think we'd expect the promise in a coroutine to admit useful lifetime optimization.) * SafeStack moves unsafe allocas onto a separate frame. We can safely drop lifetimes here, as SafeStack performs its own stack coloring. * Similar for AddressSanitizer, it also moves allocas into separate memory. * LSR sometimes replaces the lifetime argument with a GEP chain of the alloca (where the offsets ultimately cancel out). This is just unnecessary. (Fixed separately in https://github.com/llvm/llvm-project/pull/149492.) * InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast of an alloca. I don't think this is necessary.	2025-07-21 15:04:50 +02:00
Diana Picus	20d8398825	[AMDGPU] ISel & PEI for whole wave functions (#145858 ) Whole wave functions are functions that will run with a full EXEC mask. They will not be invoked directly, but instead will be launched by way of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in a future patch). These functions are meant as an alternative to the `llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics. Whole wave functions will set EXEC to -1 in the prologue and restore the original value of EXEC in the epilogue. They must have a special first argument, `i1 %active`, that is going to be mapped to EXEC. They may have either the default calling convention or amdgpu_gfx. The inactive lanes need to be preserved for all registers used, active lanes only for the CSRs. At the IR level, arguments to a whole wave function (other than `%active`) contain poison in their inactive lanes. Likewise, the return value for the inactive lanes is poison. This patch contains the following work: * 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return a SReg_1 representing `%active`, which needs to be passed into SI_WHOLE_WAVE_FUNC_RETURN. * SelectionDAG support for generating these 2 new pseudos and the special handling of %active. Since the return may be in a different basic block, it's difficult to add the virtual reg for %active to SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF which is later replaced via a custom inserter. * Expansion of the 2 pseudos during prolog/epilog insertion. PEI also marks any used VGPRs as WWM registers, which are then spilled and restored with the usual logic. Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic and a lot of optimization work (especially in order to reduce spills around function calls). --------- Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com> Co-authored-by: Shilei Tian <i@tianshilei.me>	2025-07-21 10:39:09 +02:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Jeremy Morse	b470ac419d	[DebugInfo] Delete debug-intrinsic verifier checks (#149066 ) We no longer produce debug-intrinsics, and whenever they're spotted in bitcode or textual IR they get autoupgraded. We could quite reasonably reject them out of hand as a construct that shouldn't be present. However, the DXIL folks are likely to be converting records back to intrinsics for years to come, and there's no need to make that an error. There's no value in verifying them IMO.	2025-07-16 17:32:27 +01:00
Orlando Cazalet-Hyams	653872f782	[KeyInstr] Fix verifier check (#149043 ) The verifier check was in the wrong place, meaning it wasn't actually checking many instructions. Fixing that causes a test failure (coro-dwarf-key-instrs.cpp) because coros turn off the feature but still annotate instructions with the metadata (which is a supported situation, but the verifier doesn't like it, and it's hard to teach the verifier to like it). Fix that by avoiding emitting any key instruction metadata if the DISubprogram has opted out of key instructions.	2025-07-16 10:43:09 +01:00
Jeremy Morse	57a5f9c47e	[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383 ) There are no longer debug-info instructions, thus we don't need this skipping. Horray!	2025-07-15 15:34:10 +01:00
Florian Hahn	cad62df49a	[Loads] Support dereferenceable assumption with variable size. (#128436 ) Update isDereferenceableAndAlignedPointer to make use of dereferenceable assumptions with variable sizes via SCEV. To do so, factor out the logic to check via an assumption to a helper, and use SE to check if the access size is less than the dereferenceable size. PR: https://github.com/llvm/llvm-project/pull/128436	2025-07-14 08:17:33 +01:00
Craig Topper	6ee87759e3	[RISCV][IR] Implement verifier check for llvm.experimental.vp.splice immediate. (#147458 ) This applies the same check as llvm.vector.splice which checks that the immediate is in the range [-VL, VL-1] where VL is the minimum vector length. If vscale_range is available, the lower bound is used to increase the known minimum vector length for this check. This ensures the immediate is in range for any possible value of vscale that satisfies the vscale_range.	2025-07-08 13:54:11 -07:00
Mircea Trofin	d2500e639b	[pgo] add means to specify "unknown" MD_prof (#145578 ) This PR is part of https://discourse.llvm.org/t/rfc-profile-information-propagation-unittesting/73595 In a slight departure from the RFC, instead of a brand-new `MD_prof_unknown` kind, this adds a first operand to `MD_prof` metadata. This makes it easy to replace with valid metadata (only one `MD_prof`), otherwise sites inserting valid `MD_prof` would also have to check to remove the `unknown` one. The patch just introduces the notion and fixes the verifier accordingly. Existing APIs working (esp. reading) `MD_prof` will be updated subsequently.	2025-06-30 16:57:11 -07:00
Mircea Trofin	46628718c0	[IR][PGO] Verify the structure of `VP` metadata. (#145584 )	2025-06-30 12:31:19 -07:00
Orlando Cazalet-Hyams	140e1894f2	[KeyInstr] Add DISubprogram::keyInstructions bit (#144107 ) Patch 1/4 adding bitcode support. Store whether or not a function is using Key Instructions in its DISubprogram so that we don't need to rely on the -mllvm flag -dwarf-use-key-instructions to determine whether or not to interpret Key Instructions metadata to decide is_stmt placement at DWARF emission time. This makes bitcode support simple and enables well defined mixing of non-key-instructions and key-instructions functions in an LTO context. This patch adds the bit (using DISubprogram::SubclassData1). PR 144104 and 144103 use it during DWARF emission. PR 44102 adds bitcode support. See pull request for overview of alternative attempts.	2025-06-30 08:01:55 +01:00
Mircea Trofin	82cbd68504	[NFC][PGO] Use constants rather than free strings for metadata labels (#145721 )	2025-06-25 16:20:10 -07:00
Mircea Trofin	62f8281e08	[IR][PGO] Verify invalid `MD_prof` metadata on instructions (#145576 ) This PR places the validation of `MD_prof` instruction metadata in the Verifier.	2025-06-25 13:10:43 -07:00
Tom Tromey	3b90597c2c	Non constant size and offset in DWARF (#141106 ) In Ada, a record type can have a non-constant size, and a field can appear at a non-constant bit offset in a record. To support this, this patch changes DIType to record the size and offset using metadata, rather than plain integers. In addition to a constant offset, both DIVariable and DIExpression are now supported here. One thing of note in this patch is the choice of how exactly to represent a non-constant bit offset, with the difficulty being that DWARF 5 does not support this. DWARF 3 did have a way to support a non-constant byte offset, combined with a constant bit offset within the byte, but this was deprecated in DWARF 4 and removed from DWARF 5. This patch takes a simple approach: a DWARF extension allowing the use of an expression with DW_AT_data_bit_offset. There is a corresponding DWARF issue, see https://dwarfstd.org/issues/250501.1.html. The main reason for this approach is that it keeps API simplicity: just a single value is needed, rather than having separate data describing the byte offset and the bit within the byte.	2025-06-25 11:20:35 -07:00
Florian Hahn	17c5c19902	[Verifier] Always verify all assume bundles. (#145586 ) For some reason, some of the checks for specific assumbe bundle elements exit early if the check pass, meaning we don't verify other entries. Replace the early returns with early continues. This also requires removing some tests that are currently rejected. They will be added back as part of https://github.com/llvm/llvm-project/pull/128436. PR: https://github.com/llvm/llvm-project/pull/145586	2025-06-25 09:31:02 +01:00
Matt Arsenault	f0d898f36b	DAG: Move get_dynamic_area_offset type check to IR verifier (#145268 ) Also fix the LangRef to match the implementation. This was checking against the alloca address space size rather than the default address space. The check was also more permissive than the LangRef. The error check permitted any size less than the pointer size; follow the stricter wording of the LangRef.	2025-06-24 11:11:52 +09:00
Durgadoss R	bfee625821	[NVPTX] Attach Range attr to setmaxnreg and fence intrinsics (#144120 ) This patch attaches the range attribute to the setmaxnreg and fence.proxy.tensormap.* intrinsics. The range checks are now handled generically in the Verifier. So, this patch removes the per-intrinsic error-handling for range-checks from the Verifier. This patch also adds more coverage tests for these cases. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-06-19 07:49:08 +05:30
Jeremy Morse	97ac6483aa	[DebugInfo][RemoveDIs] Delete debug-info-format flag (#143746 ) This flag was used to let us incrementally introduce debug records into LLVM, however everything is now using records. It serves no purpose now, so delete it.	2025-06-12 11:51:58 +01:00
Jeremy Morse	3d7aa961ac	[DebugInfo][RemoveDIs] Use autoupgrader to convert old debug-info (#143452 ) By chance, two things have prevented the autoupgrade path being exercised much so far: * LLParser setting the debug-info mode to "old" on seeing intrinsics, * The test in AutoUpgrade.cpp wanting to upgrade into a "new" debug-info block. In practice, this appears to mean this code path hasn't seen the various invalid inputs that can come its way. This commit does a number of things: * Tolerates the various illegal inputs that can be written with debug-intrinsics, and that must be tolerated until the Verifier runs, * Printing illegal/null DbgRecord fields must succeed, * Verifier errors need to localise the function/block where the error is, * Tests that now see debug records will print debug-record errors, Plus a few new tests for other intrinsic-to-debug-record failures modes I found. There are also two edge cases: * Some of the unit tests switch back and forth between intrinsic and record modes at will; I've deleted coverage and some assertions to tolerate this as intrinsic support is now Gone (TM), * In sroa-extract-bits.ll, the order of debug records flips. This is because the autoupgrader upgrades in the opposite order to the basic block conversion routines... which doesn't change the record order, but _does_ change the use list order in Metadata! This should (TM) have no consequence to the correctness of LLVM, but will change the order of various records and the order of DWARF record output too. I tried to reduce this patch to a smaller collection of changes, but they're all intertwined, sorry.	2025-06-11 13:56:30 +01:00
Eli Friedman	9f82ac5738	Remove GlobalObject::getAlign/setAlignment (#143188 ) Currently, GlobalObject has an "alignment" property... but it's basically nonsense: alignment doesn't mean the same thing for variables and functions, and it's completely meaningless for ifuncs. This "removes" (actually marking protected) the methods from GlobalObject, adds the relevant methods to Function and GlobalVariable, and adjusts the code appropriately. This should make future alignment-related cleanups easier.	2025-06-09 13:51:03 -07:00
clubby789	c7c79d2590	[IR][DSE] Support non-malloc functions in malloc+memset->calloc fold (#138299 ) Add a `alloc-variant-zeroed` function attribute which can be used to inform folding allocation+memset. This addresses https://github.com/rust-lang/rust/issues/104847, where LLVM does not know how to perform this transformation for non-C languages. Co-authored-by: Jamie <jamie@osec.io>	2025-06-04 09:35:20 +02:00
Durgadoss R	b6c9c7633a	[Verifier] Add checks for range attribute on ImmArg (#140522 ) This patch implements verifier checks for range attributes on ImmArg. This enables validation of the range of ImmArg operands in intrinsics, when the intrinsic definition includes the range information. Signed-off-by: Durgadoss R <durgadossr@nvidia.com>	2025-05-23 15:45:59 +05:30
Jessica Clarke	864f0ff4ef	[clang][IR] Overload @llvm.thread.pointer to support non-AS0 targets (#132489 ) Thread-local globals live, by default, in the default globals address space, which may not be 0, so we need to overload @llvm.thread.pointer to support other address spaces, and use the default globals address space in Clang.	2025-05-14 21:51:56 +01:00
Carl Ritson	036d637a1d	[AMDGPU][Verifier] Allow llvm.amdgcn.unreachable after cs.chain (#139494 ) Unreachable is transformed to llvm.amdgcn.unreachable() during exit unification. Make sure the verifier tolerates this.	2025-05-14 12:34:27 +09:00
Marina Taylor	4b794c8aff	[ObjC] Support objc_claimAutoreleasedReturnValue (#139720 ) This adds basic support for objc_claimAutoreleasedReturnValue, which is mostly equivalent to objc_retainAutoreleasedReturnValue, with the difference that it doesn't require the marker nop to be emitted between it and the call it was attached to. To achieve that, this also teaches the AArch64 attachedcall bundle lowering to pick whether the marker should be emitted or not based on whether the attachedcall target is claimARV or retainARV. Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2025-05-13 16:04:49 +01:00
Max Graey	8aaac80ddd	[NFC] Use more isa and isa_and_nonnull instead dyn_cast for predicates (#137393 ) Also fix some typos in comments --------- Co-authored-by: Mehdi Amini <joker.eph@gmail.com>	2025-05-13 22:34:42 +08:00
Tom Tromey	386f2ca03b	Allow multi-member variants in DWARF (#139300 ) Currently, each variant in the variant part of a structure type can only contain a single member. This was sufficient for Rust, where each variant is represented as its own type. However, this isn't really enough for Ada, where a variant can have multiple members. This patch adds support for this scenario. This is done by allowing the use of DW_TAG_variant by DICompositeType, and then changing the DWARF generator to recognize when a DIDerivedType representing a variant holds one of these. In this case, the fields from the DW_TAG_variant are inlined into the variant, like so: ``` <4><7d>: Abbrev Number: 9 (DW_TAG_variant) <7e> DW_AT_discr_value : 74 <5><7f>: Abbrev Number: 7 (DW_TAG_member) <80> DW_AT_name : (indirect string, offset: 0x43): field0 <84> DW_AT_type : <0xa7> <88> DW_AT_alignment : 8 <89> DW_AT_data_member_location: 0 <5><8a>: Abbrev Number: 7 (DW_TAG_member) <8b> DW_AT_name : (indirect string, offset: 0x4a): field1 <8f> DW_AT_type : <0xa7> <93> DW_AT_alignment : 8 <94> DW_AT_data_member_location: 8 ``` Note that the intermediate DIDerivedType is still needed in this situation, because that is where the discriminants are stored.	2025-05-12 07:55:38 -07:00
Philip Reames	650dca5d89	[IR] Remove the AtomicMemInst helper classes (#138710 ) Migrate their usage to the `AnyMemInst` family, and add a isAtomic() query on the base class for that hierarchy. This matches the idioms we use for e.g. isAtomic on load, store, etc.. instructions, the existing isVolatile idioms on mem* routines, and allows us to more easily share code between atomic and non-atomic variants. As with #138568, the goal here is to simplify the class hierarchy and make it easier to reason about. I'm moving from easiest to hardest, and will stop at some point when I hit "good enough". Longer term, I'd sorta like to merge or reverse the naming on the plain MemInst and the AnyMemInst, but that's a much larger and more risky change. Not sure I'm going to actually do that.	2025-05-06 14:24:40 -07:00
Nikita Popov	a8ec6e8788	[IR] Require that global value initializers are sized (#137358 ) While external globals can be unsized, I don't think an unsized initializer makes sense. It seems like the backend currently ends up treating this as a zero-size global. If people want that behavior, they should declare it as such.	2025-05-02 09:52:39 +02:00
Matt Arsenault	21aa7b8138	Verifier: Avoid unnecessary hasFnAttr (#138104 ) When checking string attribute values are valid, it's not necessary to check hasFnAttr prior to querying the value.	2025-05-01 13:08:39 +02:00
Nikita Popov	38cb7d5e75	[IR] Don't allow label arguments (#137799 ) We currently accept label arguments to inline asm calls. This support predates both blockaddresses and callbr and is only covered by one X86 test. Remove it in favor of callbr (or at least blockaddress, though that cannot guarantee correct codegen, just like using block labels directly can't). I didn't bother implementing bitcode upgrade support for this, but I can add it if desired.	2025-04-30 09:11:36 +02:00
Shilei Tian	3bc125490a	[AMDGPU][Verifier] Check address space of `alloca` instruction (#135820 ) This PR updates the `Verifier` to enforce that `alloca` instructions on AMDGPU must be in AS5. This prevents hitting a misleading backend error like "unable to select FrameIndex," which makes it look like a backend bug when it's actually an IR-level issue.	2025-04-26 00:54:00 -04:00
Benjamin Maxwell	8c7a2ce01a	[AArch64][SME] Allow spills of ZT0 around SME ABI routines again (#136726 ) In #132722 spills of ZT0 were disabled around all SME ABI routines to avoid a case where ZT0 is spilled before ZA is enabled (resulting in a crash). It turns out that the ABI does not promise that routines will preserve ZT0 (however in practice they do), so generally disabling ZT0 spills for ABI routines is not correct. The case where a crash was possible was "aarch64_new_zt0" functions with ZA disabled on entry and a ZT0 spill around __arm_tpidr2_save. In this case, ZT0 will be undefined at the call to __arm_tpidr2_save, so this patch avoids the ZT0 spill by marking the callsite with "aarch64_zt0_undef". This attribute only applies to callsites and marks that at the point the call is made ZT0 is not defined, so does not need preserving.	2025-04-25 13:33:09 +01:00
Stephen Tozer	928c33354e	[DebugInfo] Handle additional types of stores in assignment tracking (#129070 ) Fixes #126417. Currently, assignment tracking recognizes allocas, stores, and mem intrinsics as valid instructions to tag with DIAssignID, with allocas representing the allocation for a variable and the others representing instructions that may assign to the variable. There are other intrinsics that can perform these assignments however, and if we transform a store instruction into one of these intrinsics and correctly transfer the DIAssignID over, this results in a verifier error. The AssignmentTrackingAnalysis pass also does not know how to handle these intrinsics if they are untagged, as it does not know how to extract assignment information (base address, offset, size) from them. This patch adds _some_ support for some intrinsics that may perform assignments: masked store/scatter, and vp store/strided store/scatter. This patch does not add support for extracting assignment information from these, as they may store with either non-constant size or to non-contiguous blocks of memory; instead it adds support for recognizing untagged stores with "unknown" assignment info, for which we assume that the memory location of the associated variable should not be used, as we can't determine which fragments of it should or should not be used. In principle, it should be possible to handle the more complex cases mentioned above, but it would require more substantial changes to AssignmentTrackingAnalysis, and it is mostly only needed as a fallback if the DIAssignID is not preserved on these alternative stores.	2025-04-22 17:14:25 +01:00
YLChenZ	b9e11eade7	[llvm][ir]: fix llc crashes on function definitions with label parameters (#136285 ) Closes #136144. After the patch: ```llvm ; label-crash.ll define void @invalid_arg_type(i32 %0) { 1: call void @foo(label %1) ret void } declare void @foo(label) ``` ``` lambda@ubuntu22:~/test$ llc -o out.s label-crash.ll Function argument cannot be of label type! label %0 ptr @foo llc: error: 'label-crash.ll': input module cannot be verified ```	2025-04-19 09:32:40 +08:00
Florian Mayer	9ed4c705ac	[MTE] decide whether to tag global in AsmPrinter (#135891 ) there are llvm passes that would invalidate the decision we make in clang.	2025-04-17 09:52:41 -07:00
Kazu Hirata	799916ae10	[llvm] Construct SmallVector with iterator ranges (NFC) (#136064 )	2025-04-16 19:29:47 -07:00
Shilei Tian	a45b133d40	[AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier (#134910 )	2025-04-11 15:32:37 -04:00

1 2 3 4 5 ...

1202 Commits