llvm-project

Author	SHA1	Message	Date
Pedro Lobo	57627bd197	[LangRef] Fix `ptrtoaddr` code block (#152927 ) Fixes formatting of the example code block under the `ptrtoaddr` instruction.	2025-08-10 17:32:12 +01:00
Alexander Richardson	3a4b351ba1	[IR] Introduce the `ptrtoaddr` instruction This introduces a new `ptrtoaddr` instruction which is similar to `ptrtoint` but has two differences: 1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance 2) `ptrtoaddr` only extracts (and then extends/truncates) the low index-width bits of the pointer For most architectures, difference 2) does not matter since index (address) width and pointer representation width are the same, but this does make a difference for architectures that have pointers that aren't just plain integer addresses such as AMDGPU fat pointers or CHERI capabilities. This commit introduces textual and bitcode IR support as well as basic code generation, but optimization passes do not handle the new instruction yet so it may result in worse code than using ptrtoint. Follow-up changes will update capture tracking, etc. for the new instruction. RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 Reviewed By: nikic Pull Request: https://github.com/llvm/llvm-project/pull/139357	2025-08-08 10:12:39 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Craig Topper	73685583c8	[VP][RISCV] Add a vp.load.ff intrinsic for fault only first load. (#128593 ) There's been some interest in supporting early-exit loops recently. https://discourse.llvm.org/t/rfc-supporting-more-early-exit-loops/84690 This patch was extracted from our downstream where we've been using it in our vectorizer.	2025-08-05 16:12:42 -07:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Kazu Hirata	183b38078a	[llvm] Proofread *.rst (#151087 ) This patch hyphenates words that are used as adjecives, such as: - architecture specific - human readable - implementation defined - language independent - language specific - machine readable - machine specific - target independent - target specific	2025-08-01 07:01:50 -07:00
Joel E. Denny	37e03b56b8	Revert "[PGO] Add `llvm.loop.estimated_trip_count` metadata" (#151585 ) Reverts llvm/llvm-project#148758 [As requested.](https://github.com/llvm/llvm-project/pull/148758#pullrequestreview-3076627201)	2025-07-31 15:56:31 -04:00
Joel E. Denny	f7b65011de	[PGO] Add `llvm.loop.estimated_trip_count` metadata (#148758 ) This patch implements the `llvm.loop.estimated_trip_count` metadata discussed in [[RFC] Fix Loop Transformations to Preserve Block Frequencies](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785). As [suggested in the RFC comments](https://discourse.llvm.org/t/rfc-fix-loop-transformations-to-preserve-block-frequencies/85785/4), it adds the new metadata to all loops at the time of profile ingestion and estimates each trip count from the loop's `branch_weights` metadata. As [suggested in the PR #128785 review](https://github.com/llvm/llvm-project/pull/128785#discussion_r2151091036), it does so via a new `PGOEstimateTripCountsPass` pass, which creates the new metadata for each loop but omits the value if it cannot estimate a trip count due to the loop's form. An important observation not previously discussed is that `PGOEstimateTripCountsPass` often cannot estimate a loop's trip count, but later passes can sometimes transform the loop in a way that makes it possible. Currently, such passes do not necessarily update the metadata, but eventually that should be fixed. Until then, if the new metadata has no value, `llvm::getLoopEstimatedTripCount` disregards it and tries again to estimate the trip count from the loop's current `branch_weights` metadata.	2025-07-31 12:28:25 -04:00
Pengcheng Wang	058d96f2d6	[RISCV] Support PreserveMost calling convention (#148214 ) This adds the simplest implementation of `PreserveMost` calling convention and we preserve `x5-x31` (except x6/x7/x28) registers. Fixes #148147.	2025-07-30 16:30:48 +08:00
Nikita Popov	ab1f6ce482	[IR][SDAG] Remove lifetime size handling from SDAG (#150944 ) Split out from https://github.com/llvm/llvm-project/pull/150248: Specify that the argument of lifetime.start/lifetime.end is ignored and will be removed in the future. Remove lifetime size handling from SDAG. The size was previously discarded during isel, so was always ignored for stack coloring anyway. Where necessary, obtain the size of the full frame index.	2025-07-29 09:53:59 +02:00
Ralf Jung	b1aece90f3	LangRef: allocated objects can grow (#141338 ) This enables the (reasonably common) pattern of using `mmap` to reserve but not actually map a wide range of pages, and then only adding in more pages as memory is actually needed. Effectively, that region of memory is one big allocated object for LLVM, but crucially, that allocated object changes its size. Having an allocated object grow seems entirely compatible with what LLVM optimizations assume, except that when LLVM sees an `alloca` or similar instruction, it will assume that a pointer that has been `getelementptr inbounds` by more than the size of the allocated object cannot alias that `alloca`. But for allocated objects that are created e.g. by `mmap`, where LLVM does not know their size, this cannot happen anyway. The other main point to be concerned about is having a `getelementptr inbounds` that is moved up across an operation that grows an allocated object: this should be legal as `getelementptr` is freely reorderable. We achieve that by saying that for allocated objects that change their size, "inbounds" means "inbounds of their maximal size", not "inbounds of their current size". It would be nice to also allow shrinking allocations (e.g. by `munmap`ing pages at the end), but that is more tricky. Consider an example like this: - load 4 bytes from `ptr` - call some function - load 1 byte from `ptr` Right now, LLVM could argue that since `ptr` clearly has not been deallocated, there must be at least 4 bytes of dereferenceable memory behind `ptr` after the call. If allocations can shrink, this kind of reasoning is no longer valid. I don't know if LLVM actually does reasoning like that -- I think it should not, since I think it should be possible to have allocations that shrink -- but to remain conservative I am not proposing that as part of this patch.	2025-07-23 10:36:06 +02:00
Kazu Hirata	d8ea88f99a	[llvm] Proofread LangRef.rst (#150042 )	2025-07-22 21:26:38 -07:00
Kazu Hirata	1229323b8b	[llvm] Improve grammar and punctuation of several documents (#149630 )	2025-07-21 07:24:00 -07:00
Nikita Popov	92c55a315e	[IR] Only allow lifetime.start/end on allocas (#149310 ) lifetime.start and lifetime.end are primarily intended for use on allocas, to enable stack coloring and other liveness optimizations. This is necessary because all (static) allocas are hoisted into the entry block, so lifetime markers are the only way to convey the actual lifetimes. However, lifetime.start and lifetime.end are currently allowed to be used on non-alloca pointers. We don't actually do this in practice, but just the mere fact that this is possible breaks the core purpose of the lifetime markers, which is stack coloring of allocas. Stack coloring can only work correctly if all lifetime markers for an alloca are analyzable. * If a lifetime marker may operate on multiple allocas via a select/phi, we don't know which lifetime actually starts/ends and handle it incorrectly (https://github.com/llvm/llvm-project/issues/104776). * Stack coloring operates on the assumption that all lifetime markers are visible, and not, for example, hidden behind a function call or escaped pointer. It's not possible to change this, as part of the purpose of lifetime markers is that they work even in the presence of escaped pointers, where simple use analysis is insufficient. I don't think there is any way to have coherent semantics for lifetime markers on allocas, while also permitting them on arbitrary pointer values. This PR restricts lifetimes to operate on allocas only. As a followup, I will also drop the size argument, which is superfluous if we always operate on an alloca. (This change also renders various code handling lifetime markers on non-alloca dead. I plan to clean up that kind of code after dropping the size argument as well.) In practice, I've only found a few places that currently produce lifetimes on non-allocas: * CoroEarly replaces the promise alloca with the result of an intrinsic, which will later be replaced back with an alloca. I think this is the only place where there is some legitimate loss of functionality, but I don't think this is particularly important (I don't think we'd expect the promise in a coroutine to admit useful lifetime optimization.) * SafeStack moves unsafe allocas onto a separate frame. We can safely drop lifetimes here, as SafeStack performs its own stack coloring. * Similar for AddressSanitizer, it also moves allocas into separate memory. * LSR sometimes replaces the lifetime argument with a GEP chain of the alloca (where the offsets ultimately cancel out). This is just unnecessary. (Fixed separately in https://github.com/llvm/llvm-project/pull/149492.) * InferAddrSpaces sometimes makes lifetimes operate on an addrspacecast of an alloca. I don't think this is necessary.	2025-07-21 15:04:50 +02:00
Kazu Hirata	b9aa06f897	[llvm] Improve grammar and punctuation of LLVM Language Reference Manual (#149553 )	2025-07-18 14:59:53 -07:00
Prabhu Rajasekaran	921c6dbeca	[llvm] Introduce callee_type metadata Introduce `callee_type` metadata which will be attached to the indirect call instructions. The `callee_type` metadata will be used to generate `.callgraph` section described in this RFC: https://lists.llvm.org/pipermail/llvm-dev/2021-July/151739.html Reviewers: morehouse, petrhosek, nikic, ilovepi Reviewed By: nikic, ilovepi Pull Request: https://github.com/llvm/llvm-project/pull/87573	2025-07-18 14:40:54 -07:00
Florian Mayer	547a49f6b6	[LangRef] fix non-existant `icmp gte` -> `icmp sge` (#149420 )	2025-07-17 16:53:50 -07:00
Trevor Gross	78b9128250	[LangRef] Document the difference between `<abi>` and `<pref>` (#147929 ) Document how LLVM expects to use `<abi>` and `<pref>`, as well as the `pref >= abi` requirement.	2025-07-16 18:04:53 +02:00
Eli Friedman	8ec03f49dc	Clarify semantics for overflowing constrained lrint etc. (#148979 ) This fixes llvm.experimental.constrained.lrint and friends to use the same semantics as llvm.lrint and friends, as defined in ba5c26da7ce85dbdcee3d964282e5f0981792702.	2025-07-15 20:32:16 -07:00
Michael Kruse	ec2e21a14d	[LangRef] No target-specific size limit for atomics (#136864 ) According to the current LangRef, atomics of sizes larger than a target-dependent limit are non-conformant IR. Presumably, that size limit is `TargetLoweringBase::getMaxAtomicSizeInBitsSupported()`. As a consequence, one would not even know whether IR is valid without instantiating the Target backend. To get around this, Clang's CGAtomic uses a constant "16 bytes" for the maximally supported atomic. The verifier only checks the power-of-two requirement. In a discussion with jyknight, the intention is rather that the AtomicExpandPass will just lower everything larger than the target-supported atomic sizes to libcall (such as `__atomic_load`). According to this interpretation, the size limit needs only be known by the lowering and does not affect the IR specification. The original "target-specific size limit" had been added in 59b66883eacbc62a09c09f08bcbfdce7af46cf31. The LangRef change is needed for #134455 because otherwise frontends need to pass a TargetLowering object to the helper functions just to know what the target-specific limit is. This also changes the LangRef for atomicrmw. Are there libatomic fallbacks for these? If not, LLVM-IR validity still depends on instantiating the actual backend. There are also some intrinsics such as `llvm.memcpy.element.unordered.atomic` that have this constraint but do not change in this PR.	2025-07-14 20:36:09 +02:00
Adrian Vogelsgesang	de3c8410d8	[debuginfo][coro] Emit debug info labels for coroutine resume points (#141937 ) RFC on discourse: https://discourse.llvm.org/t/rfc-debug-info-for-coroutine-suspension-locations-take-2/86606 With this commit, we add `DILabel` debug infos to the resume points of a coroutine. Those labels can be used by debugging scripts to figure out the exact line and column at which a coroutine was suspended by looking up current `__coro_index` value inside the coroutines frame, and then searching for the corresponding label inside the coroutine's resume function. The DWARF information generated for such a label looks like: ``` 0x00000f71: DW_TAG_label DW_AT_name ("__coro_resume_1") DW_AT_decl_file ("generator-example.cpp") DW_AT_decl_line (5) DW_AT_decl_column (3) DW_AT_artificial (true) DW_AT_LLVM_coro_suspend_idx (0x01) DW_AT_low_pc (0x00000000000019be) ``` The labels can be mapped to their corresponding `__coro_idx` values either via their naming convention `__coro_resume_<N>` or using the new `DW_AT_LLVM_coro_suspend_idx` attribute. In gdb, those line numebrs can be looked up using `info line -function my_coroutine -label __coro_resume_1`. LLDB unfortunately does not understand DW_TAG_label debug information, yet. Given this is an artificial compiler-generated label, I did apply the DW_AT_artificial tag to it. The DWARFv5 standard only allows that tag on type and variable definitions, but this is a natural extension and was also blessed in the RFC on discourse. Also, this commit adds `DW_AT_decl_column` to labels, not only for coroutines but also for normal C and C++ labels. While not strictly necessary, I am doing so now because it would be harder to do so later without breaking the binary LLVM-IR format Drive-by fixes: While reading the existing test cases to understand how to write my own test case, I did a couple of small typo fixes and comment improvements	2025-07-04 10:44:35 +02:00
Antonio Frighetto	a2505cf1e3	[LangRef] Revisit attributes semantics after opaque ptr migration (NFC) Outdated pointee-type phrasings in ABI attributes have been removed.	2025-07-03 13:04:35 +02:00
Antonio Frighetto	f1cc0b607b	[IR] Introduce `dead_on_return` attribute Add `dead_on_return` attribute, which is meant to be taken advantage by the frontend, and states that the memory pointed to by the argument is dead upon function return. As with `byval`, it is supposed to be used for passing aggregates by value. The difference lies in the ABI: `byval` implies that the pointer is explicitly passed as argument to the callee (during codegen the copy is emitted as per byval contract), whereas a `dead_on_return`-marked argument implies that the copy already exists in the IR, is located at a specific stack offset within the caller, and this memory will not be read further by the caller upon callee return – or otherwise poison, if read before being written. RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.	2025-07-02 09:29:36 +02:00
Matt Arsenault	f0d898f36b	DAG: Move get_dynamic_area_offset type check to IR verifier (#145268 ) Also fix the LangRef to match the implementation. This was checking against the alloca address space size rather than the default address space. The check was also more permissive than the LangRef. The error check permitted any size less than the pointer size; follow the stricter wording of the LangRef.	2025-06-24 11:11:52 +09:00
Wenju He	9d570d568b	[ValueTracking] Return true for AddrSpaceCast in canCreateUndefOrPoison (#144686 ) In our downstream GPU target, following IR is valid before instcombine although the second addrspacecast causes UB. define i1 @test(ptr addrspace(1) noundef %v) { %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4) %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0) %2 = icmp eq i32 %1, 0 %3 = addrspacecast ptr addrspace(4) %0 to ptr addrspace(3) %4 = select i1 %2, ptr addrspace(3) null, ptr addrspace(3) %3 %5 = icmp eq ptr addrspace(3) %4, null ret i1 %5 } We have a custom optimization that replaces invalid addrspacecast with poison, and IR is still valid since `select` stops poison propagation. However, instcombine pass optimizes `select` to `or`: %0 = addrspacecast ptr addrspace(1) %v to ptr addrspace(4) %1 = call i32 @llvm.xxxx.isaddr.shared(ptr addrspace(4) %0) %2 = icmp eq i32 %1, 0 %3 = addrspacecast ptr addrspace(1) %v to ptr addrspace(3) %4 = icmp eq ptr addrspace(3) %3, null %5 = or i1 %2, %4 ret i1 %5 The transform is invalid for our target. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2025-06-24 08:43:47 +08:00
RonDahan101	fa9e1a1515	[AArch64] Expand llvm.histogram intrinsic to support umax, umin, and uadd.sat operations (#138447 ) This patch extends the llvm.histogram intrinsic to support additional update operations beyond the existing add. Specifically, the new supported operations are: * umax: unsigned maximum * umin: unsigned minimum * uadd.sat: unsigned saturated addition Based on the discussion from: https://discourse.llvm.org/t/rfc-expanding-the-experimental-histogram-intrinsic/84673	2025-06-11 15:15:24 +01:00
arysef	78af498035	[LLVM][IR] Support target extension types in vectors (#140630 ) This change is to support target extension types in vectors. The change allows sized target extension types to opt-in to being a valid vector element. Allowing target extension types as vector elements will allow backends to use vector operations such as `insertelement` and `extractelement` on their target types with minimal changes. RFC: https://discourse.llvm.org/t/rfc-supporting-sized-target-extension-types-in-vector/86431	2025-06-09 12:03:15 -07:00
clubby789	c7c79d2590	[IR][DSE] Support non-malloc functions in malloc+memset->calloc fold (#138299 ) Add a `alloc-variant-zeroed` function attribute which can be used to inform folding allocation+memset. This addresses https://github.com/rust-lang/rust/issues/104847, where LLVM does not know how to perform this transformation for non-C languages. Co-authored-by: Jamie <jamie@osec.io>	2025-06-04 09:35:20 +02:00
Simon Tatham	56acb06bc6	[ARM,AArch64] Don't put BTI at asm goto branch targets (#141562 ) In 'asm goto' statements ('callbr' in LLVM IR), you can specify one or more labels / basic blocks in the containing function which the assembly code might jump to. If you're also compiling with branch target enforcement via BTI, then previously listing a basic block as a possible jump destination of an asm goto would cause a BTI instruction to be placed at the start of the block, in case the assembly code used an _indirect_ branch instruction (i.e. to a destination address read from a register) to jump to that location. Now it doesn't do that any more: branches to destination labels from the assembly code are assumed to be direct branches (to a relative offset encoded in the instruction), which don't require a BTI at their destination. This change was proposed in https://discourse.llvm.org/t/85845 and there seemed to be no disagreement. The rationale is: 1. it brings clang's handling of asm goto in Arm and AArch64 in line with gcc's, which didn't generate BTIs at the target labels in the first place. 2. it improves performance in the Linux kernel, which uses a lot of 'asm goto' in which the assembly language just contains a NOP, and the label's address is saved elsewhere to let the kernel self-modify at run time to swap between the original NOP and a direct branch to the label. This allows hot code paths to be instrumented for debugging, at only the cost of a NOP when the instrumentation is turned off, instead of the larger cost of an indirect branch. In this situation a BTI is unnecessary (if the branch happens it's direct), and since the code paths are hot, also a noticeable performance hit. Implementation: `SelectionDAGBuilder::visitCallBr` is the place where 'asm goto' target labels are handled. It calls `setIsInlineAsmBrIndirectTarget()` on each target `MachineBasicBlock`. Previously it also called `setMachineBlockAddressTaken()`, which made `hasAddressTaken()` return true, which caused a BTI to be added in the Arm backends. Now `visitCallBr` doesn't call `setMachineBlockAddressTaken()` any more on asm goto targets, but `hasAddressTaken()` also checks the flag set by `setIsInlineAsmBrIndirectTarget()`. So call sites that were using `hasAddressTaken()` don't need to be modified. But the Arm backends don't call `hasAddressTaken()` any more: instead they test two more specific query functions that cover all the reasons `hasAddressTaken()` might have returned true _except_ being an asm goto target. Testing: The new test `AArch64/callbr-asm-label-bti.ll` is testing the actual change, where it expects not to see a `bti` instruction after `[[LABEL]]`. The rest of the test changes are all churn, due to the flags on basic blocks changing. Actual output code hasn't changed in any of the existing tests, only comments and diagnostics. Further work: `RISCVIndirectBranchTracking.cpp` and `X86IndirectBranchTracking.cpp` also call `hasAddressTaken()` in a way that might benefit from using the same more specific check I've put in `ARMBranchTargets.cpp` and `AArch64BranchTargets.cpp`. But I'm not sure of that, so in this commit I've only changed the Arm backends, and left those alone.	2025-06-03 08:44:13 +01:00
Luke Lau	3033f202f6	[IR] Add llvm.vector.[de]interleave{4,6,8} (#139893 ) This adds [de]interleave intrinsics for factors of 4,6,8, so that every interleaved memory operation supported by the in-tree targets can be represented by a single intrinsic. For context, [de]interleaves of fixed-length vectors are represented by a series of shufflevectors. The intrinsics are needed for scalable vectors, and we don't currently scalably vectorize all possible factors of interleave groups supported by RISC-V/AArch64. The underlying reason for this is that higher factors are currently represented by interleaving multiple interleaves themselves, which made sense at the time in the discussion in https://github.com/llvm/llvm-project/pull/89018. But after trying to integrate these for higher factors on RISC-V I think we should revisit this design choice: - Matching these in InterleavedAccessPass is non-trivial: We currently only support factors that are a power of 2, and detecting this requires a good chunk of code - The shufflevector masks used for [de]interleaves of fixed-length vectors are much easier to pattern match as they are strided patterns, but for the intrinsics it's much more complicated to match as the structure is a tree. - Unlike shufflevectors, there's no optimisation that happens on [de]interleave2 intriniscs - For non-power-of-2 factors e.g. 6, there are multiple possible ways a [de]interleave could be represented, see the discussion in #139373 - We already have intrinsics for 2,3,5 and 7, so by avoiding 4,6 and 8 we're not really saving much By representing these higher factors are interleaved-interleaves, we can in theory support arbitrarily high interleave factors. However I'm not sure this is actually needed in practice: SVE only has instructions for factors 2,3,4, whilst RVV only supports up to factor 8. This patch would make it much easier to support scalable interleaved accesses in the loop vectorizer for RISC-V for factors 3,5,6 and 7, as the loop vectorizer and InterleavedAccessPass wouldn't need to construct and match trees of interleaves. For interleave factors above 8, for which there are no hardware memory operations to match in the InterleavedAccessPass, we can still keep the wide load + recursive interleaving in the loop vectorizer.	2025-05-26 18:45:12 +01:00
Weining Lu	581d175a86	[LoongArch] Document the inline asm `q` constraint See #141037.	2025-05-26 09:14:27 +08:00
Luigi Sartor Piucco	83de1efae3	[LangRef] Comment on validity of volatile ops on null (#139803 ) Some hardware (for example, certain AVR chips) have peripheral registers mapped to the data space address 0. Although a volatile load/store on `ptr null` already generates expected code, the wording in the LangRef makes operations on null seem like undefined behavior in all cases. This commit adds a comment that, for volatile operations, it may be defined behavior to access the address null, if the architecture permits it. The intended use case is MMIO registers with hard-coded addresses that include bit-value 0. A simple CodeGen test is included for AVR, as an architecture known to have this quirk, that does `load volatile` and `store volatile` to `ptr null`, expecting to generate `lds <reg>, 0` and `sts 0, <reg>`. See [this thread](https://rust-lang.zulipchat.com/#narrow/channel/213817-t-lang/topic/Adding.20the.20possibility.20of.20volatile.20access.20to.20address.200) and [the RFC](https://discourse.llvm.org/t/rfc-volatile-access-to-non-dereferenceable-memory-may-be-well-defined/86303) for discussion and context.	2025-05-22 15:12:34 +02:00
Kazu Hirata	9f4cea209e	[llvm] Fix typos in documentation (#140844 )	2025-05-21 01:11:08 -07:00
Alexander Richardson	dbfd0fd4ff	[LangRef] Clarify that `ptrtoint` behaves like a capturing bitcast This clarifies the outcome of the discussion in https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54 In the future, we will also introduce a non-capturing pointer -> address conversion using a new `ptrtoaddr` instruction. Reviewed By: krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139349	2025-05-19 17:10:11 -07:00
Alexander Richardson	aec685ea77	[DataLayout] Introduce DataLayout::getAddressSize(AS) This function can be used to retrieve the number of bits that can be used for arithmetic in a given address space (i.e. the range of the address space). For most in-tree targets this should not make any difference but differentiating between the size of a pointer in bits and the address range is extremely important e.g. for CHERI-enabled targets, where pointers carry additional metadata such as bounds and permissions and only a subset of the pointer bits is used as the address. The address size is defined to be the same as the index size. We considered adding a separate property since targets exist where indexing and address range actually use different sizes (AMDGPU fat pointers with 160 representation, 48 bit address and 32 bit index), but for the purposes of LLVM semantics, differentiating them does not add much value and it introduces a lot of complexity in ensure the correct bits are used. See the reasoning by @nikic on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/38https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/49 Originally uploaded as https://reviews.llvm.org/D135158 Reviewed By: davidchisnall, krzysz00 Pull Request: https://github.com/llvm/llvm-project/pull/139347	2025-05-16 10:04:00 -07:00
Jameson Nash	6ebb84869e	langref updates for aarch64 trampoline (#139740 ) Add clarifying comments to the langref from the review of #126743	2025-05-16 17:46:27 +02:00
Matt Arsenault	ab119add35	LangRef: Fix minimumnum/maximumnum nan handling phrasing (#139228 ) Make this consistent with other operations with respect to signaling nan quieting. This was specifying that quieting is required, which is true for IEEE. Make this consistent with other IR operations, and signaling nan quieting is possible but optional in the case where there are two nan inputs. This permits directly selecting the intrinsic to the hardware instruction in the default floating-point environment for shaders.	2025-05-15 09:07:09 +02:00
Jessica Clarke	864f0ff4ef	[clang][IR] Overload @llvm.thread.pointer to support non-AS0 targets (#132489 ) Thread-local globals live, by default, in the default globals address space, which may not be 0, so we need to overload @llvm.thread.pointer to support other address spaces, and use the default globals address space in Clang.	2025-05-14 21:51:56 +01:00
Tom Tromey	386f2ca03b	Allow multi-member variants in DWARF (#139300 ) Currently, each variant in the variant part of a structure type can only contain a single member. This was sufficient for Rust, where each variant is represented as its own type. However, this isn't really enough for Ada, where a variant can have multiple members. This patch adds support for this scenario. This is done by allowing the use of DW_TAG_variant by DICompositeType, and then changing the DWARF generator to recognize when a DIDerivedType representing a variant holds one of these. In this case, the fields from the DW_TAG_variant are inlined into the variant, like so: ``` <4><7d>: Abbrev Number: 9 (DW_TAG_variant) <7e> DW_AT_discr_value : 74 <5><7f>: Abbrev Number: 7 (DW_TAG_member) <80> DW_AT_name : (indirect string, offset: 0x43): field0 <84> DW_AT_type : <0xa7> <88> DW_AT_alignment : 8 <89> DW_AT_data_member_location: 0 <5><8a>: Abbrev Number: 7 (DW_TAG_member) <8b> DW_AT_name : (indirect string, offset: 0x4a): field1 <8f> DW_AT_type : <0xa7> <93> DW_AT_alignment : 8 <94> DW_AT_data_member_location: 8 ``` Note that the intermediate DIDerivedType is still needed in this situation, because that is where the discriminants are stored.	2025-05-12 07:55:38 -07:00
Fangrui Song	d926ec35b7	[X86] Asm modifier %a: add (%rip) for 64-bit static relocation model In GCC, ``` static int a; int foo() { asm("# %a0" : : "i"(&a)); } ``` lowers to `# a(%rip)` regardless of the PIC mode. This PR follow suits for ELF -fno-pic, matching ELF -fpic (asm-modifier-pic.ll) and Mach-O (which defaults to PIC). Close https://github.com/llvm/llvm-project/issues/139001 Pull Request: https://github.com/llvm/llvm-project/pull/139040	2025-05-09 11:54:01 +08:00
Nikita Popov	a8ec6e8788	[IR] Require that global value initializers are sized (#137358 ) While external globals can be unsized, I don't think an unsized initializer makes sense. It seems like the backend currently ends up treating this as a zero-size global. If people want that behavior, they should declare it as such.	2025-05-02 09:52:39 +02:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
Kip Hamiltons	e87aaaf327	[LangRef] Minor editorial fixes (#137782 )	2025-04-30 15:10:48 +02:00
Nikita Popov	6feb4a8ef4	[IR] Don't allow values of opaque type (#137625 ) Consider opaque types as non-first-class types, i.e. do not allow SSA values to have opaque type.	2025-04-30 15:01:00 +02:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Yingwei Zheng	38ad9266b6	[LangRef] Clarify the behavior of select with FP poison-generating flags (#137131 ) RFC link: https://discourse.llvm.org/t/rfc-clarify-the-behavior-of-select-with-fp-poison-generating-flags/85974 Actually, it does not conflict with the definition of FMF if we interpret a select as `applyFMF(select cond, applyFMF(TrueArm), applyFMF(FalseArm))`.	2025-04-25 13:47:08 +08:00
Fabian Ritter	784dc16088	[LangRef][IR] Fix default AS documentation for allocas without explicit AS (#135942 ) So far, the Language Reference said that the alloca address space from the datalayout is used if no explicit address space is provided, which is not what the LLParser and the AsmWriter implement. This patch adjusts the documentation to match the implementation: The default AS 0 is used if none is explicitly specified. This is an alternative to PR #135786, which would change the parser's behavior to match the Language Reference instead.	2025-04-22 08:32:03 +02:00
Eli Friedman	711301066c	[LangRef] Add a description of the semantics of call signatures. (#136189 ) This doesn't introduce anything new; it's just a reflection of the semantics we've already had for many years. Per discussion on #63484.	2025-04-18 09:24:54 -07:00
Jonas Paulsson	6d03f51f0c	[SystemZ] Add support for 16-bit floating point. (#109164 ) - _Float16 is now accepted by Clang. - The half IR type is fully handled by the backend. - These values are passed in FP registers and converted to/from float around each operation. - Compiler-rt conversion functions are now built for s390x including the missing extendhfdf2 which was added. Fixes #50374	2025-04-16 20:02:56 +02:00

1 2 3 4 5 ...

1507 Commits