llvm-project

Author	SHA1	Message	Date
Mehdi Amini	6a045c29a9	Revert "[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107 )" (#188344 ) This reverts commit b1aa6a45060bb9f89efded9e694503d6b4626a4a and commit ce44d63e0d14039f1e8f68e6b7c4672457cabd4e. This fails the build with some older gcc: llvm/include/llvm/CodeGenTypes/LowLevelType.h:501:35: error: call to non-constexpr function ‘static llvm::LLT llvm::LLT::integer(unsigned int)’ return integer(getSizeInBits()); ^	2026-03-24 21:40:36 +00:00
Denis.G	b1aa6a4506	[GlobalISel][LLT] Introduce FPInfo for LLT (Enable bfloat, ppc128float and others in GlobalISel) (#155107 ) Added extra information in LLT to support ambiguous fp types during GlobalISel. Original idea by @tgymnich Main differences from https://github.com/llvm/llvm-project/pull/122503 are: * Do not deprecate LLT::scalar * Allow targets to enable/disable IR translation with extenden LLT via `TargetOption::EnableGlobalISelExtendedLLT` (disabled by default) * `IRTranslator` use `TargetLoweringInfo` for appropriate `LLT` generation. * For this reason added flag in GlobalISelMatchTable` to allow switch between legacy and new extended LLT names * Revert using stubs like `LLT::float32` for float types as they are real now. Added `TODO` for such cases. Also MIRParser now may parse new type indentifiers. --------- Co-authored-by: Tim Gymnich <tim@gymni.ch> Co-authored-by: Ryan Cowan <ryan.cowan@arm.com>	2026-03-24 08:40:39 -04:00
gonzalobg	ea8fb06f24	[atomicrmw] fminimumnum/fmaximumnum support (#187030 ) Adds support for `atomicrmw` `fminimumnum`/`fmaximumnum` operations. These were added to C++ in P3008, and are exposed in libc++ in #186716 . Adding LLVM IR support for these unblocks work in both backends with HW support, and frontends.	2026-03-18 09:35:49 +01:00
Alexis Engelke	4fd826d1f9	[IR] Split Br into UncondBr and CondBr (#184027 ) BranchInst currently represents both unconditional and conditional branches. However, these are quite different operations that are often handled separately. Therefore, split them into separate opcodes and classes to allow distinguishing these operations in the type system. Additionally, this also slightly improves compile-time performance.	2026-03-11 12:31:10 +00:00
Jameson Nash	d762cc2f03	[GlobalISel] Add SVE support for alloca (#178976 ) Complementary to the same handling code in SelectionDAG: `f3d81d4110/llvm/lib/CodeGen/SelectionDAG/FunctionLoweringInfo.cpp (L160-L165)` `f3d81d4110/llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp (L4613-L4623)` Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-02-05 14:00:34 -05:00
Nicolai Hähnle	af836ff60c	[CodeGen] Add getTgtMemIntrinsic overload for multiple memory operands (NFC) (#175843 ) There are target intrinsics that logically require two MMOs, such as llvm.amdgcn.global.load.lds, which is a copy from global memory to LDS, so there's both a load and a store to different addresses. Add an overload of getTgtMemIntrinsic that produces intrinsic info in a vector, and implement it in terms of the existing (now protected) overload. GlobalISel and SelectionDAG paths are updated to support multiple MMOs. The main part of this change is supporting multiple MMOs in MemIntrinsicNodes. Converting the backends to using the new overload is a fairly mechanical step that is done in a separate change in the hope that that allows reducing merging pains during review and for downstreams. A later change will then enable using multiple MMOs in AMDGPU.	2026-02-02 21:58:42 +00:00
Ryan Cowan	ad1a45b903	[AArch64] Use GISel for optnone functions (#174746 ) Currently, when SDAG is run on AArch64 and an `optnone` function is encountered, the selector is chosen as FastISel. AArch64 makes use of GlobalISel at O0 and this patch aims to align `optnone` with this functionality. A flag is exposed to enable this functionality for a given backend but, as AArch64 is currently the only backend I could find using GlobalISel at O0 this is the only one with it implemented. This flag is set when the target supports GlobalISel & GlobalISel hasn't been forced by the user, the target machine or by being at an optimisation level lower than `EnableGlobalISelAtO`. If this happens, the GlobalISel passes are included as shown in `llvm/test/CodeGen/AArch64/O3-pipeline.ll` and skipped by IRTranslator for functions not marked as `optnone`. In updating the tests based on this functionality, I found some unused check lines or run lines that mixed SDAG with GlobalISel pass names which have been fixed. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2026-01-29 16:30:22 +00:00
Jameson Nash	b7c1a6f8b4	[CodeGen] Only use actual alloca alignment (#178361 ) Remove getPrefTypeAlign calls and use only the alloca's explicit alignment, since the type may not be semantically useful, there is no useful reason to change alignment to support it. The alloca's explicit alignment (from getAlign()) is already optimally correct; we don't need to derive alignment from the allocated type. Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-28 22:49:19 -05:00
Jameson Nash	218d0c2ed1	[NFC][CodeGen] Use getAllocationSize instead of manual size computation (#178360 ) Replace manual alloca size computation with `getAllocationSize` API. This reduces dependency on `getAllocatedType` when just needed for size and vscale queries. Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>	2026-01-28 22:48:13 -05:00
Matt Arsenault	0d4a35d560	IR: Remove llvm.convert.to.fp16 and llvm.convert.from.fp16 intrinsics (#174484 ) These are long overdue for removal. These were originally a hack to support loading half values before there was any / decent support for the half type through the backend. There's no reason to continue supporting these, they're equivalent to fpext/fptrunc with a bitcast. SelectionDAG stopped translating these directly, and used the bitcast + fp cast since f7a02c17628e825, so there's been no reason to use these since 2014.	2026-01-21 09:50:28 +00:00
Matt Arsenault	aa57ee958d	CodeGen: Use LibcallLoweringInfo for stack protector insertion (#176829 ) Thread LibcallLoweringInfo into the TargetLowering hooks used by the stack protector passes.	2026-01-20 12:37:31 +01:00
Matt Arsenault	5d6d1d9e6c	GlobalISel: Use LibcallLoweringInfo in IRTranslator for real (#176824 )	2026-01-20 09:15:49 +01:00
Matt Arsenault	f24eafa655	GlobalISel: Use LibcallLoweringInfo more in IRTranslator (#176412 )	2026-01-17 08:18:45 +01:00
Justin Stitt	4049208388	[CodeGen] Check BlockAddress users before marking block as taken (#174480 )	2026-01-15 11:17:15 -08:00
Victor Chernyakin	c438773432	[LLVM][ADT] Migrate users of `make_scope_exit` to CTAD (#174030 ) This is a followup to #173131, which introduced the CTAD functionality.	2026-01-02 20:42:56 -08:00
Leandro Lupori	25acd42fcc	Revert "[aarch64] Mix the frame pointer with the stack cookie when protecting the stack (#161114 )" (#173987 ) This reverts commit b6bfa856860bb4304e635102872a4c994af101b4. This commit broke Windows on Arm bots.	2025-12-30 10:58:01 -03:00
Pan Tao	b6bfa85686	[aarch64] Mix the frame pointer with the stack cookie when protecting the stack (#161114 ) This strengthens the guard and matches MSVC. Fixes #156573 .	2025-12-17 12:52:28 -08:00
Robert Imschweiler	e933ccdd9d	[AMDGPU][GlobalISel] Fix / workaround amdgcn.kill/.unreachable lowering (#170639 ) cf. https://github.com/llvm/llvm-project/pull/133907#issuecomment-3611354688	2025-12-04 13:42:53 +01:00
Robert Imschweiler	e84fdbe1ef	[IR] Add CallBr intrinsics support (#133907 ) This commit adds support for using intrinsics with callbr. The uses of this will most of the time look like this example: ```llvm callbr void @llvm.amdgcn.kill(i1 %c) to label %cont [label %kill] kill: unreachable cont: ... ```	2025-12-04 10:21:00 +01:00
Petar Avramovic	25b6a15dfd	GlobalISel: Stop using TPC to check if GlobalISelAbort is enabled (#169917 ) New pass manager does not use TargetPassConfig. GlobalISel requires TargetPassConfig to reportGISelFailure, and it only actual use is to check if GlobalISelAbort is enabled. TargetPassConfig uses TargetMachine to check if GlobalISelAbort is enabled, but TargetMachine is also available from MachineFunction.	2025-12-02 17:12:10 +01:00
Peter Collingbourne	6227eb90da	Add IR and codegen support for deactivation symbols. Deactivation symbols are a mechanism for allowing object files to disable specific instructions in other object files at link time. The initial use case is for pointer field protection. For more information, see the RFC: https://discourse.llvm.org/t/rfc-deactivation-symbols/85556 Reviewers: ojhunt, nikic, fmayer, arsenm, ahmedbougacha Reviewed By: fmayer Pull Request: https://github.com/llvm/llvm-project/pull/133536	2025-11-26 12:37:09 -08:00
Sergei Barannikov	4eea157301	[GlobalISel] Return byte offsets from computeValueLLTs (NFC) (#166747 ) To avoid scaling offsets back and forth. This is also what SelectionDAG equivalent (ComputeValueVTs) does, and will allow to reuse ComputeValueTypes with less effort.	2025-11-15 00:23:26 +00:00
Juan Manuel Martinez Caamaño	5b56816dff	[NFC][SPIRV][IRTranslator] Replace leftover `MF->getTarget().getTargetTriple().isSPIRV()` with `targetSupportsBF16Type(MF)` (#167704 )	2025-11-12 17:59:38 +01:00
Alex Voicu	6ef32188b5	[SPIRV] Add support for `bfloat16` atomics via the `SPV_INTEL_16bit_atomics` extension (#166257 ) This enables support for atomic RMW ops (add, sub, min and max to be precise) with `bfloat16` operands, via the [SPV_INTEL_16bit_atomics extension](https://github.com/intel/llvm/pull/20009). It's logically a successor to #166031 (I should've used a stack), but I'm putting it up for early review. --------- Co-authored-by: Matt Arsenault <arsenm2@gmail.com>	2025-11-09 17:26:14 +00:00
Daniel Thornburgh	5f08fb4d72	[IR] llvm.reloc.none intrinsic for no-op symbol references (#147427 ) This intrinsic emits a BFD_RELOC_NONE relocation at the point of call, which allows optimizations and languages to explicitly pull in symbols from static libraries without there being any code or data that has an effectual relocation against such a symbol. See issue #146159 for context.	2025-11-06 08:52:46 -08:00
Robert Imschweiler	cad96ad703	[NFC] Refactor target intrinsic call lowering (#153204 ) Refactor intrinsic call handling in SelectionDAGBuilder and IRTranslator to prepare the addition of intrinsic support to the callbr instruction, which should then share code with the handling of the normal call instruction.	2025-11-06 10:51:44 +01:00
Alex Voicu	2286118e6f	[SPIRV] Enable `bfloat16` arithmetic (#166031 ) Enable the `SPV_INTEL_bfloat16_arithmetic` extension, which allows arithmetic, relational and `OpExtInst` instructions to take `bfloat16` arguments. This patch only adds support to arithmetic and relational ops. The extension itself is rather fresh, but `bfloat16` is ubiquitous at this point and not supporting these ops is limiting.	2025-11-04 18:10:26 +02:00
David Green	a1e59bdc17	[GlobalISel] Make scalar G_SHUFFLE_VECTOR illegal. (#140508 ) I'm not sure if this is the best way forward or not, but we have a lot of issues with forgetting that shuffle_vectors can be scalar again and again. (There is another example from the recent known-bits code added recently). As a scalar-dst shuffle vector is just an extract, and a scalar-source shuffle vector is just a build vector, this patch makes scalar shuffle vector illegal and adjusts the irbuilder to create the correct node as required. Most targets do this already through lowering or combines. Making scalar shuffles illegal simplifies gisel as a whole, it just requires that transforms that create shuffles of new sizes to account for the scalar shuffle being illegal (mostly IRBuilder and LessElements).	2025-10-24 08:21:35 +01:00
Matt Arsenault	1d9f9ad531	CodeGen: Fix crash when no libcall is available for stackguard (#164211 ) Not all the paths appear to be implemented for GlobalISel	2025-10-23 10:40:40 +09:00
Kazu Hirata	6bee6b2090	[CodeGen] Add "override" where appropriate (NFC) (#164571 ) Note that "override" makes "virtual" redundant. Identified with modernize-use-override.	2025-10-22 06:51:08 -07:00
Ryan Cowan	eb803df502	[AArch64][GlobalISel] Add `G_FMODF` instruction (#160061 ) This commit adds the intrinsic `G_FMODF` to GMIR & enables its translation, legalization and instruction selection in AArch64.	2025-10-02 10:30:31 +01:00
JaydeepChauhan14	0c1087b377	[X86][GlobalISel] Added support for llvm.set.rounding (#156591 ) - This implementation is adapted from SDAG X86TargetLowering::LowerSET_ROUNDING.	2025-09-25 22:44:47 +09:00
YixingZhang007	f91e0bf160	[SPIRV] Add support for the SPIR-V extension SPV_KHR_bfloat16 (#155645 ) This PR introduces the support for the SPIR-V extension `SPV_KHR_bfloat16`. This extension extends the `OpTypeFloat` instruction to enable the use of bfloat16 types with cooperative matrices and dot products. TODO: Per the `SPV_KHR_bfloat16` extension, there are a limited number of instructions that can use the bfloat16 type. For example, arithmetic instructions like `FAdd` or `FMul` can't operate on `bfloat16` values. Therefore, a future patch should be added to either emit an error or fall back to FP32 for arithmetic in cases where bfloat16 must not be used. Reference Specification: https://github.com/KhronosGroup/SPIRV-Registry/blob/main/extensions/KHR/SPV_KHR_bfloat16.asciidoc	2025-09-22 14:52:57 +02:00
Matt Arsenault	2331fbb019	CodeGen: Remove MachineFunction argument from getPointerRegClass (#158185 ) getPointerRegClass is a layering violation. Its primary purpose is to determine how to interpret an MCInstrDesc's operands RegClass fields. This should be context free, and only depend on the subtarget. The model of this is also wrong, since this should be an instruction / operand specific property, not a global pointer class. Remove the the function argument to help stage removal of this hook and avoid introducing any new obstacles to replacing it. The remaining uses of the function were to get the subtarget, which TargetRegisterInfo already belongs to. A few targets needed new subtarget derived properties copied there.	2025-09-12 09:18:50 +00:00
Kazu Hirata	11b4f110e0	[llvm] Remove unused includes of SmallSet.h (NFC) (#154893 ) We just replaced SmallSet<T , N> with SmallPtrSet<T , N>, bypassing the redirection found in SmallSet.h. With that, we no longer need to include SmallSet.h in many files.	2025-08-22 10:33:46 -07:00
David Green	03912a1de5	[GlobalISel] Translate scalar sequential vecreduce.fadd/fmul as fadd/fmul. (#153966 ) A llvm.vector.reduce.fadd(float, <1 x float>) will be translated to G_VECREDUCE_SEQ_FADD with two scalar operands, which is illegal according to the verifier. This makes sure we generate a fadd/fmul instead.	2025-08-18 14:59:44 +00:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Diana Picus	ac005e16f6	Reapply "[AMDGPU] Intrinsic for launching whole wave functions" (#153584 ) This reverts commit 14cd1339318b16e08c1363ec6896bd7d1e4ae281. The buildbot failure seems to have been a cmake issue which has been discussed in more detail in this Discourse post: https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901 If any buildbots fail to select arbitrary intrinsics with this patch, it's worth considering using clean builds with ccache instead of incremental builds, as recommended here: https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds The original commit message for this patch: Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Tail calls are handled in a future patch.	2025-08-15 10:12:47 +02:00
Daniel Paoliello	c430e06fb5	[win][arm64ec] Fix duplicate errors with the dontcall attribute (#152810 ) Since the `dontcall-` attributes are checked both by `FastISel`/`GlobalISel` and `SelectionDAGBuilder`, and both `FastISel` and `GlobalISel` bail for calls on Arm64EC for AFTER doing the check, we ended up emitting duplicate copies of this error. This change moves the checking for `dontcall-` in `FastISel` and `GlobalISel` to after it has been successfully lowered.	2025-08-12 11:05:07 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Diana Picus	14cd133931	Revert "[AMDGPU] Intrinsic for launching whole wave functions" (#152286 ) Reverts llvm/llvm-project#145859 because it broke a HIP test: ``` [34/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o FAILED: External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG -O3 -DNDEBUG -w -Werror=date-time --rocm-path=/opt/botworker/llvm/External/hip/rocm-6.3.0 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /home/botworker/bbot/clang-hip-vega20/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.readfirstlane ```	2025-08-06 12:24:52 +02:00
Diana Picus	0461cd3d1d	[AMDGPU] Intrinsic for launching whole wave functions (#145859 ) Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Unspeakable horrors happen around calls from whole wave functions, the plan is to improve the handling of caller/callee-saved registers in a future patch. Tail calls are also handled in a future patch.	2025-08-06 10:25:53 +02:00
Fabian Ritter	95191d5460	[GISel] Set more MIFlags when translating GEPs (#151708 ) The IRTranslator sets the flags now more consistently with `SelectionDAGBuilder::visitGetElementPtr()`. This affects `nuw` and `nusw`, as well as the recently introduced `inbounds` MIFlag (see PR #150900). This PR also adds more tests to `AArch64/GlobalISel/irtranslator-gep-flags.ll` to cover all points in `IRTranslator::translateGetElementPtr` that set flags. For SWDEV-516125.	2025-08-04 13:25:33 +02:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Fabian Ritter	d64240b5c6	[GISel] Introduce MachineIRBuilder::(build\|materialize)ObjectPtrOffset (#150392 ) These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build\|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build\|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build\|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.	2025-07-29 13:04:04 +02:00
Nikita Popov	a7a1df8f72	[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838 ) After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed that the argument is an alloca, so we don't need to look at underlying objects (which was not a correct thing to do anyway). This also drops the offset argument for lifetime nodes in SDAG. The offset is fixed to zero now. (Peculiarly, while SDAG pretended to have an offset, it just gets silently dropped during selection.)	2025-07-22 09:44:59 +02:00
JaydeepChauhan14	0f0079c29d	[X86][GlobalISel] Added support for llvm.get.rounding (#147716 ) - This implementation is adapted from SDAG X86TargetLowering::LowerGET_ROUNDING. - llvm.set.rounding will be added later because it involves MXCSR updates currently unsupported.	2025-07-11 15:48:18 +02:00
Kazu Hirata	16435a87b6	[CodeGen] Remove an unnecessary cast (NFC) (#147155 ) Offset is already of int64_t.	2025-07-05 12:26:35 -07:00
Nikita Popov	e56384ff54	[IRTranslator] Remove unnecessary isIntrinsic() check (NFC) Directly call getIntrinsicID(), there is no need to check for isIntrinsic() first.	2025-06-23 12:43:19 +02:00
Rahul Joshi	1fdf02ad5a	[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002 ) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.	2025-05-22 08:07:52 -07:00

1 2 3 4 5 ...

685 Commits