llvm-project

Author	SHA1	Message	Date
David Green	03912a1de5	[GlobalISel] Translate scalar sequential vecreduce.fadd/fmul as fadd/fmul. (#153966 ) A llvm.vector.reduce.fadd(float, <1 x float>) will be translated to G_VECREDUCE_SEQ_FADD with two scalar operands, which is illegal according to the verifier. This makes sure we generate a fadd/fmul instead.	2025-08-18 14:59:44 +00:00
Kazu Hirata	07eb7b7692	[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068 ) This patch replaces SmallSet<T , N> with SmallPtrSet<T , N>. Note that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer element types: template <typename PointeeType, unsigned N> class SmallSet<PointeeType, N> : public SmallPtrSet<PointeeType, N> {}; We only have 140 instances that rely on this "redirection", with the vast majority of them under llvm/. Since relying on the redirection doesn't improve readability, this patch replaces SmallSet with SmallPtrSet for pointer element types.	2025-08-18 07:01:29 -07:00
Diana Picus	ac005e16f6	Reapply "[AMDGPU] Intrinsic for launching whole wave functions" (#153584 ) This reverts commit 14cd1339318b16e08c1363ec6896bd7d1e4ae281. The buildbot failure seems to have been a cmake issue which has been discussed in more detail in this Discourse post: https://discourse.llvm.org/t/cmake-doesnt-regenerate-all-tablegen-target-files/87901 If any buildbots fail to select arbitrary intrinsics with this patch, it's worth considering using clean builds with ccache instead of incremental builds, as recommended here: https://llvm.org/docs/HowToAddABuilder.html#:~:text=Use%20CCache%20and%20NOT%20incremental%20builds The original commit message for this patch: Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Tail calls are handled in a future patch.	2025-08-15 10:12:47 +02:00
Daniel Paoliello	c430e06fb5	[win][arm64ec] Fix duplicate errors with the dontcall attribute (#152810 ) Since the `dontcall-` attributes are checked both by `FastISel`/`GlobalISel` and `SelectionDAGBuilder`, and both `FastISel` and `GlobalISel` bail for calls on Arm64EC for AFTER doing the check, we ended up emitting duplicate copies of this error. This change moves the checking for `dontcall-` in `FastISel` and `GlobalISel` to after it has been successfully lowered.	2025-08-12 11:05:07 -07:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Diana Picus	14cd133931	Revert "[AMDGPU] Intrinsic for launching whole wave functions" (#152286 ) Reverts llvm/llvm-project#145859 because it broke a HIP test: ``` [34/59] Building CXX object External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o FAILED: External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o /home/botworker/bbot/clang-hip-vega20/botworker/clang-hip-vega20/llvm/bin/clang++ -DNDEBUG -O3 -DNDEBUG -w -Werror=date-time --rocm-path=/opt/botworker/llvm/External/hip/rocm-6.3.0 --offload-arch=gfx908 --offload-arch=gfx90a --offload-arch=gfx1030 --offload-arch=gfx1100 -xhip -mfma -MD -MT External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -MF External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o.d -o External/HIP/CMakeFiles/TheNextWeek-hip-6.3.0.dir/workload/ray-tracing/TheNextWeek/main.cc.o -c /home/botworker/bbot/clang-hip-vega20/llvm-test-suite/External/HIP/workload/ray-tracing/TheNextWeek/main.cc fatal error: error in backend: Cannot select: intrinsic %llvm.amdgcn.readfirstlane ```	2025-08-06 12:24:52 +02:00
Diana Picus	0461cd3d1d	[AMDGPU] Intrinsic for launching whole wave functions (#145859 ) Add the llvm.amdgcn.call.whole.wave intrinsic for calling whole wave functions. This will take as its first argument the callee with the amdgpu_gfx_whole_wave calling convention, followed by the call parameters which must match the signature of the callee except for the first function argument (the i1 original EXEC mask, which doesn't need to be passed in). Indirect calls are not allowed. Make direct calls to amdgpu_gfx_whole_wave functions a verifier error. Unspeakable horrors happen around calls from whole wave functions, the plan is to improve the handling of caller/callee-saved registers in a future patch. Tail calls are also handled in a future patch.	2025-08-06 10:25:53 +02:00
Fabian Ritter	95191d5460	[GISel] Set more MIFlags when translating GEPs (#151708 ) The IRTranslator sets the flags now more consistently with `SelectionDAGBuilder::visitGetElementPtr()`. This affects `nuw` and `nusw`, as well as the recently introduced `inbounds` MIFlag (see PR #150900). This PR also adds more tests to `AArch64/GlobalISel/irtranslator-gep-flags.ll` to cover all points in `IRTranslator::translateGetElementPtr` that set flags. For SWDEV-516125.	2025-08-04 13:25:33 +02:00
Nikita Popov	86727fe9a1	[IR] Allow poison argument to lifetime markers (#151148 ) This slightly relaxes the invariant established in #149310, by also allowing the lifetime argument to be poison. This is to support the typical pattern of RAUWing with poison when removing an instruction. It's worth noting that this does not require any conservative assumptions, lifetimes with poison arguments can simply be skipped. Fixes https://github.com/llvm/llvm-project/issues/151119.	2025-08-04 10:02:04 +02:00
Fabian Ritter	d64240b5c6	[GISel] Introduce MachineIRBuilder::(build\|materialize)ObjectPtrOffset (#150392 ) These functions are for building G_PTR_ADDs when we know that the base pointer and the result are both valid pointers into (or just after) the same object. They are similar to SelectionDAG::getObjectPtrOffset. This PR also changes call sites of the generic (build\|materialize)PtrAdd functions that implement pointer arithmetic to split large memory accesses to the new functions. Since memory accesses have to fit into an object in memory, pointer arithmetic to an offset into a large memory access also yields an address in that object. Currently, these (build\|materialize)ObjectPtrOffset functions only add "nuw" to the generated G_PTR_ADD, but I intend to introduce an "inbounds" MIFlag in a later PR (analogous to a concurrent effort in SDAG: #131862, related: #140017, #141725) that will also be set in the (build\|materialize)ObjectPtrOffset functions. Most test changes just add "nuw" to G_PTR_ADDs. Exceptions are AMDGPU's call-outgoing-stack-args.ll, flat-scratch.ll, and freeze.ll tests, where offsets are now folded into scratch instructions, and cases where the behavior of the check regeneration script changed, resulting, e.g., in better checks for "nusw G_PTR_ADD" instructions, matched empty lines, and the use of "CHECK-NEXT" in MIPS tests. For SWDEV-516125.	2025-07-29 13:04:04 +02:00
Nikita Popov	a7a1df8f72	[CodeGen] Remove handling for lifetime.start/end on non-alloca (#149838 ) After https://github.com/llvm/llvm-project/pull/149310 we are guaranteed that the argument is an alloca, so we don't need to look at underlying objects (which was not a correct thing to do anyway). This also drops the offset argument for lifetime nodes in SDAG. The offset is fixed to zero now. (Peculiarly, while SDAG pretended to have an offset, it just gets silently dropped during selection.)	2025-07-22 09:44:59 +02:00
JaydeepChauhan14	0f0079c29d	[X86][GlobalISel] Added support for llvm.get.rounding (#147716 ) - This implementation is adapted from SDAG X86TargetLowering::LowerGET_ROUNDING. - llvm.set.rounding will be added later because it involves MXCSR updates currently unsupported.	2025-07-11 15:48:18 +02:00
Kazu Hirata	16435a87b6	[CodeGen] Remove an unnecessary cast (NFC) (#147155 ) Offset is already of int64_t.	2025-07-05 12:26:35 -07:00
Nikita Popov	e56384ff54	[IRTranslator] Remove unnecessary isIntrinsic() check (NFC) Directly call getIntrinsicID(), there is no need to check for isIntrinsic() first.	2025-06-23 12:43:19 +02:00
Rahul Joshi	1fdf02ad5a	[LLVM][CodeGen] Add convenience accessors for MachineFunctionProperties (#140002 ) Add per-property has<Prop>/set<Prop>/reset<Prop> functions to MachineFunctionProperties.	2025-05-22 08:07:52 -07:00
Pierre van Houtryve	b5e2a236b9	[CodeGen] Add SSID & Atomic Ordering to IntrinsicInfo (#140896 ) getTgtMemIntrinsic should be able to propagate such information to the MMO	2025-05-22 11:42:01 +02:00
Matt Arsenault	8c61befff8	GlobalISel: Translate minimumnum and maximumnum (#139106 )	2025-05-08 20:03:34 +02:00
Philip Reames	c0a264e6a9	[IntrinsicInst] Remove MemCpyInlineInst and MemSetInlineInst [nfc] (#138568 ) I'm looking for ways to simplify the Mem*Inst class structure, and these two seem to have fairly minimal justification, so let's remove them.	2025-05-05 14:07:31 -07:00
Kazu Hirata	cdc9a4b5f8	[CodeGen] Use range-based for loops (NFC) (#138488 ) This is a reland of #138434 except that: - the bits for llvm/lib/CodeGen/RenameIndependentSubregs.cpp have been dropped because they caused a test failure under asan, and - the bits for llvm/lib/CodeGen/SelectionDAG/ScheduleDAGFast.cpp have been improved with structured bindings.	2025-05-05 10:08:49 -07:00
Nico Weber	1d955489c3	Revert "[CodeGen] Use range-based for loops (NFC) (#138434 )" This reverts commit a9699a334bc9666570418a3bed9520bcdc21518b. Breaks CodeGen/AMDGPU/collapse-endcf.ll in several configs (sanitizer builds; macOS; possibly more), see comments on https://github.com/llvm/llvm-project/pull/138434	2025-05-04 17:36:52 -04:00
Kazu Hirata	a9699a334b	[CodeGen] Use range-based for loops (NFC) (#138434 )	2025-05-04 00:26:19 -07:00
Kazu Hirata	aa613777af	[llvm] Remove redundant control flow (NFC) (#138304 )	2025-05-02 10:34:25 -07:00
Jonathan Thackray	6e49f73825	Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-30 22:06:37 +01:00
Jonathan Thackray	7ee0097b48	Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657 ) Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4	2025-04-28 16:53:36 +01:00
Jonathan Thackray	ba420d8122	[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759 ) This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum` instructions. These mirror the `llvm.maximum.` and `llvm.minimum.` instructions, but are atomic and use IEEE754 2019 handling for NaNs, which is different to `fmax` and `fmin`. See: https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic for more details. Future changes will allow this LLVM IR to be lowered to specialised assembler instructions on suitable targets, such as AArch64.	2025-04-28 15:31:44 +01:00
Paul Walker	be82be281d	[LLVM][GlobalISel] Ensure G_{F}CONSTANT only store references to scalar Constant{Int,FP}. (#137319 )	2025-04-28 11:40:39 +01:00
Kazu Hirata	47d8fec9b8	[llvm] Use llvm::append_range (NFC) (#136066 ) This patch replaces: llvm::copy(Src, std::back_inserter(Dst)); with: llvm::append_range(Dst, Src); for breavity. One side benefit is that llvm::append_range eventually calls llvm::SmallVector::reserve if Dst is of llvm::SmallVector.	2025-04-16 19:30:01 -07:00
Kazu Hirata	e3a3f78f35	[CodeGen] Use llvm::append_range (NFC) (#133603 )	2025-03-29 16:53:02 -07:00
yonghong-song	0ffe83feac	[SelectionDAG] Not issue TRAP node if naked function (#132147 ) In [1], Nikita Popov suggested that during lowering 'unreachable' insn should not generate extra code for naked functions, and this applies to all architectures. Note that for naked functions, 'unreachable' insn is necessary in IR since the basic block needs a terminator to end. This patch checked whether a function is naked function or not. If it is a naked function, 'unreachable' insn will not generate ISD::TRAP. [1] https://github.com/llvm/llvm-project/pull/131731 Co-authored-by: Yonghong Song <yonghong.song@linux.dev>	2025-03-20 18:18:03 -07:00
David Green	bd1be8a242	[CodeGen][GlobalISel] Add a getVectorIdxWidth and getVectorIdxLLT. (#131526 ) From #106446, this adds a variant of getVectorIdxTy that returns an LLT. Many uses only look at the width, so a getVectorIdxWidth was added as the common base.	2025-03-18 08:31:11 +00:00
Kazu Hirata	a5bbfcf0c9	[GlobalISel] Avoid repeated hash lookups (NFC) (#129653 )	2025-03-04 00:08:40 -08:00
Rahul Joshi	0f674cce82	[NFC][LLVM] Remove unused `TargetIntrinsicInfo` class (#126003 ) Remove `TargetIntrinsicInfo` class as its practically unused (its pure virtual with no subclasses) and its references in the code.	2025-02-10 14:56:30 -08:00
Robert Imschweiler	21560fe6b9	GlobalISel: Fix defined register of invariant.start (#125664 ) In contrast to SelectionDAG, GlobalISel created a new virtual register for the return value of invariant.start, leaving subsequent users of the invariant.start value with an undefined reference. A minimal example: ``` %tmp = alloca i32, align 4, addrspace(5) %tmpI = call ptr @llvm.invariant.start.p5(i64 4, ptr addrspace(5) %tmp) #3 call void @llvm.invariant.end.p5(ptr %tmpI, i64 4, ptr addrspace(5) %tmp) #3 store i32 %i, ptr %tmpI, align 4 ``` Although the return value of invariant.start might not be intended for any use beyond invariant.end (the fuzzer might not have created a sensible situation here), an implicit definition of the corresponding virtual register avoids a segfault in the target instruction selector later. This LLVM defect was identified via the AMD Fuzzing project.	2025-02-04 23:59:03 +07:00
David Green	5a81a559d6	[GISel] Explicitly disable BF16 tablegen patterns. (#124113 ) We currently have an issue where bf16 patters can be used to match fp16 types, as GISel does not know about the difference between the two. This patch explicitly disables them to make sure that they are never used. The opposite can also happen too, where fp16 patterns are used for operators that should be bf16. So this also changes any operations with bf16 types to now cause a fallback to SDAG. The pass setup for GISel has been slightly adjusted to make sure that a verify pass does not get added between AMD-SDAG and SIFixSGPRCopiesPass, which otherwise can cause verifier issues when falling back.	2025-01-27 22:21:12 +00:00
Jeremy Morse	6292a808b3	[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737 ) As part of the "RemoveDIs" project, BasicBlock::iterator now carries a debug-info bit that's needed when getFirstNonPHI and similar feed into instruction insertion positions. Call-sites where that's necessary were updated a year ago; but to ensure some type safety however, we'd like to have all calls to getFirstNonPHI use the iterator-returning version. This patch changes a bunch of call-sites calling getFirstNonPHI to use getFirstNonPHIIt, which returns an iterator. All these call sites are where it's obviously safe to fetch the iterator then dereference it. A follow-up patch will contain less-obviously-safe changes. We'll eventually deprecate and remove the instruction-pointer getFirstNonPHI, but not before adding concise documentation of what considerations are needed (very few). --------- Co-authored-by: Stephen Tozer <Melamoto@gmail.com>	2025-01-24 13:27:56 +00:00
antangelo	b9ac390cc7	[GISel] Add generic implementation for @llvm.expect.with.probability when optimizations are disabled (#117835 ) Handle @llvm.expect.with.probability in GlobalISel in the same way @llvm.expect is handled, passing the value through as-is. This can be encountered if the intrinsic is used without optimizations, which would otherwise transform it out. Fixes #115411 for GlobalISel	2024-11-29 22:30:13 -05:00
Kazu Hirata	4048c64306	[llvm] Remove redundant control flow statements (NFC) (#115831 ) Identified with readability-redundant-control-flow.	2024-11-12 10:09:42 -08:00
Thorsten Schütt	e399322d5e	[GlobalISel] Import llvm.stepvector (#115721 )	2024-11-11 21:35:22 +01:00
David Green	a4e507df7a	[AArch64][GlobalISel] Do not create LIFETIME instructions in functions. (#115669 ) For the same reason that we do not translate lifetime markers in a -O0, we should not translate them for optnone functions too.	2024-11-11 09:27:40 +00:00
Kazu Hirata	b83399eab6	[GlobalISel] Remove unused includes (NFC) (#115429 ) Identified with misc-include-cleaner.	2024-11-08 22:28:47 -08:00
Thorsten Schütt	9061e6e58a	[GlobalISel][AArch64] Legalize G_EXTRACT_VECTOR_ELT for SVE (#115161 ) AArch64InstrGISel.td defines: def : GINodeEquiv<G_EXTRACT_VECTOR_ELT, vector_extract>; There are many patterns for SVE. Let's exploit that fact.	2024-11-08 07:58:17 +01:00
Thorsten Schütt	b3bb6f18bb	[GlobalISel] Import samesign flag (#114267 ) Credits: https://github.com/llvm/llvm-project/pull/111419 Fixes icmp-flags.mir First attempt: https://github.com/llvm/llvm-project/pull/113090 Revert: https://github.com/llvm/llvm-project/pull/114256	2024-10-30 19:56:25 +01:00
Thorsten Schütt	4b028773b2	Revert "[GlobalISel] Import samesign flag" (#114256 ) Reverts llvm/llvm-project#113090	2024-10-30 17:03:17 +01:00
Thorsten Schütt	72b115301d	[GlobalISel] Import samesign flag (#113090 ) Credits: https://github.com/llvm/llvm-project/pull/111419	2024-10-30 16:34:01 +01:00
Benjamin Maxwell	c3260c65e8	[IR] Add `llvm.sincos` intrinsic (#109825 ) This adds the `llvm.sincos` intrinsic, legalization, and lowering. The `llvm.sincos` intrinsic takes a floating-point value and returns both the sine and cosine (as a struct). ``` declare { float, float } @llvm.sincos.f32(float %Val) declare { double, double } @llvm.sincos.f64(double %Val) declare { x86_fp80, x86_fp80 } @llvm.sincos.f80(x86_fp80 %Val) declare { fp128, fp128 } @llvm.sincos.f128(fp128 %Val) declare { ppc_fp128, ppc_fp128 } @llvm.sincos.ppcf128(ppc_fp128 %Val) declare { <4 x float>, <4 x float> } @llvm.sincos.v4f32(<4 x float> %Val) ``` The lowering is built on top of the existing FSINCOS ISD node, with additional type legalization to allow for f16, f128, and vector values.	2024-10-29 10:52:20 +00:00
Keith Packard	44b020a381	[PowerPC][ISelLowering] Support -mstack-protector-guard=tls (#110928 ) Add support for using a thread-local variable with a specified offset for holding the stack guard canary value. This supports both 32- and 64- bit PowerPC targets. This mirrors changes from #108942 but targeting PowerPC instead of RISCV. Because both of these PRs modify the same driver functions, this series is stack on top of the RISC-V one. --------- Signed-off-by: Keith Packard <keithp@keithp.com>	2024-10-17 19:06:47 -07:00
duk	464a7ee79e	[CodeGen] Generalize trap emission after SP check fail (#109744 ) Generalize and improve some target-specific code that emits traps after stack protector failure in SelectionDAG & GlobalIsel.	2024-10-12 20:01:22 -04:00
Stephen Tozer	d826b0c90f	[LLVM] Add HasFakeUses to MachineFunction (#110097 ) Following the addition of the llvm.fake.use intrinsic and corresponding MIR instruction, two further changes are planned: to add an -fextend-lifetimes flag to Clang that emits these intrinsics, and to have -Og enable this flag by default. Currently, some logic for handling fake uses is gated by the optdebug attribute, which is intended to be switched on by -fextend-lifetimes (and by extension -Og later on). However, the decision was made that a general optdebug attribute should be incompatible with other opt_ attributes (e.g. optsize, optnone), since they all express different intents for how to optimize the program. We would still like to allow -fextend-lifetimes with optsize however (i.e. -Os -fextend-lifetimes should be legal), since it may be a useful configuration and there is no technical reason to not allow it. This patch resolves this by tracking MachineFunctions that have fake uses, allowing us to run passes that interact with them and skip passes that clash with them.	2024-10-04 13:13:30 +01:00
Thorsten Schütt	53943de73a	[GlobalISel] Import extract/insert subvector (#110287 ) Test: AArch64/GlobalISel/irtranslator-subvector.ll Reference: https://llvm.org/docs/LangRef.html#llvm-vector-extract-intrinsic https://llvm.org/docs/LangRef.html#llvm-vector-insert-intrinsic	2024-09-30 22:12:06 +02:00
Tex Riddell	139688a699	[SPIRV] Add atan2 function lowering (p2) (#110037 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - Add generic opcode for atan2 - Add SPIRV lowering for atan2 Part 2 for Implement the atan2 HLSL Function #70096.	2024-09-26 15:00:59 -07:00

1 2 3 4 5 ...

650 Commits