llvm-project

Author	SHA1	Message	Date
Craig Topper	1c798e0b07	[SelectionDAGBuilder][RISCV] Fix crash when using a memory constraint with scalable vector type. (#99821 ) We need to use the minimum size of the scalable type and the correct stack ID. The code in the PR is still invalid because the instruction used doesn't have a pointer operand. This is diagnosed later when the assembler parses it. Fixes #99782	2024-07-22 09:41:57 -07:00
Vitaly Buka	98c0e55d9d	Revert "SelectionDAG: Avoid using MachineFunction::getMMI" (#99777 ) Reverts llvm/llvm-project#99696 https://lab.llvm.org/buildbot/#/builders/164/builds/1262	2024-07-20 12:20:50 -07:00
Joseph Huber	615b7eeaa9	Reapply "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit 740161a9b98c9920dedf1852b5f1c94d0a683af5. I moved the `ISD` dependencies into the CodeGen portion of the handling, it's a little awkward but it's the easiest solution I can think of for now.	2024-07-20 09:29:31 -05:00
Matt Arsenault	c2019a37bd	SelectionDAG: Avoid using MachineFunction::getMMI (#99696 )	2024-07-20 10:53:41 +04:00
NAKAMURA Takumi	740161a9b9	Revert "[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 )" This reverts commit c05126bdfc3b02daa37d11056fa43db1a6cdef69. (llvmorg-19-init-17714-gc05126bdfc3b) See #99610	2024-07-20 12:36:57 +09:00
Matt Arsenault	0f0cfcff2c	CodeGen: Avoid some references to MachineFunction's getMMI (#99652 ) MachineFunction's probably should not include a backreference to the owning MachineModuleInfo. Most of these references were used just to query the MCContext, which MachineFunction already directly stores. Other contexts are using it to query the LLVMContext, which can already be accessed through the IR function reference.	2024-07-19 22:09:05 +04:00
Lawrence Benson	177ce1900f	[LLVM] Add `llvm.experimental.vector.compress` intrinsic (#92289 ) This PR adds a new vector intrinsic `@llvm.experimental.vector.compress` to "compress" data within a vector based on a selection mask, i.e., it moves all selected values (i.e., where `mask[i] == 1`) to consecutive lanes in the result vector. A `passthru` vector can be provided, from which remaining lanes are filled. The main reason for this is that the existing `@llvm.masked.compressstore` has very strong constraints in that it can only write values that were selected, resulting in guard branches for all targets except AVX-512 (and even there the AMD implementation is _very_ slow). More instruction sets support "compress" logic, but only within registers. So to store the values, an additional store is needed. But this combination is likely significantly faster on many target as it avoids branches. In follow up PRs, my plan is to add target-specific lowerings for x86, SVE, and possibly RISCV. I also want to combine this with a store instruction, as this is probably a common case and we can avoid some memory writes in that case. See [discussion in forum](https://discourse.llvm.org/t/new-intrinsic-for-masked-vector-compress-without-store/78663) for initial discussion on the design.	2024-07-17 14:24:24 +02:00
Amara Emerson	f270a4dd66	[AArch64] Don't tail call memset if it would convert to a bzero. (#98969 ) Well, not quite that simple. We can tc memset since it returns the first argument but bzero doesn't do that and therefore we can end up miscompiling. This patch also refactors the logic out of isInTailCallPosition() into the callers. As a result memcpy and memmove are also modified to do the same thing for consistency. rdar://131419786	2024-07-17 01:31:52 -07:00
Joseph Huber	c05126bdfc	[LLVM][LTO] Factor out RTLib calls and allow them to be dropped (#98512 ) Summary: The LTO pass and LLD linker have logic in them that forces extraction and prevent internalization of needed runtime calls. However, these currently take all RTLibcalls into account, even if the target does not support them. The target opts-out of a libcall if it sets its name to nullptr. This patch pulls this logic out into a class in the header so that LTO / lld can use it to determine if a symbol actually needs to be kept. This is important for targets like AMDGPU that want to be able to use `lld` to perform the final link step, but does not want the overhead of uncalled functions. (This adds like a second to the link time trivially)	2024-07-16 06:22:09 -05:00
Ahmed Bougacha	d286efeb1d	[AArch64][PAC] Lower direct authenticated calls to ptrauth constants. (#97664 ) This tries to turn indirect ptrauth calls into direct calls, using `ConstantPtrAuth::isKnownEquivalent` to compare the `ConstantPtrAuth` target with the ptrauth call bundle. This should be straightforward, other than the somewhat awkward GISel handling, which has a handshake between CallLowering and IRTranslator to elide the ptrauth when possible.	2024-07-15 16:14:43 -07:00
Farzon Lotfi	0b58f34c98	[X86][CodeGen] Add base trig intrinsic lowerings (#96222 ) This change is an implementation of https://github.com/llvm/llvm-project/issues/87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 This change adds constraint intrinsics and some lowering cases for `acos`, `asin`, `atan`, `cosh`, `sinh`, and `tanh`. The only x86 specific change was for f80. https://github.com/llvm/llvm-project/issues/70079 https://github.com/llvm/llvm-project/issues/70080 https://github.com/llvm/llvm-project/issues/70081 https://github.com/llvm/llvm-project/issues/70083 https://github.com/llvm/llvm-project/issues/70084 https://github.com/llvm/llvm-project/issues/95966 The x86 lowering is going to be done in three pr changes with this being the first. A second PR will be put up for Loop Vectorizing and then SLPVectorizer. The constraint intrinsics is also going to be in multiple parts, but just 2. This part covers just the llvm specific changes, part2 will cover clang specifc changes and legalization for backends than have special legalization requirements like aarch64 and wasm.	2024-07-11 15:58:43 -04:00
Daniel Kiss	1782810b84	[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819 ) So far branch protection, sign return address, guarded control stack attributes are only emitted as module flags to indicate the functions need to be generated with those features. The problem is in case of an LTO build the module flags are merged with the `min` rule which means if one of the module is not build with sign return address then the features will be turned off for all functions. Due to the functions take the branch-protection and sign-return-address features from the module flags. The sign-return-address is function level option therefore it is expected functions from files that is compiled with -mbranch-protection=pac-ret to be protected. The inliner might inline functions with different set of flags as it doesn't consider the module flags. This patch adds the attributes to all functions and drops the checking of the module flags for the code generation. Module flag is still used for generating the ELF markers. Also drops the "true"/"false" values from the branch-protection-enforcement, branch-protection-pauth-lr, guarded-control-stack attributes as presence of the attribute means it is on absence means off and no other option. Releand with test fixes.	2024-07-10 11:32:41 +02:00
Daniel Kiss	4b2daeccc7	Revert "[Clang][ARM][AArch64] Alway emit protection attributes for functions." (#98284 ) Reverts llvm/llvm-project#82819	2024-07-10 10:22:38 +02:00
Daniel Kiss	e15d67cfc2	[Clang][ARM][AArch64] Alway emit protection attributes for functions. (#82819 ) So far branch protection, sign return address, guarded control stack attributes are only emitted as module flags to indicate the functions need to be generated with those features. The problem is in case of an LTO build the module flags are merged with the `min` rule which means if one of the module is not build with sign return address then the features will be turned off for all functions. Due to the functions take the branch-protection and sign-return-address features from the module flags. The sign-return-address is function level option therefore it is expected functions from files that is compiled with -mbranch-protection=pac-ret to be protected. The inliner might inline functions with different set of flags as it doesn't consider the module flags. This patch adds the attributes to all functions and drops the checking of the module flags for the code generation. Module flag is still used for generating the ELF markers. Also drops the "true"/"false" values from the branch-protection-enforcement, branch-protection-pauth-lr, guarded-control-stack attributes as presence of the attribute means it is on absence means off and no other option.	2024-07-10 10:06:14 +02:00
Shengchen Kan	a48305e0f9	[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1 This fixes the crash when building llvm-test-suite with avx512f + cf.	2024-07-05 15:50:21 +08:00
Shengchen Kan	c60b9307d0	Revert "[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1" This reverts commit 74984dee51307779a3eab10a8cd6102be37e1081. It caused AArch64 test sve-nontemporal-masked-ldst.ll to fail.	2024-07-05 15:14:30 +08:00
Shengchen Kan	74984dee51	[X86][CodeGen] Convert masked.load/store to CLOAD/CSTORE node only when vector size = 1 This fixes the crash when building llvm-test-suite with avx512f + cf.	2024-07-05 14:35:42 +08:00
Nicholas Guy	6222c8f030	[IR][LangRef] Add partial reduction add intrinsic (#94499 ) Adds the llvm.experimental.partial.reduce.add.* overloaded intrinsic, this intrinsic represents add reductions that result in a narrower vector.	2024-07-04 13:32:42 +01:00
Kazu Hirata	58fd3bea6d	[CodeGen] Use range-based for loops (NFC) (#97467 )	2024-07-02 16:36:13 -07:00
Igor Kudrin	23db37c51c	[CodeGen] Do not emit TRAP for `unreachable` after `@llvm.trap` (#94570 ) With `--trap-unreachable`, `clang` can emit double `TRAP` instructions for code that contains a call to `__builtin_trap()`: ``` > cat test.c void test() { __builtin_trap(); } > clang test.c --target=x86_64 -mllvm --trap-unreachable -O1 -S -o - ... test: ... ud2 ud2 ... ``` `SimplifyCFGPass` inserts `unreachable` after a call to a `noreturn` function, and later this instruction causes `TRAP/G_TRAP` to be emitted in `SelectionDAGBuilder::visitUnreachable()` or `IRTranslator::translateUnreachable()` if `TargetOptions.TrapUnreachable` is set. The patch checks the instruction before `unreachable` and avoids inserting an additional trap.	2024-07-02 15:36:02 -07:00
Daniil Kovalev	1488fb4153	[PAC][AArch64] Lower ptrauth constants in code (#96879 ) This re-applies #94241 after fixing buildbot failure, see https://lab.llvm.org/buildbot/#/builders/51/builds/570 According to standard, `constexpr` variables and `const` variables initialized with constant expressions can be used in lambdas w/o capturing - see https://en.cppreference.com/w/cpp/language/lambda. However, MSVC used on buildkite seems to ignore that rule and does not allow using such uncaptured variables in lambdas: we have "error C3493: 'Mask16' cannot be implicitly captured because no default capture mode has been specified" - see https://buildkite.com/llvm-project/github-pull-requests/builds/73238 Explicitly capturing such a variable, however, makes buildbot fail with "error: lambda capture 'Mask16' is not required to be captured for this use [-Werror,-Wunused-lambda-capture]" - see https://lab.llvm.org/buildbot/#/builders/51/builds/570. Fix both cases by using `0xffff` value directly instead of giving a name to it. Original PR description below. Depends on #94240. Define the following pseudos for lowering ptrauth constants in code: - non-`extern_weak`: - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added PAC; - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC; - `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker during relocation resolving instead of a GOT slot. --------- Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2024-06-28 07:29:38 +03:00
Shengchen Kan	15fc801cf0	[X86][CodeGen] Support hoisting load/store with conditional faulting (#96720 ) 1. Add TTI interface for conditional load/store. 2. Mark 1 x i16/i32/i64 masked load/store legal so that it's not legalized in pass scalarize-masked-mem-intrin. 3. Visit 1 x i16/i32/i64 masked load/store to build a target-specific CLOAD/CSTORE node to avoid error in `DAGTypeLegalizer::ScalarizeVectorResult`. 4. Combine DAG to simplify the nodes for CLOAD/CSTORE. 5. Lower CLOAD/CSTORE to CFCMOV by pattern match. This is CodeGen part of #95515	2024-06-27 17:01:55 +08:00
Daniil Kovalev	99251f5a11	Revert "[PAC][AArch64] Lower ptrauth constants in code (#94241 )" (#96865 ) This reverts #94241. See buildbot failure https://lab.llvm.org/buildbot/#/builders/51/builds/570	2024-06-27 11:10:38 +03:00
Daniil Kovalev	b5cc19e572	[PAC][AArch64] Lower ptrauth constants in code (#94241 ) Depends on #94240. Define the following pseudos for lowering ptrauth constants in code: - non-`extern_weak`: - no GOT load needed: `MOVaddrPAC` - similar to `MOVaddr`, with added PAC; - GOT load needed: `LOADgotPAC` - similar to `LOADgot`, with added PAC; - `extern_weak`: `LOADauthptrstatic` - similar to `LOADgot`, but use a special stub slot named `sym$auth_ptr$key$disc` filled by dynamic linker during relocation resolving instead of a GOT slot. --------- Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>	2024-06-27 10:02:17 +03:00
Nikita Popov	f2f18459d4	Revert "Intrinsic: introduce minimumnum and maximumnum (#93841 )" As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.	2024-06-21 08:34:04 +02:00
YunQiang Su	8988148003	Intrinsic: introduce minimumnum and maximumnum (#93841 ) Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033	2024-06-21 11:53:08 +08:00
Heejin Ahn	3c8f3b91d8	[WebAssembly] Treat 'rethrow' as terminator in custom isel (#95967 ) `rethrow` instruction is a terminator, but when when its DAG is built in `SelectionDAGBuilder` in a custom routine, it was NOT treated as such. ```ll rethrow: ; preds = %catch.start invoke void @llvm.wasm.rethrow() #1 [ "funclet"(token %1) ] to label %unreachable unwind label %ehcleanup ehcleanup: ; preds = %rethrow, %catch.dispatch %tmp = phi i32 [ 10, %catch.dispatch ], [ 20, %rethrow ] ... ``` In this bitcode, because of the `phi`, a `CONST_I32` will be created in the `rethrow` BB. Without this patch, the DAG for the `rethrow` BB looks like this: ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t0, TargetConstant:i32<12161> t6: ch = TokenFactor t3, t5 t8: ch = br t6, BasicBlock:ch<unreachable 0x562532e43c50> ``` Note that `CopyToReg` and `llvm.wasm.rethrow` don't have dependence so either can come first in the selected code, which can result in the code like ```mir bb.3.rethrow: RETHROW 0, implicit-def dead $arguments %9:i32 = CONST_I32 20, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` After this patch, `llvm.wasm.rethrow` is treated as a terminator, and the DAG will look like ``` t0: ch,glue = EntryToken t3: ch = CopyToReg t0, Register:i32 %9, Constant:i32<20> t5: ch = llvm.wasm.rethrow t3, TargetConstant:i32<12161> t7: ch = br t5, BasicBlock:ch<unreachable 0x5555e3d32c70> ``` Note that now `rethrow` takes a token from `CopyToReg`, so `rethrow` has to come after `CopyToReg`. And the resulting code will be ```mir bb.3.rethrow: %9:i32 = CONST_I32 20, implicit-def dead $arguments RETHROW 0, implicit-def dead $arguments BR %bb.6, implicit-def dead $arguments ``` I'm not very familiar with the internals of `getRoot` vs. `getControlRoot`, but other terminator instructions seem to use the latter, and using it for `rethrow` too worked.	2024-06-18 21:56:41 -07:00
Matt Arsenault	62b5196be0	DAG: Fix asserting on invalid inline asm constraints (#95935 )	2024-06-18 18:51:29 +02:00
Poseydon42	995835fe6d	[SelectionDAG] Add support for the 3-way comparison intrinsics [US]CMP (#91871 ) This PR adds initial support for the `scmp`/`ucmp` 3-way comparison intrinsics in the SelectionDAG. Some of the expansions/lowerings are not optimal yet.	2024-06-17 11:16:52 +02:00
NAKAMURA Takumi	85a7bba7d2	Cleanup MC/DC intrinsics for #82448 (#95496 ) 3rd arg of `tvbitmap.update` was made unused. Remove 3rd arg. Sweep `condbitmap.update`, since it is no longer used.	2024-06-16 09:04:51 +09:00
Andreas Jonson	ae71609e91	[SDAG] Lower range attribute to AssertZext (#95450 ) Add support for range attributes on calls, in addition to range metadata.	2024-06-14 10:04:33 +02:00
Jon Roelofs	785dc76c66	[llvm][SelectionDAG] Fix up chains in lowerInvokeable. rdar://113994760 (#94004 ) lowerInvokeable wasn't updating the returned chain after emitting the lowerEndEH, which caused SwiftErrorVal-handling code to re-set the DAG root, and thus accidentally skip the EH_LABEL node it was supposed to have addeed. After fixing that, a few places needed to be adjusted that assume the specific shape of the returned DAG. Fixes: #64826 Fixes: rdar://113994760	2024-06-13 18:05:52 -07:00
Jay Foad	d4a0154902	[llvm-project] Fix typo "seperate" (#95373 )	2024-06-13 20:20:27 +01:00
Quentin Colombet	0605e984fa	[SDISel][Builder] Fix the instantiation of <1 x bfloat\|half> (#94591 ) Prior to this change, `SelectionDAGBuilder` was producing `SDNode`s of the form: `f32 = extract_vector_elt <1 x bfloat\|half>, i32 0` when lowering phis of `<1 x bfloat\|half>` and running on a target that promotes this type to `f32` (like some x86 or AMDGPU targets.) This construct is invalid since this type of node only allows type extensions for integer types. It went unotice because the `extract_vector_elt` node is later broken down in `bitcast` followed by `bf16_to_fp\|fp_extend`. However, when the argument of the phi is a constant we were crashing because the existing code would try to constant fold this `extract_vector_elt` into a any_ext. This patch fixes this by using a proper decomposition for `<1 x bfloat\|half>`: ``` bfloat\|half = bitcast <1 x blfoat\|half> float = fp_extend bfloat\|half ``` This change should be NFC for the non-constant-folding cases and fix the SDISel crashes (reported in https://github.com/llvm/llvm-project/issues/94449) for the folding cases. Note: The change on the arm test is a missing fp16 to f32 constant folding exposed by this patch. I'll push a separate improvement for that.	2024-06-07 18:47:37 +02:00
Orlando Cazalet-Hyams	ea32197daa	[DebugInfo][SelectionDAG] Fix position of salvaged 'dangling' DBG_VALUEs (#94458 ) `SelectionDAGBuilder::handleDebugValue` has a parameter `Order` which represents the insert-at position for the new DBG_VALUE. Prior to this patch `SelectionDAGBuilder::SDNodeOrder` is used instead of the `Order` parameter. The only code-paths where `Order != SDNodeOrder` are the two calls calls to `handleDebugValue` from `salvageUnresolvedDbgValue`. `salvageUnresolvedDbgValue` is called from `resolveOrClearDbgInfo` and `dropDanglingDebugInfo`. The former is called after SelectionDAG completes one block. Some dbg.values can't be lowered to DBG_VALUEs right away. These get recorded as 'dangling' - their order-number is saved - and get salvaged later through `dropDanglingDebugInfo`, or if we've still got dangling debug info once the whole block has been emitted, through `resolveOrClearDbgInfo`. Their saved order-number is passed to `handleDebugValue`. Prior to this patch, DBG_VALUEs inserted using these functions are inserted at the "current" `SDNodeOrder` rather than the intended position that is passed to the function. Fix and add test.	2024-06-06 09:17:38 +01:00
Farzon Lotfi	1d87433593	[x86] Add tan intrinsic part 4 (#90503 ) This change is an implementation of #87367's investigation on supporting IEEE math operations as intrinsics. Which was discussed in this RFC: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 Much of this change was following how G_FSIN and G_FCOS were used. Changes: - `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN` opcode - `llvm/docs/LangRef.rst` - Document the tan intrinsic - `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan intrinsic as a vector function similar to the tanf libcall. - `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to `ISD::FTAN` - `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic - `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall mappings - `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN` Opcode - `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN` Opcode handler - `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map `G_FTAN` to `ftan` - `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`, `strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN` and `STRICT_FTAN` - `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a vector intrinsic - `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic to `G_FTAN` Opcode - `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to the list of floating point math operations also associate `G_FTAN` with the `TAN_F` runtime lib. - `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math operation common behaviors. - llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function expansion operations for `FTAN` and `STRICT_FTAN`. Also define both opcodes in `PromoteNode`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN` and `STRICT_FTAN` handling in the legalizer - `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define `SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define `FTAN` as a legal vector operation. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an intrinsic that doesn't return NaN. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map `LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map `Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for `Intrinsic::tan`. - `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan` and `strict_ftan` names for the equivalent ISD opcodes. - `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and ISD::FTAN as a target lowering action. - `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for tan intrinsic resolves https://github.com/llvm/llvm-project/issues/70082	2024-06-05 15:01:33 -04:00
Nikita Popov	deab451e7a	[IR] Remove support for icmp and fcmp constant expressions (#93038 ) Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.	2024-06-04 08:31:03 +02:00
Jon Roelofs	0b4af3a5f4	[llvm][SelectionDAG] Relax llvm.ptrmask's size check on arm64_32 (#94125 ) Since pointers in memory, as well as the index type are both 32 bits, but in registers pointers are 64 bits, the mask generated by llvm.ptrmask needs to be zero-extended. Fixes: #94075 Fixes: rdar://125263567	2024-06-03 15:26:30 -07:00
Ahmed Bougacha	cc548ec47c	[AArch64][PAC] Lower authenticated calls with ptrauth bundles. (#85736 ) This adds codegen support for the "ptrauth" operand bundles, which can be used to augment indirect calls with the equivalent of an `@llvm.ptrauth.auth` intrinsic call on the call target (possibly preceded by an `@llvm.ptrauth.blend` on the auth discriminator if applicable.) This allows the generation of combined authenticating calls on AArch64 (in the BLRA* PAuth instructions), while avoiding the raw just-authenticated function pointer from being exposed to attackers. This is done by threading a PtrAuthInfo descriptor through the call lowering infrastructure, eventually selecting a BLRA pseudo. The pseudo encapsulates the safe discriminator computation, which together with the real BLRA* call get emitted in late pseudo expansion in AsmPrinter. Note that this also applies to the other forms of indirect calls, notably invokes, rvmarker, and tail calls. Tail-calls in particular bring some additional complexity, with the intersecting register constraints of BTI and PAC discriminator computation. However this doesn't currently support PAuth_LR tail-call variants. This also adopts an x8+ allocation order for GPR64noip, matching GPR64.	2024-05-31 14:08:10 -07:00
Roger Ferrer Ibáñez	05e6bb40eb	[SelectionDAG] Add an ISD::CLEAR_CACHE node to lower llvm.clear_cache (#93795 ) The current way of lowering `llvm.clear_cache` is a bit unusual. As suggested by Matt Arsenault we are better off using an ISD node. This change introduces a new `ISD::CLEAR_CACHE`, registers a new libcall by default named `__clear_cache` and the default legalisation is a libcall. This is preparatory work for a custom lowering of `ISD::CLEAR_CACHE` needed by RISC-V on some platforms.	2024-05-30 14:55:32 +02:00
Graham Hunter	fbb37e9606	[AArch64] Add an all-in-one histogram intrinsic Based on discussion from https://discourse.llvm.org/t/rfc-vectorization-support-for-histogram-count-operations/74788 Current interface is: llvm.experimental.histogram(<vecty> ptrs, <intty> inc_amount, <vecty> mask) The integer type used by 'inc_amount' needs to match the type of the buckets in memory. The intrinsic covers the following operations: * Gather load * histogram on the elements of 'ptrs' * multiply the histogram results by 'inc_amount' * add the result of the multiply to the values loaded by the gather * scatter store the results of the add Supports lowering to histcnt instructions for AArch64 targets, and scalarization for all others at present.	2024-05-13 11:35:28 +01:00
Paul Walker	b277bf56d7	[LLVM][CodeGen][SVE] Clean up lowering of VECTOR_SPLICE operations. (#91330 ) Remove DAG combine that is performing type legalisation and instead add isel patterns for all legal types.	2024-05-10 10:53:57 +01:00
Simon Pilgrim	df21ee4c62	[DAG] Add clang-format off/on wrappers around compact switch handlers. NFC. Avoids a problem identified in #90503	2024-05-09 17:02:17 +01:00
David Sherwood	b52fa9461a	[Analysis] Add cost model for experimental.cttz.elts intrinsic (#90720 ) In PR #88385 I've added support for auto-vectorisation of some early exit loops, which requires using the experimental.cttz.elts to calculate final indices in the early exit block. We need a more accurate cost model for this intrinsic to better reflect the cost of work required in the early exit block. I've tried to accurately represent the expansion code for the intrinsic when the target does not have efficient lowering for it. It's quite tricky to model because you need to first figure out what types will actually be used in the expansion. The type used can have a significant effect on the cost if you end up using illegal vector types. Tests added here: Analysis/CostModel/AArch64/cttz_elts.ll Analysis/CostModel/RISCV/cttz_elts.ll	2024-05-09 09:40:33 +01:00
Aleksandr Popov	df311a2762	Add interface to check if a call has a deopt bundle (NFC) (#91348 ) Encapsulate check that a call has a deopt bundle to make it easier to change the deopt scheme.	2024-05-08 11:54:49 +02:00
Paul Walker	235cea720c	[NFC][LLVM] Refactor rounding mode detection of constrained fp intrinsic IDs (#90854 ) I've refactored the code to genericise the implementation to better allow for target specific constrained fp intrinsics.	2024-05-07 11:23:55 +01:00
Min-Yih Hsu	539f626ecd	[VP][RISCV] Add vp.cttz.elts intrinsic and its RISC-V codegen (#90502 ) This intrinsic is the VP version of `experimental.cttz.elts`.	2024-04-30 09:27:10 -07:00
Maciej Gabka	bfc0317153	Move several vector intrinsics out of experimental namespace (#88748 ) This patch is moving out following intrinsics: * vector.interleave2/deinterleave2 * vector.reverse * vector.splice from the experimental namespace. All these intrinsics exist in LLVM for more than a year now, and are widely used, so should not be considered as experimental.	2024-04-29 10:16:45 +01:00
Pierre van Houtryve	cf328ff96d	[IR] Memory Model Relaxation Annotations (#78569 ) Implements the core/target-agnostic components of Memory Model Relaxation Annotations. RFC: https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5	2024-04-24 08:52:25 +02:00
Björn Pettersson	d8b253be56	[SelectionDAG] Mark frame index as "aliased" at argument copy elison (#89712 ) This is a fix for miscompiles reported in https://github.com/llvm/llvm-project/issues/89060 After argument copy elison the IR value for the eliminated alloca is aliasing with the fixed stack object. This patch is making sure that we mark the fixed stack object as being aliased with IR values to avoid that for example schedulers are reordering accesses to the fixed stack object. This could otherwise happen when there is a mix of MemOperands refering the shared fixed stack slow via both the IR value for the elided alloca, and via a fixed stack pseudo source value (as would be the case when lowering the arguments).	2024-04-23 13:49:18 +02:00

... 2 3 4 5 6 ...

2182 Commits