llvm-project

Author	SHA1	Message	Date
Youngsuk Kim	d22a236ae7	[llvm] Replace use of Type::getPointerTo() (NFC) Partial progress towards replacing in-tree uses of `Type::getPointerTo()`. If `getPointerTo()` is used solely to support an unnecessary bitcast, remove the bitcast. Reviewed By: barannikov88, nikic Differential Revision: https://reviews.llvm.org/D153307	2023-06-23 22:32:29 -04:00
Fangrui Song	f9fd0062b6	[XRay][AArch64] Suppport __xray_customevent/__xray_typedevent `__xray_customevent` and `__xray_typedevent` are built-in functions in Clang. With -fxray-instrument, they are lowered to intrinsics llvm.xray.customevent and llvm.xray.typedevent, respectively. These intrinsics are then lowered to TargetOpcode::{PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL}. The target is responsible for generating a code sequence that calls either `__xray_CustomEvent` (with 2 arguments) or `__xray_TypedEvent` (with 3 arguments). Before patching, the code sequence is prefixed by a branch instruction that skips the rest of the code sequence. After patching (compiler-rt/lib/xray/xray_AArch64.cpp), the branch instruction becomes a NOP and the function call will take effects. This patch implements the lowering process for {PATCHABLE_EVENT_CALL,PATCHABLE_TYPED_EVENT_CALL} and implements the runtime. ``` // Lowering of PATCHABLE_EVENT_CALL .Lxray_sled_N: b #24 stp x0, x1, [sp, #-16]! x0 = reg of op0 x1 = reg of op1 bl __xray_CustomEvent ldrp x0, x1, [sp], #16 ``` As a result, two updated tests in compiler-rt/test/xray/TestCases/Posix/ now pass on AArch64. Reviewed By: peter.smith Differential Revision: https://reviews.llvm.org/D153320	2023-06-23 09:24:18 -07:00
Nikita Popov	81ec494c36	[SDAGBuilder] Handle multi-part arguments in argument copy elision (PR63430) When eliding an argument copy, we need to update the chain to ensure the argument reads are performed before later writes. However, the code doing this only handled this for the first part of the argument. If the argument had multiple parts, the chains of the later parts were dropped. Make sure we preserve all chains. Fixes https://github.com/llvm/llvm-project/issues/63430.	2023-06-22 17:04:56 +02:00
Simon Pilgrim	43ad2e9c8b	[DAG] Add getExtOrTrunc helper. NFC. Wrap the getSExtOrTrunc/getZExtOrTrunc calls behind an IsSigned argument.	2023-06-20 16:03:18 +01:00
Matt Arsenault	cdcbef1b14	DAG: Fix typo in GET_FPENV legality check This made GET_FPENV unusable since the DAG builder would always emit the mem version.	2023-06-13 20:10:21 -04:00
Anna Thomas	26bfbec5d2	[Intrinsic] Introduce reduction intrinsics for minimum/maximum This patch introduces the reduction intrinsic for floating point minimum and maximum which has the same semantics (for NaN and signed zero) as llvm.minimum and llvm.maximum. Reviewed-By: nikic Differential Revision: https://reviews.llvm.org/D152370	2023-06-13 12:29:58 -04:00
Serge Pavlov	8d1edae998	Use SelectionDAGBuiler::getRoot instead of SelectionDAG::getRoot	2023-06-13 18:59:39 +07:00
Phoebe Wang	7634905a73	[X86][BF16] Share FP16 vector ABI with BF16 The ABI of BF16 is identical to FP16 rather than i16. Fixes #62997 Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D151710	2023-06-09 09:04:56 +08:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Serge Pavlov	eecaeb6f10	[FPEnv] Intrinsics for access to FP environment The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some default state. They do the same actions as C library functions 'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls to these functions. The new intrinsics specify FP environment as a value of integer type, it is convenient of most targets where the FP state is a content of some register. Some targets however use long representations. On X86 the size of FP environment is 256 bits, and even half of this size is not a legal ibteger type. To facilitate legalization in such cases, two sets of DAG nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP environment may be represented by a legal integer type. Nodes GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in memory, much like `fesetenv` and `fegetenv` do. They are used when target has long representation for floationg-point state. Differential Revision: https://reviews.llvm.org/D71742	2023-06-05 13:10:01 +07:00
Dávid Bolvanský	09515f2c20	[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata. Example: ``` int MaxIndex(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is converted to branch by X86CmovConversion if (a[i] > a[t]) t = i; } return t; } int MaxIndex2(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is preserved if (__builtin_unpredictable(a[i] > a[t])) t = i; } return t; } ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D118118	2023-06-01 20:56:44 +02:00
Luo, Yuanke	9032a94637	[NFC][DAGISel] Remove dead code.	2023-05-29 09:56:59 +08:00
Craig Topper	c5e6c886aa	[VP][SelectionDAG][RISCV] Add get_vector_length intrinsics and generic SelectionDAG support. The generic implementation is umin(TC, VF * vscale). Lowering to vsetvli for RISC-V will come in a future patch. This patch is a pre-requisite to be able to CodeGen vectorized code from D99750. Reviewed By: reames, frasercrmck Differential Revision: https://reviews.llvm.org/D149916	2023-05-26 09:06:38 -07:00
Felipe de Azevedo Piovezan	aba1bea673	[SelectionDAGBuilder] Handle entry_value dbg.value intrinsics Summary: DbgValue intrinsics whose expression is an entry_value and whose address is described an llvm::Argument must be lowered to the corresponding livein physical register for that Argument. Depends on D151329 Reviewers: aprantl Subscribers:	2023-05-26 06:55:49 -04:00
Craig Topper	3fb1041165	[SelectionDAGBuilder] Use getPtrExtOrTrunc in place of getZExtOrTrunc. NFC This getZExtOrTrunc seems to have been added when getPtrExtOrTrunc was introduced. getPtrExtOrTrunc is currently equivalent to getZExtOrTrunc, but could be changed for some target in the future. Reviewed By: t.p.northover Differential Revision: https://reviews.llvm.org/D149680	2023-05-19 13:08:39 -07:00
eopXD	c8eb535aed	[1/11][IR] Permit load/store/alloca for struct of the same scalable vector type This patch-set aims to simplify the existing RVV segment load/store intrinsics to use a type that represents a tuple of vectors instead. To achieve this, first we need to relax the current limitation for an aggregate type to be a target of load/store/alloca when the aggregate type contains homogeneous scalable vector types. Then to adjust the prolog of an LLVM function during lowering to clang. Finally we re-define the RVV segment load/store intrinsics to use the tuple types. The pull request under the RVV intrinsic specification is riscv-non-isa/rvv-intrinsic-doc#198 --- This is the 1st patch of the patch-set. This patch is originated from D98169. This patch allows aggregate type (StructType) that contains homogeneous scalable vector types to be a target of load/store/alloca. The RFC of this patch was posted in LLVM Discourse. https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527 The main changes in this patch are: Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to accommodate an expression of scalable size. Allow `StructType:isSized` to also return true for homogeneous scalable vector types. Let `Type::isScalableTy` return true when `Type` is `StructType` and contains scalable vectors Extra description is added in the LLVM Language Reference Manual on the relaxation of this patch. Authored-by: Hsiangkai Wang <kai.wang@sifive.com> Co-Authored-by: eop Chen <eop.chen@sifive.com> Reviewed By: craig.topper, nikic Differential Revision: https://reviews.llvm.org/D146872	2023-05-19 09:39:36 -07:00
OCHyams	6c088972d2	[DebugInfo][SelectionDAG] Do not drop dbg intrinsics with empty metadata locs Without this patch SelectionDAG silently drops dbg.values using `!{}` operands. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value This causes assignment-tracking to behaviour to match non-assignment-tracking behaviour after a recent change (see D140990). Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D150767	2023-05-18 10:08:37 +01:00
Felipe de Azevedo Piovezan	3db7d0dffb	[MachineFunction][DebugInfo][nfc] Introduce EntryValue variable kind MachineFunction keeps a table of variables whose addresses never change throughout the function. Today, the only kinds of locations it can handle are stack slots. However, we could expand this for variables whose address is derived from the value a register had upon function entry. One case where this happens is with variables alive across coroutine funclets: these can be placed in a coroutine frame object whose pointer is placed in a register that is an argument to coroutine funclets. ``` define @foo(ptr %frame_ptr) { dbg.declare(%frame_ptr, !some_var, !DIExpression(EntryValue, <ptr_arithmetic>)) ``` This is a patch in a series that aims to improve the debug information generated by the CoroSplit pass in the context of `swiftasync` arguments. Variables stored in the coroutine frame _must_ be described the entry_value of the ABI-defined register containing a pointer to the coroutine frame. Since these variables have a single location throughout their lifetime, they are candidates for being stored in the MachineFunction table. Differential Revision: https://reviews.llvm.org/D149879	2023-05-11 07:29:57 -04:00
Felipe de Azevedo Piovezan	a524f84780	[SelectionDAG][NFCI] Use common logic for identifying MMI vars After function argument lowering, but prior to instruction selection, dbg declares pointing to function arguments are lowered using special logic. Later, during instruction selection (both "fast" and regular ISel), this logic is "repeated" in order to identify which intrinsics have already been lowered. This is bad for two reasons: 1. The logic is not _really_ repeated, the code is different, which could lead to duplicate lowering of the intrinsic. 2. Even if the logic were repeated properly, this is still code duplication. This patch addresses these issues by storing all preprocessed dbg.declare intrinsics in a set inside FuncInfo; the set is queried upon instruction selection. Differential Revision: https://reviews.llvm.org/D149682	2023-05-03 10:58:31 -04:00
Shengchen Kan	3910a9fcb2	Revert part of D149033 b/c original code is correct This reverts part of D149033 and rG8f966cedea594d9a91e585e88a80a42c04049e6c. The added test case is kept to avoid future regression. Reviewed By: vzakhari, vdonaldson Differential Revision: https://reviews.llvm.org/D149639	2023-05-03 12:20:19 +08:00
Shengchen Kan	8f966cedea	[SelectionDAG] Use int64_t to store the integer power of llvm.powi https://llvm.org/docs/LangRef.html#llvm-powi-intrinsic The max length of the integer power of `llvm.powi` intrinsic is 32, and the value can be negative. If we use `int32_t` to store this value, `-Val` will underflow when it is `INT32_MIN` The issue was reported in D149033.	2023-05-02 14:08:42 +08:00
Shengchen Kan	4e4db6f6c6	Revert "[SelectionDAG] Use logic right shift to avoid loop hang" This reverts commit b73229e55543b4ba2b293adcb8b7d6025f01f7d9. It caused LIT failure on non-X86 targets.	2023-05-02 13:14:47 +08:00
Shengchen Kan	b73229e555	[SelectionDAG] Use logic right shift to avoid loop hang Issue was reported in D149033, `Val` can be negative value and arithmetic right shift always keeps the sign bit. BTW, the redundant code `Val = -Val` is removed by this patch.	2023-05-02 12:47:28 +08:00
Wang, Xin10	9c1e4ee690	[NFC]Fix 2 logic dead code First, in CodeGenPrepare.cpp, line 6891, the VectorCond will always be false because if not function will return at 6888. Second, in SelectionDAGBuilder.cpp, line 5443, getSExtValue() will return value as int type, but now we use unsigned Val to maintain it, which make the if condition at 5452 meaningless. Reviewed By: skan Differential Revision: https://reviews.llvm.org/D149033	2023-04-28 03:02:59 -04:00
OCHyams	2b3c13b716	[DebugInfo] Treat empty metadata operands the same as undef operands in SelectionDAG Without this patch SelectionDAG silently drops dbg.values using `!{}` operands. Related to https://discourse.llvm.org/t/auto-undef-debug-uses-of-a-deleted-value Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D140990	2023-04-26 09:03:07 +01:00
OCHyams	311260a699	[Assignment Tracking][SelectionDAG] Downgrade dbg.assigns to dbg.values if assignment tracking is not enabled We shouldn't be able to reach this code path from source code but this provides a better fail-safe than asserting. The result of the downgrade is a degraded debugging experience, but it is better than nothing. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D148212	2023-04-18 13:03:45 +01:00
Kazu Hirata	972983539b	[llvm] Apply fixes from readability-redundant-control-flow (NFC)	2023-04-16 00:13:46 -07:00
Ellis Hoag	244be0b0de	[InstrProf] Temporal Profiling As described in [0], this extends IRPGO to support //Temporal Profiling//. When `-pgo-temporal-instrumentation` is used we add the `llvm.instrprof.timestamp()` intrinsic to the entry of functions which in turn gets lowered to a call to the compiler-rt function `INSTR_PROF_PROFILE_SET_TIMESTAMP()`. A new field in the `llvm_prf_cnts` section stores each function's timestamp. Then in `llvm-profdata merge` we convert these function timestamps into a //trace// and add it to the indexed profile. Since these traces could significantly increase the profile size, we've added `-max-temporal-profile-trace-length` and `-temporal-profile-trace-reservoir-size` to limit the length of a trace and the number of traces in a profile, respectively. In a future diff we plan to use these traces to construct an optimized function order to reduce the number of page faults during startup. Special thanks to Julian Mestre for helping with reservoir sampling. [0] https://discourse.llvm.org/t/rfc-temporal-profiling-extension-for-irpgo/68068 Reviewed By: snehasish Differential Revision: https://reviews.llvm.org/D147287	2023-04-11 08:30:52 -07:00
Craig Topper	de92a20131	[SelectionDAG] Move variable declaration to its first assignment. NFC We declared this variable and assigned it to true, but then overwrote it before its first use.	2023-04-03 14:03:05 -07:00
Craig Topper	bb64fd571b	[SelectionDAGBuilder] Use SmallVectorImpl& for function arguments. NFC Make the reference const since we aren't modifying the vectors.	2023-04-03 14:03:05 -07:00
Craig Topper	b5f207e5b2	[SelectionDAG] Rename Flag->Glue. NFC	2023-04-02 19:46:51 -07:00
OCHyams	06f28f2451	[Assignment Tracking][NFC] Cache debug-info-assignment-tracking module flag This reduces CTMark LTO-O3-g compile time by a geomean of 0.1%. Reviewed By: jmorse Differential Revision: https://reviews.llvm.org/D146985	2023-03-29 12:51:59 +01:00
Phoebe Wang	0efe111365	Reland "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2" This reverts commit db6a979ae82410e42430e47afa488936ba8e3025. Reland D102817 without any change. The previous revert was a mistake. Differential Revision: https://reviews.llvm.org/D102817	2023-03-29 08:59:56 +08:00
Kazu Hirata	e844638946	[llvm] Use isIntOrFPConstant (NFC)	2023-03-27 22:32:23 -07:00
OCHyams	7d89437455	[Assignment Tracking][NFC] Use RawLocationWrapper in VarLocInfo [2/x] Use RawLocationWrapper rather than a Value to represent the location operand(s) so that it's possible to represent multiple location operands. AssignmentTrackingAnalysis still converts variadic debug intrinsics to kill locations so this patch is NFC. Reviewed By: StephenTozer Differential Revision: https://reviews.llvm.org/D145911	2023-03-16 09:55:15 +00:00
Sander de Smalen	170e7a0ec2	[AArch64][SME2] Add CodeGen support for target("aarch64.svcount"). This patch adds AArch64 CodeGen support such that the type can be passed and returned to/from functions, and also adds support to use this type in load/store operations and PHI nodes. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D136862	2023-03-02 12:07:41 +00:00
J. Ryan Stinnett	22b8e82c12	[DebugInfo] Remove `dbg.addr` from CodeGen As part of this work, removing `SDDbgValue::clearIsEmitted` originally added for `dbg.addr` in 045c67769d7fe577fc38cccb6fb40fd814437447 was attempted, but it appears some tests for `DBG_INSTR_REF` now depend on that behaviour as well, so it was kept and comments were updated instead. Part of `dbg.addr` removal Discussed in https://discourse.llvm.org/t/what-is-the-status-of-dbg-addr/62898 Differential Revision: https://reviews.llvm.org/D144800	2023-03-02 09:29:43 +00:00
Serge Pavlov	7f81dd4dd6	[NFC] Make FPClassTest a bitmask enumeration This is recommit of 2e416cdd52, fixed to be accepatble by GCC. The original commit message is below. With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-24 15:12:16 +07:00
Nikita Popov	8347ca7dc8	[PatternMatch] Don't require DataLayout for m_VScale() The m_VScale() matcher is unusual in that it requires a DataLayout. It is currently used to determine the size of the GEP type. However, I believe it is sufficient to check for the canonical <vscale x 1 x i8> form here -- I don't think there's a need to recognize exotic variations like <vscale x 1 x i4> as a vscale constant representation as well. Differential Revision: https://reviews.llvm.org/D144566	2023-02-23 15:30:29 +01:00
Yeting Kuo	419948fe67	[VP] Reorder is_int_min_poison/is_zero_poison operand before mask for vp.abs/ctlz/cttz. The patch ensures last two operands of vp.abs/ctlz/cttz are mask and evl. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D144536	2023-02-23 13:58:21 +08:00
Serge Pavlov	08a09235b6	Revert "[NFC] Make FPClassTest a bitmask enumeration" This reverts commit e7613c1d9b259bdf2b0b06b4169d9a10dd553406. GCC issues an error: In file included from /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/unittests/ADT/BitmaskEnumTest.cpp:9: /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:66:22: error: explicit specialization of template<class E, class Enable> struct llvm::is_bitmask_enum outside its namespace must use a nested-name-specifier [-fpermissive] 66 \| template <> struct is_bitmask_enum<Enum> : std::true_type {}; \ \| ^~~~~~~~~~~~~~~~~~~~~ /home/buildbot/as-builder-4/lld-x86_64-ubuntu-fast/llvm-project/llvm/unittests/ADT/BitmaskEnumTest.cpp:30:1: note: in expansion of macro LLVM_DECLARE_ENUM_AS_BITMASK 30 \| LLVM_DECLARE_ENUM_AS_BITMASK(Flags2, V4); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~	2023-02-23 12:55:58 +07:00
Serge Pavlov	e7613c1d9b	[NFC] Make FPClassTest a bitmask enumeration This is recommit of 2e416cdd52, reverted in 8555ab2fcd, because GCC complains on extra qualification. The macro LLVM_DECLARE_ENUM_AS_BITMASK does not specify llvm:: anymore, so the macro must occur in the namespace llvm. Documentation updated accordingly. The original commit message is below. With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-23 12:38:57 +07:00
Nikita Popov	8555ab2fcd	Revert "[NFC] Make FPClassTest a bitmask enumeration" This reverts commit 2e416cdd52c1079b8c7cb1f7d7e557c889a4fb56. Breaks the GCC build: In file included from /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:18, from /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/APFloat.h:20, from /home/npopov/repos/llvm-project/llvm/lib/Support/APFloat.cpp:14: /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:66:22: error: extra qualification not allowed [-fpermissive] 66 \| template <> struct llvm::is_bitmask_enum<Enum> : std::true_type {}; \ \| ^~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:223:1: note: in expansion of macro ‘LLVM_DECLARE_ENUM_AS_BITMASK’ 223 \| LLVM_DECLARE_ENUM_AS_BITMASK(FPClassTest, /* LargestValue / fcPosInf); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/BitmaskEnum.h:67:22: error: extra qualification not allowed [-fpermissive] 67 \| template <> struct llvm::largest_bitmask_enum_bit<Enum> { \ \| ^~~~ /home/npopov/repos/llvm-project/llvm/include/llvm/ADT/FloatingPointMode.h:223:1: note: in expansion of macro ‘LLVM_DECLARE_ENUM_AS_BITMASK’ 223 \| LLVM_DECLARE_ENUM_AS_BITMASK(FPClassTest, / LargestValue */ fcPosInf); \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~ [43/4396] Building CXX object lib/Supp...iles/LLVMSupport.dir/CommandLine.cpp.o	2023-02-22 08:56:19 +01:00
Serge Pavlov	2e416cdd52	[NFC] Make FPClassTest a bitmask enumeration With this change bitwise operations are allowed for FPClassTest enumeration, it must simplify using this type. Also some functions changed to get argument of type FPClassTest instead of unsigned. Differential Revision: https://reviews.llvm.org/D144241	2023-02-22 14:20:04 +07:00
Caroline Concatto	d515ecca68	[IR] Add new intrinsics interleave and deinterleave vectors This patch adds 2 new intrinsics: ; Interleave two vectors into a wider vector <vscale x 4 x i64> @llvm.vector.interleave2.nxv2i64(<vscale x 2 x i64> %even, <vscale x 2 x i64> %odd) ; Deinterleave the odd and even lanes from a wider vector {<vscale x 2 x i64>, <vscale x 2 x i64>} @llvm.vector.deinterleave2.nxv2i64(<vscale x 4 x i64> %vec) The main motivator for adding these intrinsics is to support vectorization of complex types using scalable vectors. The intrinsics are kept simple by only supporting a stride of 2, which makes them easy to lower and type-legalize. A stride of 2 is sufficient to handle complex types which only have a real/imaginary component. The format of the intrinsics matches how `shufflevector` is used in LoopVectorize. For example: using cf = std::complex<float>; void foo(cf * dst, int N) { for (int i=0; i<N; ++i) dst[i] += cf(1.f, 2.f); } For this loop, LoopVectorize: (1) Loads a wide vector (e.g. <8 x float>) (2) Extracts odd lanes using shufflevector (leading to <4 x float>) (3) Extracts even lanes using shufflevector (leading to <4 x float>) (4) Performs the addition (5) Interleaves the two <4 x float> vectors into a single <8 x float> using shufflevector (6) Stores the wide vector. In this example, we can 1-1 replace shufflevector in (2) and (3) with the deinterleave intrinsic, and replace the shufflevector in (5) with the interleave intrinsic. The SelectionDAG nodes might be extended to support higher strides (3, 4, etc) as well in the future. Similar to what was done for vector.splice and vector.reverse, the intrinsic is lowered to a shufflevector when the type is fixed width, so to benefit from existing code that was written to recognize/optimize shufflevector patterns. Note that this approach does not prevent us from adding new intrinsics for other strides, or adding a more generic shuffle intrinsic in the future. It just solves the immediate problem of being able to vectorize loops with complex math. Reviewed By: paulwalker-arm Differential Revision: https://reviews.llvm.org/D141924	2023-02-20 12:21:59 +00:00
Nick Desaulniers	5cc1016a57	[llvm][SelectionDAGBuilder] codegen callbr.landingpad intrinsic Given a CallBrInst, retain its first virtual register in SelectionDagBuilder's FunctionLoweringInfo if there's corresponding landingpad. Walk the list of COPY MachineInstr to find the original virtual and physical registers defined by the INLINEASM_BR MachineInst. Test cases from https://reviews.llvm.org/D139565. Link: https://github.com/llvm/llvm-project/issues/59538 Part 3 from https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Follow up patches still need to wire up CallBrPrepare into the pass pipelines. Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D140160	2023-02-16 17:58:34 -08:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Marco Elver	98f0e4f611	Revert "[SelectionDAG] Add pcsections recursively on SDNode values" Revert "[SelectionDAG] Add missing setValue calls in visitIntrinsicCall" This reverts commit 0c64e1b68f36640ffe82fc90e6279c50617ad1cc. This reverts commit 1142e6c7c795de7f80774325a07ed49bc95a48c9. It spuriously added !pcsections where they shouldn't be. See added test case in test/CodeGen/X86/pcsections.ll as an example. The reason is that the SelectionDAG chains operations in a basic block as "operands" pointing to preceding instructions. This resulted in setting the metadata on _all_ instructions preceding the one that should have the metadata. Reverting for now because the semantics of !pcsections was completely buggy now.	2023-02-03 18:57:34 +01:00
Sanjay Patel	fb3e3ef62e	[SDAG] fix miscompiles caused by using ValueTracking matchSelectPattern to create FMINIMUM/FMAXIMUM ValueTracking attempts to match compare+select patterns to FP min/max operations, but it was created before the newer IEEE-754-2019 minimum/maximum ops were defined. Ie, matchSelectPattern() does not account for the -0.0/+0.0 behavior that is specified in the newer standard. FMINIMUM/FMAXIMUM nodes were created to map to the newer standard: /// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0 /// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008 /// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics. We could adjust ValueTracking to deal with signed zero, but it seems like a moot point given the divergent NaN behavior discussed in D143056, so just delete this possibility to avoid bugs when converting IR to SDAG. Differential Revision: https://reviews.llvm.org/D143106	2023-02-03 09:53:47 -05:00
Matt Arsenault	8bdb149c0a	DAG: Remove redundant check for return alignment This is already what the CallBase getRetAlign does	2023-01-31 08:43:56 -04:00

1 2 3 4 5 ...

1905 Commits