204 Commits

Author SHA1 Message Date
Vitaly Buka
5bc166728a
Revert "Reland [EquivClasses] Introduce members iterator-helper" (#130380)
Reverts llvm/llvm-project#130319

Multiple bot failures.
2025-03-07 17:46:53 -08:00
Ramkumar Ramachandra
21d973dbb3
Reland [EquivClasses] Introduce members iterator-helper (#130319)
Changes: Fix the expectations in EquivalenceClassesTest.MemberIterator,
also fixing a build failure.
2025-03-07 21:09:31 +00:00
Ramkumar Ramachandra
86dfd90193
Revert "[EquivClasses] Introduce members iterator-helper" (#130313)
This reverts commit 259624bf6d, as it causes a build failure.
2025-03-07 17:38:38 +00:00
Ramkumar Ramachandra
259624bf6d
[EquivClasses] Introduce members iterator-helper (#130139) 2025-03-07 17:24:14 +00:00
Mel Chen
9b4ad2fe50
[LV][EVL] Support fixed-order recurrence idiom with EVL tail folding. (#124093)
This patch converts the llvm.vector.splice intrinsic to
llvm.experimental.vp.splice, ensuring that fixed-order recurrences
execute correctly when tail folding by EVL is enable.
Due to the non-VFxUF penultimate EVL issue, the EVL from the previous
iteration will be preserved and used in llvm.experimental.vp.splice.
2025-03-03 21:27:13 +08:00
Philip Reames
248be98418 Reapply "[RISCV][TTI] Add shuffle costing for masked slide lowering (#128537)"
With a fix for fully undef masks.  These can't reach the lowering code, but
can reach the costing code via e.g. SLP.

This change adds the TTI costing corresponding to the recently added
isMaskedSlidePair lowering for vector shuffles. However, since the
existing costing code hadn't covered either slideup, slidedown, or the
(now removed) isElementRotate, the impact is larger in scope than just
that new lowering.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Co-authored-by: Luke Lau <luke_lau@icloud.com>
2025-02-28 08:02:27 -08:00
Benjamin Maxwell
89e7f4d31b
[LV] Teach the vectorizer to cost and vectorize modf and sincospi intrinsics (#129064)
Follow on to #128035. It is a small extension to support vectorizing
`llvm.modf.*` and `llvm.sincospi.*` too.

This renames the test files from `sincos.ll` ->
`multiple-result-intrinsics.ll` to group together the similar tests
(which make up most of this PR).
2025-02-28 12:56:12 +00:00
Philip Reames
b2152823e0 Revert "[RISCV][TTI] Add shuffle costing for masked slide lowering (#128537)"
This reverts commit 4904728cab8596320a77a895cb712fba07ea7bb1.  Downstream
test failed, reverting during investigation.
2025-02-27 22:03:18 -08:00
Philip Reames
4904728cab
[RISCV][TTI] Add shuffle costing for masked slide lowering (#128537)
This change adds the TTI costing corresponding to the recently added
isMaskedSlidePair lowering for vector shuffles. However, since the
existing costing code hadn't covered either slideup, slidedown, or the
(now removed) isElementRotate, the impact is larger in scope than just
that new lowering.

---------

Co-authored-by: Alexey Bataev <a.bataev@gmx.com>
Co-authored-by: Luke Lau <luke_lau@icloud.com>
2025-02-27 18:58:42 -08:00
Benjamin Maxwell
3307b0374a
[LV] Teach the loop vectorizer llvm.sincos is trivially vectorizable (#128035)
Depends on #123210
2025-02-27 09:37:06 +00:00
Deric Cheung
37ed2e6b33
[Scalarizer] Make *_with_overflow intrinsics scalarizable (#126815)
Addresses issue #126809

- Made `uadd_with_overflow`, `sadd_with_overflow`, `usub_with_overflow`,
`ssub_with_overflow`, `umul_with_overflow`, and `smul_with_overflow`
trivially scalarizable in `isTriviallyScalarizable()` from
`VectorUtils.cpp`
- Renamed and updated the test `Scalarizer/uadd_overflow.ll` to
`Scalarizer/uadd_with_overflow.ll` to check that `uadd_with_overflow`
gets scalarized
- Added a test `Scalarizer/sincos.ll` to ensure the bug fix #113625
still works
2025-02-13 13:31:39 -08:00
Alexey Bataev
bab7920fd7
[RISCV][CG]Use processShuffleMasks for per-register shuffles
Patch adds usage of processShuffleMasks in in codegen
in lowerShuffleViaVRegSplitting. This function is already used for X86
shuffles estimations and in DAGTypeLegalizer::SplitVecRes_VECTOR_SHUFFLE
functions, unifies the code.

Reviewers: topperc, wangpc-pp, lukel97, preames

Reviewed By: preames

Pull Request: https://github.com/llvm/llvm-project/pull/121765
2025-01-13 17:06:25 -05:00
Philip Reames
24bb180e8a
[RISCV] Attempt to widen SEW before generic shuffle lowering (#122311)
This takes inspiration from AArch64 which does the same thing to assist
with zip/trn/etc.. Doing this recursion unconditionally when the mask
allows is slightly questionable, but seems to work out okay in practice.

As a bit of context, it's helpful to realize that we have existing logic
in both DAGCombine and InstCombine which mutates the element width of in
an analogous manner. However, that code has two restriction which
prevent it from handling the motivating cases here. First, it only
triggers if there is a bitcast involving a different element type.
Second, the matcher used considers a partially undef wide element to be
a non-match. I considered trying to relax those assumptions, but the
information loss for undef in mid-level opt seemed more likely to open a
can of worms than I wanted.
2025-01-10 07:12:24 -08:00
Finn Plummer
45c01e8a33
[NFC][TargetTransformInfo][VectorUtils] Consolidate isVectorIntrinsic... api (#117635)
- update `VectorUtils:isVectorIntrinsicWithScalarOpAtArg` to use TTI for
all uses, to allow specifiction of target specific intrinsics
- add TTI to the `isVectorIntrinsicWithStructReturnOverloadAtField` api
- update TTI api to provide `isTargetIntrinsicWith...` functions and
  consistently name them
- move `isTriviallyScalarizable` to VectorUtils
  
- update all uses of the api and provide the TTI parameter

Resolves #117030
2024-12-19 11:54:26 -08:00
LiqinWeng
b759020cc8
[LV][EVL] Support cast instruction with EVL-vectorization (#108351) 2024-12-11 10:01:41 +08:00
Alexey Bataev
b9aa155d26
[TTI][X86]Fix detection of the shuffles from the second shuffle operand only
If the shuffle mask uses only indices from the second shuffle operand,
processShuffleMasks function misses it currently, which prevents correct
cost estimation in this corner case. To fix this, need to raise the
limit to 2 * VF rather than just VF and adjust processing
correspondingly. Will allow future improvements for 2 sources
permutations.

Reviewers: RKSimon

Reviewed By: RKSimon

Pull Request: https://github.com/llvm/llvm-project/pull/118972
2024-12-06 12:27:00 -05:00
LiqinWeng
4a3f46de50
[LV][EVL] Support call instruction with EVL-vectorization (#110412) 2024-11-28 10:05:08 +08:00
Finn Plummer
8663b8777e
[NFC][VectorUtils][TargetTransformInfo] Add isVectorIntrinsicWithOverloadTypeAtArg api (#114849)
This changes allows target intrinsics to specify and overwrite overloaded types.

- Updates `ReplaceWithVecLib` to not provide TTI as there most probably won't be a use-case
- Updates `SLPVectorizer` to use available TTI
- Updates `VPTransformState` to pass down TTI
- Updates `VPlanRecipe` to use passed-down TTI

This change will let us add scalarization for `asdouble`:  #114847
2024-11-21 11:04:25 -08:00
Tex Riddell
818d715989
[Analysis] atan2: isTriviallyVectorizable; add to massv and accelerate veclibs (#113637)
This change is part of this proposal:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294

- Return true for atan2 from isTriviallyVectorizable
- Add atan2 to VecFuncs.def for massv and accelerate libraries.
- Add atan2 to hasOptimizedCodeGen
- Add atan2 support in llvm/lib/Analysis/ValueTracking.cpp
llvm::getIntrinsicForCallSite and update vectorization tests
- Add atan2 name check to isLoweredToCall in
llvm/include/llvm/Analysis/TargetTransformInfoImpl.h
- Note: there's no test coverage for these names in isLoweredToCall, except that Transforms/TailCallElim/inf-recursion.ll is impacted by the "fabs" case

Thanks to @jroelofs for the atan2 accelerate veclib and associated test
additions, plus the hasOptimizedCodeGen addition.

Part of: Implement the atan2 HLSL Function #70096.
2024-11-08 16:07:38 -08:00
Rohit Aggarwal
dfb60bb919
Adding more vector calls for -fveclib=AMDLIBM (#109662)
AMD has it's own implementation of vector calls.
New vector calls are introduced in the library for exp10, log10, sincos and finite asin/acos
Please refer [https://github.com/amd/aocl-libm-ose]

---------

Co-authored-by: Rohit Aggarwal <Rohit.Aggarwal@amd.com>
2024-10-29 10:09:55 +00:00
Farzon Lotfi
dcbf2c2ca0
[Scalarizer][DirectX] support structs return types (#111569)
Based on this RFC:
https://discourse.llvm.org/t/rfc-allow-the-scalarizer-pass-to-scalarize-vectors-returned-in-structs/82306

LLVM intrinsics do not support out params. To get around this limitation
implementers will make intrinsics return structs to capture a return
type and an out param. This implementation detail should not impact
scalarization since these cases should be elementwise operations.

## Three changes are needed. 
- The CallInst visitor needs to be updated to handle Structs
- A new visitor is needed for `ExtractValue` instructions
- finsh needs to be update to handle structs so that insert elements are
properly propogated.

## Testing changes
- Add support for `llvm.frexp`
- Add support for `llvm.dx.splitdouble`

fixes https://github.com/llvm/llvm-project/issues/111437
2024-10-21 12:51:01 -04:00
Amr Hesham
4ba1800be6
[LLVM][NFC] Reduce copying of parameter in lambda (#110299)
Reduce redundant copy parameter in lambda

Fixes #95642
2024-10-16 09:55:01 +01:00
Piotr Fusik
cc7b24a4d1
[NFC] Fix typos in comments (#109765) 2024-09-24 11:19:56 +02:00
Yingwei Zheng
a156b5a47d
[SLP] Add vectorization support for [u|s]cmp (#106747)
This patch adds vectorization support for [u|s]cmp intrinsic calls.
2024-09-02 17:06:07 +08:00
Simon Pilgrim
d58d105cda
[Analysis] isTriviallyVectorizable - add vectorization support for acos/asin/atan and cosh/sinh/tanh intrinsics (#106584)
Show fallback cases in amdlibm tests where it doesn't have that specific op
2024-08-30 16:49:23 +01:00
mskamp
b22fa9093b
[ValueTracking][X86] Compute KnownBits for phadd/phsub (#92429)
Add KnownBits computations to ValueTracking and X86 DAG lowering.
    
These instructions add/subtract adjacent vector elements in their operands. Example: phadd [X1, X2] [Y1, Y2] = [X1 + X2, Y1 + Y2]. This means that, in this example, we can compute the KnownBits of the operation by computing the KnownBits of [X1, X2] + [X1, X2] and [Y1, Y2] + [Y1, Y2] and intersecting the results. This approach also generalizes to all x86 vector types.
    
There are also the operations phadd.sw and phsub.sw, which perform saturating addition/subtraction. Use sadd_sat and ssub_sat to compute the KnownBits of these operations.
    
Also adjust the existing test case pr53247.ll because it can be transformed to a constant using the new KnownBits computation.
    
Fixes #82516.
2024-07-16 15:50:21 +01:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Nikita Popov
d42b392696 [VectorUtils] Use SmallPtrSet::remove_if() (NFC) 2024-06-26 14:55:06 +02:00
Simon Pilgrim
5b4000dc58
[VectorUtils] Add llvm::scaleShuffleMaskElts wrapper for narrowShuffleMaskElts/widenShuffleMaskElts, NFC. (#96646)
Using the target number of vector elements, scaleShuffleMaskElts will try to use narrowShuffleMaskElts/widenShuffleMaskElts to scale the shuffle mask accordingly.

Working on #58895 I didn't want to create yet another case where we have to handle both re-scaling cases.
2024-06-26 10:43:58 +01:00
Nikita Popov
605e18479c [VectorUtils] Use poison instead of undef in findScalarElement()
Out-of-range extractelement returns poison, and so do poison elements
in the shufflevector mask.
2024-06-24 16:20:40 +02:00
Farzon Lotfi
1d87433593
[x86] Add tan intrinsic part 4 (#90503)
This change is an implementation of #87367's investigation on supporting
IEEE math operations as intrinsics.
Which was discussed in this RFC:
https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294


Much of this change was following how G_FSIN and G_FCOS were used.

Changes:
- `llvm/docs/GlobalISel/GenericOpcode.rst` - Document the `G_FTAN`
opcode
-  `llvm/docs/LangRef.rst` - Document the tan intrinsic
- `llvm/include/llvm/Analysis/VecFuncs.def` - Associate the tan
intrinsic as a vector function similar to the tanf libcall.
- `llvm/include/llvm/CodeGen/BasicTTIImpl.h` - Map the tan intrinsic to
`ISD::FTAN`
- `llvm/include/llvm/CodeGen/ISDOpcodes.h` - Define ISD opcodes for
`FTAN` and `STRICT_FTAN`
-  `llvm/include/llvm/IR/Intrinsics.td` - Create the tan intrinsic
- `llvm/include/llvm/IR/RuntimeLibcalls.def` - Define tan libcall
mappings
- `llvm/include/llvm/Target/GenericOpcodes.td` - Define the `G_FTAN`
Opcode
- `llvm/include/llvm/Support/TargetOpcodes.def` - Create a `G_FTAN`
Opcode handler
- `llvm/include/llvm/Target/GlobalISel/SelectionDAGCompat.td` - Map
`G_FTAN` to `ftan`
- `llvm/include/llvm/Target/TargetSelectionDAG.td` - Define `ftan`,
`strict_ftan`, and `any_ftan` and map them to the ISD opcodes for `FTAN`
and `STRICT_FTAN`
- `llvm/lib/Analysis/VectorUtils.cpp` - Associate the tan intrinsic as a
vector intrinsic
- `llvm/lib/CodeGen/GlobalISel/IRTranslator.cpp` Map the tan intrinsic
to `G_FTAN` Opcode
- `llvm/lib/CodeGen/GlobalISel/LegalizerHelper.cpp` - Add `G_FTAN` to
the list of floating point math operations also associate `G_FTAN` with
the `TAN_F` runtime lib.
- `llvm/lib/CodeGen/GlobalISel/Utils.cpp` - More floating point math
operation common behaviors.
- llvm/lib/CodeGen/SelectionDAG/LegalizeDAG.cpp - List the function
expansion operations for `FTAN` and `STRICT_FTAN`. Also define both
opcodes in `PromoteNode`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeFloatTypes.cpp` - More `FTAN`
and `STRICT_FTAN` handling in the legalizer
- `llvm/lib/CodeGen/SelectionDAG/LegalizeTypes.h` - Define
`SoftenFloatRes_FTAN` and `ExpandFloatRes_FTAN`.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorOps.cpp` - Define `FTAN`
as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/LegalizeVectorTypes.cpp` - Define
`FTAN` as a legal vector operation.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAG.cpp` - define tan as an
intrinsic that doesn't return NaN.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGBuilder.cpp` Map
`LibFunc_tan`, `LibFunc_tanf`, and `LibFunc_tanl` to `ISD::FTAN`. Map
`Intrinsic::tan` to `ISD::FTAN` and add selection dag handling for
`Intrinsic::tan`.
- `llvm/lib/CodeGen/SelectionDAG/SelectionDAGDumper.cpp` - Define `ftan`
and `strict_ftan` names for the equivalent ISD opcodes.
- `llvm/lib/CodeGen/TargetLoweringBase.cpp` -Define a Tan128 libcall and
ISD::FTAN as a target lowering action.
- `llvm/lib/Target/X86/X86ISelLowering.cpp` - Add x86_64 lowering for
tan intrinsic

resolves https://github.com/llvm/llvm-project/issues/70082
2024-06-05 15:01:33 -04:00
Pierre van Houtryve
cf328ff96d
[IR] Memory Model Relaxation Annotations (#78569)
Implements the core/target-agnostic components of Memory Model
Relaxation Annotations.

RFC:
https://discourse.llvm.org/t/rfc-mmras-memory-model-relaxation-annotations/76361/5
2024-04-24 08:52:25 +02:00
Yingwei Zheng
a1a590ef12
[InstCombine] Fix miscompilation in PR83947 (#83993)
762f762504/llvm/lib/Transforms/InstCombine/InstCombineCalls.cpp (L394-L407)

Comment from @topperc:
> This transforms assumes the mask is a non-zero splat. We only know its
a splat and not provably all 0s. The mask is a constexpr that includes
the address of the global variable. We can't resolve the constant
expression to an exact value.

Fixes #83947.
2024-03-05 22:34:04 +08:00
Alexandros Lamprineas
92289db82f
[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
This fixes #71892 allowing us to check magled names in the IR verifier.
2024-01-17 09:55:30 +00:00
Alexandros Lamprineas
e512df3ecc
[LV] Fix crash when vectorizing function calls with linear args. (#76274)
llvm/lib/IR/Type.cpp:694:
    Assertion `isValidElementType(ElementType) && "Element type of a
    VectorType must be an integer, floating point, or pointer type."'
    failed.
Stack dump:
    llvm::FixedVectorType::get(llvm::Type*, unsigned int)
    llvm::VPWidenCallRecipe::execute(llvm::VPTransformState&)
    llvm::VPBasicBlock::execute(llvm::VPTransformState*)
    llvm::VPRegionBlock::execute(llvm::VPTransformState*)
    llvm::VPlan::execute(llvm::VPTransformState*)
    ...

Happens with function calls of void return type.
2024-01-02 18:14:16 +00:00
Paschalis Mpeis
ddb6db4d09
[VFABI] Create FunctionType for vector functions (#75058)
`createFunctionType` returns a FunctionType  that may contain a mask,
which is currently placed as the last parameter to the Function.
The placement happens according to `VFParameters` of `VFInfo`, and it
should be able to handle VFABI specification changes.

Regarding the return type, it uses the scalar type of the input instruction,
as the specification does not encode in the mangled name such information.
If that ever happens, that information should be available from `VFInfo`.
2023-12-19 12:05:28 +00:00
Paschalis Mpeis
7b83f69db4
[NFC] Replace CallInst with FunctionType in VFABI, VFShape API (#74569)
Minor simplification applied to VFShape::getScalarShape,
VFShape::get, and VFABI::tryDemangleForVFABI methods.

Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`
2023-12-06 17:14:58 +00:00
Graham Hunter
b1fba568f6
[SVE] Don't require lookup when demangling vector function mappings (#72260)
We can determine the VF from a combination of the mangled name (which
indicates the arguments that take vectors) and the element sizes of
the arguments for the scalar function the mapping has been established
for.

The assert when demangling fails has been removed in favour of just
not adding the mapping, which prevents the crash seen in
https://github.com/llvm/llvm-project/issues/71892

This patch also stops using _LLVM_ as an ISA for scalable vector tests,
since there aren't defined rules for the way vector arguments should be
handled (e.g. packed vs. unpacked representation).
2023-11-23 17:15:48 +00:00
Ramkumar Ramachandra
2302e4c327
Reland "VectorUtils: mark xrint as trivially vectorizable" (#71416)
With the recent change 98c90a13 (ISel: introduce vector ISD::LRINT,
ISD::LLRINT; custom RISCV lowering), it is now possible for
SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint
and llvm.llrint, with vector codegen for the RISC-V target. Make a
trivial change to VectorUtils, and update the corresponding tests.

A couple of important fixes have been landed since the original patch
was landed and reverted, and it is now safe to re-land the patch:
5e1d81a (LegalizeIntegerTypes: implement PromoteIntRes for xrint) and
fd887a3 (LegalizeVectorTypes: fix bug in widening of vec result in
xrint). See also #71399, which proves that lrint and llrint will indeed
produce vector codegen on RISC-V.

Fixes #55208.
2023-11-06 18:49:49 +00:00
Ramkumar Ramachandra
ac7c816dc2 Revert "VectorUtils: mark lrint, llrint as trivially vectorizable (#69945)"
This reverts commit 5bfd89bda7c2d5ff167c7bcea0c8d69b0b498f08.

It was causing build failures on ffmpeg on i686.
2023-11-01 09:57:22 +00:00
Ramkumar Ramachandra
5bfd89bda7
VectorUtils: mark lrint, llrint as trivially vectorizable (#69945)
With the recent change 98c90a13 (ISel: introduce vector ISD::LRINT,
ISD::LLRINT; custom RISCV lowering), it is now possible for
SLPVectorizer, LoopVectorize, and Scalarizer to operate on llvm.lrint
and llvm.llrint, with vector codegen for the RISC-V target. Make a
trivial change to VectorUtils, and update the corresponding tests.
2023-10-31 21:29:15 +00:00
Kazu Hirata
3b7bfeb483 [llvm] Stop including llvm/ADT/SmallString.h (NFC)
Identified with misc-include-cleaner.
2023-10-22 10:42:15 -07:00
JolantaJensen
01797dad86
Fix mechanism propagating mangled names for TLI function mappings (#66656)
Currently the mappings from TLI are used to generate the list of
available "scalar to vector" mappings attached to scalar calls as
"vector-function-abi-variant" LLVM IR attribute. Function names from TLI
are wrapped in mangled name following the pattern:
_ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
The problem is the mangled name uses _LLVM_ as the ISA name which
prevents the compiler to compute vectorization factor for scalable
vectors as it cannot make any decision based on the _LLVM_ ISA. If we
use "s" as the ISA name, the compiler can make decisions based on VFABI
specification where SVE spacific rules are described.

This patch is only a refactoring stage where there is no change to the
compiler's behaviour.
2023-10-02 18:58:39 +01:00
Anna Thomas
3cf24dbbdd [LV] Complete load groups and release store groups. Try 2.
This is a complete fix for CompleteLoadGroups introduced in
D154309. We need to check for dependency between A and every member of
the load Group of B.
This patch also fixes another miscompile seen when we incorrectly sink stores
below a depending load (see testcase in
interleaved-accesses-sink-store-across-load.ll). This is fixed by
releasing store groups correctly.

This change was previously reverted (e85fd3cbdd68) due to Asan failure with
use-after-free error. A testcase is added and the bug is fixed in this
version of the patch.

Differential Revision: https://reviews.llvm.org/D155520
2023-08-08 18:10:23 -04:00
Anna Thomas
e85fd3cbdd Revert "[LV] Complete load groups and release store groups in presence of dependency"
This reverts commit eaf6117f3388615f51198e47c0d6be0252729508 (D155520).
There's an ASAN build failure that needs investigation.
2023-07-26 15:07:26 -04:00
Anna Thomas
eaf6117f33 [LV] Complete load groups and release store groups in presence of dependency
This is a complete fix for CompleteLoadGroups introduced in
D154309. We need to check for dependency between A and every member of
the load Group of B.
This patch also fixes another miscompile seen when we incorrectly sink stores
below a depending load (see testcase in
interleaved-accesses-sink-store-across-load.ll). This is fixed by
releasing store groups correctly.

Differential Revision: https://reviews.llvm.org/D155520
2023-07-25 17:32:09 -04:00
Anna Thomas
9675e3fa81 [LV] Address post-commit NFC comments in interleave
Addressed most of post-commit comments in D154309.
2023-07-14 16:24:07 -04:00
Florian Hahn
d7e79bd7d4
[LV] Check if ops can safely be truncated in computeMinimumValueSizes.
Update computeMinimumValueSizes to check if an instruction's operands
can safely be truncated.

If more than MinBW bits are demanded by for the operand or if the
operand is a constant and cannot be safely truncated, it is not safe to
evaluate the instruction in the narrower MinBW. Skip those cases.

Fixes https://github.com/llvm/llvm-project/issues/47927

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D154717
2023-07-11 20:18:55 +01:00
Elliot Goodrich
39d8e6e22c Add missing StringExtras.h includes
In preparation for removing the `#include "llvm/ADT/StringExtras.h"`
from the header to source file of `llvm/Support/Error.h`, first add in
all the missing includes that were previously included transitively
through this header.

This is fixing all files missed in b0abd4893fa1.

Differential Revision: https://reviews.llvm.org/D154543
2023-07-08 10:19:07 +01:00
Florian Hahn
4d847bf4d0
[LV] Do not add load to group if it moves across conflicting store.
This patch prevents invalid load groups from being formed, where a load
needs to be moved across a conflicting store.

Once we hit a store that conflicts with a load with an existing
interleave group, we need to stop adding earlier loads to the group, as
this would force hoisting the previous stores in the group across the
conflicting load.

To detect such cases, add a new CompletedLoadGroups set, which is used
to keep track of load groups to which no earlier loads can be added.

Fixes https://github.com/llvm/llvm-project/issues/63602

Reviewed By: anna

Differential Revision: https://reviews.llvm.org/D154309
2023-07-07 11:06:30 +01:00