162941 Commits

Author SHA1 Message Date
Craig Topper
e94dc58dff [RISCV] Inline scalar ceil/floor/trunc/rint/round/roundeven.
This avoids the call overhead as well as the the save/restore of
fflags and the snan handling in the libm function.

The save/restore of fflags and snan handling are needed to be
correct for -ftrapping-math. I think we can ignore them in the
default environment.

The inline sequence will generate an invalid exception for nan
and an inexact exception if fractional bits are discarded.

I've used a custom inserter to explicitly create the control flow
around the float->int->float conversion.

We can probably avoid the final fsgnj after the conversion for
no signed zeros FMF, but I'll leave that for future work.

Note the comparison constant is slightly different than glibc uses.
They use 1<<53 for double, I'm using 1<<52. I believe either are valid.
Numbers >= 1<<52 can't have any fractional bits. It's ok to do the
float->int->float conversion on numbers between 1<<53 and 1<<52 since
they will all fit in 64. We only have a problem if the double can't fit
in i64

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136508
2022-10-26 14:36:49 -07:00
Félix Cloutier
bca75abc01 Revert "[NFC] Make format() more amenable to format attributes"
This reverts commit fb1e90ef07fec0d64a05c0b6d41117a5ea3e8344.
2022-10-26 12:53:14 -07:00
Félix Cloutier
fb1e90ef07 [NFC] Make format() more amenable to format attributes
This change modifies the implementation of the format() function
so that vendor forks committed to building with compilers that
support __attribute__((format)) on non-variadic functions can
check the format() function with it.

Reviewed By: ahatanak
Differential Revision: https://reviews.llvm.org/D132413
rdar://84571523
2022-10-26 12:10:42 -07:00
James Y Knight
26fdad031c [MIPS] Fix useDeprecatedPositionallyEncodedOperands errors.
This is a follow-on to https://reviews.llvm.org/D134073.

The number of MIPS16 changes here is a bit surprising. Many of the
fields with mismatched names were NOT previously choosing the correct
argument positionally, but instead doing something completely wrong
(e.g. it would encode a register where an immediate was expected).

But, machine-code generation for MIPS16 has never actually functioned.
It's also fully untested, thus, the MIPS16 changes, despite changing
behavior, breaks (and fixes) zero tests. This change does not fix
MIPS16 output, but it ought to be at least incrementally less broken.

Outside MIPS16, I believe the only functional change is to the 'ginvi'
instruction: it was previously encoding garbage into a field which was
specified to be '00'. Fortunately, it was covered by tests -- and the
tests were testing the incorrect behavior. So, fixed.

Differential Revision: https://reviews.llvm.org/D134220
2022-10-26 14:06:08 -04:00
James Y Knight
23394cd810 [Sparc] Fix useDeprecatedPositionallyEncodedOperands errors.
This is a follow-on to https://reviews.llvm.org/D134073.

It renames a few fields to have consistent names, as well as renaming
operands to match the field names.

Behavior is unchanged by this cleanup. (The only generated code change
is for the disassembler for LDSTUB/LDSTUBA, but in both old and new
versions, it fails to add enough operands, and thus triggers a runtime
abort. I will address that bug in a future commit.)

Differential Revision: https://reviews.llvm.org/D134201
2022-10-26 14:06:07 -04:00
Sanjay Patel
54eeadcf44 [SDAG] avoid vector extract/insert around binop
scalar-to-vector (scalar binop (extractelt V, Idx), C) --> shuffle (vector binop V, C'), {Idx, -1, -1...}

We generally try to avoid ad-hoc vectorization in SDAG,
but the motivating case from issue #39482 escapes our
normal vectorization folds in IR. It seems like it should
always be a win to transform this pattern in cases where
we have the same vector type for input and output and the
target supports the vector operation. That avoids
transfers from vector to scalar and back.

In the x86 shift examples, we create the scalar-to-vector
node during legalization. I'm not sure if there's a more
general way to create the pattern for testing. (If so, I
could add tests for other targets.)

Differential Revision: https://reviews.llvm.org/D136713
2022-10-26 14:04:46 -04:00
Matt Arsenault
f85ce1b236 ConstantFold: Reduce code duplication for checking commuted compare 2022-10-26 10:59:58 -07:00
Johannes Rudolf Doerfert
41a278f56a [OpenMP][FIX] Do not add custom state machine eagerly in LTO runs
If we run LTO optimization we migth end up introducing a custom state machine
and later transforming the region into SPMD. This is a problem. While a follow
up will introduce a check for the SPMD conversion, this already prevents the
eager custom state machine generation. Only if the kernel init function is
defined, rather then declared, we will emit a custom state machine. SPMD-zation
can happen eagerly though. Tests are adjusted via a weak definition. The LTO
test was added to verify this works as expected.

Differential Revision: https://reviews.llvm.org/D136740
2022-10-26 10:40:11 -07:00
Alex Brachet
443e2a10f6 Reland "[PGO] Make emitted symbols hidden"
This was reverted because it was breaking when targeting Darwin which
tried to export these symbols which are now hidden. It should be safe
to just stop attempting to export these symbols in the clang driver,
though Apple folks will need to change their TAPI allow list described
in the commit where these symbols were originally exported
f538018562

Then reverted again because it broke tests on MacOS, they should be
fixed now.

Bug: https://github.com/llvm/llvm-project/issues/58265

Differential Revision: https://reviews.llvm.org/D135340
2022-10-26 17:13:05 +00:00
Piyou Chen
7d7940fd77 [RISCV] add svinval extension
1. Add the svinval extension support
2. Add the svinval Predicates for its instruction

Note: the svinval instructions defined in https://reviews.llvm.org/D117654

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136571
2022-10-26 09:45:30 -07:00
Craig Topper
a61b74889f [RISCV] Use vslide1down for i64 insertelt on RV32.
Instead of using vslide1up, use vslide1down and build the other
direction. This avoids the overlap constraint early clobber of
vslide1up.

Reviewed By: reames

Differential Revision: https://reviews.llvm.org/D136735
2022-10-26 09:43:12 -07:00
Michael Maitland
64d5aedd06 [TableGen] Add log bang operator
This patch adds base 2 logarithm that returns integer result. I initially wanted to name it `!log2`,
but numbers are not permitted in the name. The documentation makes sure to clarify that it is
base 2 since it is not explicit in the operator name.

Differential Revision: https://reviews.llvm.org/D134068
2022-10-26 09:16:32 -07:00
Sanjay Patel
3aec021118 [SDAG] add helper for opcodes that are not speculatable
This is not quite NFC because one of the users should
now avoid the DIVREM opcodes too, but I'm not sure
how to test that.

I used the same name as an analysis function in IR
in case we want to expand this to include other
operations.

Another potential use is proposed in D136713.
2022-10-26 11:20:14 -04:00
Momchil Velikov
9901583968 Revert "[FuncSpec] Fix specialisation based on literals"
This reverts commit a8b0f580170089fcd555ade5565ceff0ec60f609 because
of "reverse-iteration" buildbot failure.
2022-10-26 13:54:12 +01:00
Momchil Velikov
2c8a4c6e62 Revert "[FuncSpec][NFC] Refactor finding specialisation opportunities"
This reverts commit a8853924bd3c50deebfbf993c037257ccf9805f4 due to dependency
on a8b0f5801700
2022-10-26 13:54:12 +01:00
Guillaume Chatelet
1a726cfa83 Take memset_inline into account in analyzeLoadFromClobberingMemInst
This appeared in https://reviews.llvm.org/D126903#3884061

Differential Revision: https://reviews.llvm.org/D136752
2022-10-26 09:50:13 +00:00
Momchil Velikov
a8853924bd [FuncSpec][NFC] Refactor finding specialisation opportunities
This patch reorders the traversal of function call sites and function
formal parameters to:

* do various argument feasibility checks (`isArgumentInteresting` ) only once per argument, i.e. doing N-args checks instead of N-calls x N-args checks.

* do hash table lookups only once per call site, i.e. N-calls lookups/inserts instead of N-call x N-args lookups/inserts.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D135968
2022-10-26 10:18:35 +01:00
Momchil Velikov
606d25e545 [FuncSpec] Compute specialisation gain even when forcing specialisation
When rewriting the call sites to call the new specialised functions, a
single call site can be matched by two different specialisations - a
"less specialised" version of the function and a "more specialised"
version of the function, e.g.  for a function

    void f(int x, int y)

the call like `f(1, 2)` could be matched by either

    void f.1(int x /* int y == 2 */);

or

    void f.2(/* int x == 1, int y == 2 */);

The `FunctionSpecialisation` pass tries to match specialisation in the
order of decreasing gain, so "more specialised" functions are
preferred to "less specialised" functions. This breaks, however, when
using the flag `-force-function-specialization`, in which case the
cost/benefit analysis is not performed and all the specialisations are
equally preferable.

This patch makes the pass calculate specialisation gain and order the
specialisations accordingly even when `-force-function-specialization`
is used, under the assumption that this flag has purely debugging
purpose and it is reasonable to ignore the extra computing effort it
incurs.

Reviewed By: ChuanqiXu, labrinea

Differential Revision: https://reviews.llvm.org/D136180
2022-10-26 10:08:03 +01:00
Momchil Velikov
a8b0f58017 [FuncSpec] Fix specialisation based on literals
The `FunctionSpecialization` pass has support for specialising
functions, which are called with literal arguments. This functionality
is disabled by default and is enabled with the option
`-function-specialization-for-literal-constant` .  There are a few
issues with the implementation, though:

* even with the default, the pass will still specialise based on
   floating-point literals

* even when it's enabled, the pass will specialise only for the `i1`
    type (or `i2` if all of the possible 4 values occur, or `i3` if all
    of the possible 8 values occur, etc)

The reason for this is incorrect check of the lattice value of the
function formal parameter. The lattice value is `overdefined` when the
constant range of the possible arguments is the full set, and this is
the reason for the specialisation to trigger. However, if the set of
the possible arguments is not the full set, that must not prevent the
specialisation.

This patch changes the pass to NOT consider a formal parameter when
specialising a function if the lattice value for that parameter is:

* unknown or undef
* a constant
* a constant range with a single element

on the basis that specialisation is pointless for those cases.

Is also changes the criteria for picking up an actual argument to
specialise if the argument is:

* a LLVM IR constant
* has `constant` lattice value
 has `constantrange` lattice value with a single element.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135893
2022-10-26 09:55:33 +01:00
Haohai Wen
21f23a37c6 [SelectionDAG] Clamp stack alignment for memset, memmove
memcpy has clamped dst stack alignment to NaturalStackAlignment if
hasStackRealignment is false. We should also clamp stack alignment
for memset and memmove. If we don't clamp, SelectionDAG may first
do tail call optimization which requires no stack realignment. Then
memmove, memset in same function may be lowered to load/store with
larger alignment leading to PEI emit stack realignment code which
is absolutely not correct.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D136456
2022-10-26 16:45:31 +08:00
Pierre van Houtryve
63390dccd8 [GlobalISel] Add Predicates to GICombineRule
Small QoL change to allow Predicates to be used in GICombineRule.
Currently only one combine in the AMDGPU backend makes use of it.

The implementation is pretty simple to get started but of course we can expand this later on and optimize predicate checking better if needed.

Reviewed By: dsanders

Differential Revision: https://reviews.llvm.org/D136681
2022-10-26 07:13:40 +00:00
chenglin.bi
9403a8bc37 [GlobalISel][AArch64] Fix miscompile caused by wrong G_ZEXT selection in GISel
The miscompile case's G_ZEXT has a G_FREEZE source.  Similar to D127154, this patch removed isDef32, relying on the AArch64MIPeephole optimizer to remove redundant SUBREG_TO_REG nodes also in GISel.

Fix #58431

Reviewed By: paquette

Differential Revision: https://reviews.llvm.org/D136433
2022-10-26 09:54:13 +08:00
Lang Hames
b26f45e5a4 [ORC] Skip non-SHF_ALLOC sections in DebugObjectManagerPlugin.
We don't need to provide a load-address for non-alloc sections. Skipping them
allows us to avoid some complications, like handling duplicate .group sections.
2022-10-25 18:40:38 -07:00
Guozhi Wei
d24c93cc41 [X86] Enable reassociation for ADD instructions
ADD is an associative and commutative operation, so we can do reassociation for it.

Differential Revision: https://reviews.llvm.org/D136396
2022-10-26 00:46:13 +00:00
Matt Arsenault
a91c17498a GlobalISel: Fix copy paste error
Pretty sure this was harmless since the tablegen
calling convention definitions do not use pointers.

Part of issue 58604
2022-10-25 17:06:00 -07:00
Douglas Yung
fc40c73921 Revert "Update supported features in the generic CPU configuration"
This reverts commit 11afbf396e10e1b1e91a5991e2aec1916e29a910.

There are 10 tests still failing after follow-up fix b5d0bf9b9853, this should get the following bots back to green:
 - https://lab.llvm.org/buildbot/#/builders/183/builds/8194
 - https://lab.llvm.org/buildbot/#/builders/186/builds/9491
 - https://lab.llvm.org/buildbot/#/builders/214/builds/3908
 - https://lab.llvm.org/buildbot/#/builders/93/builds/11740
 - https://lab.llvm.org/buildbot/#/builders/231/builds/4200
 - https://lab.llvm.org/buildbot/#/builders/121/builds/24519
 - https://lab.llvm.org/buildbot/#/builders/230/builds/4466
 - https://lab.llvm.org/buildbot/#/builders/94/builds/11639
 - https://lab.llvm.org/buildbot/#/builders/45/builds/9325
 - https://lab.llvm.org/buildbot/#/builders/124/builds/5219
 - https://lab.llvm.org/buildbot/#/builders/67/builds/8623
 - https://lab.llvm.org/buildbot/#/builders/123/builds/13836
 - https://lab.llvm.org/buildbot/#/builders/109/builds/49355
 - https://lab.llvm.org/buildbot/#/builders/58/builds/27751
 - https://lab.llvm.org/buildbot/#/builders/117/builds/9922
 - https://lab.llvm.org/buildbot/#/builders/16/builds/37012
 - https://lab.llvm.org/buildbot/#/builders/104/builds/9490
 - https://lab.llvm.org/buildbot/#/builders/42/builds/7725
 - https://lab.llvm.org/buildbot/#/builders/196/builds/20077
 - https://lab.llvm.org/buildbot/#/builders/3/builds/15217
 - https://lab.llvm.org/buildbot/#/builders/6/builds/15251
 - https://lab.llvm.org/buildbot/#/builders/9/builds/15247
 - https://lab.llvm.org/buildbot/#/builders/36/builds/26487
 - https://lab.llvm.org/buildbot/#/builders/54/builds/2474
 - https://lab.llvm.org/buildbot/#/builders/74/builds/14536
 - https://lab.llvm.org/buildbot/#/builders/5/builds/28555
2022-10-25 16:34:08 -07:00
Momchil Velikov
1a525dec7f [FuncSpec] Fix missed opportunities for function specialisation
When collecting the possible constant arguments to
specialise a function the compiler will abandon the search
on the first argument that is for some reason unsuitable as
a specialisation constant. Thus, depending on the traversal
order of the functions and call sites, the compiler can end
up with a different set of possible constants, hence with
different set of specialisations.

With this patch, the compiler will skip unsuitable
constants, but nevertheless will continue searching for
more.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135867
2022-10-25 23:19:48 +01:00
Philip Reames
269bc684e7 [LV][RISCV] Disable vectorization of epilogue loops
Epilogue loop vectorization is a feature in the vectorize intended to avoid running fully scalar code when the vector length of the main loop turns out to be either longer than the trip count of the actual loop, or with a huge remainder.

In practice, this feature appears to not have been well tuned. I honestly don't think it should be on by default at all, but it definitely shouldn't be on for RISCV. Note that other targets have also disabled it, but they've done so via disabling interleaving - which is, well, completely unrelated - and we don't want to do that for RISCV.

In the near term, many examples I'm seeing have terrible codegen for epilogue vectorization. We are greatly increasing code size for little value at reasonable VLEN values for small types. In the long term, the cases that epilogue vectorization are intended to handle are likely better handled via tail folding on RISCV.

As an aside, I also don't really trust the correctness of epilogue vectorization. The code structure is such that otherwise straight forward changes sometimes break only epilogue vectorization. The reuse of an existing vplan without careful validation opens significant room for nasty bugs. Given how rarely the code is exercised, that is not a good combination.

As such, this patch introduces a TTI hook, and completely disables epilogue vectorization on RISCV.

Differential Revision: https://reviews.llvm.org/D136695
2022-10-25 14:28:02 -07:00
Arthur Eubanks
ef37504879 [Instrumentation] Remove legacy passes
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D136615
2022-10-25 13:11:07 -07:00
Alina Sbirlea
d1b19da854 [LoopPeeling] Add flag to disable support for peeling loops with non-latch exits
Add a flag to allow disabling the changes in
https://reviews.llvm.org/D134803.

Differential Revision: https://reviews.llvm.org/D136643
2022-10-25 12:19:14 -07:00
Momchil Velikov
c47739b45c [FuncSpec] Consider small noinline functions for specialisation
Small functions with size under a given threshold are not
considered for specialisaion on the presumption that they
are easy to inline. This does not apply to `noinline`
functions, though.

Reviewed By: ChuanqiXu

Differential Revision: https://reviews.llvm.org/D135862
2022-10-25 19:49:04 +01:00
Lang Hames
b5d5813762 [JITLink][AArch64] Add a generic 'createAnonymousPointer' utility.
Adds a generic utility for creating anonymous aarch64 pointer blocks
(automatically adding an edge to initialize the pointer if given an
initial target).

Updates the aarch64 GOTTableManager to use the utility when building
GOT entries.
2022-10-25 11:46:06 -07:00
Dan Gohman
11afbf396e Update supported features in the generic CPU configuration
Accompanying https://reviews.llvm.org/D125728, this updates LLVM
Codegen's "generic" CPU to enable the same new features.

Differential Revision: https://reviews.llvm.org/D125729
2022-10-25 11:42:32 -07:00
Artem Belevich
0e8a414ab3 [CUDA, NVPTX] Added basic __bf16 support for NVPTX.
Recent Clang changes expose _bf16 types for SSE2-enabled host compilations and
that makes those types visible furing GPU-side compilation, where it currently
fails with Sema complaining that __bf16 is not supported.

Considering that __bf16 is a storage-only type, enabling it for NVPTX if it's
enabled on the host should pose no issues, correctness-wise.

Recent NVIDIA GPUs have introduced bf16 support, so we'll likely grow better
support for __bf16 on NVPTX going forward.

Differential Revision: https://reviews.llvm.org/D136311
2022-10-25 11:08:06 -07:00
Fangrui Song
a527bda520 [LegacyPM] Remove DataFlowSanitizerLegacyPass
Using the legacy PM for the optimization pipeline was deprecated in 13.0.0.
Following recent changes to remove non-core features of the legacy
PM/optimization pipeline, remove DataFlowSanitizerLegacyPass.

Differential Revision: https://reviews.llvm.org/D124594
2022-10-25 10:55:29 -07:00
Caroline Concatto
9fbd57fbe2 [AArch64]SME2 single-multi and multi-multi INT dot product instructions[part2]
This patch adds the assembly/disassembly for the following instructions:
        SDOT: (4-way, multiple and single vector): Multi-vector signed integer dot-product by vector.
              SDOT (4-way, multiple vectors): Multi-vector signed integer dot-product.
        UDOT: (4-way, multiple and single vector): Multi-vector unsigned integer dot-product by vector.
              (4-way, multiple vectors): Multi-vector unsigned integer dot-product.
    for groups of 2 and 4 ZA registers

The reference can be found here:
                https://developer.arm.com/documentation/ddi0602/2022-09

Depends on: D135563

Differential Revision: https://reviews.llvm.org/D135760
2022-10-25 18:32:20 +01:00
Caroline Concatto
070f414604 [AArch64]SME2 single-multi and multi-multi INT/FP dot product instructions
This patch adds the assembly/disassembly for the following instruction:
INT:
  SDOT (2-way, multiple and single vector): Multi-vector signed integer dot-product by vector.
       (2-way, multiple vectors): Multi-vector signed integer dot-product.
  UDOT (2-way, multiple and single vector): Multi-vector unsigned integer dot-product by vector.
       (2-way, multiple vectors): Multi-vector unsigned integer dot-product.
  SUDOT (multiple and indexed vector): Multi-vector signed by unsigned integer dot-product by indexed elements.
        (multiple and single vector): Multi-vector signed by unsigned integer dot-product by vector.
  USDOT (multiple and single vector): Multi-vector unsigned by signed integer dot-product by vector.
        (multiple vectors): Multi-vector unsigned by signed integer dot-product.
FP:
  BFDOT(multiple and single vector): Multi-vector BFloat16 floating-point dot-product by vector.
        (multiple vectors): Multi-vector BFloat16 floating-point dot-product.

  FDOT (multiple and single vector): Multi-vector half-precision floating-point dot-product by vector.
       (multiple vectors): Multi-vector half-precision floating-point dot-product.
For set of 2 and 4 ZA registers

The reference can be found here:
        https://developer.arm.com/documentation/ddi0602/2022-09

Depends on:D135455

Differential Revision: https://reviews.llvm.org/D135683
2022-10-25 18:28:11 +01:00
Craig Topper
8c42b5e89e [SelectionDAG] Add missing semicolon after return.
I'm unsure what the code does without the semicolon. On the surface it
seems like the assert below it would be considered part of the if
and thus the assert would only execute if DestReg is 0. But 0 isn't
considered a virtual register so the assert should fail.

Found by PVS Studio.
Reported https://pvs-studio.com/en/blog/posts/cpp/1003/ (N7)
2022-10-25 10:24:01 -07:00
Joe Nash
01b8140d3a [AMDGPU] Fix delay alu for VOPD with src2acc
V_FMAC_F32 and V_DOT2C_F32_F16 have a dummy src2 operand tied to vdst to
inform passes that the instructions read the dst operand. The VOPD
versions of these instructions lacked the dummy operand, which was a
problem for inserting s_delay_alu.
Introduce the dummy src2 operand on the VOPD versions, and fix the VOPD operand
tracking logic to account for it.

Reviewed By: dp

Differential Revision: https://reviews.llvm.org/D136629
2022-10-25 13:11:17 -04:00
Ulrich Weigand
96482ee434 [SystemZInstPrinter] Introduce markup tags emission
SystemZ assembly syntax emission now leverages markup tags, if enabled.

Author: Antonio Frighetto

Differential Revision: https://reviews.llvm.org/D129868
2022-10-25 18:59:50 +02:00
Sanjay Patel
b179351ad4 [SDAG] refactor folds for scalar-to-vector; NFCI
Fix typos, add comments, improve variable names,
rearrange code, add early exits.
2022-10-25 12:53:46 -04:00
Simon Pilgrim
50fe87a5c8 [Transforms] classifyArgUse - don't deference pointer before null test
Reported here: https://pvs-studio.com/en/blog/posts/cpp/1003/ (N11)
2022-10-25 17:24:00 +01:00
Yaxun (Sam) Liu
9d5adc7e49 Revert "reland e5581df60a35 [SimplifyCFG] accumulate bonus insts cost"
This reverts commit bd7949bcd86633bd4203b2ba6f891aea00fce4d1.

Revert this patch since reviwers have different opinions regarding
the approach in post-commit review.

Will open RFC for further discussion.

Differential Revision: https://reviews.llvm.org/D132408
2022-10-25 12:15:39 -04:00
Lang Hames
b3c9ced93c [ORC] Allow EPCDebugObjectRegistrar clients to specify registration fn dylib.
Similar to the EPCEHFrameRegistrar change in c977251ef6f, this allows clients
who have sourced a dylib handle via a side-channel to search that dylib to
find the registration functions.

This patch defaults to the existing behavior in the case where the client does
not specify a handle to use.
2022-10-25 08:50:27 -07:00
Lang Hames
9000ee2224 [ORC] Update SelfExecutorProcessControl to allow user-supplied handles.
SelfExecutorProcessControl no longer requires that handles passed to
lookupSymbols be ones that were previously returned from loadDylib. This brings
SelfExecutorPRocessControl into alignment with SimpleRemoteEPC, which was
updated in 6613f4aff85.
2022-10-25 08:50:27 -07:00
Jan Sjodin
3d0e9edd8e [OpenMP] [OMPIRBuilder] Create a new datatype to hold the unique target region info
This patch puts the individual target region information attributes into a
struct so that the nested mappings are not needed and passing the information
around is simplified.

Reviewed By: jdoerfert, mikerice

Differential Revision: https://reviews.llvm.org/D136601
2022-10-25 11:15:36 -04:00
zhongyunde
620cff096a [InstCombine] Fold series of instructions into mull for more types
Relax the constraint of wide/vectors types.
Address the comment https://reviews.llvm.org/D136015?id=469189#inline-1314520

Reviewed By: spatel, chfast
Differential Revision: https://reviews.llvm.org/D136661
2022-10-25 23:04:46 +08:00
Simon Pilgrim
ed1b0da557 [X86] combineConcatVectorOps - fold v4i64/v8x32 concat(broadcast(),broadcast()) -> permilps(concat())
Extend the existing v4f64 fold to handle v4i64/v8f32/v8i32 as well

Fixes #58585
2022-10-25 15:37:42 +01:00
Juan Manuel MARTINEZ CAAMAÑO
854b1bca60 [DebugInfo] getMergedLocation: Maintain the line number if they match
getMergedLocation returns a 'line 0' DILocaiton if the two locations
being merged don't perfecly match, even if they are in the same line but
a different column.

This commit adds support to keep the line number if it matches (but only
the column differs). The merged column number is the leftmost between the
two.

Reviewed By: dblaikie, orlando

Differential Revision: https://reviews.llvm.org/D135166
2022-10-25 14:10:21 +00:00
Jay Foad
191d70f2f5 [AMDGPU] Use Register in more places in SIInstrInfo. NFC.
Also avoid using AMDGPU::NoRegister when it's not neeeded.
2022-10-25 15:04:58 +01:00