469694 Commits

Author SHA1 Message Date
cabreraam
a033bf242f [flang][hlfir] work towards handling char_convert in hlfir
This patch aims to address the TODO for handling character conversion in HLFIR found [here](1defa78124/flang/lib/Lower/ConvertExprToHLFIR.cpp (L1388)) using [this similar operation but for FIR as inspiration](3ea673a97b/flang/lib/Lower/ConvertExpr.cpp (L1212-L1271)).

Reviewed By: vzakhari, tblah

Differential Revision: https://reviews.llvm.org/D155650
2023-07-31 10:45:10 -04:00
Roger Ferrer Ibanez
896aada3b6 [NFCI][mlir][Tests] Rename identifiers minor/major to avoid clashes with system headers
Identifiers major and minor are often already taken in POSIX systems due
to their presence in <sys/types.h> as part of the makedev library
function.

This causes compilation failures on FreeBSD and Linux systems with glibc
<2.28.

This change renames the identifiers to major_/minor_.

Differential Revision: https://reviews.llvm.org/D156683
2023-07-31 14:36:35 +00:00
Matt Arsenault
fbeda975d2 InstCombine: Drop some typed pointer cast handling 2023-07-31 10:34:31 -04:00
Alexey Bataev
662efdee9b [SLP][NFC]Improve handling of MinBWs container, NFC.
Replaced by DenseMap instead of MapVector(the order is not important,
just lookup is used) + reduced number of lookups.
2023-07-31 07:26:55 -07:00
Nikita Popov
72ec2c007e [InstCombine] Fix handling of irreducible loops (PR64259)
Fixes a regression introduced by D75362 for irreducible control
flow. In that case, we may visit the predecessor that renders
the current block live only later, and incorrectly determine
that a block is dead.

Instead, switch to using the same DeadEdges based implementation
we also use during the main InstCombine iteration.

This temporarily regresses some cases that need replacement of
dead phi operands with poison, which is currently only done during
the main run, but not worklist population. This will be addressed
in a followup, to keep it separate from the correctness fix here.

Fixes https://github.com/llvm/llvm-project/issues/64259.
2023-07-31 16:20:22 +02:00
Shraiysh Vaishay
2cb6d0c70b [mlir][OpenMP] Translating if and final clauses for task construct
Support for if and final clauses for task construct.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D130704
2023-07-31 19:44:17 +05:30
Matt Arsenault
d6f9428e46 GlobalISel: Pass MachineIRBuilder to applyMappingImpl
The target should not have to construct MachineIRBuilders during
RegBankSelect (we should perhaps hide the constructors for it). The
pass should own the builder setup with the desired CSE configuration
(although currently the pass does not use the CSE builder, which is
what I want to fix).

https://reviews.llvm.org/D156479
2023-07-31 10:03:38 -04:00
Alexey Bataev
85635c7f60 [SLP][NFC]Use ScalarTy consistently in getEntryCost, NFC. 2023-07-31 06:52:56 -07:00
Sergei Barannikov
aeeaadd6ee [SystemZ] Replace OperandMatchResultTy with ParseStatus (NFC)
ParseStatus is slightly more convenient to use due to implicit
conversion from bool, which allows to do something like:
```
  return Error(L, "msg");
```
when with MatchOperandResultTy it had to be:
```
  Error(L, "msg");
  return MatchOperand_ParseFail;
```
It also has more appropriate name since parse* methods are not only for
parsing operands.

Reviewed By: uweigand

Differential Revision: https://reviews.llvm.org/D154316
2023-07-31 16:44:00 +03:00
Matthias Springer
16b75cd2bb [mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions
`DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayAttr`) are generated.

Differential Revision: https://reviews.llvm.org/D156684
2023-07-31 15:25:37 +02:00
Matthias Springer
aba0ef7059 [mlir][bufferization] Support casts in EmptyTensorElimination
EmptyTensorElimination is a pre-bufferization transformation that replaces "tensor.empty" ops with "tensor.extract_slice" ops. This revision adds support for cases where the input IR contains "tensor.cast" ops.

Differential Revision: https://reviews.llvm.org/D156167
2023-07-31 15:20:00 +02:00
Nikita Popov
09156b36c6 [InstCombine] Move worklist preparation into InstCombinerImpl (NFC) 2023-07-31 15:18:12 +02:00
Matthias Springer
933fde3d1c [mlir][tensor][NFC] Simplify extract_slice(cast) folder
The type computation part is not needed.

Differential Revision: https://reviews.llvm.org/D156652
2023-07-31 15:07:49 +02:00
Matthias Springer
b2826c0209 [mlir][NFC] Move offsets/sizes/strides helper to dialect utils and interface header
* Move `foldDynamicIndexList` to `DialectUtils` and simplify function.
* Move `OpWithOffsetSizesAndStridesConstantArgumentFolder` to `ViewLikeInterface` and add documentation.

Differential Revision: https://reviews.llvm.org/D156581
2023-07-31 14:53:14 +02:00
Matt Arsenault
ab6cd2d498 AMDGPU: Simplify early exit handling for libcall simplify
Early exit on intrinsics and don't duplicate indirect call
checks. Also let the IRBuilder constructor figure out the insert point
rather than doing it manually. Also avoid debug print about trying to
simplify calls in more unhandled scenarios.
2023-07-31 08:18:12 -04:00
Matt Arsenault
d74c89fdb4 InstCombine: Drop some typed pointer bitcasts 2023-07-31 08:05:58 -04:00
Matt Arsenault
055a7f2512 AMDGPU: Adjust outdated comment 2023-07-31 08:05:13 -04:00
Matt Arsenault
51ec5a2733 AMDGPU: Use available subtarget member 2023-07-31 08:05:12 -04:00
Matt Arsenault
acc163d4ab Inliner: Regenerate test
Test claims to be autogenerated but some functions are inexplicibly
missing checks.
2023-07-31 08:05:12 -04:00
Matt Arsenault
360a5d5612 AMDGPU: Remove some typed pointer handling 2023-07-31 08:05:12 -04:00
Matt Arsenault
d388222be2 InstCombine: Drop some typed pointer bitcast handling 2023-07-31 08:05:12 -04:00
Nimish Mishra
da1f1b2292 Prevent extraneous copy in f752265231c2d15590a53e45bcc850acf2450dfc
Commit f752265231c2d15590a53e45bcc850acf2450dfc uses
extraneous copy to the loop variable. Fixing the same
2023-07-31 17:31:19 +05:30
Jonas Hahnfeld
5ea647dea6 [CodeGen] Assert that EmittedDeferredDecls is empty
Its contents are transferred into DeferredDecls in Release(), so it
should be empty in moveLazyEmissionStates(). This matches the code
downstream in Cling.

Differential Revision: https://reviews.llvm.org/D156660
2023-07-31 13:40:00 +02:00
Haojian Wu
dcb28244fa [clangd] Respect IWYU keep pragma for standard headers.
see the issue https://github.com/llvm/llvm-project/issues/64191

Differential Revision: https://reviews.llvm.org/D156650
2023-07-31 13:21:54 +02:00
Nimish Mishra
f752265231 [flang][OpenMP] Support for privatization in common block
This patch provides support for usage of common block
in private/firstprivate and lastprivate clauses.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D156120
2023-07-31 16:46:18 +05:30
Haojian Wu
171868dc2c [Tooling/Inclusion] Add std::range symbols in the mapping.
Fixes https://github.com/llvm/llvm-project/issues/64191

Differential Revision: https://reviews.llvm.org/D156648
2023-07-31 13:05:47 +02:00
Peixin Qiao
b4c54b2027 [flang][OpenMP] Support common block in OpenMP private clause
This supports the common block in OpenMP privat clause by making
each common block member host-associated privatization and
adds the test case.

Reviewed By: kiranchandramohan

Differential Revision: https://reviews.llvm.org/D127215
2023-07-31 16:24:12 +05:30
Simon Pilgrim
c1c86f9eae [X86] LowerEXTRACT_VECTOR_ELT - match i8 extraction with MVT::i8 instead of getSizeInBits()
Noticed on D156350
2023-07-31 11:37:26 +01:00
Jay Foad
0ef39e33d7 [StackColoring] Fix typo in comment 2023-07-31 11:35:57 +01:00
Sergio Afonso
fcb6a9c07c
[Flang][OpenMP][Lower] Refactor implementation of PFT to MLIR lowering
This patch makes the following non-functional changes:
  - Extract OpenMP clause processing into a new internal `ClauseProcessor`
    class. Atomic and reduction-related clauses processing is kept unchanged,
    since atomic clauses are stored in `OmpAtomicClauseList` rather than
    `OmpClauseList` and there are many TODO comments related to the current
    implementation of reduction lowering. This has been left unchanged to avoid
    merge conflicts and work duplication.
  - Reorganize functions into sections in the file to improve readability.
  - Explicitly use mlir:: namespace everywhere, rather than just most places.
  - Spell out uses of `auto` in which the type wasn't explicitly stated as part
    of the initialization expression.
  - Normalize a few function names to match the rest and renamed variables in
    'snake_case' to 'camelCase'.

The main purpose is to reduce code duplication and simplify the implementation
of upcoming work to support loop-attached target constructs and teams/
distribute lowering to MLIR.

Differential Revision: https://reviews.llvm.org/D155981
2023-07-31 10:51:39 +01:00
Ingo Müller
bd17556d55 [mlir][memref][transform][python] Create .td file for bindings.
This patch creates the .td files for the Python bindings of the
transform ops of the MemRef dialect and integrates them into the build
systems (CMake and Bazel).

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D156536
2023-07-31 09:49:28 +00:00
Vedant Paranjape
259d56d41d [LoopAccessAnalysis] Add a const qualifier to getMaxSafeDepDistBytes()
Add a const qualifier to this API call, since this is a member of
MemoryDepChecker and LoopAccessInfo returns an object of this class as a
const, as follows:

const MemoryDepChecker &getDepChecker() const { return *DepChecker; }

If one tries to use function as follows:

LAI->getDepChecker().getMaxSafeDepDistBytes()

results in the following error:

passing ‘const llvm::MemoryDepChecker’ as ‘this’ argument discards
qualifiers

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D156304
2023-07-31 09:45:01 +00:00
Simon Pilgrim
076bee1020 [DAG] getNode() - fold (zext (trunc (assertzext x))) -> (assertzext x)
If the pre-truncated value was the same width as the extension, and the assertzext guarantees that the extended bits are already zero, then skip the zext/trunc 'zero_extend_inreg' pattern.

Addresses several regressions noticed in D155472
2023-07-31 10:43:11 +01:00
Simon Tatham
60b98363c7 Retain all jump table range checks when using BTI.
This modifies the switch-statement generation in SelectionDAGBuilder,
specifically the part that generates case clusters of type CC_JumpTable.

A table-based branch of any kind is at risk of being a JOP gadget, if
it doesn't range-check the offset into the table. For some types of
table branch, such as Arm TBB/TBH, the impact of this is limited
because the value loaded from the table is a relative offset of
limited size; for others, such as a MOV PC,Rn computed branch into a
table of further branch instructions, the gadget is fully general.

When compiling for branch-target enforcement via Arm's BTI system,
many of these table branch idioms use branch instructions of types
that do not require a BTI instruction at the branch destination. This
avoids the need to put a BTI at the start of each case handler,
reducing the number of available gadgets //with// BTIs (i.e. ones
which could be used by a JOP attack in spite of the BTI system). But
without a range check, the use of a non-BTI-requiring branch also
opens up a larger range of followup gadgets for an attacker's use.

A defence against this is to avoid optimising away the range check on
the table offset, even if the compiler believes that no out-of-range
value should be able to reach the table branch. (Rationale: that may
be true for values generated legitimately by the program, but not
those generated maliciously by attackers who have already corrupted
the control flow.)

The effect of keeping the range check and branching to an unreachable
block is that no actual code is generated at that block, so it will
typically point at the end of the function. That may still cause some
kind of unpredictable code execution (such as executing data as code,
or falling through to the next function in the code section), but even
if so, there will only be //one// possible invalid branch target,
rather than giving an attacker the choice of many possibilities.

This defence is enabled only when branch target enforcement is in use.
Without branch target enforcement, the range check is easily bypassed
anyway, by branching in to a location just after it. But with
enforcement, the attacker will have to enter the jump table dispatcher
at the initial BTI and then go through the range check. (Or, if they
don't, it's because they //already// have a general BTI-bypassing
gadget.)

Reviewed By: MaskRay, chill

Differential Revision: https://reviews.llvm.org/D155485
2023-07-31 10:39:50 +01:00
Cullen Rhodes
ce6303f0e6 [lli] Fix crash on empty entry-function
Empty entry-function triggers the following assertion:

  llvm/lib/IR/Mangler.cpp:38: void getNameWithPrefixImpl(llvm::raw_ostream
  &, const llvm::Twine &, (anonymous namespace)::ManglerPrefixTy, const
  llvm::DataLayout &, char):

  Assertion `!Name.empty() && "getNameWithPrefix requires non-empty name"' failed.

Throw an error if entry-function is empty.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D156516
2023-07-31 09:29:54 +00:00
Nikita Popov
41895843b5 [InstCombine] Only perform one iteration
InstCombine is a worklist-driven algorithm, which works roughly
as follows:

* All instructions are initially pushed to the worklist.
  The initial order is in RPO program order.
* All newly inserted instructions get added to the worklist.
* When an instruction is folded, its users get added back to the
  worklist.
* When the use-count of an instruction decreases, it gets added
  back to the worklist.
* And a few of other heuristics on when we should revisit
  instructions.

On top of the worklist algorithm, InstCombine layers an additional
fix-point iteration: If any fold was performed in the previous
iteration, then InstCombine will re-populate the worklist from
scratch and fold the entire function again. This continues until
a fix-point is reached.

In the vast majority of cases, InstCombine will reach a fix-point
within a single iteration: However, a second iteration is performed
to verify that this is indeed the fixpoint. We can see this in the
statistics for llvm-test-suite:

    "instcombine.NumOneIteration": 411380,
    "instcombine.NumTwoIterations": 117921,
    "instcombine.NumThreeIterations": 236,
    "instcombine.NumFourOrMoreIterations": 2,

The way to read these numbers is that in 411380 cases, InstCombine
performs no folds. In 117921 cases it performs a fold and reaches
the fix-point within one iteration (the second iteration verifies
the fixpoint). In the remaining 238 cases, more than one iteration
is needed to reach the fixpoint.

In other words, only in 0.04% of cases are additional iterations
needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine
performs a completely useless extra iteration to verify the fix point.

This patch removes the fixpoint iteration from InstCombine, and always
only perform a single iteration. This results in a major compile-time
improvement of around 4% at negligible codegen impact.

This explicitly does accept that we will not reach a fixpoint in all
cases. However, this is mitigated by two factors: First, the data
suggests that this happens very rarely in practice. Second,
InstCombine runs many times during the optimization pipeline
(8 times even without LTO), so there are many chances to recover
such cases.

In order to prevent accidental optimization regressions in the
future, this implements a verify-fixpoint option, which is enabled
by default when instcombine is specified in -passes and disabled
when InstCombinePass() is constructed from C++. This means that
test cases need to explicitly use the no-verify-fixpoint option
if they fail to reach a fixed point (for a well understand reason
we cannot / do not want to avoid).

Differential Revision: https://reviews.llvm.org/D154579
2023-07-31 10:56:49 +02:00
wangpc
19a1b67b6d [RISCV] Fix typo in C9LeftShift
It should be 9 instead of 5.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D156500
2023-07-31 16:49:47 +08:00
Alex Zinenko
b17acc08a8 [mlir][python] more python gpu transform mixins
Add the Python mix-in for MapNestedForallToThreads. Fix typing
annotations in MapForallToBlocks and drop the attribute wrapping
rendered unnecessary by attribute builders.

Reviewed By: ingomueller-net

Differential Revision: https://reviews.llvm.org/D156528
2023-07-31 08:24:18 +00:00
Alex Zinenko
8e4887a12e [mlir] use a thread-local alternative to llvm::nulls
LLVM is not set up in a thread-safe way, which seems to be leading to
race conditions when sending stuff to llvm::nulls in opt builds. Try a
thread-local alternative.

Reviewed By: pzread

Differential Revision: https://reviews.llvm.org/D156421
2023-07-31 08:21:21 +00:00
Francesco Petrogalli
c4b21d57bc [llc] Add the command line option -sched-model-force-enable-intervals.
The option is used to force the use of resource intervals
in the machine scheduler, effectively ignoring the value of
`EnableIntervals` in the instance of the `SchedMachineModel`.

Reviewed By: anemet

Differential Revision: https://reviews.llvm.org/D156540
2023-07-31 10:10:18 +02:00
Takuya Shimizu
e90f4fc6ac [clang][ExprConstant] Print template arguments when describing stack frame
This patch adds additional printing of template argument list when the described function is a template specialization.
This can be useful when handling complex template functions in constexpr evaluator.

Reviewed By: cjdb, dblaikie
Differential Revision: https://reviews.llvm.org/D154366
2023-07-31 17:05:56 +09:00
Nikita Popov
063b37e7b4 Reapply [IR] Mark and/or constant expressions as undesirable
Reapply after D156401, which stops PatternMatch from recognizing
binop constant expressions, which should avoid the infinite loops
and assertion failures this patch previously exposed.

-----

In preparation for removing support for and/or expressions, mark
them as undesirable. As such, we will no longer implicitly create
such expressions, but they still exist.
2023-07-31 09:54:24 +02:00
Alexandros Lamprineas
893d3a61c0 Reland [FuncSpec] Add Phi nodes to the InstCostVisitor.
This patch allows constant folding of PHIs when estimating the user
bonus. Phi nodes are a special case since some of their inputs may
remain unresolved until all the specialization arguments have been
processed by the InstCostVisitor. Therefore, we keep a list of dead
basic blocks and then lazily visit the Phi nodes once the user bonus
has been computed for all the specialization arguments.

Differential Revision: https://reviews.llvm.org/D154852
2023-07-31 08:25:48 +01:00
Simi Pallipurath
3f75d38a4d [clang] Improve hermeticity of clang header tests.
At the moment the below header tests fail with the multilib error in LLVM Embedded Toolchain for Arm because there is no corresponding aarch64 big endian library variant  exist. Specifying --sysroot to its own testing directory clang/test/Headers/Inputs (which does not have any dependency library) prevents these header tests  from being located in standard library directories.

 1. clang/test/Headers/arm-neon-header.c
 2. clang/test/Headers/arm-fp16-header.c

Reviewed By: michaelplatings

Differential Revision: https://reviews.llvm.org/D156427
2023-07-31 08:25:36 +01:00
Timm Bäder
d37f1e9965 [clang][Interp] Implement __builtin_isnormal
Differential Revision: https://reviews.llvm.org/D155374
2023-07-31 09:14:16 +02:00
Timm Bäder
f444f39686 [clang][Interp] Implement __builtin_isfinite
Differential Revision: https://reviews.llvm.org/D155372
2023-07-31 09:12:32 +02:00
Mel Chen
5962942902 [LV][NFC] Refine comments related to reduction idioms. 2023-07-31 00:06:45 -07:00
Timm Bäder
72450a7793 [clang][Interp] Implement __builtin_isinf
Differential Revision: https://reviews.llvm.org/D155371
2023-07-31 08:49:22 +02:00
Sameer Sahasrabuddhe
d9847cde48 [GlobalISel] convergent intrinsics
Introduced the convergent equivalent of the existing G_INTRINSIC opcodes:

- G_INTRINSIC_CONVERGENT
- G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS

Out of the targets that currently have some support for GlobalISel, the patch
assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D154766
2023-07-31 12:15:39 +05:30
Jim Lin
f2e44238ee [RISCV] Clean up RISCVInstrInfoXTHead.td to look like the same style with other td file. NFC.
Unify indent rule and add one blank line after comment block.
2023-07-31 14:38:21 +08:00