213 Commits

Author SHA1 Message Date
Kazu Hirata
a98965d92f [CodeGen] Use llvm::erase_value (NFC) 2022-06-10 22:59:48 -07:00
David Penry
907aedbb3d [NFC] Fix spelling/newlines in comments/debug messages
Just a few spelling mistakes and missing newlines

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D127162
2022-06-07 09:38:53 -07:00
Fangrui Song
95a134254a Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 01:07:51 -07:00
Fangrui Song
d86a206f06 Remove unneeded cl::ZeroOrMore for cl::opt/cl::list options 2022-06-05 00:31:44 -07:00
Kazu Hirata
bcf4fa458a [CodeGen] Use a range-based for loop (NFC) 2022-06-04 22:26:55 -07:00
Fangrui Song
36c7d79dc4 Remove unneeded cl::ZeroOrMore for cl::opt options
Similar to 557efc9a8b68628c2c944678c6471dac30ed9e8e.
This commit handles options where cl::ZeroOrMore is more than one line below
cl::opt.
2022-06-04 00:10:42 -07:00
Thomas Preud'homme
68dee83923 [MachinePipeliner] Fix unscheduled instruction
Prior to ordering instructions to be scheduled, the machine pipeliner
update recurrence node sets in groupRemainingNodes() by adding in a
given node set any node on the dependency path from a node set with
higher priority to the given node set. The function computePath() that
determine what constitutes a path follows artificial dependencies.

However, when ordering the nodes in the resulting node sets,
computeNodeOrder() calls ignoreDependence when looking at dependencies
which ignores artificial dependencies. This can cause a node not to be
scheduled which then causes wrong code generation and in the case of a
debug build will lead to an assert failure in generatePhis() in
ModuloScheduler.cpp.

This commit adds calls to ignoreDependence() in computePath() to not add
any node in groupRemainingNodes() that would not be ordered by
computeNodeOrder().

Reviewed By: sgundapa

Differential Revision: https://reviews.llvm.org/D124267
2022-05-05 16:01:41 +01:00
David Penry
dcb77643e3 Reapply [CodeGen][ARM] Enable Swing Module Scheduling for ARM
Fixed "private field is not used" warning when compiled
with clang.

original commit: 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f
reverted in: fa49021c68ef7a7adcdf7b8a44b9006506523191

------

This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-29 10:54:39 -07:00
David Penry
fa49021c68 Revert "[CodeGen][ARM] Enable Swing Module Scheduling for ARM"
This reverts commit 28d09bbbc3d09c912b54a4d5edb32cab7de32a6f
while I investigate a buildbot failure.
2022-04-28 13:29:27 -07:00
David Penry
28d09bbbc3 [CodeGen][ARM] Enable Swing Module Scheduling for ARM
This patch permits Swing Modulo Scheduling for ARM targets
turns it on by default for the Cortex-M7.  The t2Bcc
instruction is recognized as a loop-ending branch.

MachinePipeliner is extended by adding support for
"unpipelineable" instructions.  These instructions are
those which contribute to the loop exit test; in the SMS
papers they are removed before creating the dependence graph
and then inserted into the final schedule of the kernel and
prologues. Support for these instructions was not previously
necessary because current targets supporting SMS have only
supported it for hardware loop branches, which have no
loop-exit-contributing instructions in the loop body.

The current structure of the MachinePipeliner makes it difficult
to remove/exclude these instructions from the dependence graph.
Therefore, this patch leaves them in the graph, but adds a
"normalization" method which moves them in the schedule to
stage 0, which causes them to appear properly in kernel and
prologues.

It was also necessary to be more careful about boundary nodes
when iterating across successors in the dependence graph because
the loop exit branch is now a non-artificial successor to
instructions in the graph. In additional, schedules with physical
use/def pairs in the same cycle should be treated as creating an
invalid schedule because the scheduling logic doesn't respect
physical register dependence once scheduled to the same cycle.

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D122672
2022-04-28 13:01:18 -07:00
Thomas Preud'homme
449ef2fcc6 [Pipeliner] Fix comment typo 2022-04-04 16:10:27 +01:00
serge-sans-paille
989f1c72e0 Cleanup codegen includes
This is a (fixed) recommit of https://reviews.llvm.org/D121169

after:  1061034926
before: 1063332844

Discourse thread: https://discourse.llvm.org/t/include-what-you-use-include-cleanup
Differential Revision: https://reviews.llvm.org/D121681
2022-03-16 08:43:00 +01:00
Nico Weber
a278250b0f Revert "Cleanup codegen includes"
This reverts commit 7f230feeeac8a67b335f52bd2e900a05c6098f20.
Breaks CodeGenCUDA/link-device-bitcode.cu in check-clang,
and many LLVM tests, see comments on https://reviews.llvm.org/D121169
2022-03-10 07:59:22 -05:00
serge-sans-paille
7f230feeea Cleanup codegen includes
after:  1061034926
before: 1063332844

Differential Revision: https://reviews.llvm.org/D121169
2022-03-10 10:00:30 +01:00
Thomas Preud'homme
67c14d5c69 [MachinePipeliner] Fix isPseduo typo. 2022-03-09 15:26:39 +00:00
Kazu Hirata
bb6447a78c [llvm] Use llvm::reverse (NFC) 2021-12-12 16:13:49 -08:00
Mircea Trofin
b012742405 [NFC] Rename MachineFunction::deleteMachineInstr (coding style) 2021-12-08 20:36:13 -08:00
Kazu Hirata
c4a8928b51 [CodeGen] Use range-based for loops (NFC) 2021-12-06 08:49:10 -08:00
Kazu Hirata
ca2f53897a [CodeGen] Use range-based for loops (NFC) 2021-12-04 08:48:05 -08:00
Kazu Hirata
f240e528ce [llvm] Use range-based for loops (NFC) 2021-11-29 09:04:44 -08:00
Kazu Hirata
fd7d40640d [llvm] Use range-based for loops (NFC) 2021-11-28 18:14:49 -08:00
Kazu Hirata
c73fc74ce0 [llvm] Use range-based for loops (NFC) 2021-11-28 10:04:54 -08:00
Kazu Hirata
c3e698e2f5 [CodeGen, Hexagon] Use MachineBasicBlock::phis (NFC) 2021-10-26 09:01:29 -07:00
Arthur Eubanks
d7593ebaee [NFC] Clean up users of AttributeList::hasAttribute()
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().

Add hasRetAttr() since it was missing from AttributeList.
2021-08-13 11:59:18 -07:00
dfukalov
d066079728 [NFC][AA] Prepare to convert AliasResult to class with PartialAlias offset.
Main reason is preparation to transform AliasResult to class that contains
offset for PartialAlias case.

Reviewed By: asbirlea

Differential Revision: https://reviews.llvm.org/D98027
2021-04-09 12:54:22 +03:00
Kazu Hirata
d639120983 [llvm] Use set_is_subset (NFC) 2021-02-28 10:59:20 -08:00
Marianne Mailhot-Sarrasin
f0ec9f1bb3 [Pipeliner] Fixed optimization remarks and debug dumps Initiation
Interval value

The II value was incremented before exiting the loop, and therefor when
used in the optimization remarks and debug dumps it did not reflect the
initiation interval actually used in Schedule.

Differential Revision: https://reviews.llvm.org/D95692
2021-02-17 12:28:37 -05:00
Kazu Hirata
3279943adf [CodeGen] Use range-based for loops (NFC) 2021-02-16 23:23:08 -08:00
Kazu Hirata
7dc3575ef2 [llvm] Remove redundant return and continue statements (NFC)
Identified with readability-redundant-control-flow.
2021-01-14 20:30:34 -08:00
Kazu Hirata
b7c5e0b02c [Target, Transforms] Use *Set::contains (NFC) 2021-01-08 18:39:54 -08:00
Kazu Hirata
1e3ed09165 [CodeGen] Use llvm::append_range (NFC) 2020-12-28 19:55:16 -08:00
dfukalov
2ce38b3f03 [NFC] Reduce include files dependency.
1. Removed #include "...AliasAnalysis.h" in other headers and modules.
2. Cleaned up includes in AliasAnalysis.h.

Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D92489
2020-12-03 18:25:05 +03:00
Nikita Popov
4df8efce80 [AA] Split up LocationSize::unknown()
Currently, we have some confusion in the codebase regarding the
meaning of LocationSize::unknown(): Some parts (including most of
BasicAA) assume that LocationSize::unknown() only allows accesses
after the base pointer. Some parts (various callers of AA) assume
that LocationSize::unknown() allows accesses both before and after
the base pointer (but within the underlying object).

This patch splits up LocationSize::unknown() into
LocationSize::afterPointer() and LocationSize::beforeOrAfterPointer()
to make this completely unambiguous. I tried my best to determine
which one is appropriate for all the existing uses.

The test changes in cs-cs.ll in particular illustrate a previously
clearly incorrect AA result: We were effectively assuming that
argmemonly functions were only allowed to access their arguments
after the passed pointer, but not before it. I'm pretty sure that
this was not intentional, and it's certainly not specified by
LangRef that way.

Differential Revision: https://reviews.llvm.org/D91649
2020-11-26 18:39:55 +01:00
Mircea Trofin
c8fcffe775 [NFC][MC] Use MCRegister in Machine{Sink|Pipeliner}.cpp
Differential Revision: https://reviews.llvm.org/D89328
2020-10-14 08:42:17 -07:00
Alon Kom
818cf30b83 [MachinePipeliner] Fix II_setByPragma initialization
II_setByPragma was not reset between 2 calls of the MachinePipleiner pass

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D87088
2020-09-09 13:38:35 +00:00
Vitaly Buka
b0eb40ca39 [NFC] Remove unused GetUnderlyingObject paramenter
Depends on D84617.

Differential Revision: https://reviews.llvm.org/D84621
2020-07-31 02:10:03 -07:00
Vitaly Buka
89051ebace [NFC] GetUnderlyingObject -> getUnderlyingObject
I am going to touch them in the next patch anyway
2020-07-30 21:08:24 -07:00
Jinsong Ji
80b78a47e5 [MachinePipeliner] Add ORE for MachinePipeliner
This patch adds ORE for MachinePipeliner, so that people can anaylyze
their code using opt-viewer or other tools, then optimize the code to
catch more piplining opportunities.

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D79368
2020-05-05 16:04:53 +00:00
Fraser Cormack
c819ef9653 Provide operand indices to adjustSchedDependency
This allows targets to know exactly which operands are contributing to
the dependency, which is required for targets with per-operand
scheduling models.

Differential Revision: https://reviews.llvm.org/D77135
2020-04-17 11:08:44 +01:00
Sumanth Gundapaneni
a04ab2ec08 [Pipeliner] Fix the bug in pragma that disables the pipeliner.
Differential Revision: https://reviews.llvm.org/D76303.
2020-04-10 12:52:16 -05:00
Lama
4a6ebc03ba [MachinePipeliner] Fix a bug in Output Dependency chains
The current implementation collects all Preds/Succs of a Dep of kind Output, creating a long chain and subsequently a schedule with an unnecessarily large II.
Was this done on purpose for a reason I'm missing?

Reviewed By: bcahoon

Differential Revision: https://reviews.llvm.org/D75424
2020-03-24 14:37:50 +00:00
Bevin Hansson
c3f36acc92 [MC] Widen the functional unit type from 32 to 64 bits.
Summary:
The type used to represent functional units in MC is
'unsigned', which is 32 bits wide. This is currently
not a problem in any upstream target as no one seems
to have hit the limit on this yet, but in our
downstream one, we need to define more than 32
functional units.

Increasing the size does not seem to cause a huge
size increase in the binary (an llc debug build went
from 1366497672 to 1366523984, a difference of 26k),
so perhaps it would be acceptable to have this patch
applied upstream as well.

Subscribers: hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71210
2020-02-24 09:37:00 +01:00
Sander de Smalen
8fbc925807 Add OffsetIsScalable to getMemOperandWithOffset
Summary:
Making `Scale` a `TypeSize` in AArch64InstrInfo::getMemOpInfo,
has the effect that all places where this information is used
(notably, TargetInstrInfo::getMemOperandWithOffset) will need
to consider Scale - and derived, Offset - possibly being scalable.

This patch adds a new operand `bool &OffsetIsScalable` to
TargetInstrInfo::getMemOperandWithOffset and fixes up all
the places where this function is used, to consider the
offset possibly being scalable.

In most cases, this means bailing out because the algorithm does not
(or cannot) support scalable offsets in places where it does some
form of alias checking for example.

Reviewers: rovka, efriedma, kristof.beyls

Reviewed By: efriedma

Subscribers: wuzish, kerbowa, MatzeB, arsenm, nemanjai, jvesely, nhaehnle, hiraditya, kbarton, javed.absar, asb, rbar, johnrusso, simoncook, sabuasal, niosHD, jrtc27, MaskRay, zzheng, edward-jones, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, jsji, Jim, lenary, s.egerton, pzheng, sameer.abuasal, apazos, luismarques, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D72758
2020-02-18 15:53:29 +00:00
Sumanth Gundapaneni
7c7e368a7f [Pipeliner] Fix an assertion caused by iterator invalidation. 2019-11-14 13:08:06 -06:00
James Molloy
9026518e73 [ModuloSchedule] Peel out prologs and epilogs, generate actual code
Summary:
This extends the PeelingModuloScheduleExpander to generate prolog and epilog code,
and correctly stitch uses through the prolog, kernel, epilog DAG.

The key concept in this patch is to ensure that all transforms are *local*; only a
function of a block and its immediate predecessor and successor. By defining the problem in this way
we can inductively rewrite the entire DAG using only local knowledge that is easy to
reason about.

For example, we assume that all prologs and epilogs are near-perfect clones of the
steady-state kernel. This means that if a block has an instruction that is predicated out,
we can redirect all users of that instruction to that equivalent instruction in our
immediate predecessor. As all blocks are clones, every instruction must have an equivalent in
every other block.

Similarly we can make the assumption by construction that if a value defined in a block is used
outside that block, the only possible user is its immediate successors. We maintain this
even for values that are used outside the loop by creating a limited form of LCSSA.

This code isn't small, but it isn't complex.

Enabled a bunch of testing from Hexagon. There are a couple of tests not enabled yet;
I'm about 80% sure there isn't buggy codegen but the tests are checking for patterns
that we don't produce. Those still need a bit more investigation. In the meantime we
(Google) are happy with the code produced by this on our downstream SMS implementation,
and believe it generates correct code.

Subscribers: mgorny, hiraditya, jsji, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D68205

llvm-svn: 373462
2019-10-02 12:46:44 +00:00
Changpeng Fang
f5524f0451 Remove the AliasAnalysis argument in function areMemAccessesTriviallyDisjoint
Reviewers:
  arsenm

Differential Revision:
  https://reviews.llvm.org/D58360

llvm-svn: 373024
2019-09-26 22:53:44 +00:00
James Molloy
8a74eca398 [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
Recommit: fix asan errors.

The way MachinePipeliner uses these target hooks is stateful - we reduce trip
count by one per call to reduceLoopCount. It's a little overfit for hardware
loops, where we don't have to worry about stitching a loop induction variable
across prologs and epilogs (the induction variable is implicit).

This patch introduces a new API:

  /// Analyze loop L, which must be a single-basic-block loop, and if the
  /// conditions can be understood enough produce a PipelinerLoopInfo object.
  virtual std::unique_ptr<PipelinerLoopInfo>
  analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;

The return value is expected to be an implementation of the abstract class:

  /// Object returned by analyzeLoopForPipelining. Allows software pipelining
  /// implementations to query attributes of the loop being pipelined.
  class PipelinerLoopInfo {
  public:
    virtual ~PipelinerLoopInfo();
    /// Return true if the given instruction should not be pipelined and should
    /// be ignored. An example could be a loop comparison, or induction variable
    /// update with no users being pipelined.
    virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

    /// Create a condition to determine if the trip count of the loop is greater
    /// than TC.
    ///
    /// If the trip count is statically known to be greater than TC, return
    /// true. If the trip count is statically known to be not greater than TC,
    /// return false. Otherwise return nullopt and fill out Cond with the test
    /// condition.
    virtual Optional<bool>
    createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
                                 SmallVectorImpl<MachineOperand> &Cond) = 0;

    /// Modify the loop such that the trip count is
    /// OriginalTC + TripCountAdjust.
    virtual void adjustTripCount(int TripCountAdjust) = 0;

    /// Called when the loop's preheader has been modified to NewPreheader.
    virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;

    /// Called when the loop is being removed.
    virtual void disposed() = 0;
  };

The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while
allowing the target to hold its own state across all calls. This API, in
particular the disjunction of creating a trip count check condition and
adjusting the loop, improves the code quality in ModuloSchedule.cpp.

llvm-svn: 372463
2019-09-21 08:19:41 +00:00
Mitch Phillips
72a3d8597d Revert "[MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount"
This commit broke the ASan buildbot. See comments in rL372376 for more
information.

This reverts commit 15e27b0b6d9d51362fad85dbe95ac5b3fadf0a06.

llvm-svn: 372425
2019-09-20 20:25:16 +00:00
James Molloy
15e27b0b6d [MachinePipeliner] Improve the TargetInstrInfo API analyzeLoop/reduceLoopCount
The way MachinePipeliner uses these target hooks is stateful - we reduce trip
count by one per call to reduceLoopCount. It's a little overfit for hardware
loops, where we don't have to worry about stitching a loop induction variable
across prologs and epilogs (the induction variable is implicit).

This patch introduces a new API:

  /// Analyze loop L, which must be a single-basic-block loop, and if the
  /// conditions can be understood enough produce a PipelinerLoopInfo object.
  virtual std::unique_ptr<PipelinerLoopInfo>
  analyzeLoopForPipelining(MachineBasicBlock *LoopBB) const;

The return value is expected to be an implementation of the abstract class:

  /// Object returned by analyzeLoopForPipelining. Allows software pipelining
  /// implementations to query attributes of the loop being pipelined.
  class PipelinerLoopInfo {
  public:
    virtual ~PipelinerLoopInfo();
    /// Return true if the given instruction should not be pipelined and should
    /// be ignored. An example could be a loop comparison, or induction variable
    /// update with no users being pipelined.
    virtual bool shouldIgnoreForPipelining(const MachineInstr *MI) const = 0;

    /// Create a condition to determine if the trip count of the loop is greater
    /// than TC.
    ///
    /// If the trip count is statically known to be greater than TC, return
    /// true. If the trip count is statically known to be not greater than TC,
    /// return false. Otherwise return nullopt and fill out Cond with the test
    /// condition.
    virtual Optional<bool>
    createTripCountGreaterCondition(int TC, MachineBasicBlock &MBB,
                                 SmallVectorImpl<MachineOperand> &Cond) = 0;

    /// Modify the loop such that the trip count is
    /// OriginalTC + TripCountAdjust.
    virtual void adjustTripCount(int TripCountAdjust) = 0;

    /// Called when the loop's preheader has been modified to NewPreheader.
    virtual void setPreheader(MachineBasicBlock *NewPreheader) = 0;

    /// Called when the loop is being removed.
    virtual void disposed() = 0;
  };

The Pipeliner (ModuloSchedule.cpp) can use this object to modify the loop while
allowing the target to hold its own state across all calls. This API, in
particular the disjunction of creating a trip count check condition and
adjusting the loop, improves the code quality in ModuloSchedule.cpp.

llvm-svn: 372376
2019-09-20 08:57:46 +00:00
James Molloy
fef9f59055 [ModuloSchedule] Introduce PeelingModuloScheduleExpander
This is the beginnings of a reimplementation of ModuloScheduleExpander. It works
by generating a single-block correct pipelined kernel and then peeling out the
prolog and epilogs.

This patch implements kernel generation as well as a validator that will
confirm the number of phis added is the same as the ModuloScheduleExpander.

Prolog and epilog peeling will come in a different patch.

Differential Revision: https://reviews.llvm.org/D67081

llvm-svn: 370893
2019-09-04 12:54:24 +00:00