21 Commits

Author SHA1 Message Date
David Green
9872551ca0 [ARM] Skip debug during vpt block creation
Debug info is currently preventing VPT block creation, leading to
different codegen. This patch attempts to skip any debug instructions
during vpt block creation, making sure they do not interfere.

Differential Revision: https://reviews.llvm.org/D103610
2021-06-10 14:49:04 +01:00
David Green
91fb9eac0b [ARM] Remove dead instructions before creating VPT block bundles
We remove VPNOT instructions in VPT blocks as we create them, turning
them into else predicates. We don't remove the dead instructions until
after the block has been created though. Because the VPNOT will have
killed the vpr register it used, this makes finalizeBundle add internal
flags to the vpr uses of any instructions after the VPNOT. These
incorrect flags can then confuse what is alive and what is not, leading
to machine verifier problems.

This patch removes them earlier instead, before the bundle is finalized
so that kill flags remain valid.

Differential Revision: https://reviews.llvm.org/D92227
2020-12-08 14:05:07 +00:00
David Green
c8cd7e2bbf [ARM] Remove MI variable aliasing. NFC
This was accidentally using the same name for two different variables in
the same line. Whilst it seems to work for some compilers, others have
trouble and it is probably not a fantastic idea.
2020-11-09 18:18:43 +00:00
David Green
a0a9e1c798 [ARM] Remove kill flags between VCMP and insertion point
When we fold a VCMP into a VPST instruction any kill flags between the
old VCMP position and the new insertion point need to be removed, in
order to keep the verifier happy.

Differential Revision: https://reviews.llvm.org/D90964
2020-11-09 13:17:53 +00:00
Pierre-vh
24bf8063d6 [Target][ARM] Replace outdated getARMVPTBlockMask function
getARMVPTBlockMask was an outdated function that only handled basic
block masks: T, TT, TTT and TTTT. This worked fine before the MVE
VPT Block Insertion Pass improvements as it was the only kind of
masks that it could generate, but now it can generate more complex
masks that uses E predicates, so it's dangerous to use that function
to calculate VPT/VPST block masks.

I replaced it with 2 different functions:
  - expandPredBlockMask, in ARMBaseInfo. This adds an "E" or "T" at
    the end of an existing PredBlockMask.
  - recomputeVPTBlockMask, in Thumb2InstrInfo. This takes an iterator
    to a VPT/VPST instruction and recomputes its block mask by looking
    at the predicated instructions that follows it. This should be
    used to recompute a block mask after removing/adding a predicated
    instruction to the block.

The expandPredBlockMask function is pretty much imported from the MVE
VPT Blocks pass.

I had to change the ARMLowOverheadLoops and MVEVPTBlocks passes as well
so they could use these new functions.

Differential Revision: https://reviews.llvm.org/D78201
2020-05-12 12:10:15 +01:00
Pierre-vh
13eb890139 [Target][ARM] Fix VPT Block Pass miscompilation
The pass was incorrectly reverting back to a "T" when something wrote
to VPR inside a "E" block. This is not the correct behaviour, the
predicate should stay the same.

Differential Revision: https://reviews.llvm.org/D77798
2020-04-14 15:16:27 +01:00
Matt Arsenault
6011627f51 CodeGen: More conversions to use Register 2020-04-07 18:54:36 -04:00
Benjamin Kramer
b605c56b0f [ARM] Silence warning in Release builds
llvm/lib/Target/ARM/MVEVPTBlockPass.cpp:175:37: error: unused variable 'BlockBeg' [-Werror,-Wunused-variable]
  MachineBasicBlock::instr_iterator BlockBeg = Iter;
                                    ^
2020-04-01 15:29:19 +02:00
Pierre-vh
2effe8f5e7 [Target][ARM] Improvements to the VPT Block Insertion Pass
This allows the MVE VPT Block insertion pass to remove VPNOTs in
order to create more complex VPT blocks such as TE, TEET, TETE, etc.

Differential Revision: https://reviews.llvm.org/D75993
2020-04-01 12:34:20 +01:00
David Green
d50e188a07 Revert "[ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS"
This reverts commit e34801c8e6df and the followup due to multiple
problems.

I've tried to keep the tests and RDA parts where possible, as those
still seem useful.
2020-02-02 13:24:05 +00:00
Sjoerd Meijer
a08c0adee0 [ARM][MVE] VTP Block Pass fix
Fix a missing and broken test: 2 VPT blocks predicated on the same VCMP
instruction that can be folded. The problem was that for each VPT block, we
record the predicate statements with a list, but the same instruction was added
twice. Thus, we were running in an assert trying to remove the same instruction
twice. To avoid this the instructions are now recorded with a set.

Differential Revision: https://reviews.llvm.org/D72699
2020-01-14 16:10:55 +00:00
Sjoerd Meijer
e34801c8e6 [ARM][MVE] VPT Blocks: findVCMPToFoldIntoVPS
This is a recommit of D71330, but with a few things fixed and changed:

1) ReachingDefAnalysis: this was not running with optnone as it was checking
skipFunction(), which other analysis passes don't do. I guess this is a
copy-paste from a codegen pass.
2) VPTBlockPass: here I've added skipFunction(), because like most/all
optimisations, we don't want to run this with optnone.

This fixes the issues with the initial/previous commit: the VPTBlockPass was
running with optnone, but ReachingDefAnalysis wasn't, and so VPTBlockPass was
crashing querying ReachingDefAnalysis.

I've added test case mve-vpt-block-optnone.mir to check that we don't run
VPTBlock with optnone.

Differential Revision: https://reviews.llvm.org/D71470
2020-01-07 13:54:47 +00:00
Sam Parker
4042518335 [ARM][MVE] Tail predicate in the presence of vcmp
Record the discovered VPT blocks while checking for validity and, for
now, only handle blocks that begin with VPST and not VPT. We're now
allowing more than one instruction to define vpr, but each block must
somehow be predicated using the vctp. This leaves us with several
scenarios which need fixing up:
1) A VPT block with is only predicated by the vctp and has no
   internal vpr defs.
2) A VPT block which is only predicated by the vctp but has an
   internal vpr def.
3) A VPT block which is predicated upon the vctp as well as another
   vpr def.
4) A VPT block which is not predicated upon a vctp, but contains it
   and all instructions within the block are predicated upon in.

The changes needed are, for:
1) The easy one, just remove the vpst and unpredicate the
   instructions in the block.
2) Remove the vpst and unpredicate the instructions up to the
   internal vpr def. Need insert a new vpst to predicate the
   remaining instructions.
3) No nothing.
4) The vctp will be inside a vpt and the instruction will be removed,
   so adjust the size of the mask on the vpst.

Differential Revision: https://reviews.llvm.org/D71107
2019-12-20 08:42:11 +00:00
Sjoerd Meijer
049f9672d8 [ARM] Move MVE opcode helper functions to ARMBaseInstrInfo. NFC.
In ARMLowOverheadLoops.cpp, MVETailPredication.cpp, and MVEVPTBlock.cpp we have
quite a few helper functions all looking at the opcodes of MVE instructions.
This moves all these utility functions to ARMBaseInstrInfo.

Diferential Revision: https://reviews.llvm.org/D71426
2019-12-16 09:13:59 +00:00
Sjoerd Meijer
e91420e17d Revert "[ARM][MVE] findVCMPToFoldIntoVPS. NFC."
This reverts commit 9468e3334ba54fbb1b209aaec662d7375451fa1f.

There's a test that doesn't like this change. The RDA analysis
gets invalided by changes in the block, which is not taken into
account. Revert while I work on a fix for this.
2019-12-13 11:56:44 +00:00
Sjoerd Meijer
9468e3334b [ARM][MVE] findVCMPToFoldIntoVPS. NFC.
This adds ReachingDefAnalysis (RDA) to the VPTBlock pass, so that we can
reimplement findVCMPToFoldIntoVPS with just a few calls to RDA.

Differential Revision: https://reviews.llvm.org/D71330
2019-12-12 15:41:20 +00:00
Reid Kleckner
904cd3e06b Prune a LegacyDivergenceAnalysis and MachineLoopInfo include each
Now X86ISelLowering doesn't depend on many IR analyses.

llvm-svn: 375320
2019-10-19 01:31:09 +00:00
Benjamin Kramer
df4b9a3f4f Hide implementation details in namespaces.
llvm-svn: 372113
2019-09-17 12:56:29 +00:00
David Green
ce7328cb61 [ARM] Fold VCMP into VPT
MVE has VPT instructions, which perform the duties of both a VCMP and a VPST in
a single instruction, performing the compare and starting the VPT block in one.
This teaches the MVEVPTBlockPass to fold them, searching back through the
basicblock for a valid VCMP and creating the VPT from its operands.

There are some changes to the VPT instructions to accommodate this, altering
the order of the operands to match the VCMP better, and changing P0 register
defs to be VPR defs, as is used in other places.

Differential Revision: https://reviews.llvm.org/D66577

llvm-svn: 371982
2019-09-16 13:02:41 +00:00
David Green
83a3341246 [ARM] Fixup the creation of VPT blocks
This attempts to just fix the creation of VPT blocks, fixing up the iterating,
which instructions are considered in the bundle, and making sure that we do not
overrun the end of the block.

Differential Revision: https://reviews.llvm.org/D67219

llvm-svn: 371064
2019-09-05 13:37:04 +00:00
David Green
379f6186dd [ARM] Move MVEVPTBlockPass to a separate file. NFC
This just pulls the MVEVPTBlockPass into a separate file, as opposed to being
wrapped up in Thumb2ITBlockPass.

Differential revision: https://reviews.llvm.org/D66579

llvm-svn: 370187
2019-08-28 11:37:31 +00:00