27490 Commits

Author SHA1 Message Date
Thomas Raoux
caabb713ea [ModuloSchedule] Fix data types in ModuloScheduleExpander::isLoopCarried
The cycle values in modulo scheduling results can be negative.
The result of ModuloSchedule::getCycle() must be received as an int type.

Patch by Masaki Arai!

Differential Revision: https://reviews.llvm.org/D71122
2019-12-09 07:37:00 -08:00
Jeremy Morse
00e238896c [DebugInfo] Nerf placeDbgValues, with prejudice
CodeGenPrepare::placeDebugValues moves variable location intrinsics to be
immediately after the Value they refer to. This makes tracking of locations
very easy; but it changes the order in which assignments appear to the
debugger, from the source programs order to the order in which the
optimised program computes values. This then leads to PR43986 and PR38754,
where variable locations that were in a conditional block are made
unconditional, which is highly misleading.

This patch adjusts placeDbgValues to only re-order variable location
intrinsics if they use a Value before it is defined, significantly reducing
the damage that it does. This is still not 100% safe, but the rest of
CodeGenPrepare needs polishing to correctly update debug info when
optimisations are performed to fully fix this.

This will probably break downstream debuginfo tests -- if the
instruction-stream position of variable location changes isn't the focus of
the test, an easy fix should be to manually apply placeDbgValues' behaviour
to the failing tests, moving dbg.value intrinsics next to SSA variable
definitions thus:

  %foo = inst1
  %bar = ...
  %baz = ...
  void call @llvm.dbg.value(metadata i32 %foo, ...

to

  %foo = inst1
  void call @llvm.dbg.value(metadata i32 %foo, ...
  %bar = ...
  %baz = ...

This should return your test to exercising whatever it was testing before.

Differential Revision: https://reviews.llvm.org/D58453
2019-12-09 12:52:10 +00:00
David Stenberg
6965f835b4 [DebugInfo] Make describeLoadedValue() reg aware
Summary:
Currently the describeLoadedValue() hook is assumed to describe the
value of the instruction's first explicit define. The hook will not be
called for instructions with more than one explicit define.

This commit adds a register parameter to the describeLoadedValue() hook,
and invokes the hook for all registers in the worklist.

This will allow us to for example describe instructions which produce
more than two parameters' values; e.g. Hexagon's various combine
instructions.

This also fixes situations in our downstream target where we may pass
smaller parameters in the high part of a register. If such a parameter's
value is produced by a larger copy instruction, we can't describe the
call site value using the super-register, and we instead need to know
which sub-register that should be used.

This also allows us to handle cases like this:

  $ebx = [...]
  $rdi = MOVSX64rr32 $ebx
  $esi = MOV32rr $edi
  CALL64pcrel32 @call

The hook will first be invoked for the MOV32rr instruction, which will
say that @call's second parameter (passed in $esi) is described by $edi.
As $edi is not preserved it will be added to the worklist. When we get
to the MOVSX64rr32 instruction, we need to describe two values; the
sign-extended value of $ebx -> $rdi for the first parameter, and $ebx ->
$edi for the second parameter, which is now possible.

This commit modifies the dbgcall-site-lea-interpretation.mir test case.
In the test case, the values of some 32-bit parameters were produced
with LEA64r. Perhaps we can in general cases handle such by emitting
expressions that AND out the lower 32-bits, but I have not been able to
land in a case where a LEA64r is used for a 32-bit parameter instead of
LEA64_32 from C code.

I have not found a case where it would be useful to describe parameters
using implicit defines, so in this patch the hook is still only invoked
for explicit defines of forwarding registers.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: djtodoro, vsk

Subscribers: ormris, hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D70431
2019-12-09 10:47:49 +01:00
David Stenberg
f3696533f2 Revert "[DebugInfo] Make describeLoadedValue() reg aware"
This reverts commit 3cd93a4efcdeabeb20cb7bec9fbddcb540d337a1.
I'll recommit with a well-formatted arcanist commit message.
2019-12-09 10:45:13 +01:00
David Stenberg
3cd93a4efc [DebugInfo] Make describeLoadedValue() reg aware
Currently the describeLoadedValue() hook is assumed to describe the
value of the instruction's first explicit define. The hook will not be
called for instructions with more than one explicit define.

This commit adds a register parameter to the describeLoadedValue() hook,
and invokes the hook for all registers in the worklist.

This will allow us to for example describe instructions which produce
more than two parameters' values; e.g. Hexagon's various combine
instructions.

This also fixes a case in our downstream target where we may pass
smaller parameters in the high part of a register. If such a parameter's
value is produced by a larger copy instruction, we can't describe the
call site value using the super-register, and we instead need to know
which sub-register that should be used.

This also allows us to handle cases like this:

  $ebx = [...]
  $rdi = MOVSX64rr32 $ebx
  $esi = MOV32rr $edi
  CALL64pcrel32 @call

The hook will first be invoked for the MOV32rr instruction, which will
say that @call's second parameter (passed in $esi) is described by $edi.
As $edi is not preserved it will be added to the worklist. When we get
to the MOVSX64rr32 instruction, we need to describe two values; the
sign-extended value of $ebx -> $rdi for the first parameter, and $ebx ->
$edi for the second parameter, which is now possible.

This commit modifies the dbgcall-site-lea-interpretation.mir test case.
In the test case, the values of some 32-bit parameters were produced
with LEA64r. Perhaps we can in general cases handle such by emitting
expressions that AND out the lower 32-bits, but I have not been able to
land in a case where a LEA64r is used for a 32-bit parameter instead of
LEA64_32 from C code.

I have not found a case where it would be useful to describe parameters
using implicit defines, so in this patch the hook is still only invoked
for explicit defines of forwarding registers.
2019-12-09 10:44:17 +01:00
Hans Wennborg
a38396939c Revert 393dacacf7e7 "[ARM] Enable TypePromotion by default"
This caused "Too many bits for uint64_t" asserts when building Chromium. See
https://crbug.com/1031978#c2 for a reproducer. I'll follow up on the
llvm-commits thread with a creduced version.

> ARMCodeGenPrepare has already been generalized and renamed to
> TypePromotion. We've had it enabled and tested downstream for a
> while, so enable it by default.
>
> Differential Revision: https://reviews.llvm.org/D70998
2019-12-09 09:39:31 +01:00
rollrat
9fdb7ac503 [NFC][LivePhysRegs] Fix incorrect comment
Reviewers: #llvm, tellenbach

Reviewed By: tellenbach

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71051

Patch by: rollrat <rollrat.cse@gmail.com>
2019-12-08 21:07:28 +01:00
Ulrich Weigand
9db13b5a7d [FPEnv] Constrained FCmp intrinsics
This adds support for constrained floating-point comparison intrinsics.

Specifically, we add:

      declare <ty2>
      @llvm.experimental.constrained.fcmp(<type> <op1>, <type> <op2>,
                                          metadata <condition code>,
                                          metadata <exception behavior>)
      declare <ty2>
      @llvm.experimental.constrained.fcmps(<type> <op1>, <type> <op2>,
                                           metadata <condition code>,
                                           metadata <exception behavior>)

The first variant implements an IEEE "quiet" comparison (i.e. we only
get an invalid FP exception if either argument is a SNaN), while the
second variant implements an IEEE "signaling" comparison (i.e. we get
an invalid FP exception if either argument is any NaN).

The condition code is implemented as a metadata string.  The same set
of predicates as for the fcmp instruction is supported (except for the
"true" and "false" predicates).

These new intrinsics are mapped by SelectionDAG codegen onto two new
ISD opcodes, ISD::STRICT_FSETCC and ISD::STRICT_FSETCCS, again
representing quiet vs. signaling comparison operations.  Otherwise
those nodes look like SETCC nodes, with an additional chain argument
and result as usual for strict FP nodes.  The patch includes support
for the common legalization operations for those nodes.

The patch also includes full SystemZ back-end support for the new
ISD nodes, mapping them to all available SystemZ instruction to
fully implement strict semantics (scalar and vector).

Differential Revision: https://reviews.llvm.org/D69281
2019-12-07 11:28:39 +01:00
Craig Topper
28b573d249 [TargetLowering] Fix another potential FPE in expandFP_TO_UINT
D53794 introduced code to perform the FP_TO_UINT expansion via FP_TO_SINT in a way that would never expose floating-point exceptions in the intermediate steps. Unfortunately, I just noticed there is still a way this can happen. As discussed in D53794, the compiler now generates this sequence:

// Sel = Src < 0x8000000000000000
// Val = select Sel, Src, Src - 0x8000000000000000
// Ofs = select Sel, 0, 0x8000000000000000
// Result = fp_to_sint(Val) ^ Ofs
The problem is with the Src - 0x8000000000000000 expression. As I mentioned in the original review, that expression can never overflow or underflow if the original value is in range for FP_TO_UINT. But I missed that we can get an Inexact exception in the case where Src is a very small positive value. (In this case the result of the sub is ignored, but that doesn't help.)

Instead, I'd suggest to use the following sequence:

// Sel = Src < 0x8000000000000000
// FltOfs = select Sel, 0, 0x8000000000000000
// IntOfs = select Sel, 0, 0x8000000000000000
// Result = fp_to_sint(Val - FltOfs) ^ IntOfs
In the case where the value is already in range of FP_TO_SINT, we now simply compute Val - 0, which now definitely cannot trap (unless Val is a NaN in which case we'd want to trap anyway).

In the case where the value is not in range of FP_TO_SINT, but still in range of FP_TO_UINT, the sub can never be inexact, as Val is between 2^(n-1) and (2^n)-1, i.e. always has the 2^(n-1) bit set, and the sub is always simply clearing that bit.

There is a slight complication in the case where Val is a constant, so we know at compile time whether Sel is true or false. In that scenario, the old code would automatically optimize the sub away, while this no longer happens with the new code. Instead, I've added extra code to check for this case and then just fall back to FP_TO_SINT directly. (This seems to catch even slightly more cases.)

Original version of the patch by Ulrich Weigand. X86 changes added by Craig Topper

Differential Revision: https://reviews.llvm.org/D67105
2019-12-06 14:11:04 -08:00
Hiroshi Yamauchi
2eb30fafa5 Revert "[PGO][PGSO] Instrument the code gen / target passes."
This reverts commit 9a0b5e14075a1f42a72eedb66fd4fde7985d37ac.

This seems to break buildbots.
2019-12-06 12:17:32 -08:00
Hiroshi Yamauchi
9a0b5e1407 [PGO][PGSO] Instrument the code gen / target passes.
Summary:
Split off of D67120.

Add the profile guided size optimization instrumentation / queries in the code
gen or target passes. This doesn't enable the size optimizations in those passes
yet as they are currently disabled in shouldOptimizeForSize (for non-IR pass
queries).

Reviewers: davidxl

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71072
2019-12-06 10:43:39 -08:00
Guozhi Wei
72942459d0 [MBP] Avoid tail duplication if it can't bring benefit
Current tail duplication integrated in bb layout is designed to increase the fallthrough from a BB's predecessor to its successor, but we have observed cases that duplication doesn't increase fallthrough, or it brings too much size overhead.

To overcome these two issues in function canTailDuplicateUnplacedPreds I add two checks:

  make sure there is at least one duplication in current work set.
  the number of duplication should not exceed the number of successors.

The modification in hasBetterLayoutPredecessor fixes a bug that potential predecessor must be at the bottom of a chain.

Differential Revision: https://reviews.llvm.org/D64376
2019-12-06 09:53:53 -08:00
John Brawn
984f1bb3e7 [LegalizeTypes] Add missing case for STRICT_FP_ROUND softening
This fixes a test failure in test/CodeGen/ARM/fp-intrinsics.ll.
2019-12-06 15:54:27 +00:00
Jeremy Morse
c93a9b15ce [DebugInfo][CGP] Update dbg.values when sinking address computations
One of CodeGenPrepare's optimizations is to duplicate address calculations
into basic blocks, so that as much information as possible can be folded
into memory addressing operands. This is great -- but the dbg.value
variable location intrinsics are not updated in the same way. This can lead
to dbg.values referring to address computations in other blocks that will
never be encoded into the DAG, while duplicate address computations are
performed locally that could be used by the dbg.value. Some of these (such
as non-constant-offset GEPs) can't be salvaged past.

Fix this by, whenever we duplicate an address computation into a block,
looking for dbg.value users of the original memory address in the same
block, and redirecting those to the local computation.

Differential Revision: https://reviews.llvm.org/D58403
2019-12-06 11:27:19 +00:00
Ulrich Weigand
daee549b17 [FPEnv][SelectionDAG] Relax chain requirements
This patch implements the following changes:

1) SelectionDAGBuilder::visitConstrainedFPIntrinsic currently treats
each constrained intrinsic like a global barrier (e.g. a function call)
and fully serializes all pending chains. This is actually not required;
it is allowed for constrained intrinsics to be reordered w.r.t one
another or (nonvolatile) memory accesses. The MI-level scheduler already
allows for that flexibility, so it makes sense to allow it at the DAG
level as well.

This patch therefore changes the way chains for constrained intrisincs
are created, and handles them basically like load operations are handled.
This has the effect that constrained intrinsics are no longer serialized
against one another or (nonvolatile) loads. They are still serialized
against stores, but that seems hard to change with the current DAG chain
setup, and it also doesn't seem to be a big problem preventing DAG

2) The OPC_CheckFoldableChainNode check requires that each of the
intermediate nodes in a multi-node pattern match only has a single use.
This check tends to fail if those intermediate nodes are strict operations
as those have a chain output that typically indeed has another use.
However, we don't really need to consider chains here at all, since they
will all be rewritten anyway by UpdateChains later. Other parts of the
matcher therefore already ignore chains, but this hasOneUse check doesn't.

This patch replaces hasOneUse by a custom test that verifies there is no
more than one use of any non-chain output value.

In theory, this change could affect code unrelated to strict FP nodes,
but at least on SystemZ I could not find any single instance of that
happening

3) The SystemZ back-end currently does not allow matching multiply-and-
extend operations (32x32 -> 64bit or 64x64 -> 128bit FP multiply) for
strict FP operations.  This was not possible in the past due to the
problems described under 1) and 2) above.

With those issues fixed, it is now possible to fully support those
instructions in strict mode as well, and this patch does so.

Differential Revision: https://reviews.llvm.org/D70913
2019-12-06 11:02:11 +01:00
Alexey Lapshin
9e8c799e2b [Dsymutil][NFC] Move NonRelocatableStringpool into common CodeGen folder.
That refactoring moves NonRelocatableStringpool into common CodeGen folder.
So that NonRelocatableStringpool could be used not only inside dsymutil.

Differential Revision: https://reviews.llvm.org/D71068
2019-12-06 10:02:27 +03:00
David Blaikie
560ab1f8d3 DebugInfo: Pull out a common expression.
This is for the case where -gmlt -gsplit-dwarf -fsplit-dwarf-inlining
are used together in some but not all units during LTO (or, in the
reduced case, even without LTO) - ensuring that no split dwarf is used
(because split-dwarf-inlining puts the same data in the .o file, so
there's no need to duplicate it into the .dwo file)
2019-12-05 19:51:30 -08:00
Quentin Colombet
2ec71ea7c7 [RegisterCoalescer] Fix the creation of subranges when rematerialization is used
* Context *

During register coalescing, we use rematerialization when coalescing is not
possible. That means we may rematerialize a super register when only a smaller
register is actually used.
E.g.,
0B v1 = ldimm 0xFF
1B v2 = COPY v1.low8bits
2B   = v2
=>
0B v1 = ldimm 0xFF
1B v2 = ldimm 0xFF
2B   = v2.low8bits

Where xB are the slot indexes.
Here v2 grew from a 8-bit register to a 16-bit register.

When that happens and subregister liveness is enabled, we create subranges for
the newly created value.
E.g., before remat, the live range of v2 looked like:
main range: [1r, 2r)
(Reads v2 is defined at index 1 slot register and used before the slot register
of index 2)

After remat, it should look like:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 1d) <-- dead def

I.e., the unsused lanes of v2 should be marked as dead definition.

* The Problem *

Prior to this patch, the live-ranges from the previous exampel, would have the
full live-range for all subranges:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 2r) <-- too long

* The Fix *

Technically, the code that this patch changes is not wrong:
When we create the subranges for the newly rematerialized value, we create only
one subrange for the whole bit mask.
In other words, at this point v2 live-range looks like this:
main range: [1r, 2r)
low & high: [1r, 2r)

Then, it gets wrong when we call LiveInterval::refineSubRanges on low 8 bits:
main range: [1r, 2r)
low 8 bits: [1r, 2r)
high 8 bits: [1r, 2r) <-- too long

Ideally, we would like LiveInterval::refineSubRanges to be able to do the right
thing and mark the dead lanes as such. However, this is not possible, because by
the time we update / refine the live ranges, the IR hasn't been updated yet,
therefore we actually don't have enough information to do the right thing.

Another option to fix the problem would have been to call
LiveIntervals::shrinkToUses after the IR is updated. This is not desirable as
this may have a noticeable impact on compile time.

Instead, what this patch does is when we create the subranges for the
rematerialized value, we explicitly create one subrange for the lanes that were
used before rematerialization and one for the lanes that were not used. The used
one inherits the live range of the main range and the unused one is just created
empty. The existing rematerialization code then detects that the unused one are
not live and it correctly sets dead def intervals for them.

https://llvm.org/PR41372
2019-12-05 16:32:30 -08:00
David Blaikie
decee04e63 DebugInfo: Fix LTO+DWARFv5 loclists
The loclists_table_base was being overwritten for each CU even though
only one loclists contribution is made so everything but the last CU
would have a label that was never defined and fail to assemble.
2019-12-05 12:47:54 -08:00
Volkan Keles
bfa3d260b8 [GlobalISel] Localizer: Allow targets not to run the pass conditionally
Summary:
Previously, it was not possible to skip running the localizer pass
conditionally. This patch adds an input function to the pass which
decides if the pass should run on the given MachineFunction or not.

No test case as there is no upstream target needs this functionality.

Reviewers: qcolombet

Reviewed By: qcolombet

Subscribers: rovka, hiraditya, Petar.Avramovic, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D71038
2019-12-05 11:09:50 -08:00
Jeremy Morse
30e8f80fd5 [DebugInfo] Don't create multiple DBG_VALUEs when sinking
This patch addresses a performance problem reported in PR43855, and
present in the reapplication in in 001574938e5. It turns out that
MachineSink will (often) move instructions to the first block that
post-dominates the current block, and then try to sink further. This
means if we have a lot of conditionals, we can needlessly create large
numbers of DBG_VALUEs, one in each block the sunk instruction passes
through.

To fix this, rather than immediately sinking DBG_VALUEs, record them in
a pass structure. When sinking is complete and instructions won't be
sunk any further, new DBG_VALUEs are added, avoiding lots of
intermediate DBG_VALUE $noregs being created.

Differential revision: https://reviews.llvm.org/D70676
2019-12-05 15:52:20 +00:00
Jeremy Morse
e4cdd62631 [DebugInfo] Don't reorder DBG_VALUEs when sunk
Fix part of PR43855, resolving a problem that comes from the reapplication
in 001574938e5. If we have two DBG_VALUE insts in a block that specify
the location of the same variable, for example:

   %0 = someinst
   DBG_VALUE %0, !123, !DIExpression()
   %1 = anotherinst
   DBG_VALUE %1, !123, !DIExpression()

if %0 were to sink, the corresponding DBG_VALUE would sink too, past the
next DBG_VALUE, effectively re-ordering assignments. To fix this, I've
added a SeenDbgVars set recording what variable locations have been seen in
a block already (working bottom up), and now flag DBG_VALUEs that would
pass a later DBG_VALUE for the same variable.

NB, this only works for repeated DBG_VALUEs in the same basic block, the
general case involving control flow is much harder, which I've written
up in PR44117.

Differential revision: https://reviews.llvm.org/D70672
2019-12-05 15:52:20 +00:00
Jeremy Morse
fca4100196 [DebugInfo] Re-apply two patches to MachineSink
These were:
 * D58386 / f5e1b718a67 / reverted in d382a8a768b
 * D58238 / ee50590e168 / reverted in a8db456b53a

Of which the latter has a performance regression tracked in PR43855,
fixed by D70672 / D70676, which will be committed atomically with this
reapplication.

Contains a minor difference to account for a change in the IsCopyInstr
signature.
2019-12-05 15:52:20 +00:00
Sam Parker
393dacacf7 [ARM] Enable TypePromotion by default
ARMCodeGenPrepare has already been generalized and renamed to
TypePromotion. We've had it enabled and tested downstream for a
while, so enable it by default.

Differential Revision: https://reviews.llvm.org/D70998
2019-12-05 14:21:11 +00:00
Djordje Todorovic
52b231ee84 [LiveDebugValues] Silence the unused var warning; NFC 2019-12-05 12:32:14 +01:00
David Stenberg
54682d871d [DebugInfo] Handle call site values for instructions before call bundle
Summary:
If a call is bundled then the code that looks for instructions that
produce parameter values would break when reaching the call's bundle
header, due to the `ifCall(/*AnyInBundle*/)` invocation returning true.

It is not enough to simply ignore bundle headers in the `isCall()`
invocation, as the bundle header may have defines of parameter registers
due to the call, meaning that such registers would incorrectly be
removed from the worklist. Therefore, do not look at bundle headers at
all.

Reviewers: djtodoro, NikolaPrica, aprantl, vsk

Reviewed By: aprantl, vsk

Subscribers: hiraditya, llvm-commits

Tags: #debug-info, #llvm

Differential Revision: https://reviews.llvm.org/D71024
2019-12-05 11:50:41 +01:00
Djordje Todorovic
4b4ede440a Reland "[LiveDebugValues] Introduce entry values of unmodified params"
Relanding this after resolving the cause of the test failure.
2019-12-05 11:10:49 +01:00
Florian Hahn
76a5c8421e [MCRegInfo] Add forward sub and super register iterators. (NFC)
This patch adds forward iterators mc_difflist_iterator,
mc_subreg_iterator and mc_superreg_iterator, based on the existing
DiffListIterator. Those are used to provide iterator ranges over
sub- and super-register from TRI, which are slightly more convenient
than the existing MCSubRegIterator/MCSuperRegIterator. Unfortunately,
it duplicates a bit of functionality, but the new iterators are a bit
more convenient (and can be used with various existing iterator
utilities)  and should probably replace the old iterators in the future.

This patch updates some existing users.

Reviewers: evandro, qcolombet, paquette, MatzeB, arsenm

Reviewed By: qcolombet

Differential Revision: https://reviews.llvm.org/D70565
2019-12-05 09:29:26 +00:00
Florian Hahn
1b81964586 [MIBundle] Turn MachineOperandIteratorBase into a forward iterator.
This patch turns MachineOperandIteratorBase into a regular forward
iterator, which can be used with iterator_range.

It also adds mi_bundle_ops and const_mi_bundle_ops that return iterator
ranges over all operands in a bundle and updates a use of the old
iterator.

Reviewers: evandro, t.p.northover, paquette, MatzeB, arsenm

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D70561
2019-12-05 09:06:22 +00:00
Kai Luo
b200c5180e Reland [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation.
Fix assertion error
```
bool llvm::MachineOperand::isRenamable() const: Assertion `Register::isPhysicalRegister(getReg()) && "isRenamable should only be checked on physical registers"' failed.
```
by checking if the register is 0 before invoking `isRenamable`.
2019-12-05 14:32:11 +08:00
Kai Luo
3882edbe19 Revert "[MachineCopyPropagation] Extend MCP to do trivial copy backward propagation"
This reverts commit 75b3a1c318ccad0f96c38689279bc5db63e2ad05, since it
breaks bootstrap build.
2019-12-05 12:48:37 +08:00
Kai Luo
75b3a1c318 [MachineCopyPropagation] Extend MCP to do trivial copy backward propagation
Summary:
This patch mainly do such transformation
```
$R0 = OP ...
... // No read/clobber of $R0 and $R1
$R1 = COPY $R0 // $R0 is killed
```
Replace $R0 with $R1 and remove the COPY, we have
```
$R1 = OP ...
```
This transformation can also expose more opportunities for existing
copy elimination in MCP.

Differential Revision: https://reviews.llvm.org/D67794
2019-12-05 10:59:07 +08:00
Amara Emerson
28f5ad5801 [GlobalISel] Fix compiler crash lowering G_LOAD in AArch64.
Patch by Daniel Rodríguez Troitiño.

Differential Revision: https://reviews.llvm.org/D70794
2019-12-04 17:04:54 -08:00
Puyan Lotfi
fdc6f4b97b [llvm] Fixing MIRVRegNamerUtils to properly handle 2+ MachineBasicBlocks.
An interplay of code from D70210, along with code from the
Value-Numbering-esque hash-based namer from D70210, as well as some
crusty code from the original MIR-Canon code lead to multiple causes of
failure when canonicalizing or renaming vregs for MIR with multiple
basic blocks. This patch fixes those issues while deleting some no
longer needed code and adding a nice diamond test case to boot.

Differential Revision: https://reviews.llvm.org/D70478
2019-12-04 18:36:08 -05:00
Alexey Lapshin
789e257ce0 [DWARF5][Debuginfo] Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match.
That patch fixes incompatible compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) error.

cat split-dwarf.cpp
int main()
{
  int a = 1;
  return 0;
}

clang++ -O -g -gsplit-dwarf -gdwarf-5 split-dwarf.cpp; llvm-dwarfdump --verify ./a.out | grep skeleton
error: Compilation unit type (DW_UT_skeleton) and root DIE (DW_TAG_compile_unit) do not match.

The fix is to change DW_TAG_compile_unit into DW_TAG_skeleton_unit when skeleton file is generated.

Differential Revision: https://reviews.llvm.org/D70880
2019-12-05 00:53:47 +03:00
Amy Huang
9e978bb01c Add support for lowering 32-bit/64-bit pointers
Summary:
This follows a previous patch that changes the X86 datalayout to represent
mixed size pointers (32-bit sext, 32-bit zext, and 64-bit) with address spaces
(https://reviews.llvm.org/D64931)

This patch implements the address space cast lowering to the corresponding
sign extension, zero extension, or truncate instructions.

Related to https://bugs.llvm.org/show_bug.cgi?id=42359

Reviewers: rnk, craig.topper, RKSimon

Subscribers: hiraditya, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D69639
2019-12-04 11:39:03 -08:00
Vedant Kumar
f208b70fbc Revert "[Coverage] Revise format to reduce binary size"
This reverts commit e18531595bba495946aa52c0a16b9f9238cff8bc.

On Windows, there is an error:

http://lab.llvm.org:8011/builders/sanitizer-windows/builds/54963/steps/stage%201%20check/logs/stdio

error: C:\b\slave\sanitizer-windows\build\stage1\projects\compiler-rt\test\profile\Profile-x86_64\Output\instrprof-merging.cpp.tmp.v1.o: Failed to load coverage: Malformed coverage data
2019-12-04 10:35:14 -08:00
Vedant Kumar
e18531595b [Coverage] Revise format to reduce binary size
Revise the coverage mapping format to reduce binary size by:

1. Naming function records and marking them `linkonce_odr`, and
2. Compressing filenames.

This shrinks the size of llc's coverage segment by 82% (334MB -> 62MB)
and speeds up end-to-end single-threaded report generation by 10%. For
reference the compressed name data in llc is 81MB (__llvm_prf_names).

Rationale for changes to the format:

- With the current format, most coverage function records are discarded.
  E.g., more than 97% of the records in llc are *duplicate* placeholders
  for functions visible-but-not-used in TUs. Placeholders *are* used to
  show under-covered functions, but duplicate placeholders waste space.

- We reached general consensus about giving (1) a try at the 2017 code
  coverage BoF [1]. The thinking was that using `linkonce_odr` to merge
  duplicates is simpler than alternatives like teaching build systems
  about a coverage-aware database/module/etc on the side.

- Revising the format is expensive due to the backwards compatibility
  requirement, so we might as well compress filenames while we're at it.
  This shrinks the encoded filenames in llc by 86% (12MB -> 1.6MB).

See CoverageMappingFormat.rst for the details on what exactly has
changed.

Fixes PR34533 [2], hopefully.

[1] http://lists.llvm.org/pipermail/llvm-dev/2017-October/118428.html
[2] https://bugs.llvm.org/show_bug.cgi?id=34533

Differential Revision: https://reviews.llvm.org/D69471
2019-12-04 10:10:55 -08:00
Cullen Rhodes
17e537bc58 [NFC] Use default case in EVT::getEVTString
Summary:
The default case handles the majority of MVTs so most of the individual
cases can be removed. Also added a case for floating point types.

Reviewed By: sdesmalen

Differential Revision: https://reviews.llvm.org/D70955
2019-12-04 11:06:49 +00:00
Ulrich Weigand
c3d05c1b52 [SelectionDAG] Expand nnan FMINNUM/FMAXNUM to select sequence
InstCombine may synthesize FMINNUM/FMAXNUM nodes from fcmp+select
sequences (where the fcmp is marked nnan).  Currently, if the
target does not otherwise handle these nodes, they'll get expanded
to libcalls to fmin/fmax.  However, these functions may reside in
libm, which may introduce a library dependency that was not originally
present in the source code, potentially resulting in link failures.

To fix this problem, add code to TargetLowering::expandFMINNUM_FMAXNUM
to expand FMINNUM/FMAXNUM to a compare+select sequence instead of the
libcall. This is done only if the node is marked as "nnan"; in this case,
the expansion to compare+select is always correct. This also suffices to
catch all cases where FMINNUM/FMAXNUM was synthesized as above.

Differential Revision: https://reviews.llvm.org/D70965
2019-12-04 10:32:35 +01:00
QingShan Zhang
d84b320dfd [MacroFusion] Limit the max fused number as 2 to reduce the dependency
This is the example:

int foo(int a, int b, int c, int d) {
  return a + b + c + d;
}

And this is the Dependency Graph:
+------+       +------+       +------+       +------+
|  A   |       |  B   |       |  C   |       |  D   |
+--+--++       +---+--+       +--+---+       +--+---+
   ^  ^            ^  ^          ^              ^
   |  |            |  |          |              |
   |  |            |  |New1      +--------------+
   |  |            |  |          |
   |  |            |  |       +--+---+
   |  |New2        |  +-------+ ADD1 |
   |  |            |          +--+---+
   |  |            |    Fuse     ^
   |  |            +-------------+
   |  +------------+
   |               |
   |   Fuse     +--+---+
   +----------->+ ADD2 |
   |            +------+
+--+---+
| ADD3 |
+------+

We need also create an artificial edge from ADD1 to A if
https://reviews.llvm.org/D69998 is landed. That will force the Node A scheduled
before the ADD1 and ADD2. But in fact, it is ok to schedule the Node A
in-between ADD3 and ADD2, as ADD3 and ADD2 are NOT a fusion pair because
ADD2 has been matched to ADD1. We are creating these unnecessary dependency
edges that override the heuristics.

Differential Revision: https://reviews.llvm.org/D70066
2019-12-04 05:05:35 +00:00
Craig Topper
f586fd44e4 [FPEnv] [PowerPC] Lowering ppc_fp128 StrictFP Nodes to libcalls
This is an alternative to D64662 that shares more code between
strict and non-strict nodes. It's modeled after the implementation
that I did for softening.

Differential Revision: https://reviews.llvm.org/D70867
2019-12-03 14:11:21 -08:00
Aditya Nandakumar
6da7dbb806 [GlobalISel]: Allow targets to override how to widen constants during legalization
https://reviews.llvm.org/D70922

This adds a hook to allow targets to define exactly what extension
operation should be performed for widening constants. This handles cases
like widening i1 true which would end up becoming -1 which affects code
quality during combines.
Additionally, in order to stay consistent with how DAG is promoting
constants, we now signextend for byte sized types and zero extend
otherwise (by default). Targets can of course override this if
necessary.
2019-12-03 10:41:10 -08:00
Roman Lebedev
9a20c79ddc
[NFC][KnownBits] Add getMinValue() / getMaxValue() methods
As it can be seen from accompanying cleanup, it is not unheard of
to write `~Known.Zero` meaning "what maximal value can this KnownBits
produce". But i think `~Known.Zero` isn't *that* self-explanatory,
as compared to a method with a name.

Note that not all `~Known.Zero` places were cleaned up,
only those where this arguably improves things.
2019-12-03 20:04:51 +03:00
Amaury Séchet
b4980f7781 [SelectionDAG] Reoder ViewXXXDAGs declarations to match execution order. NFC 2019-12-03 16:26:12 +01:00
stozer
269a9afe25 [DebugInfo] Make DebugVariable class available in DebugInfoMetadata
The DebugVariable class is a class declared in LiveDebugValues.cpp which
is used to uniquely identify a single variable, using its source
variable, inline location, and fragment info to do so. This patch moves
this class into DebugInfoMetadata.h, making it available in a much
broader scope.
2019-12-03 15:10:56 +00:00
Sourabh Singh Tomar
8dd17a13b0 [NFCI][DebugInfo] Corrected a comment. 2019-12-03 19:45:37 +05:30
Djordje Todorovic
409350deea Revert "[LiveDebugValues] Introduce entry values of unmodified params"
This reverts commit rG4cfceb910692 due to LLDB test failing.
2019-12-03 13:13:27 +01:00
Sam Parker
bc76dadb3c [CodeGen] Move ARMCodegenPrepare to TypePromotion
Convert ARMCodeGenPrepare into a generic type promotion pass by:
- Removing the insertion of arm specific intrinsics to handle narrow
  types as we weren't using this.
- Removing ARMSubtarget references.
- Now query a generic TLI object to know which types should be
  promoted and what they should be promoted to.
- Move all codegen tests into Transforms folder and testing using opt
  and not llc, which is how they should have been written in the
  first place...

The pass searches up from icmp operands in an attempt to safely
promote types so we can avoid generating unnecessary unsigned extends
during DAG ISel.

Differential Revision: https://reviews.llvm.org/D69556
2019-12-03 11:12:52 +00:00
Jonas Paulsson
f8c0cfc24e ImplicitNullChecks: Don't add a dead definition of DepMI as live-in
This is one of the fixes needed to reapply D68267 which improves verification
of live-in lists.

Review: craig.topper
https://reviews.llvm.org/D70434
2019-12-03 11:02:53 +01:00