32115 Commits

Author SHA1 Message Date
Kazu Hirata
7e75f6fc1d [SelectionDAG] Use range-based for loops (NFC) 2021-02-09 22:14:30 -08:00
Matt Arsenault
b72a23650f GlobalISel: Fix using wrong calling convention for callees
This was taking the calling convention from the parent function,
instead of the callee. Avoids regressions in a future patch when the
caller and callee have different type breakdowns.

For some reason AArch64's lowerFormalArguments seems to intentionally
ignore the parent isVarArg.
2021-02-09 13:48:56 -05:00
Nico Weber
de1966e542 Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly"
This reverts commit 4a64d8fe392449b205e59031aad5424968cf7446.
Makes clang crash when buildling trivial iOS programs, see comment
after https://reviews.llvm.org/D92808#2551401
2021-02-09 11:06:32 -05:00
Nemanja Ivanovic
a5222aa085 [DAGCombine] Do not remove masking argument to FP16_TO_FP for some targets
As of commit 284f2bffc9bc5, the DAG Combiner gets rid of the masking of the
input to this node if the mask only keeps the bottom 16 bits. This is because
the underlying library function does not use the high order bits. However, on
PowerPC's ELFv2 ABI, it is the caller that is responsible for clearing the bits
from the register. Therefore, the library implementation of __gnu_h2f_ieee will
return an incorrect result if the bits aren't cleared.

This combine is desired for ARM (and possibly other targets) so this patch adds
a query to Target Lowering to check if this zeroing needs to be kept.

Fixes: https://bugs.llvm.org/show_bug.cgi?id=49092

Differential revision: https://reviews.llvm.org/D96283
2021-02-09 06:33:48 -06:00
Thomas Preud'homme
a50ab8672d Revert STRICT_FCMP nonan optimisation
Summary: This reverts commit b7b61a7b5bc63df0d84f3722a1dcfa375c35ba30 which fails on some of the builders: http://lab.llvm.org:8011/#/builders/14/builds/5806

Reviewers:

Subscribers:
2021-02-09 11:27:35 +00:00
Thomas Preud'homme
b7b61a7b5b Improve STRICT_FSETCC codegen in absence of no NaN
As for SETCC, use a less expensive condition code when generating
STRICT_FSETCC if the node is known not to have Nan.

Reviewed By: SjoerdMeijer

Differential Revision: https://reviews.llvm.org/D91972
2021-02-09 11:18:16 +00:00
Matt Arsenault
87e280110d GlobalISel: Use correct calling convention in handleAssignments
This was using the calling convention of the calling function, not the
callee. Avoids regressions in a future patch.
2021-02-08 17:09:28 -05:00
Amara Emerson
ec41ed5b1b [AArch64][GlobalISel] Support the 'returned' parameter attribute.
On AArch64 (which seems to be the only target that supports it), this
attribute allows codegen to avoid saving/restoring the value in x0
across a call.

Gives a 0.1% geomean -Os code size improvement on CTMark.

Differential Revision: https://reviews.llvm.org/D96099
2021-02-08 12:47:39 -08:00
Simon Pilgrim
c5c690a835 [DAG] visitVECTOR_SHUFFLE - move shuffle legality check into MergeInnerShuffle lamda. NFCI.
This is going to be necessary for a future reuse of MergeInnerShuffle
2021-02-08 14:25:16 +00:00
Nicholas Guy
cd880442ae [CodeGen][AArch64] Add TargetInstrInfo hook to modify the TailDuplicateSize default threshold
Different targets might handle branch performance differently, so this patch allows for
targets to specify the TailDuplicateSize threshold. Said threshold defines how small a branch
can be and still be duplicated to generate straight-line code instead.
This patch also specifies said override values for the AArch64 subtarget.

Differential Revision: https://reviews.llvm.org/D95631
2021-02-08 13:28:00 +00:00
Jeremy Morse
c1d45abda5 Revert "Re-land D94976 after revert in e29552c5aff6"
Maskray has reported a fault with .debug_gnu_pubnames in the comments on
D94976, caused by this patch, reverting to investigate.

This reverts commit 8998f5843503773c2f51fd475e2c77c687a65ee6.
2021-02-08 12:41:12 +00:00
Jeremy Morse
6ade2dea7b Revert "DebugInfo: Temporarily work around -gsplit-dwarf + LTO .debug_gnu_pubnames regression after D94976"
Backing out this workaround to focus on fixing whatever's wrong with
.debug_gnu_pubnames, I'll revert the cause, (8998f584) in the next commit.

This reverts commit 56fa34ae3570a34fd0f4c2cf1bfaf095da01a959.
2021-02-08 12:41:01 +00:00
Kazu Hirata
7b9f6c2d42 [SelectionDAG] Drop unnecessary const from a return type (NFC)
Identified with const-return-type.
2021-02-07 09:49:33 -08:00
Simon Pilgrim
86dabf4226 [DAG] SelectionDAG::isSplatValue - handle OR/XOR cases
Add OR/XOR to the basic binops that we support when checking for a splat vector value
2021-02-07 13:27:57 +00:00
Fangrui Song
e44a100942 .gcc_except_table: Set SHF_LINK_ORDER if binutils>=2.36, and drop unneeded unique ID for -fno-unique-section-names
GNU ld>=2.36 supports mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an
output section, so we can set SHF_LINK_ORDER if -fbinutils-version=2.36 or above.

If -fno-function-sections or older binutils, drop unique ID for -fno-unique-section-names.
The users can just specify -fbinutils-version=2.36 or above to allow GC with both GNU ld and LLD.
(LLD does not support garbage collection of non-group non-SHF_LINK_ORDER .gcc_except_table sections.)
2021-02-05 21:45:21 -08:00
Fangrui Song
853a264916 [AsmPrinter] __patchable_function_entries: Set SHF_LINK_ORDER for binutils 2.36 and above
This matches GCC behavior when the configure-time binutils is new. GNU ld<2.36
did not support mixed SHF_LINK_ORDER and non-SHF_LINK_ORDER sections in an
output section, so we conservatively disable SHF_LINK_ORDER for <2.36.
2021-02-05 19:53:06 -08:00
Sanjay Patel
c981f6f8e1 Revert "[Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library"
This reverts commit 2303e93e666e13ebf6d24323729c28f520ecca37.
Investigating bot failures.
2021-02-05 15:10:11 -05:00
Lukas Sommer
2303e93e66 [Codegen][ReplaceWithVecLib] add pass to replace vector intrinsics with calls to vector library
This patch adds a pass to replace calls to vector intrinsics
(i.e., LLVM intrinsics operating on vector operands) with
calls to a vector library.

Currently, calls to LLVM intrinsics are only replaced with
calls to vector libraries when scalar calls to intrinsics are
vectorized by the Loop- or SLP-Vectorizer.

With this pass, it is now possible to replace calls to LLVM
intrinsics already operating on vector operands, e.g., if
such code was generated by MLIR. For the replacement,
information from the TargetLibraryInfo, e.g., as specified
via -vector-library is used.

Differential Revision: https://reviews.llvm.org/D95373
2021-02-05 14:25:19 -05:00
Wouter van Oortmerssen
e3c0b0fe09 [WebAssembly] locals can now be indirect in DWARF
This for example to indicate that byval args are represented by a pointer to a struct.
Followup to https://reviews.llvm.org/D94140

Differential Revision: https://reviews.llvm.org/D94347
2021-02-05 11:14:42 -08:00
Amy Huang
34f3249abd [DebugInfo] Fix error from D95893, where I accidentally used an
unsigned int in a loop and it wraps around.

Follow up to https://reviews.llvm.org/D95893
2021-02-05 10:25:21 -08:00
Huihui Zhang
1b81117f88 [DAGCombiner][SVE] Fix invalid use of getVectorNumElements() in visitSRA.
Make sure scalable property is preserved by using getVectorElementCount().

Reviewed By: paulwalker-arm

Differential Revision: https://reviews.llvm.org/D95967
2021-02-05 09:56:49 -08:00
Amy Huang
a740af4de9 [CodeView][DebugInfo] Update the code for removing template arguments from the display name of a codeview function id.
Previously the code split the string at the first '<', which
incorrectly truncated names like `operator<`.

Differential Revision: https://reviews.llvm.org/D95893
2021-02-05 09:49:11 -08:00
Akira Hatanaka
4a64d8fe39 [ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly
emitting retainRV or claimRV calls in the IR

This reapplies 3fe3946d9a958b7af6130241996d9cfcecf559d4 without the
changes made to lib/IR/AutoUpgrade.cpp, which was violating layering.

Original commit message:

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.rv" to calls, which
  indicates the call is implicitly followed by a marker instruction and
  an implicit retainRV/claimRV call that consumes the call result. In
  addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
  consumes the call result, to prevent the middle-end passes from changing
  the return type of the called function. This is currently done only when
  the target is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  the call is annotated with claimRV since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if the implicit call is a call to
  retainRV and does nothing if it's a call to claimRV.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-02-05 06:09:42 -08:00
Akira Hatanaka
2fbbb18c1d Revert "[ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly"
This reverts commit 3fe3946d9a958b7af6130241996d9cfcecf559d4.

The commit violates layering by including a header from Analysis in
lib/IR/AutoUpgrade.cpp.
2021-02-05 06:00:05 -08:00
Akira Hatanaka
3fe3946d9a [ObjC][ARC] Use operand bundle 'clang.arc.rv' instead of explicitly
emitting retainRV or claimRV calls in the IR

Background:

This patch makes changes to the front-end and middle-end that are
needed to fix a longstanding problem where llvm breaks ARC's autorelease
optimization (see the link below) by separating calls from the marker
instructions or retainRV/claimRV calls. The backend changes are in
https://reviews.llvm.org/D92569.

https://clang.llvm.org/docs/AutomaticReferenceCounting.html#arc-runtime-objc-autoreleasereturnvalue

What this patch does to fix the problem:

- The front-end adds operand bundle "clang.arc.rv" to calls, which
  indicates the call is implicitly followed by a marker instruction and
  an implicit retainRV/claimRV call that consumes the call result. In
  addition, it emits a call to @llvm.objc.clang.arc.noop.use, which
  consumes the call result, to prevent the middle-end passes from changing
  the return type of the called function. This is currently done only when
  the target is arm64 and the optimization level is higher than -O0.

- ARC optimizer temporarily emits retainRV/claimRV calls after the calls
  with the operand bundle in the IR and removes the inserted calls after
  processing the function.

- ARC contract pass emits retainRV/claimRV calls after the call with the
  operand bundle. It doesn't remove the operand bundle on the call since
  the backend needs it to emit the marker instruction. The retainRV and
  claimRV calls are emitted late in the pipeline to prevent optimization
  passes from transforming the IR in a way that makes it harder for the
  ARC middle-end passes to figure out the def-use relationship between
  the call and the retainRV/claimRV calls (which is the cause of
  PR31925).

- The function inliner removes an autoreleaseRV call in the callee if
  nothing in the callee prevents it from being paired up with the
  retainRV/claimRV call in the caller. It then inserts a release call if
  the call is annotated with claimRV since autoreleaseRV+claimRV is
  equivalent to a release. If it cannot find an autoreleaseRV call, it
  tries to transfer the operand bundle to a function call in the callee.
  This is important since ARC optimizer can remove the autoreleaseRV
  returning the callee result, which makes it impossible to pair it up
  with the retainRV/claimRV call in the caller. If that fails, it simply
  emits a retain call in the IR if the implicit call is a call to
  retainRV and does nothing if it's a call to claimRV.

Future work:

- Use the operand bundle on x86-64.

- Fix the auto upgrader to convert call+retainRV/claimRV pairs into
  calls annotated with the operand bundles.

rdar://71443534

Differential Revision: https://reviews.llvm.org/D92808
2021-02-05 05:55:18 -08:00
Guillaume Chatelet
4b15156dca [NFC] inline variable 2021-02-05 10:17:02 +00:00
Kazu Hirata
5438e079b1 [GlobalISel] Use ListSeparator (NFC) 2021-02-04 21:18:04 -08:00
Craig Topper
11ef356d9e [TargetLowering] Use Align in allowsMisalignedMemoryAccesses.
Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D96097
2021-02-04 19:22:06 -08:00
Fangrui Song
56fa34ae35 DebugInfo: Temporarily work around -gsplit-dwarf + LTO .debug_gnu_pubnames regression after D94976
`-flto -gsplit-dwarf -g -O[123]` may create .debug_gnu_pubnames with 0 DIE
offset entries. llvm-dwarfdump -debug-gnu-pubnames/ld.lld --gdb-index errors for that.

```
        .section        .debug_gnu_pubnames,"",@progbits
        .long   .LpubNames_end2-.LpubNames_begin2 # Length of Public Names Info
.LpubNames_begin2:
        .short  2                               # DWARF Version
        .long   .Lcu_begin2                     # Offset of Compilation Unit Info
        .long   57                              # Compilation Unit Length
        .long   0                               # DIE offset
        .byte   16                              # Attributes: TYPE, EXTERNAL
        .asciz  "absl"                          # External Name
        .long   0                               # DIE offset
        .byte   16                              # Attributes: TYPE, EXTERNAL
        .asciz  "absl::base_internal"           # External Name
        .long   0                               # End Mark
```
2021-02-04 17:35:09 -08:00
Craig Topper
8cc9c42a0c [TargetLowering] Use LegalOnly operand to isOperationLegalOrCustom to simplify some code. NFC 2021-02-04 12:30:37 -08:00
Sanjay Patel
056d31dd2a [ExpandReductions] fix FMF requirement for fmin/fmax
The upstream callers (the vectorizers) were fixed with:
bbed5f2f8a04 ( D95690 )
77adbe6a8c71

We should remove this pass entirely now that reduction
legalization/lowering is expected to work just as well,
but we need to confirm that the shuffle ops do not
regress (for x86 in particular).

This should be the last step needed to close:
https://llvm.org/PR23116
2021-02-04 13:32:08 -05:00
Jeremy Morse
8998f58435 Re-land D94976 after revert in e29552c5aff6
This modified patch avoids redirecting the unit in which a subprogram is
created if type units are enabled -- DIEs were getting children allocated
from different units memory pools. Original commit message:

[DWARF] Create subprogram's DIE in DISubprogram's unit

This is a fix for PR48790. Over in D70350, subprogram DIEs were permitted
to be shared between CUs. However, the creation of a subprogram DIE can be
triggered early, from other CUs. The subprogram definition is then created
in one CU, and when the function is actually emitted children are attached
to the subprogram that expect to be in another CU. This breaks internal CU
references in the children.

Fix this by redirecting the creation of subprogram DIEs in
getOrCreateContextDIE to the CU specified by it's DISubprogram definition.
This ensures that the subprogram DIE is always created in the correct CU.

Differential Revision: https://reviews.llvm.org/D94976
2021-02-04 11:17:18 +00:00
Justin Bogner
62ce4b048f [GlobalISel] Combine narrowScalar of G_ADD and G_SUB. NFC
These two cases have identical implementations other than an
unreachable part of `G_ADD` that checks if the scalar we're narrowing
is a vector. Combining them to avoid unnecessary divergence.
2021-02-03 11:06:04 -08:00
Matt Arsenault
39fbb5c3e3 RegisterCoalescer: Fix not setting undef on coalesced subregister uses
This was only adding undef to the use if the copy itself had a
subregister index. It did not consider the subrange liveness if the
use had a subreg index to begin with.
2021-02-03 13:54:43 -05:00
Matt Arsenault
d886da042c RegisterCoalescer: Prune undef subranges from copy pairs in loops
If we had a pair of copies inside a loop which introduced new liveness
to a subregister which was undef before the loop, we would have a
dummy phi-only segment remaining across the loop body. Later, this
false segment would confuse RenameIndependentSubregs causing it to
introduce IMPLICIT_DEFs with broken value numbering.

It seems always adding the lanes to ShrinkMask is OK, so any
conditions should be purely a compile time filter.
2021-02-03 13:42:53 -05:00
Craig Topper
34da12dd1f [DAGCombiner] Remove (sra (shl X, C), C) if X has more than C sign bits.
If sext_inreg is supported, we will turn this into sext_inreg. That
will then remove it if there are enough sign bits. But if sext_inreg
isn't supported, we can still remove the shift pair based on sign
bits.

Split from D95890.
2021-02-03 10:18:40 -08:00
Jeremy Morse
d32deaab4d Revert "[DWARF] Location-less inlined variables should not have DW_TAG_variable"
This reverts commit ddc2f1e3fb4f8f9ae7dd130e40b60cfc775eba24.

A build-bot objected:

  http://lab.llvm.org:8011/#builders/105/builds/5486
2021-02-03 17:54:33 +00:00
Jeremy Morse
ddc2f1e3fb [DWARF] Location-less inlined variables should not have DW_TAG_variable
Discussed in this thread:

  https://lists.llvm.org/pipermail/llvm-dev/2021-January/148139.html

DwarfDebug::collectEntityInfo accidentally distinguishes between variable
locations that never have a location specified, and variable locations that
have an empty location specified. The latter leads to the creation of an
empty variable referring to the abstract origin.

Fix this by seeking a non-empty location before producing a concrete
entity, to guarantee a DW_AT_location will be produced. Other loops in
collectEntityInfo and endFunctionImpl take care of examining the
retainedNodes collection and ensuring optimised-out variables are created.

Differential Revision: https://reviews.llvm.org/D95617
2021-02-03 17:32:31 +00:00
Kazu Hirata
511c9a76fb [AsmPrinter] Use ListSeparator (NFC) 2021-02-02 22:52:48 -08:00
Serguei Katkov
de305b0425 [Statepoint] Handle 'undef' operand tied to def
FixupStatepoints pass does not take into account the undef use
it skips may have a tied def. So when defs are handled pass
considers that tied-use should be spilled and triggers an assert.

FixupStatepoints should skip undef def as well.

Reviewers: reames, dantrushin
Reviewed By: dantrushin
Subscribers: llvm-commits
Differential Revision: https://reviews.llvm.org/D95858
2021-02-03 10:41:14 +07:00
Jessica Paquette
02d4b365bf [GlobalISel] Check if branches use the same MBB in matchOptBrCondByInvertingCond
If the G_BR + G_BRCOND in this combine use the same MBB, then it will infinite
loop. Don't allow that to happen.

Differential Revision: https://reviews.llvm.org/D95895
2021-02-02 15:38:48 -08:00
Craig Topper
4553821815 [SelectionDAG] Prevent scalable vector warning from ComputeNumSignBits on extract_vector_elt on a scalable vector. 2021-02-01 23:42:03 -08:00
Jessica Paquette
4809663334 [GlobalISel] Make sure G_ASSERT_ZEXT's src ends up with the same rc as dst
When replacing the dst reg with the src reg, we need to make sure that we
propagate the dst reg's register class through to the src.

Otherwise, we aren't meeting the requirements for G_ASSERT_ZEXT, and so the
verifier will fail.

Differential Revision: https://reviews.llvm.org/D95708
2021-02-01 09:46:35 -08:00
Kerry McLaughlin
9b4fcfaa9e [SVE][CodeGen] Remove performMaskedGatherScatterCombine
The AArch64 DAG combine added by D90945 & D91433 extends the index
of a scalable masked gather or scatter to i32 if necessary.

This patch removes the combine and instead adds shouldExtendGSIndex, which
is used by visitMaskedGather/Scatter in SelectionDAGBuilder to query whether
the index should be extended before calling getMaskedGather/Scatter.

Reviewed By: david-arm

Differential Revision: https://reviews.llvm.org/D94525
2021-02-01 14:10:00 +00:00
Tim Northover
c2b322fc19 GlobalISel: check type size before getZExtValue()ing it.
Otherwise getZExtValue() asserts.
2021-02-01 12:43:33 +00:00
xgupta
94fac81fcc [Branch-Rename] Fix some links
According to the [[ https://foundation.llvm.org/docs/branch-rename/ | status of branch rename ]], the master branch of the LLVM repository is removed on 28 Jan 2021.

Reviewed By: mehdi_amini

Differential Revision: https://reviews.llvm.org/D95766
2021-02-01 16:43:21 +05:30
Serge Pavlov
bf416d166b [FPEnv] Intrinsic for setting rounding mode
To set non-default rounding mode user usually calls function 'fesetround'
from standard C library. This way has some disadvantages.

* It creates unnecessary dependency on libc. On the other hand, setting
  rounding mode requires few instructions and could be made by compiler.
  Sometimes standard C library even is not available, like in the case of
  GPU or AI cores that execute small kernels.
* Compiler could generate more effective code if it knows that a particular
  call just sets rounding mode.

This change introduces new IR intrinsic, namely 'llvm.set.rounding', which
sets current rounding mode, similar to 'fesetround'. It however differs
from the latter, because it is a lower level facility:

* 'llvm.set.rounding' does not return any value, whereas 'fesetround'
  returns non-zero value in the case of failure. In glibc 'fesetround'
  reports failure if its argument is invalid or unsupported or if floating
  point operations are unavailable on the hardware. Compiler usually knows
  what core it generates code for and it can validate arguments in many
  cases.
* Rounding mode is specified in 'fesetround' using constants like
  'FE_TONEAREST', which are target dependent. It is inconvenient to work
  with such constants at IR level.

C standard provides a target-independent way to specify rounding mode, it
is used in FLT_ROUNDS, however it does not define standard way to set
rounding mode using this encoding.

This change implements only IR intrinsic. Lowering it to machine code is
target-specific and will be implemented latter. Mapping of 'fesetround'
to 'llvm.set.rounding' is also not implemented here.

Differential Revision: https://reviews.llvm.org/D74729
2021-02-01 11:28:14 +07:00
Jun Ma
54842fa0bb [CodeGenPrepare] Also skip lifetime.end intrinsic when check return block in dupRetToEnableTailCallOpts.
Differential Revision: https://reviews.llvm.org/D95424
2021-02-01 08:18:44 +08:00
Craig Topper
70289ea6f5 [RISCV][LegalizeTypes] Try to expand BSWAP before promoting if the promoted BSWAP would expand anyway.
If we're going to end up expanding anyway, we should do it early
so we don't create extra operations to handle the bytes added by
promotion.

This is helfpul on RISCV where we might have to promote i16 all
the way to i64.

Differential Revision: https://reviews.llvm.org/D95756
2021-01-31 14:33:29 -08:00
Matt Arsenault
1801e2aa24 RegAlloc: Fix assert if all registers in class reserved
With a context instruction, this would produce a context
error. However, it would continue on and do an out of bounds access of
the empty allocation order array.
2021-01-31 11:10:04 -05:00