45822 Commits

Author SHA1 Message Date
PeixinQiao
6d952b08bd [NFC] Fix typos
Initial commit test.
2021-08-17 17:16:37 +08:00
Bing1 Yu
bcec4ccd04 [X86] [AMX] Replace bitcast with specific AMX intrinsics with X86 specific cast.
There is some discussion on the bitcast for vector and x86_amx at https://reviews.llvm.org/D99152. This patch is to introduce a x86 specific cast for vector and x86_amx, so that it can avoid some unnecessary optimization by middle-end. On the other way, we have to optimize the x86 specific cast by ourselves. This patch also optimize the cast operation to eliminate redundant code.

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D107544
2021-08-17 17:04:26 +08:00
Hongtao Yu
f27fee623d [SamplePGO][NFC] Dump function profiles in order
Sample profiles are stored in a string map which is basically an unordered map. Printing out profiles by simply walking the string map doesn't enforce an order. I'm sorting the map in the decreasing order of total samples to enable a more stable dump, which is good for comparing two dumps.

Reviewed By: wenlei, wlei

Differential Revision: https://reviews.llvm.org/D108147
2021-08-16 17:22:30 -07:00
Hongtao Yu
a1e21864df [SamplePGO] Fixing a memory issue when creating profiles on-demand
There is a on-dmeand creation of function profile during top-down processing in the sample loader when merging uninlined callees.  During the profile creation, a stack string object is used to store a newly-created MD5 name, which is then used by reference as hash key in the profile map. This makes the hash key a dangling reference when later on the stack string object is deallocated.

The issue only happens with md5 profile use and was exposed by context split work for CS profile. I'm making a fix by storing newly created names in the reader.

Reviewed By: wenlei, wmi, wlei

Differential Revision: https://reviews.llvm.org/D108142
2021-08-16 16:30:26 -07:00
Arthur Eubanks
0d822da2bd [NFC] Remove/replace some confusing attribute getters on Function 2021-08-16 16:12:37 -07:00
Stanislav Mekhanoshin
877572cc19 Allow rematerialization of virtual reg uses
Currently isReallyTriviallyReMaterializableGeneric() implementation
prevents rematerialization on any virtual register use on the grounds
that is not a trivial rematerialization and that we do not want to
extend liveranges.

It appears that LRE logic does not attempt to extend a liverange of
a source register for rematerialization so that is not an issue.
That is checked in the LiveRangeEdit::allUsesAvailableAt().

The only non-trivial aspect of it is accounting for tied-defs which
normally represent a read-modify-write operation and not rematerializable.

The test for a tied-def situation already exists in the
/CodeGen/AMDGPU/remat-vop.mir,
test_no_remat_v_cvt_f32_i32_sdwa_dst_unused_preserve.

The change has affected ARM/Thumb, Mips, RISCV, and x86. For the targets
where I more or less understand the asm it seems to reduce spilling
(as expected) or be neutral. However, it needs a review by all targets'
specialists.

Differential Revision: https://reviews.llvm.org/D106408
2021-08-16 12:42:42 -07:00
Rong Xu
9b8425e42c Reapply commit b7425e956
The commit b7425e956: [NFC] fix typos
is harmless but was reverted by accident. Reapply.
2021-08-16 12:18:40 -07:00
Nikita Popov
735a590471 [MemorySSA] Remove -enable-mssa-loop-dependency option
This option has been enabled by default for quite a while now.
The practical impact of removing the option is that MSSA use
cannot be disabled in default pipelines (both LPM and NPM) and
in manual LPM invocations. NPM can still choose to enable/disable
MSSA using loop vs loop-mssa.

The next step will be to require MSSA for LICM and drop the
AST-based implementation entirely.

Differential Revision: https://reviews.llvm.org/D108075
2021-08-16 20:59:37 +02:00
Kostya Kortchinsky
80ed75e7fb Revert "[NFC] Fix typos"
This reverts commit b7425e956be60a73004d7ae5bb37da85872c29fb.
2021-08-16 11:13:05 -07:00
Rong Xu
b7425e956b [NFC] Fix typos
s/senstive/senstive/g
2021-08-16 10:15:30 -07:00
Renato Golin
a19747ea73 Fix type in DenseMap<SmallBitVector, *> to match V.size()
Differential Revision: https://reviews.llvm.org/D108124
2021-08-16 15:32:04 +01:00
David Blaikie
62a4c2c10e DWARFVerifier: Check section-relative references at the end of the section
This ensures that debug_types references aren't looked for in
debug_info section.

Behavior is still going to be questionable in an unlinked object file -
since cross-cu references could refer to symbols in another .debug_info
(or, in theory, .debug_types) chunk - but if a producer only uses
ref_addr to refer to things within the same .debug_info chunk in an
object file (eg: whole program optimization/LTO - producing two CUs into
a single .debug_info section in an object file - the ref_addrs there
could be resolved relative to that .debug_info chunk, not needing to
consider comdat  (DWARFv5 type units or other creatures) chunks of
.debug_info, etc)
2021-08-15 11:40:24 -07:00
Harald van Dijk
957334382c
[ExecutionEngine] Check for libunwind before calling __register_frame
libgcc and libunwind have different flavours of __register_frame. Both
 flavours are already correctly handled, except that the code to handle
the libunwind flavour is guarded by __APPLE__. This change uses the
presence of __unw_add_dynamic_fde in libunwind instead to detect whether
libunwind is used, rather than hardcoding it as Apple vs. non-Apple.

Fixes PR44074.

Thanks to Albert Jin <albert.jin@gmail.com> and Chris Schafmeister
<chris.schaf@verizon.net> for identifying the problem.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D106129
2021-08-15 13:35:53 +01:00
Wang, Pengfei
f1de9d6dae [X86] AVX512FP16 instructions enabling 2/6
Enable FP16 binary operator instructions.

Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html

Reviewed By: LuoYuanke

Differential Revision: https://reviews.llvm.org/D105264
2021-08-15 08:56:33 +08:00
luxufan
4ec32375bc [JITLink] Unify x86-64 MachO and ELF 's optimize GOT/Stub function
This patch  unify optimizeELF_x86_64_GOTAndStubs and optimizeMachO_x86_64_GOTAndStubs into a pure optimize_x86_64_GOTAndStubs

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D108025
2021-08-15 00:33:09 +08:00
Lang Hames
27ea3f1607 [JITLink][x86-64] Rename *Relaxable edges to *REXRelaxable.
The existing relaxable edges all assume a REX prefix. ELF includes non-REX
relaxations, so rename these edges to make room for the new kinds.
2021-08-14 18:28:49 +10:00
Lang Hames
632135acae [JITLink][x86-64] Rename BranchPCRel32ToPtrJumpStub(Relaxable -> Bypassable).
ELF allows for branch optimizations other than bypass, so rename this edge kind
to avoid any confusion.
2021-08-14 17:49:31 +10:00
Jessica Paquette
50efbf9cbe [GlobalISel] Narrow binops feeding into G_AND with a mask
This is a fairly common pattern:

```
%mask = G_CONSTANT iN <mask val>
%add = G_ADD %lhs, %rhs
%and = G_AND %add, %mask
```

We have combines to eliminate G_AND with a mask that does nothing.

If we combined the above to this:

```
%mask = G_CONSTANT iN <mask val>
%narrow_lhs = G_TRUNC %lhs
%narrow_rhs = G_TRUNC %rhs
%narrow_add = G_ADD %narrow_lhs, %narrow_rhs
%ext = G_ZEXT %narrow_add
%and = G_AND %ext, %mask
```

We'd be able to take advantage of those combines using the trunc + zext.

For this to work (or be beneficial in the best case)

- The operation we want to narrow then widen must only be used by the G_AND
- The G_TRUNC + G_ZEXT must be free
- Performing the operation at a narrower width must not produce a different
  value than performing it at the original width *after masking.*

Example comparison between SDAG + GISel: https://godbolt.org/z/63jzb1Yvj

At -Os for AArch64, this is a 0.2% code size improvement on CTMark/pairlocalign.

Differential Revision: https://reviews.llvm.org/D107929
2021-08-13 18:31:13 -07:00
Matt Arsenault
cc56152f83 GlobalISel: Add helper function for getting EVT from LLT
This can only give an imperfect approximation, but is enough to avoid
crashing in places where we call into EVT functions starting from LLTs.
2021-08-13 21:10:13 -04:00
Arthur Eubanks
dc41c558dd [NFC] Make AttributeList::hasAttribute(AttributeList::ReturnIndex) its own method
AttributeList::hasAttribute() is confusing. In an attempt to change the
name to something that suggests using other methods, fix up some
existing uses.
2021-08-13 16:27:11 -07:00
Arthur Eubanks
f80ae58068 [NFC] Cleanup calls to AttributeList::getAttribute(FunctionIndex)
getAttribute() is confusing, use a clearer method.
2021-08-13 16:27:11 -07:00
Arthur Eubanks
8e9ffa1dc6 [NFC] Cleanup callers of AttributeList::hasAttributes()
AttributeList::hasAttributes() is confusing, use clearer methods like
hasFnAttrs().
2021-08-13 12:16:52 -07:00
Arthur Eubanks
d7593ebaee [NFC] Clean up users of AttributeList::hasAttribute()
AttributeList::hasAttribute() is confusing, use clearer methods like
hasParamAttr()/hasRetAttr().

Add hasRetAttr() since it was missing from AttributeList.
2021-08-13 11:59:18 -07:00
Arthur Eubanks
80ea2bb574 [NFC] Rename AttributeList::getParam/Ret/FnAttributes() -> get*Attributes()
This is more consistent with similar methods.
2021-08-13 11:16:52 -07:00
Arthur Eubanks
92ce6db9ee [NFC] Rename AttributeList::hasFnAttribute() -> hasFnAttr()
This is more consistent with similar methods.
2021-08-13 11:09:18 -07:00
Arthur Eubanks
a0c42ca56c [NFC] Remove AttributeList::hasParamAttribute()
It's the same as AttributeList::hasParamAttr().
2021-08-13 10:58:21 -07:00
Rainer Orth
8738c5b0fe [MC][ELF] Mark Solaris objects as ELFOSABI_SOLARIS
Prompted by D107747 <https://reviews.llvm.org/D107747>, it seems prudent to
mark objects as `ELFOSABI_SOLARIS` on Solaris.

Tested on `amd64-pc-solaris2.11`, `sparcv9-sun-solaris2.11`, and
`x86_64-pc-linux-gnu`.

Differential Revision: https://reviews.llvm.org/D107748
2021-08-13 14:31:32 +02:00
luxufan
ee65938357 [JITLink] Update ELF_x86_64 's edge kind to generic edge kind
This patch uses a switch statement to map the ELF_x86_64's edge kind to generic edge kind, and merge the ELF_x86_64 's applyFixup function to the x86_64 's applyFixup function. Some edge kinds were not have corresponding generic edge kinds, so I added three generic edge kinds asa follows:
1. RequestGOTAndTransformToDelta64, which is similar to RequestGOTAndTransformToDelta32.

2. GOTDelta64. This generic kind is similar to Delta64, except the GOTDelta64 computes the delta relative to GOTSymbol

3. RequestGOTAndTransformToGOTDelta64. This edge kind was used to deal with ELF_x86_64's GOT64 edge kind, it request the fixGOTEdge function to change the target to GOT entry, and set the edge kind to generic edge kind GOTDelta64.

These added generic edge kinds may named haphazardly, or can't express its meaning well.

Reviewed By: lhames

Differential Revision: https://reviews.llvm.org/D107967
2021-08-13 12:53:54 +08:00
Michael Kruse
b1de32d6dd [OMPIRBuilder] Clarify CanonicalLoopInfo. NFC.
Add in-source documentation on how CanonicalLoopInfo is intended to be used. In particular, clarify what parts of a CanonicalLoopInfo is considered part of the loop, that those parts must be side-effect free, and that InsertPoints to instructions outside those parts can be expected to be preserved after method calls implementing loop-associated directives.

CanonicalLoopInfo are now invalidated after it does not describe canonical loop anymore and asserts when trying to use it afterwards.

In addition, rename `createXYZWorkshareLoop` to `applyXYZWorkshareLoop` and remove the update location to avoid that the impression that they insert something from scratch at that location where in reality its InsertPoint is ignored. createStaticWorkshareLoop does not return a CanonicalLoopInfo anymore. First, it was not a canonical loop in the clarified sense (containing side-effects in form of calls to the OpenMP runtime). Second, it is ambiguous which of the two possible canonical loops it should actually return. It will not be needed before a feature expected to be introduced in OpenMP 6.0

Also see discussion in D105706.

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D107540
2021-08-12 21:02:19 -05:00
Lei Huang
8930af45c3 [PowerPC] Implement XL compatibility builtin __addex
Add builtin and intrinsic for `__addex`.

This patch is part of a series of patches to provide builtins for
compatibility with the XL compiler.

Reviewed By: stefanp, nemanjai, NeHuang

Differential Revision: https://reviews.llvm.org/D107002
2021-08-12 16:38:21 -05:00
Vyacheslav Zakharin
50c7e299f1 [NFC] Enumerate LLVMOMPOFFLOAD note types.
It is a NFC to enumerate types of LLVMOMPOFFLOAD ELF notes.

Differential Revision: https://reviews.llvm.org/D99612
2021-08-12 11:01:41 -07:00
Florian Hahn
f999312872
Recommit "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts the revert 28c04794df74ad3c38155a244729d1f8d57b9400.

The failing MLIR test that caused the revert should be fixed  in this
version.

Also includes a PPC test fix previously in 1f87c7c478a6.
2021-08-12 18:31:57 +01:00
Kazu Hirata
45938114b2 [DWARF] Remove getMaxLineIncrementForSpecialOpcode (NFC)
The function was introduced without a use on Sep 16, 2011 in
commit 5acab501de113acb4c0d28b48e76c27631ba0904.
2021-08-12 09:53:49 -07:00
maekawatoshiki
dd3eea6566 [LICM] Support sinking in LNICM
Currently, LNICM pass does not support sinking instructions out of loop nest.
This patch enables LNICM to sink down as many instructions to the exit block of outermost loop as possible.

Reviewed By: Whitney

Differential Revision: https://reviews.llvm.org/D107219
2021-08-13 00:56:26 +09:00
Hongtao Yu
ccb5b9bbfb [CSSPGO] Allow the use of debug-info-for-profiling and pseudo-probe-for-profiling together
Previoulsy debug-info-for-profiling and pseudo-probe-for-profiling are mutual exclusive because they compete the dwarf discrimnator for callsites on the IR. This changes allows to use the two switches together. The side effect is that callsite discriminators will be taken by pseudo probe, while discriminators for other instructions are still available for AutoFDO use. This is less than ideal, however, it still allows us a chance to smoothly transition from AutoFDO to CSSPGO, by collecting both profiles from a CSSPGO binary.

Reviewed By: wenlei, wmi

Differential Revision: https://reviews.llvm.org/D107876
2021-08-12 08:52:49 -07:00
Liqiang Tao
422fc5603a [llvm][Inline] Refactor out InlineOrder
Move InlineOrder to separated file.

Reviewed By: kazu

Differential Revision: https://reviews.llvm.org/D107831
2021-08-12 22:19:53 +08:00
Mehdi Amini
28c04794df Revert "[Matrix] Overload stride arg in matrix.columnwise.load/store."
This reverts commit a1ef81de35a4bac6d3b22e9d7186d880124d7a55.

Broke the MLIR buildbot.
2021-08-12 11:57:19 +00:00
Florian Hahn
a1ef81de35
[Matrix] Overload stride arg in matrix.columnwise.load/store.
This patch adjusts the intrinsics definition of
llvm.matrix.column.major.load and llvm.matrix.column.major.store to
allow overloading the type of the stride. The bitwidth of the stride is
used to perform the offset computation.

This fixes a crash when using __builtin_matrix_column_major_load or
__builtin_matrix_column_major_store on 32 bit platforms. The stride argument
of the builtins are defined as `size_t`, which is 32 bits wide on 32 bit
platforms.

Note that we still perform offset computations with 64 bit width on 32
bit platforms for accesses that do not take a user-specified stride.
This can be fixed separately.

Fixes PR51304.

Reviewed By: erichkeane

Differential Revision: https://reviews.llvm.org/D107349
2021-08-12 10:45:25 +01:00
Kazu Hirata
63c566b1fd [DWARF] Remove extractFast (NFC)
The last use was removed on Dec 13, 2016 in commit
c8c1032c0c52d7f851ccfa29ec850b24047ebcb9.  This patch repurposes the
function comment for the other variant of extractFast.
2021-08-11 09:55:00 -07:00
Yolanda Chen
8fa16cc628 [LTO][lld] Add lto-pgo-warn-mismatch option
When enable CSPGO for ThinLTO, there are profile cfg mismatch warnings that will cause lld-link errors (with /WX)
due to source changes (e.g. `#if` code runs for profile generation but not for profile use)
To disable it we have to use an internal "/mllvm:-no-pgo-warn-mismatch" option.
In contrast clang uses option ”-Wno-backend-plugin“ to avoid such warnings and gcc has an explicit "-Wno-coverage-mismatch" option.

Add "lto-pgo-warn-mismatch" option to lld COFF/ELF to help turn on/off the profile mismatch warnings explicitly when build with ThinLTO and CSPGO.

Differential Revision: https://reviews.llvm.org/D104431
2021-08-11 09:45:55 -07:00
Wang, Pengfei
6c4809825d Revert "[lld] Add lto-pgo-warn-mismatch option"
This reverts commit 0cfb00a1c98f8ca3749b8d829b7301f397efa302.
2021-08-11 16:25:42 +08:00
Yolanda Chen
0cfb00a1c9 [lld] Add lto-pgo-warn-mismatch option
When enable CSPGO for ThinLTO, there are profile cfg mismatch warnings that will cause lld-link errors (with /WX).
To disable it we have to use an internal "/mllvm:-no-pgo-warn-mismatch" option.
In contrast clang uses option ”-Wno-backend-plugin“ to avoid such warnings and gcc has an explicit "-Wno-coverage-mismatch" option.

Add this "lto-pgo-warn-mismatch" option to lld to help turn on/off the profile mismatch warnings explicitly when build with ThinLTO and CSPGO.

Reviewed By: tejohnson

Differential Revision: https://reviews.llvm.org/D104431
2021-08-11 14:43:26 +08:00
Petr Hosek
c0c1c3cf93 Revert "[InstrProfiling] Emit bias variable eagerly"
This reverts commit 6660cec568504df47d9becb0c552c20577880df8 since
it was superseded by https://reviews.llvm.org/D98061.
2021-08-10 23:21:15 -07:00
Christopher Di Bella
c874dd5362 [llvm][clang][NFC] updates inline licence info
Some files still contained the old University of Illinois Open Source
Licence header. This patch replaces that with the Apache 2 with LLVM
Exception licence.

Differential Revision: https://reviews.llvm.org/D107528
2021-08-11 02:48:53 +00:00
Adrian Prantl
d6b6880172 Streamline the API of salvageDebugInfoImpl (NFC)
This patch refactors / simplifies salvageDebugInfoImpl(). The goal
here is to simplify the implementation of coro::salvageDebugInfo() in
a followup patch.

  1. Change the return value to I.getOperand(0). Currently users of
     salvageDebugInfoImpl() assume that the first operand is
     I.getOperand(0). This patch makes this information explicit. A
     nice side-effect of this change is that it allows us to salvage
     expressions such as add i8 1, %a in the future.

  2. Factor out the creation of a DIExpression and return an array of
     DIExpression operations instead. This change allows users that
     call salvageDebugInfoImpl() in a loop to avoid the costly
     creation of temporary DIExpressions and to defer the creation of
     a DIExpression until the end.

This patch does not change any functionality.

rdar://80227769

Differential Revision: https://reviews.llvm.org/D107383
2021-08-10 15:21:18 -07:00
Joseph Huber
640091884f [OpenMP] AlwaysInline __kmpc_parallel_51 to improve inlining hueristics
This patch adds the `AlwaysInline` attribute to the `__kmpc_parallel_51`
device runtime call. This improves inlining heuristics which encourages
the indirect function pointer arguemnt to also be inlined. This greatly
improves performance for a few applications whose outlined regions were
not inlined otherwise.

Reviewed By: jdoerfert

Differential Revision: https://reviews.llvm.org/D107839
2021-08-10 14:41:43 -04:00
Sanjay Patel
e260e10c4a [InstSimplify] fold min/max with limit constant
This is already done within InstCombine:
https://alive2.llvm.org/ce/z/MiGE22

...but leaving it out of analysis makes it
harder to avoid infinite loops there.
2021-08-10 10:57:25 -04:00
Sanjay Patel
188832f419 Revert "[InstSimplify] fold min/max with limit constant; NFC"
This reverts commit f43859b4370f978d2bc625643ccbe03775b99713.
This is not NFC, so I'll try again without that mistake in the commit message.
2021-08-10 10:50:09 -04:00
Sanjay Patel
f43859b437 [InstSimplify] fold min/max with limit constant; NFC
This is already done within InstCombine:
https://alive2.llvm.org/ce/z/MiGE22

...but leaving it out of analysis makes it
harder to avoid infinite loops there.
2021-08-10 10:43:07 -04:00
Kazu Hirata
c1a014c382 [DWARF] Remove isInlinedCStr (NFC)
The last use was removed on Oct 28, 2013 in commit
48cbda5850264671e982ecdd834c1587b1732c15.
2021-08-10 07:24:46 -07:00