401 Commits

Author SHA1 Message Date
Alexandre Ganea
e66500c774 [Support] On Windows 11 and Windows Server 2022, fix an affinity mask issue on large core count machines
Before Windows 11 and Windows Server 2022, only one 'processor group' is assigned by default to a starting process, then the program is responsible for dispatching its own threads on more 'processor groups'. That is what 8404aeb56a73ab24f9b295111de3b37a37f0b841 was doing, allowing LLVM tools to automatically use all hardware threads in the machine.

After Windows 11 and Windows Server 2022, the OS takes care of that. This has an adverse effect reported in #56618 which is that using `GetProcessAffinityMask()` API in some edge cases seems buggy now. That API is used to detect if an affinity mask was set, and adjust accordingly the available threads for a ThreadPool.

With this patch, on one hand, we let the OS dispatch threads on all 'processor groups', but only for Windows 11 & Windows Server 2022 and after. We retain the old behavior for older OS versions. On the other hand, a workaround was added to mitigate the `GetProcessAffinityMask()` issue described above (see Threading.inc, L226).

Differential Revision: https://reviews.llvm.org/D138747
2023-01-06 17:03:43 -05:00
Aaron Ballman
b4993bea29 Remove documentation about the Go bindings
We removed the Go bindings in https://reviews.llvm.org/D135436 but
missed documentation that talks about the bindings.
2023-01-05 14:49:28 -05:00
Freddy Ye
27b8f54f51 [X86] Support -march=emeraldrapids
Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D140950
2023-01-05 20:27:32 +08:00
Florian Hahn
60359f56aa
Revert "[IPSCCP] Enable specialization of functions."
This reverts commit 2656572d485127cc30b8fe9752024d2a0f1c50db.

It looks like CINT2017rate/502.gcc_r gets mis-compiled with LTO + PGO on
AArch64 with function specialization.
2022-12-26 16:02:59 +00:00
Alexandros Lamprineas
2656572d48 [IPSCCP] Enable specialization of functions.
This patch enables Function Specialization by default at all
optimization levels except Os, Oz.

Compilation Time Overhead:
--------------------------
Measured the Instruction Count increase (Geomean) for CTMark from
the llvm-testsuite as in https://llvm-compile-time-tracker.com.
 * {-O3, Non-LTO}: +0.136% Instruction Count
 * {-O3, LTO}: +0.346% Instruction Count

Performance Uplift:
-------------------
Measured +9.121% score increase for 505.mcf_r from SPEC Int 2017
(Tested on Neoverse N1 with -O3 + LTO)

Correctness Testing:
--------------------
 * Passes bootstrap Clang with ASAN + LTO + FuncSpec aggressive options:
   { MaxClonesThreshold=10,
     SmallFunctionThreshold=10,
     AvgLoopIterationCount=30,
     SpecializeOnAddresses=true,
     EnableSpecializationForLiteralConstant=true,
     FuncSpecializationMaxIters=10 }
 * Builds Chromium and passes its unittests with the above options + ThinLTO.

For more info please refer to
https://discourse.llvm.org/t/rfc-should-we-enable-function-specialization/61518

Differential Revision: https://reviews.llvm.org/D140210
2022-12-25 10:05:21 +02:00
Joshua Cranmer
e6b02214c6 [IR] Add a target extension type to LLVM.
Target-extension types represent types that need to be preserved through
optimization, but otherwise are not introspectable by target-independent
optimizations. This patch doesn't add any uses of these types by an existing
backend, it only provides basic infrastructure such that these types would work
correctly.

Reviewed By: nikic, barannikov88

Differential Revision: https://reviews.llvm.org/D135202
2022-12-20 11:02:11 -05:00
Mark de Wever
d40dc41738 [CMake] Warn when the version is older than 3.20.0.
This is a preparation to require CMake 3.20.0 after LLVM 16 has been
released.

This change has been discussed on discourse
https://discourse.llvm.org/t/rfc-upgrading-llvms-minimum-required-cmake-version/66193

Reviewed By: #libc_vendors, MaskRay, ChuanqiXu, to268, thieta, stellaraccident, ldionne, #libc, #libc_abi, phosek

Differential Revision: https://reviews.llvm.org/D137724
2022-12-11 20:19:46 +01:00
Sjoerd Meijer
8250180238 Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit 3ea6a9a469fde168c527b1c34c09f6d684ec86af because of the
reported miscompilation in: https://github.com/llvm/llvm-project/issues/59339
2022-12-05 15:14:12 +00:00
Nikita Popov
2796182418 [llvm-c] Remove C API functions that are incompatible with opaque pointers
These deprecated functions are incompatible with opaque pointers,
and have replacements that accept an explicit type. Drop them now
as a final warning to consumers of the C API to migrate their code
(while LLVMGetElementType still exists as a temporary workaround).

Differential Revision: https://reviews.llvm.org/D135271
2022-12-05 08:47:12 +01:00
Sjoerd Meijer
3ea6a9a469 Recommit "[LoopFlatten] Enable it by default"
The problem in 58441 that was reported after enabling this last time was fixed
in 8e9e22f07bcbe2ee95478684cf31948370e4e51e.
2022-11-29 10:45:13 +00:00
Keith Walker
00d98e6572 [AArch64] RME MEC instructions and system registers
This patch adds assembler/disassembler support for
RME MEC (Memory Encryption Contexts).

Cache maintence instructions added:
- DC CIPAPA
- DC CIGDPAPA

System registers added:
- MECIDR_EL2
- MECID_P0_EL2
- MECID_A0_EL2
- MECID_P1_EL2
- MECID_A1_EL2
- VMECID_P_EL2
- VMECID_A_EL2
- MECID_RL_A_EL3

Differential Revision: https://reviews.llvm.org/D137431
2022-11-10 14:05:12 +00:00
Victor Campos
9d1ff787e5 [AArch64] Add support for the Cortex-X3 CPU
Cortex-X3 is an Armv9-A AArch64 CPU.

This patch introduces support for Cortex-X3.

Technical Reference Manual: https://developer.arm.com/documentation/101593/latest

Reviewed By: dmgreen

Differential Revision: https://reviews.llvm.org/D136589
2022-11-09 11:33:48 +00:00
Freddy Ye
84a18a260e [X86] Support -march=sierraforest, grandridge, graniterapids.
Reviewed By: skan, pengfei, MaskRay

Differential Revision: https://reviews.llvm.org/D137153
2022-11-09 16:56:03 +08:00
Nikita Popov
304f1d59ca [IR] Switch everything to use memory attribute
This switches everything to use the memory attribute proposed in
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
The old argmemonly, inaccessiblememonly and inaccessiblemem_or_argmemonly
attributes are dropped. The readnone, readonly and writeonly attributes
are restricted to parameters only.

The old attributes are auto-upgraded both in bitcode and IR.
The bitcode upgrade is a policy requirement that has to be retained
indefinitely. The IR upgrade is mainly there so it's not necessary
to update all tests using memory attributes in this patch, which
is already large enough. We could drop that part after migrating
tests, or retain it longer term, to make it easier to import IR
from older LLVM versions.

High-level Function/CallBase APIs like doesNotAccessMemory() or
setDoesNotAccessMemory() are mapped transparently to the memory
attribute. Code that directly manipulates attributes (e.g. via
AttributeList) on the other hand needs to switch to working with
the memory attribute instead.

Differential Revision: https://reviews.llvm.org/D135780
2022-11-04 10:21:38 +01:00
Freddy Ye
a806fc2767 [X86] Support -march=raptorlake, meteorlake
Reviewed By: pengfei, skan, MaskRay

Differential Revision: https://reviews.llvm.org/D135937
2022-11-04 09:32:17 +08:00
Simi Pallipurath
fa8aeab606 [AArch64] Add support for the Cortex-A715 CPU
Cortex-A715 is an Armv9-A AArch64 CPU.

This patch introduces support for Cortex-A715.

Technical Reference Manual: https://developer.arm.com/documentation/101590/latest.

Reviewed By: vhscampos

Differential Revision: https://reviews.llvm.org/D136957
2022-11-03 09:28:46 +00:00
Freddy Ye
aee2a35ac4 [X86] Add AVX-NE-CONVERT instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei

Differential Revision: https://reviews.llvm.org/D135930
2022-10-31 23:39:38 +08:00
Daniel Thornburgh
cc2457ca1b [llvm-objdump] Set --print-imm-hex by default.
This was previously attempted in 2016 by colinl's D18770, but LLD tests
were missed, which caused the change to be reverted.

Setting --print-imm-hex by default brings llvm-objdump's behavior closer
in line with objdump, and it makes it easier to read addresses and
alignment from the disassembly. It may make non-address immediates
harder to interpret, but it still seems the better default, barring more
context-sensitive base selection logic.

Differential Revision: https://reviews.llvm.org/D136972
2022-10-30 13:36:18 -07:00
Craig Topper
974e2e690b [RISCV] Adjust RV64I data layout by using n32:64 in layout string
Although i32 type is illegal in the backend, RV64I has pretty good support for i32 types by using W instructions.

By adding n32 to the DataLayout string, middle end optimizations will consider i32 to be a native type. One known effect of this is enabling LoopStrengthReduce on loops with i32 induction variables. This can be beneficial because C/C++ code often has loops with i32 induction variables due to the use of `int` or `unsigned int`.

If this patch exposes performance issues, those are better addressed by tuning LSR or other passes.

Reviewed By: asb, frasercrmck

Differential Revision: https://reviews.llvm.org/D116735
2022-10-28 08:27:03 -07:00
Kadir Cetinkaya
b999ac1af6
[llvm] Fix minimum Apple Clang requirement
This was stated as 9.3, but as pointed out in
https://discourse.llvm.org/t/rfc-bump-minimal-requirements-apple-clang-9-3-10-0-0-before-4th-tue-in-january/66156/7?u=kadircet
9.3 doesn't exist, hence this was effectively 10.0.

This patch merely reflects the reality more closely.

Differential Revision: https://reviews.llvm.org/D136609
2022-10-28 15:12:24 +02:00
Freddy Ye
23f02693ec [X86] Add AVX-VNNI-INT8 instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135938
2022-10-28 10:39:54 +08:00
Freddy Ye
0e720e6ada [X86] Add AVX-IFMA instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135932
2022-10-28 09:42:30 +08:00
David Spickett
c6e2de6042 [LLVM] Use DWARFv4 bitfields when tuning for GDB
GDB implemented data_bit_offset in https://sourceware.org/bugzilla/show_bug.cgi?id=12616
which has been present since GDB 8.0.

GCC started using it at GCC 11.

Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D135583
2022-10-27 08:51:07 +00:00
Freddy Ye
fdac4c4e92 [X86] Add CMPCCXADD instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: pengfei, skan

Differential Revision: https://reviews.llvm.org/D135933
2022-10-25 14:33:39 +08:00
Xiang1 Zhang
661881d436 [X86] Add AMX-FP16 instructions.
Differential Revision: https://reviews.llvm.org/D135941
2022-10-22 08:05:22 +08:00
Freddy Ye
3ee58e2f35 [X86] Add WRMSRNS instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D135935
2022-10-19 13:04:11 +08:00
Freddy Ye
e3df4ba9d2 [X86] Add MSRLIST instructions.
For more details about these instructions, please refer to the latest ISE document: https://www.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Reviewed By: skan, RKSimon

Differential Revision: https://reviews.llvm.org/D135934
2022-10-19 10:35:42 +08:00
Sjoerd Meijer
f7c42a278b Revert "Recommit "[LoopFlatten] Enable it by default""
This reverts commit 5b9597f59a445523bd59b5251ab1c2865e74919f.

A miscompilation was reported:
https://github.com/llvm/llvm-project/issues/58441

Reverting this while I look at that.
2022-10-18 23:36:36 +05:30
Sjoerd Meijer
5b9597f59a Recommit "[LoopFlatten] Enable it by default"
The sanitizer bots turned green again after another change went in, i.e.
revert 26dd64ba9cfabe5474bb207f3b7099965f81fed7, so I don't think this
patch was causing the problems.
2022-10-17 23:27:19 +05:30
Sjoerd Meijer
a71c4e4fbb Revert "[LoopFlatten] Enable it by default"
This reverts commit 233659c7ae9b83b64a9f739d340736bca39c3d2e.

I see some sanitizer build bot failures. Not sure if it is change
causing it, but let's see if a revert returns the bots to green...
2022-10-17 22:14:20 +05:30
Paul Kirth
aa123b8c09 [llvm-readobj] Improve JSON output
The current implementation outputs JSON in the following way:

[{'<filename>':{'FileSummary':{},...}}]

Using the filename as a key makes processing the JSON data awkward, and
should be avoided. This patch removes that outer key, since the
'FileSummary' data also includes a 'File' field, and so we lose no data.

Reviewed By: jhenderson, leonardchan

Differential Revision: https://reviews.llvm.org/D134843
2022-10-17 16:42:57 +00:00
Sjoerd Meijer
233659c7ae [LoopFlatten] Enable it by default
LoopFlatten has been in the code base off by default for years, but this
enables it to run by default. Downstream this has been running for
years, so it has been exposed to quite some code. Then around the time
we switched to the NPM, several fixes went in related to updating the
MemorySSA state and we moved it to a loop pass manager, which both
helped preventing rerunning certain analysis passes, and thus helped a
bit with compile-times.

About compile-times, adding a pass isn't free, but this should see only
very minor increases. The pass is relatively simple and there shouldn't
be anything algorithmically expensive because all it does is looking at
inner/outer loops and it checks assumptions on loop increments and
indices. If we see increases, I expect this to mainly come from
invalidation of analysis info, and perhaps subsequent passes to trigger
and do more. Despite its simplicity/restrictions, it triggers in most
code-bases, which makes it worth to enable this by default.

Differential Revision: https://reviews.llvm.org/D109958
2022-10-17 17:11:39 +05:30
Chuanqi Xu
1cedc51ff5 [Coroutines] Don't merge readnone calls in presplit coroutines
Another alternative to fix the thread identification problem in
coroutines.

We plan to fix this problem by unifying memory effecting attributes. See
https://discourse.llvm.org/t/rfc-unify-memory-effect-attributes/65579.
But it may be a long-term project. And it is a pity that the coroutines
can't resume in different threads for years. So this one is temporary
fix. It may cause unnecessary performance regression for coroutines. But
correctness are more important. And this one is planned to be reverted
after we are able to unify the memory effecting attributes actually.

Reviewed By: jdoerfert, rjmccall

Differential Revision: https://reviews.llvm.org/D135550
2022-10-17 10:22:43 +08:00
Craig Topper
cde3de5381 [RISCV] Remove a few remnants of Zbr I misssed. 2022-09-23 21:21:51 -07:00
Craig Topper
52708be182 [RISCV] Remove support for the unratified Zbe, Zbf, and Zbm extensions.
These extensions do not appear to be on their way to ratification.
2022-09-22 13:04:41 -07:00
Craig Topper
182aa0cbe0 [RISCV] Remove support for the unratified Zbp extension.
This extension does not appear to be on its way to ratification.

Still need some follow up to simplify the RISCVISD nodes.
2022-09-21 21:22:42 -07:00
Craig Topper
70a64fe7b1 [RISCV] Remove support for the unratified Zbt extension.
This extension does not appear to be on its way to ratification.

Out of the unratified bitmanip extensions, this one had the
largest impact on the compiler.

Posting this patch to start a discussion about whether we should
remove these extensions. We'll talk more at the RISC-V sync meeting this
Thursday.

Reviewed By: asb, reames

Differential Revision: https://reviews.llvm.org/D133834
2022-09-20 20:26:48 -07:00
David Spickett
e428baf001 [LLVM][ARM] Remove options for armv2, 2A, 3 and 3M
Fixes #57486

These pre v4 architectures are not specifically supported
by codegen. As demonstrated in the linked issue.

GCC has not supported 3M since GCC 9 and presumably
2 and 2A earlier than that. So we are aligned in that sense.

(see https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/commit/?id=2abd6e34fcf3bd9f9ffafcaa47cdc3ed443f9add)

This removes the options and associated testing.

The Pre_v4 build attribute remains mainly because its absence
would be more confusing. It will not be used other than to
complete the list of build attributes as shown in the ABI.

https://github.com/ARM-software/abi-aa/blob/main/addenda32/addenda32.rst#3352the-target-related-attributes

Reviewed By: nickdesaulniers, peter.smith, rengolin

Differential Revision: https://reviews.llvm.org/D133109
2022-09-08 09:49:48 +00:00
Nikita Popov
96cb7c2273 [ConstantExpr] Remove fneg expression
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes the fneg constant expression (which is, incidentally,
the only unary operator expression).

Differential Revision: https://reviews.llvm.org/D133418
2022-09-08 10:24:55 +02:00
Martin Storsjö
c5b3de6745 [COFF] Emit embedded -exclude-symbols: directives for hidden visibility for MinGW
This works with the automatic export of all symbols; in MinGW mode,
when a DLL has no explicit dllexports, it exports all symbols (except
for some that are hardcoded to be excluded, including some toolchain
libraries).

By hooking up the hidden visibility to the -exclude-symbols: directive,
the automatic export of all symbols can be controlled in an easier
way (with a mechanism that doesn't require strict annotation of every
single symbol, but which allows gradually marking more unnecessary
symbols as hidden).

The primary use case is dylib builds of LLVM/Clang. These can be done
in MinGW mode but not in MSVC mode, as MinGW builds can export all
symbols (and the calling code can use APIs without corresponding
dllimport directives). However, as all symbols are exported, it can
easily overflow the max number of exported symbols in a DLL (65536).

In the llvm-mingw distribution, only the X86, ARM and AArch64 backends
are enabled; for the LLVM 13.0.0 release, libLLVM-13.dll ended up with
58112 exported symbols. For LLVM 14.0.0, it was 62015 symbols. Current
builds of the 15.x branch end up at around 64650 symbols - i.e. extremely
close to the limit.

The msys2 packages of LLVM have had to progressively disable more
of their backends in their builds, to be able to keep building with a
dylib.

This allows improving the current mingw dylib situation significantly,
by using the same hidden visibility options and attributes as on Unix.
With those in place, a current build of LLVM git main ends up at 35142
symbols instead of 64650.

For code using hidden visibility, this now requires linking with either
a current git lld or ld.bfd. (Older lld error out on the unknown
directives, older ld.bfd will successfully link, but will print huge
amounts of warnings.)

Differential Revision: https://reviews.llvm.org/D130121
2022-08-11 12:00:08 +03:00
Tobias Hieta
b1356504e6
[LLVM] Update C++ standard to 17
Also make the soft toolchain requirements hard. This allows
us to use C++17 features in LLVM now.

If we find patterns with C++17 that improve readability
it should be recommended in the coding standards.

Reviewed By: jhenderson, cor3ntin, MaskRay

Differential Revision: https://reviews.llvm.org/D130689
2022-08-06 09:42:10 +02:00
tlattner
520d29f381 Update references to mailing lists that have moved to Discourse. 2022-07-28 16:54:58 -07:00
Tom Stellard
809855b56f Bump the trunk major version to 16 2022-07-26 21:34:45 -07:00
David Spickett
2f9fa9ef53 [lldb][AArch64] Add support for memory tags in core files
This teaches ProcessElfCore to recognise the MTE tag segments.

https://www.kernel.org/doc/html/latest/arm64/memory-tagging-extension.html#core-dump-support

These segments contain all the tags for a matching memory segment
which will have the same size in virtual address terms. In real terms
it's 2 tags per byte so the data in the segment is much smaller.

Since MTE is the only tag type supported I have hardcoded some
things to those values. We could and should support more formats
as they appear but doing so now would leave code untested until that
happens.

A few things to note:
* /proc/pid/smaps is not in the core file, only the details you have
  in "maps". Meaning we mark a region tagged only if it has a tag segment.
* A core file supports memory tagging if it has at least 1 memory
  tag segment, there is no other flag we can check to tell if memory
  tagging was enabled. (unlike a live process that can support memory
  tagging even if there are currently no tagged memory regions)

Tests have been added at the commands level for a core file with
mte and without.

There is a lot of overlap between the "memory tag read" tests here and the unit tests for
MemoryTagManagerAArch64MTE::UnpackTagsFromCoreFileSegment, but I think it's
worth keeping to check ProcessElfCore doesn't cause an assert.

Depends on D129487

Reviewed By: omjavaid

Differential Revision: https://reviews.llvm.org/D129489
2022-07-26 08:46:36 +01:00
Nikita Popov
5102084787 [Docs] Add release notes for opaque pointers (NFC) 2022-07-22 14:14:03 +02:00
Nikita Popov
2a721374ae [IR] Don't use blockaddresses as callbr arguments
Following some recent discussions, this changes the representation
of callbrs in IR. The current blockaddress arguments are replaced
with `!` label constraints that refer directly to callbr indirect
destinations:

    ; Before:
    %res = callbr i8* asm "", "=r,r,i"(i8* %x, i8* blockaddress(@test8, %foo))
    to label %asm.fallthrough [label %foo]
    ; After:
    %res = callbr i8* asm "", "=r,r,!i"(i8* %x)
    to label %asm.fallthrough [label %foo]

The benefit of this is that we can easily update the successors of
a callbr, without having to worry about also updating blockaddress
references. This should allow us to remove some limitations:

* Allow unrolling/peeling/rotation of callbr, or any other
  clone-based optimizations
  (https://github.com/llvm/llvm-project/issues/41834)
* Allow duplicate successors
  (https://github.com/llvm/llvm-project/issues/45248)

This is just the IR representation change though, I will follow up
with patches to remove limtations in various transformation passes
that are no longer needed.

Differential Revision: https://reviews.llvm.org/D129288
2022-07-15 10:18:17 +02:00
Tom Stellard
5b0788fef8 Remove left over merge marker from 4b1e3d19370694dd2b2c04a5945f3f9e43917456 2022-07-14 14:51:44 -07:00
Tom Stellard
4b1e3d1937 [gold] Ignore bitcode from sections inside object files
-fembed-bitcode will put bitcode into special sections within object
files, but this is not meant to be used by LTO, so the gold plugin
should ignore it.

https://github.com/llvm/llvm-project/issues/47216

Reviewed By: tejohnson, MaskRay

Differential Revision: https://reviews.llvm.org/D116995
2022-07-14 14:46:15 -07:00
Nikita Popov
4bb7b6fae3 [IR] Remove support for float binop constant expressions
As part of https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179,
this removes support for the floating-point binop constant expressions
fadd, fsub, fmul, fdiv and frem.

As part of this change, the C APIs LLVMConstFAdd, LLVMConstFSub,
LLVMConstFMul, LLVMConstFDiv and LLVMConstFRem are removed.
The LLVMBuild APIs should be used instead.

Differential Revision: https://reviews.llvm.org/D129478
2022-07-12 09:40:49 +02:00
Xiang1 Zhang
a45dd3d814 [X86] Support -mstack-protector-guard-symbol
Reviewed By: nickdesaulniers

Differential Revision: https://reviews.llvm.org/D129346
2022-07-12 10:17:00 +08:00