17 Commits

Author SHA1 Message Date
David Sherwood
bdc0afc871
[CodeGen][AArch64] Set min jump table entries to 13 for AArch64 targets (#71166)
There are some workloads that are negatively impacted by using jump
tables when the number of entries is small. The SPEC2017 perlbench
benchmark is one example of this, where increasing the threshold to
around 13 gives a ~1.5% improvement on neoverse-v1. I chose the minimum
threshold based on empirical evidence rather than science, and just
manually increased the threshold until I got the best performance
without impacting other workloads. For neoverse-v1 I saw around ~0.2%
improvement in the SPEC2017 integer geomean, and no overall change for
neoverse-n1. If we find issues with this threshold later on we can
always revisit this.

The most significant SPEC2017 score changes on neoverse-v1 were:

500.perlbench_r: +1.6%
520.omnetpp_r: +0.6%

and the rest saw changes < 0.5%.

I updated CodeGen/AArch64/min-jump-table.ll to reflect the new
threshold. For most of the affected tests I manually set the min number
of entries back to 4 on the RUN line because the tests seem to rely upon
this behaviour.
2023-11-14 13:00:28 +00:00
Daniel Hoekwater
0982d96186 [CodeGen][AArch64] Don't split inline asm goto blocks or their targets
Machine function splitting + branch relaxation currently don't properly
handle inline asm goto blocks that conditional branch to cold goto
labels. While such inline asm is technically invalid, machine
function splitting is the only thing that exposes it as such.

Since machine function splitting doesn't help too much in these
circumstances anyway, disable it for asm goto blocks and their targets.

Differential Revision: https://reviews.llvm.org/D158647
2023-08-29 20:24:38 +00:00
Daniel Hoekwater
ef1c25eb50 [CodeGen][AArch64] Don't split jump table basic blocks
Jump tables on AArch64 are label-relative rather than table-relative, so
having jump table destinations that are in different sections causes
problems with relocation. Jump table lookups have a max range of 1MB, so
all destinations must be in the same section as the lookup code. Both of
these restrictions can be mitigated with some careful and complex logic,
but doing so doesn't gain a huge performance benefit.

Efficiently ensuring jump tables are correct and can be compressed on
AArch64 is a TODO item. In the meantime, don't split blocks that can
cause problems.

Differential Revision: https://reviews.llvm.org/D157124
2023-08-28 21:47:57 +00:00
Snehasish Kumar
3dbabeadd6 [CodeGen] Remove unused option in MachineFunctionSplitter.
The option was added in github.com/llvm/llvm-project/commit/90ab85a but it doesn't seem to be used. The triple check has been removed so this shouldn't be required going forward.

Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D158885
2023-08-25 21:24:28 +00:00
Daniel Hoekwater
8c249c44d4 [CodeGen][AArch64] Don't split functions with a red zone on AArch64
Because unconditional branch relaxation on AArch64 grows the stack to
spill a register, splitting a function would cause the red zone to be
overwritten. Explicitly disable MFS for such functions.

Differential Revision: https://reviews.llvm.org/D157127
2023-08-24 21:57:35 +00:00
Daniel Hoekwater
c9f328844d Reland "[CodeGen] Fix unconditional branch duplication issue in bbsections"
Reverted in 4c8d056f50342d5401f5930ed60e5e48b211c3fb because it broke
buildbot `llvm-clang-x86_64-expensive-checks-debian` due to the AArch64
test generating invalid code. The issue still exists, but it's fixed in
D156767, so the AArch64 test should be added there.

Differential Revision: https://reviews.llvm.org/D158674
2023-08-24 21:27:55 +00:00
Daniel Hoekwater
4c8d056f50 Revert "[CodeGen] Fix unconditional branch duplication issue in bbsections"
This reverts commit 994eb5adc40cd001d82d0f95d18d1827b57e496c.
Breaks buildbot `llvm-clang-x86_64-expensive-checks-debian`
https://lab.llvm.org/buildbot/#/builders/16/builds/53620
2023-08-24 16:59:17 +00:00
Daniel Hoekwater
994eb5adc4 [CodeGen] Fix unconditional branch duplication issue in bbsections
If an end section basic block ends in an unconditional branch to its
fallthrough, BasicBlockSections will duplicate the unconditional branch.
This doesn't break x86, but it is a (slight) size optimization and more
importantly prevents AArch64 builds from breaking.

Ex:
```
bb1 (bbsections Hot):
  jmp bb2

bb2 (bbsections Cold):
  /* do work... */
```

After running sortBasicBlocksAndUpdateBranches():
```
bb1 (bbsections Hot):
  jmp bb2
  jmp bb2

bb2 (bbsections Cold):
  /* do work... */
```

Differential Revision: https://reviews.llvm.org/D158674
2023-08-24 16:22:55 +00:00
Daniel Hoekwater
90ab85a1b2 Reland "[CodeGen][AArch64] Make MFS testable on AArch64"
Reverted by 3d22dac6c3b97d7bb92f243886dfb0d32a5c42e9 because it depended
on b9d079d6188b50730e0a67267b7fee36008435ce, which broke some tests.
2023-08-22 20:21:33 +00:00
Fangrui Song
77596e6b16 Revert D157750 "[Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation."
This reverts commit 317a0fe5bd7113c0ac9d30b2de58ca409e5ff754.
This reverts commit 30c4b97aec60895a6905816670f493cdd1d7c546.

See post-commit discussions on https://reviews.llvm.org/D157750 that
we should use a different mechanism to handle the error with --cuda-gpu-arch=

The IR/DiagnosticInfo.cpp, warn_drv_for_elf_only, codegne tests in
clang/test/Driver, and the following driver behavior (downgrading error
to warning) changes are undesired.
```
% clang --target=riscv64 -fsplit-machine-functions -c a.c
warning: -fsplit-machine-functions is not valid for riscv64 [-Wbackend-plugin]
```
2023-08-21 13:54:15 -07:00
Nico Weber
3d22dac6c3 Revert "[clang][test] Refine clang machine-function-split tests."
This reverts commit b9d079d6188b50730e0a67267b7fee36008435ce.
Breaks tests on Windows, see https://reviews.llvm.org/D157565#4600939
2023-08-20 10:38:29 -04:00
Han Shen
b9d079d618 [clang][test] Refine clang machine-function-split tests.
This CL includes two changes:
1. moved clang backend-warnings test cases from Driver/ to CodeGen/.
2. removed multiple `cd "$(dirname "%t")"` and replaced with `-o %t`.

Reviewed By: maskray (Fangrui Song)
Differential Revision: https://reviews.llvm.org/D157565
2023-08-18 18:05:47 -07:00
Jonas Hahnfeld
eeac4321c5 Disable two tests without {arm,aarch64}-registered-target 2023-08-17 10:04:38 +02:00
Han Shen
317a0fe5bd [Driver][CodeGen] Properly handle -fsplit-machine-functions for fatbinary compilation.
When building a fatbinary, the driver invokes the compiler multiple
times with different "--target". (For example, with "-x cuda
--cuda-gpu-arch=sm_70" flags, clang will be invoded twice, once with
--target=x86_64_...., once with --target=sm_70) If we use
-fsplit-machine-functions or -fno-split-machine-functions for such
invocation, the driver reports an error.

This CL changes the behavior so:

  - "-fsplit-machine-functions" is now passed to all targets, for non-X86
    targets, the flag is a NOOP and causes a warning.

  - "-fno-split-machine-functions" now negates -fsplit-machine-functions (if
    -fno-split-machine-functions appears after any -fsplit-machine-functions)
    for any target triple, previously, it causes an error.

  - "-fsplit-machine-functions -Xarch_device -fno-split-machine-functions"
    enables MFS on host but disables MFS for GPUS without warnings/errors.

  - "-Xarch_host -fsplit-machine-functions" enables MFS on host but disables
    MFS for GPUS without warnings/errors.

Reviewed by: xur, dhoekwater

Differential Revision: https://reviews.llvm.org/D157750
2023-08-16 23:41:34 -07:00
Daniel Hoekwater
2c43d591c6 [CodeGen] Move function splitting tests from X86 to Generic (NFC)
Machine function splitting will become available for AArch64; since MFS
is no longer X86-only, the tests for generic behavior should live
somewhere other than tests/CodeGen/X86.

MFS implementation doesn't vary much across platforms, and most tests
should be identical between X86 and AArch64 besides instruction
selection, so the tests can live together in tests/CodeGen/Generic.

Differential Revision: https://reviews.llvm.org/D157563
2023-08-16 18:11:23 +00:00
Daniel Hoekwater
e8540723b3 Revert "[CodeGen] Move function splitting tests from X86 to Generic (NFC)"
This reverts commit 1670e0ea076b12b3fcea2ea63f01bc09e0e7f3b2.
Causes https://lab.llvm.org/buildbot/#/builders/188/builds/33943
2023-08-16 01:46:35 +00:00
Daniel Hoekwater
1670e0ea07 [CodeGen] Move function splitting tests from X86 to Generic (NFC)
Machine function splitting will become available for AArch64; since MFS
is no longer X86-only, the tests for generic behavior should live
somewhere other than tests/CodeGen/X86.

MFS implementation doesn't vary much across platforms, and most tests
should be identical between X86 and AArch64 besides instruction
selection, so the tests can live together in tests/CodeGen/Generic.

Differential Revision: https://reviews.llvm.org/D157563
2023-08-16 01:25:54 +00:00