35203 Commits

Author SHA1 Message Date
Kazu Hirata
71ca5c54a2 [CodeGen] Use a range-based for loop with llvm::predecessors (NFC) 2024-01-19 22:24:11 -08:00
paperchalice
ab0d8fc4a6
Reland "[CodeGen] Support start/stop in CodeGenPassBuilder (#70912)" (#78570)
Unfortunately the legacy pass system can't recognize `no-op-module` and
`no-op-function` so it causes test failure in `CodeGenTests`. Add a
workaround in function `PassInfo *getPassInfo(StringRef PassName)`,
`TargetPassConfig.cpp`.
2024-01-20 08:38:22 +08:00
Danila Malyutin
0388ab3e29
[Statepoint][NFC] Use uint16_t and add an assert (#78717)
Use a fixed width integer type and assert that DwarRegNum fits the 16
bits.

This is a follow up to review comments on #78600.
2024-01-20 00:44:00 +03:00
Felipe de Azevedo Piovezan
b6677835fe
[AsmPrinter][DebugNames] Implement DW_IDX_parent entries (#77457)
This implements the ideas discussed in [1].

To summarize, this commit changes AsmPrinter so that it outputs
DW_IDX_parent information for debug_name entries. It will enable
debuggers to speed up queries for fully qualified types (based on a
DWARFDeclContext) significantly, as debuggers will no longer need to
parse the entire CU in order to inspect the parent chain of a DIE.
Instead, a debugger can simply take the parent DIE offset from the
accelerator table and peek at its name in the debug_info/debug_str
sections.

The implementation uses two types of DW_FORM for the DW_IDX_parent
attribute:

1. DW_FORM_ref4, which points to the accelerator table entry for the
parent.
2. DW_FORM_flag_present, when the entry has a parent that is not in the
table (that is, the parent doesn't have a name, or isn't allowed to be
in the table as per the DWARF spec). This is space-efficient, since it
takes 0 bytes.

The implementation works by:

1. Changing how abbreviations are encoded (so that they encode which
form, if
any, was used to encode IDX_Parent)
2. Creating an MCLabel per accelerator table entry, so that they may be
referred by IDX_parent references.


When all patches related to this are merged, we are able to show that
evaluating an expression such as:

```
lldb --batch -o 'b CodeGenFunction::GenerateCode' -o run -o 'expr Fn' -- \
  clang++ -c -g test.cpp -o /dev/null
```

is far faster: from ~5000 ms to ~1500ms.

Building llvm-project + clang with and without this patch, and looking
at its impact on object file size:

```
ls -la $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,507,327,592

-la $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | awk '{s+=$5}  END {printf "%\047d\n", s}'
11,436,446,616
```

That is, an increase of 0.62% in total object file size.

Looking only at debug_names:

```
$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
440,772,348

$stage1_build/bin/llvm-objdump --section-headers $(find build_stage2_Debug_no_idx_parent_assert_dwarf5 -name \*.cpp.o) | grep __debug_names | awk '{s+="0x"$3}  END {printf "%\047d\n", s}'
369,867,920
```

That is an increase of 19%.

DWARF Linkers need to be changed in order to support this. This commit
already brings support to "base" linker, but it does not attempt to
modify the parallel linker. Accelerator entries refer to the
corresponding DIE offset, and this patch also requires the parent DIE
offset -- it's not clear how the parallel linker can access this. It may
be obvious to someone familiar with it, but it would be nice to get help
from its authors.

[1]:
https://discourse.llvm.org/t/rfc-improve-dwarf-5-debug-names-type-lookup-parsing-speed/74151/
2024-01-19 09:19:09 -08:00
Danila Malyutin
9ad7d8f0e4
[Statepoint] Optimize Location structure size (#78600)
Reduce its size from 24 to 12 bytes. Improves memory consumption when
dealing with statepoint-heavy code.
2024-01-19 17:15:36 +04:00
Philip Reames
0fc5f4b524
[DAG] Set nneg flag when forming zext in demanded bits (#72281)
We do the same for the analogous transform in DAGCombine, but this case
was missed in the recent patch which added support for zext nneg.

Sorry for the lack of test coverage. Not sure how to exercise this piece
of logic. It appears to have only minimal impact on LIT tests (only
test/CodeGen/X86/wide-scalar-shift-by-byte-multiple-legalization.ll),
and even then, the changes without it appear uninteresting. Maybe we
should remove this transform instead?
2024-01-18 07:34:08 -08:00
Haohai Wen
fb2c6bbf42
[BranchFolding] Use isSuccessor to confirm fall through (#77923)
When merging blocks, if the previous block has no any branch instruction
and has one successor, the successor may be SEH landing pad and the
block will always raise exception and nerver fall through to next block.
We can not merge them in such case. isSuccessor should be used to
confirm it can fall through to next block.
2024-01-18 23:26:22 +08:00
paperchalice
a48c1bda74
Revert "[CodeGen] Support start/stop in CodeGenPassBuilder" (#78567)
Reverts llvm/llvm-project#70912. This breaks some bazel tests.
2024-01-18 20:09:53 +08:00
Florian Hahn
40d952b874
[CGP] Avoid replacing a free ext with multiple other exts. (#77094)
Replacing a free extension with 2 or more extensions unnecessarily
increases the number of IR instructions without providing any benefits.
It also unnecessarily causes operations to be performed on wider types
than necessary.

In some cases, the extra extensions also pessimize codegen (see
bfis-in-loop.ll).

The changes in arm64-codegen-prepare-extload.ll also show that we avoid
promotions that should only be performed in stress mode.

PR: https://github.com/llvm/llvm-project/pull/77094
2024-01-18 10:48:10 +00:00
Mikael Holmen
c3cc09bdf8 [AsmPrinter] Fix gcc -Wparentheses warning [NFC]
Without this gcc warned
 ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:3585:70: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  3584 |            ((&Current == &AccelDebugNames) &&
       |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3585 |             (Unit.getUnitDie().getTag() != dwarf::DW_TAG_type_unit)) &&
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  3586 |                "Kind is CU but TU is being processed.");
       |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
 ../lib/CodeGen/AsmPrinter/DwarfDebug.cpp:3589:70: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  3588 |            ((&Current == &AccelTypeUnitsDebugNames) &&
       |            ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
  3589 |             (Unit.getUnitDie().getTag() == dwarf::DW_TAG_type_unit)) &&
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  3590 |                "Kind is TU but CU is being processed.");
       |                ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2024-01-18 08:37:30 +01:00
Matt Arsenault
c8007f9047
DAG: Fix chain mismanagement in SoftenFloatRes_FP_EXTEND (#74558) 2024-01-18 14:32:44 +07:00
paperchalice
baaf0c968e
[CodeGen] Support start/stop in CodeGenPassBuilder (#70912)
Add `-start/stop-before/after` support for CodeGenPassBuilder.
Part of #69879.
2024-01-18 14:54:56 +08:00
paperchalice
bd9e14574a
[CodeGen] Port GlobalMerge to new pass manager (#77474) 2024-01-18 12:07:46 +07:00
Matt Arsenault
11bf02e019
DAG: Fix ABI lowering with FP promote in strictfp functions (#74405)
This was emitting non-strict casts in ABI contexts for illegal
types.
2024-01-18 10:57:53 +07:00
Thorsten Schütt
67dc6e9075
[GlobalIsel][AArch64] more legal icmps (#78239)
In https://github.com/llvm/llvm-project/pull/78181 the godbolt
(https://llvm.godbolt.org/z/vMsnxMf1v) crashed with GlobalIsel.

LLVM ERROR: unable to legalize instruction: %90:_(<3 x s32>) = G_ICMP
intpred(uge), %15:_(<3 x s32>), %0:_ (in function: vec3_i32)
2024-01-17 22:23:51 +01:00
Michael Maitland
b1ae461a53 [CodeGen][MISched][NFC] Rename some instances of Cycle -> ReleaseAtCycle
This is to match the naming of arguments in MachineScheduler.h
2024-01-17 12:07:42 -08:00
Petar Avramovic
90bdf76fdb
Revert "AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis" (#78468)
Reverts llvm/llvm-project#76145
2024-01-17 17:41:19 +01:00
Simon Pilgrim
d92ce344bf Revert faecc736e2ac3cd8c77 #74443 [DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443)
Relying on ComputeKnownBits to find a splat is causing miscompilations where a shift of zero is being assumed to give zero, but further simplification leads to a shift of zero by undef, resulting in an unexpected undef value.

Fixes #78109
2024-01-17 15:59:33 +00:00
Petar Avramovic
1fbf533286
AMDGPU/GlobalISelDivergenceLowering: select divergent i1 phis (#76145)
Implement PhiLoweringHelper for GlobalISel in DivergenceLoweringHelper.
Use machine uniformity analysis to find divergent i1 phis and select
them as lane mask phis in same way SILowerI1Copies select VReg_1 phis.
Note that divergent i1 phis include phis created by LCSSA and all cases
of uses outside of cycle are actually covered by "lowering LCSSA phis".
GlobalISel lane masks are registers with sgpr register class and S1 LLT.

TODO: General goal is that instructions created in this pass are fully
instruction-selected so that selection of lane mask phis is not split
across multiple passes.

patch 3 from: https://github.com/llvm/llvm-project/pull/73337
2024-01-17 12:10:24 +01:00
Alexandros Lamprineas
92289db82f
[VFABI] Move the Vector ABI demangling utility to LLVMCore. (#77513)
This fixes #71892 allowing us to check magled names in the IR verifier.
2024-01-17 09:55:30 +00:00
Nikita Popov
435bcea83b
[GISel] Add debug counter to force sdag fallback (#78257)
Add a debug counter that allows forcing an sdag fallback after a certain
number of functions.

The intended use-case is to bisect which function gets miscompiled by
global isel using `-debug-counter=globalisel-count=N` (in cases where
sdag doesn't also miscompile it, of course).

The "falling back" debug line is printed unconditionally, because using
`-debug-only` is usually too spammy for the intended purpose.
2024-01-17 09:33:31 +01:00
Nikita Popov
cde780c18f
[DAGCombine] Add debug counter (#78259)
Add a debug counter for DAGCombine. This can help with bisecting which
DAG combine introduced a miscompile.
2024-01-17 09:31:56 +01:00
Dávid Ferenc Szabó
55172b7005
[GlobalISel] Improve combines for extend operation by taking hint ins… (#74125)
…tructions into account

Hint instructions like G_ASSERT_ZEXT cann be viewed as a copy. Including
this fact into the combiner allows the match more patterns involving
such instructions.
2024-01-17 15:21:02 +07:00
Danila Malyutin
46a929f0a0
[SelectionDAG] Fix isKnownNeverZeroFloat for vectors (#78308)
Return true iff all of vector elements are constant AND not zero

Fixes #77805

Previously, it'd return `true` (as in - the value is known to be never
zero) for any build_vector/splat_vector with non-constant elements.
2024-01-17 12:55:57 +07:00
Davide Italiano
b6f922fbf5 Revert "[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)"
This reverts commit fc6faa1113e9069f41b5500db051210af0eea843.
2024-01-16 17:01:01 -08:00
Rahman Lavaee
e1616ef9d7
[BasicBlockSections] Always keep the entry block in the beginning of the function. (#74696)
BasicBlockSections must enforce placing the entry block at the beginning
of the function regardless of the basic block sections profile.
2024-01-16 14:15:33 -08:00
David Green
7850c94b86 [NFC] sentinal -> sentinel 2024-01-16 17:22:06 +00:00
Paschalis Mpeis
8e514c572e
Reapply [TLI] Fix replace-with-veclib crash with invalid arguments (#77945)
Fix a crash of `replace-with-veclib` pass, when the arguments of the TLI
mapping do not match the original call.
Now, it simply ignores such cases.

Test require assertions as it accesses programmatically the debug log.

Reapplies reverted PR #77112
2024-01-16 10:00:15 +00:00
Pierre van Houtryve
4b0a76a3d7
[GlobalISel] Fix buildCopyFromRegs for split vectors (#77448)
Fixes #77055
2024-01-16 10:04:20 +01:00
Alex Bradbury
84f7fb6217
[MachineScheduler] Add option to control reordering for store/load clustering (#75338)
Reordering based on the sort order of the MemOpInfo array was disabled
in <https://reviews.llvm.org/D72706>. However, it's not clear this is
desirable for al targets. It also makes it more difficult to compare the
incremental benefit of enabling load clustering in the selectiondag
scheduler as well was the machinescheduler, as the sdag scheduler does
seem to allow this reordering.

This patch adds a parameter that can control the behaviour on a
per-target basis.

Split out from #73789.
2024-01-16 07:17:41 +00:00
Amara Emerson
eb009ed249 [GlobalISel] Fix the select->minmax combine from trying to operate on pointer types. 2024-01-15 18:20:18 -08:00
chuongg3
fcfe1b6482
[GlobalISel] Refactor extractParts() (#75223)
Moved extractParts() and extractVectorParts() from LegalizerHelper
to Utils to be able to use it in different passes.

extractParts() will also try to use unmerge when doing irregular
splits where possible, falling back to extract elements when not.
2024-01-15 16:40:39 +00:00
Dávid Ferenc Szabó
0ff3d729f9
[GlobalISel] Make IRTranslator able to handle PHIs with empty types. (#73235)
SelectionDAG already handle this since
e53b7d1a11d180ed7b33190a837d8898ab2a0b71.
2024-01-15 23:26:30 +07:00
Kazu Hirata
10b1c29e39 [CodeGen] Use a range-based for loop (NFC) 2024-01-14 12:17:54 -08:00
Kazu Hirata
e4a6be0fc0 [CodeGen] Use getConstantOperandVal (NFC) 2024-01-13 18:18:51 -08:00
Kazu Hirata
15179aa433 [llvm] Use llvm::is_contained (NFC) 2024-01-12 18:39:48 -08:00
Paschalis Mpeis
a300b24037 Revert "[TLI] Fix replace-with-veclib crash with invalid arguments (#77112)"
This reverts commit 9fdc568824b0992d48704dfa530a12073cc02f5e,
as it linker crashes on some platforms.
2024-01-12 15:49:15 +00:00
Paschalis Mpeis
9fdc568824
[TLI] Fix replace-with-veclib crash with invalid arguments (#77112)
Fix a crash of `replace-with-veclib` pass, when the arguments of the TLI
mapping do not match the original call.
Now, it simply ignores such cases.

Test require assertions as it accesses programmatically the debug log.
2024-01-12 15:19:52 +00:00
Alexander Yermolovich
d199ab4699
[LLVM][DWARF] Fix accelerator table switching between CU and TU (#77511)
Bug 1 is triggered when a TU is already created, and we process the same
DICompositeType at a top level. We would switch to TU accelerator table,
but
would not switch back on early exit. As the result we would add CU
entries to the TU
accelerator table. When we try to write out TUs and normalize entries,
the
offsets for DIEs that are part of a CU would not have been computed, and
it
would assert on getOffset().

Bug 2 is triggered when processing nested TUs. When we exit from
addDwarfTypeUnitType we switched back to CU accelerator table. If we
were processing nested TUs, the rest of the entries from TUs would be
added to CU accelerator table. When we write out TUs, all the DIE
pointers will become invalid. Eventually it will assert during
normalization step after CU is processed.
2024-01-12 07:01:17 -08:00
Nikita Popov
6c2fbc3a68
[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582)
This abstracts over the common pattern of creating a gep with i8 element
type.
2024-01-12 14:21:21 +01:00
Amara Emerson
a946934a12 [GlobalISel][NFC] Use GPhi wrapper in more places instead of iterating over operands. 2024-01-11 22:25:53 -08:00
Carl Ritson
6752f1517d
[TwoAddressInstruction] Recompute live intervals for partial defs (#74431)
Force live interval recomputation for a register if its definition is
narrowed to become partial. The live interval repair process cannot
otherwise detect these changes.
2024-01-12 13:26:01 +09:00
Emil J
3baedb4111
[GISel] Fix #77762: extend correct source registers in combiner helper rule extend_through_phis (#77765)
Since we already know which register we want to extend, we don't have to
ask its defining MI about it

---------

Co-authored-by: Emil Tywoniak <Emil.Tywoniak@hightec-rt.com>
2024-01-12 12:09:58 +08:00
darkbuck
54c19546ba
[GlobalISel] Revise 'assignCustomValue' interface (#77824)
- Previously, 'assignCustomValue' requests the number of assigned VAs
minus 1 is returned and treats 0 as the assignment failure. However,
under that arrangment, we cannot tell a successful *single* VA custom
assignment from the failure case.
- This change requests that 'assignCustomValue' just return the number
of all VAs assigned, including the first WA so that it won't be ambigous
to tell the failure case from the single VA custom assignment.
2024-01-12 10:41:55 +07:00
Wang Pengcheng
a2af374284
[SelectionDAG] Add space-optimized forms of OPC_CheckPredicate (#77763)
We record the usage of each `Predicate` and sort them by usage.

For the top 8 `Predicate`s, we will emit a `PC_CheckPredicateN` to
save one byte.

Overall this reduces the llc binary size with all in-tree targets by
about 61K.

This is a recommit of 1a57927, which was reverted in bc98c31.

The CI failures occurred when doing expensive checks (with option
`LLVM_ENABLE_EXPENSIVE_CHECKS` being ON).

The key point here is that we need stable sorting result in the
test, but doing expensive checks uncovered the non-determinism of
`llvm::sort`. So `llvm::sort` is changed to `llvm::stable_sort`
in this revised patch.

And we use `llvm::MapVector` to keep insertion order.
2024-01-12 11:38:05 +08:00
James Y Knight
b58f91a31b Set the default value for MaxAtomicSizeInBitsSupported to 0.
This was planned since its introduction, but wasn't rolled out for a
little bit longer than intended (ahem...8 years).

All in-tree targets have now been adjusted to call
setMaxAtomicSizeInBitsSupported explicitly where required, so this
should be a no-op. The docs in docs/Atomics.rst already claimed the
default was 0, so that doesn't need updating.
2024-01-11 18:01:46 -05:00
Vladislav Dzhidzhoev
fc6faa1113
[CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined functions (#75385)
- [DebugMetadata][DwarfDebug] Support function-local types in lexical
block scopes (4/7)
- [CloneFunction][DebugInfo] Avoid cloning DILocalVariables of inlined
functions

This is a follow-up for https://reviews.llvm.org/D144006, fixing a crash
reported
in Chromium (https://reviews.llvm.org/D144006#4651955).

The first commit is added for convenience, as it has already been
accepted.

If DISubpogram was not cloned (e.g. we are cloning a function that has
other
functions inlined into it, and subprograms of the inlined functions are
not supposed to be cloned), it doesn't make sense to clone its
DILocalVariables as well.
Otherwise get duplicated DILocalVariables not tracked in their
subprogram's retainedNodes, that crash LTO with Chromium.

This is meant to be committed along with
https://reviews.llvm.org/D144006.
2024-01-11 17:08:12 +01:00
HaohaiWen
52613396a6
[InstrRef] Add debug hint for not reachable blocks from entry (#77725)
Those not reachable blocks was not analyzed by LiveDebugValues and may
raise out of bound access to VarLocs as case in #77441.
2024-01-11 22:10:56 +08:00
HaohaiWen
f892cc36fd
[BranchFolding] Fix missing predecessors of landing-pad (#77608)
When removing an empty machine basic block, all of its successors should
be inherited by its fall through MBB. This keeps CFG as only have one
entry which is required by LiveDebugValues.

Reland #77441 as LiveDebugValues test.
2024-01-11 22:09:41 +08:00
Mikhail Goncharov
bc98c3103a Revert "[SelectionDAG] Add space-optimized forms of OPC_CheckPredicate (#73488)"
This reverts commit 1a5792735aa0bb10e5624a438bcf7fd5091ee265.

Test address-space-patfrags.td.test is failing

https://lab.llvm.org/buildbot/#/builders/104/builds/15012
2024-01-11 12:25:00 +01:00