33786 Commits

Author SHA1 Message Date
OCHyams
d5b2c8e56d [Assignment Tracking][NFC] Use BitVectors as masks for SmallVectors
...rather than using DenseMaps to track per-variable information.

Rather than tracking 3 maps of {VariableID: SomeInfo} per block, use a
BitVector indexed by VariableID to mask 3 vectors of SomeInfo.

BlockInfos now need to be initialised with a call to init which sets the
BitVector width to the number of partially promoted variables in the function
and fills the vectors with Top values.

Prior to this patch, in joinBlockInfo, it was necessary to insert Top values
into the Join result for variables in A XOR B after joining the variables in A
AND B. Now, because the vectors are pre-filled with Top values we need only
join the variables A AND B and set the BitVector of tracked variables to A OR
B.

The patch achieves an average of 0.25% reduction in instructions retired and a
1.1% max-rss for the CTMark suite in LTO-O3-g builds.

Reviewed By: scott.linder

Differential Revision: https://reviews.llvm.org/D145558
2023-03-21 09:11:54 +00:00
Michal Paszkowski
b333f33939 [SPIR-V] Add Machine Value Type for SPIR-V builtins
Differential Revision: https://reviews.llvm.org/D145703
2023-03-20 23:15:34 +01:00
Amara Emerson
41e9c4b88c [NFC][Outliner] Delete default ctors for Candidate & OutlinedFunction.
I think it's good practice to avoid having default ctors unless they're really
valid/useful. For OutlinedFunction the default ctor was used to represent a
bail-out value for getOutliningCandidateInfo(), so I changed the API to return
an optional<getOutliningCandidateInfo> instead which seems a tad cleaner.

Differential Revision: https://reviews.llvm.org/D146375
2023-03-20 11:17:10 -07:00
Simon Pilgrim
2d4042f4b7 [DAG] visitTRUNCATE - use FoldConstantArithmetic to perform constant folding.
Avoid needing to perform extra isConstantIntBuildVectorOrConstantInt checks
2023-03-20 11:14:14 +00:00
Simon Pilgrim
e9a86b7813 [DAG] foldBinOpIntoSelect - remove !CanFoldNonConst check. NFC.
These checks are in an if-else chain where CanFoldNonConst is already guaranteed to be false.
2023-03-20 11:14:14 +00:00
Heejin Ahn
e1f830bde8 [WebAssembly] Support debug info for TLS + global in PIC mode
This adds debug info support for
- `thread_local` global variables, both in non-PIC and PIC modes
- (non-thread_local) Global variables in PIC mode

The former needs to read the value from an offset relative to
`__tls_base` and the latter an offset from `__memory_base`. The code for
doing this overlaps with some of the existing code to add
`__stack_pointer` global, so this adds a new member function to add a
a global in `TI_GLOBAL_RELOC` mode and use it in all three places.

Split DWARF support is currently patchy at best, because the index for
`__tls_base` is not fixed after dynamic linking. The preexisting split
DWARF support for `__stack_pointer` relies on that in practice it is
always index 0. This does similar hardcoding for `__tls_base` and
`__memory_base`, but `__tls_base`'s index in dynamic linking is not
fixed now (See
19afbfe331/lld/wasm/Driver.cpp (L786-L823)
for details), TLS + dynamic linking will not work at the moment.

Fixes https://bugs.chromium.org/p/chromium/issues/detail?id=1416702.

Reviewed By: dschuff

Differential Revision: https://reviews.llvm.org/D145626
2023-03-17 20:16:48 -07:00
Matt Arsenault
c98b2e20c9 LiveRangeEdit: Use Register 2023-03-17 17:34:52 -04:00
Matt Arsenault
ce6c36bab5 RegAllocGreedy: Don't use Register reference 2023-03-17 15:22:13 -04:00
Stefan Gränitz
ef006eb0bc [CodeView] Add source languages ObjC and ObjC++
This patch adds llvm::codeview::SourceLanguage entries, DWARF translations, and PDB source file extensions in LLVM and allow LLDB's PDB parsers to recognize them correctly.

The CV_CFL_LANG enum in the Visual Studio 2022 documentation https://learn.microsoft.com/en-us/visualstudio/debugger/debug-interface-access/cv-cfl-lang defines:
```
    CV_CFL_OBJC     = 0x11,
    CV_CFL_OBJCXX   = 0x12,
```

Since the initial commit in D24317, ObjC was emitted as C language and ObjC++ as Masm.

Reviewed By: DavidSpickett

Differential Revision: https://reviews.llvm.org/D146221
2023-03-17 17:09:31 +01:00
Matt Arsenault
9356ec1516 CodeGen: Reorder case handling for is.fpclass legalization
Subnormal and zero checks can be combined into one, so move
the code closer to reduce the diff in a future change.
2023-03-17 11:29:50 -04:00
Matt Arsenault
61f2f2c64a GlobalISel: Use FPClassTest in is.fpclass lowering 2023-03-17 10:23:01 -04:00
Tim Northover
2d690684f6 Recommit DwarfEHPrepare: insert extra unwind paths for stack protector to instrument
This is a mitigation patch for
https://bugs.chromium.org/p/llvm/issues/detail?id=30, where existing stack
protection is skipped if a function is returned through by an unwinder rather
than the normal call/return path. The recent patch D139254 added the ability to
instrument a visible unwind path, at least in the IR case (I'm working on the
SelectionDAG instrumentation too) but there are still invisible unwinds it
can't reach.

So this patch adds logic to DwarfEHPrepare that goes through a function,
converting any call that might throw into an invoke to a simple resume cleanup,
and adding cleanup clauses to existing landingpads that lack them. Obviously we
don't really want to do this if it's wasted effort, so I also exposed
requiresStackProtector from the actual StackProtector code to skip the extra
paths if they won't be used.

Changes:
  * Move test to AArch64 directory as it relies on target presence.
  * Re-add Dominator-tree maintenance. Accidentally cherry-picked wrong patch.
  * Skip adding paths on Windows EH functions.

https://reviews.llvm.org/D143637
2023-03-16 13:43:17 +00:00
Simon Pilgrim
6bc0e362d7 [DAG] TargetLowering::ShrinkDemandedOp - move SmallVTBits iterator inside for loop. NFC 2023-03-16 12:12:33 +00:00
Simon Pilgrim
7aa7393aab [DAG] TargetLowering::ShrinkDemandedOp - pull out repeated getValueType calls. NFC 2023-03-16 12:12:33 +00:00
Tim Northover
e4b352a0b9 Revert "DwarfEHPrepare: insert extra unwind paths for stack protector to instrument"
It's caused more failures than are trivially fixable.

This reverts commit 203b6f31bb71ce63488eb96b303e000e91aee376.
2023-03-16 11:55:53 +00:00
Tim Northover
203b6f31bb DwarfEHPrepare: insert extra unwind paths for stack protector to instrument
This is a mitigation patch for
https://bugs.chromium.org/p/llvm/issues/detail?id=30, where existing stack
protection is skipped if a function is returned through by an unwinder rather
than the normal call/return path. The recent patch D139254 added the ability to
instrument a visible unwind path, at least in the IR case (I'm working on the
SelectionDAG instrumentation too) but there are still invisible unwinds it
can't reach.

So this patch adds logic to DwarfEHPrepare that goes through a function,
converting any call that might throw into an invoke to a simple resume cleanup,
and adding cleanup clauses to existing landingpads that lack them. Obviously we
don't really want to do this if it's wasted effort, so I also exposed
requiresStackProtector from the actual StackProtector code to skip the extra
paths if they won't be used.

https://reviews.llvm.org/D143637
2023-03-16 11:32:45 +00:00
Tim Northover
79b3f0823e StackProtector: expose RequiresStackProtector publicly. NFC. 2023-03-16 11:32:45 +00:00
OCHyams
47b99b7fc0 [Assignment Tracking] Do not convert variadic locations to kill locations [3/x]
Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D145914
2023-03-16 09:55:15 +00:00
OCHyams
7d89437455 [Assignment Tracking][NFC] Use RawLocationWrapper in VarLocInfo [2/x]
Use RawLocationWrapper rather than a Value to represent the location operand(s)
so that it's possible to represent multiple location
operands. AssignmentTrackingAnalysis still converts variadic debug intrinsics
to kill locations so this patch is NFC.

Reviewed By: StephenTozer

Differential Revision: https://reviews.llvm.org/D145911
2023-03-16 09:55:15 +00:00
Kazu Hirata
398af9b43b [llvm] Use *{Map,Set}::contains (NFC) 2023-03-15 18:06:32 -07:00
Simon Pilgrim
28a0d0e85a [DAG] Don't fold zext(logicalshift(zext(x),c)) -> logicalshift(zext(x),c) if the outer zext is free
Avoid widening the shift to a bigger type if the zext would be free anyway

Pulled out of D146121
2023-03-15 17:45:12 +00:00
Kazu Hirata
8bdf387858 Use *{Map,Set}::contains (NFC)
Differential Revision: https://reviews.llvm.org/D146104
2023-03-15 08:46:32 -07:00
Arthur Eubanks
bfc6590e66 [PassManager] Run PassInstrumentation after analysis invalidation
This allows instrumentation to inspect cached analyses to verify them.

The CGSCC PassInstrumentation previously ran `runAfterPass()` on the original SCC, but really it should be running on UpdatedC when relevant since that's the relevant SCC after the pass.

Reviewed By: nikic

Differential Revision: https://reviews.llvm.org/D146096
2023-03-15 08:36:14 -07:00
Simon Pilgrim
dc20ce7e54 [DAG] TargetLowering::ShrinkDemandedOp - rename Demanded arg to DemandedBits. NFC
Make it clear this is referring to DemandedBits not DemandedElts.
2023-03-15 13:22:21 +00:00
Simon Pilgrim
c1f81e7604 [DAG] mergeStore - peek through truncates when finding dead store(trunc(load())) patterns
Extend the existing store(load()) removal code to account for intermediate truncates that some targets won't remove with canCombineTruncStore - we only care about the load/store MemoryVT.

Fixes regression from D146121
2023-03-15 11:54:13 +00:00
Simon Pilgrim
70562607ab [DAG] Fold multiple insert_vector_elt of zero values into an AND mask
This also allows us to make use of the existing isVectorClearMaskLegal shuffle canonicalization

Differential Revision: https://reviews.llvm.org/D145939
2023-03-15 09:56:26 +00:00
pvanhout
f90849dfa3 [AMDGPU] Use UniformityAnalysis in AtomicOptimizer
Adds & uses a new `isDivergentUse` API in UA.
UniformityAnalysis now requires CycleInfo as well as the new temporal divergence API can query it.

-----

Original patch that adds `isDivergentUse` by @sameerds

The user of a temporally divergent value is marked as divergent in the
uniformity analysis. But the same user may also have been marked divergent for
other reasons, thus losing this information about temporal divergence. But some
clients need to specificly check for temporal divergence. This change restores
such an API, that already existed in DivergenceAnalysis.

Reviewed By: sameerds, foad

Differential Revision: https://reviews.llvm.org/D146018
2023-03-15 09:39:55 +01:00
Julian Lettner
e6a789ef9b Remove -lower-global-dtors-via-cxa-atexit flag
Remove the `-lower-global-dtors-via-cxa-atexit` escape hatch introduced
in D121736 [1], which switched the default lowering of global
destructors on MachO to use `__cxa_atexit()` to avoid emitting
deprecated `__mod_term_func` sections.

I added this flag as an escape hatch in case the switch causes any
problems.  We didn't discover any problems so now we can remove it.

[1] https://reviews.llvm.org/D121736

rdar://90277838

Differential Revision: https://reviews.llvm.org/D145715
2023-03-14 14:18:11 -07:00
Simon Pilgrim
da570ef1b4 [DAG] Match select(icmp(x,y),sub(x,y),sub(y,x)) -> abd(x,y) patterns
Pulled out of PowerPC, and added ABDS support as well (hence the additional v4i32 PPC matches)

Differential Revision: https://reviews.llvm.org/D144789
2023-03-14 15:10:30 +00:00
Kazu Hirata
a585fa2637 [CodeGen] Use *{Set,Map}::contains (NFC) 2023-03-14 08:07:42 -07:00
Simon Pilgrim
4bf004e07e [DAG] Fold (bitcast (logicop (bitcast x), (c))) -> (logicop x, (bitcast c)) iff the current logicop type is illegal
Try to remove extra bitcasts around logicops if we're dealing with illegal types

Fixes the regressions in D145939

Differential Revision: https://reviews.llvm.org/D146032
2023-03-14 14:41:11 +00:00
pvanhout
1f1fea6c38 Reland: [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 14:38:45 +01:00
Nicholas Guy
96615c856d [Codegen][ARM][AArch64] Support symmetric operations on complex numbers
Differential Revision: https://reviews.llvm.org/D142482
2023-03-14 12:11:10 +00:00
Nicholas Guy
49384f1411 Cleanup of Complex Deinterleaving pass (NFCI)
Differential Revision: https://reviews.llvm.org/D143177
2023-03-14 12:11:09 +00:00
Luke Lau
a9d9616c0d [RISCV][NFC] Share interleave mask checking logic
This adds two new methods to ShuffleVectorInst, isInterleave and
isInterleaveMask, so that the logic to check if a shuffle mask is an
interleave can be shared across the TTI, codegen and the interleaved
access pass.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145971
2023-03-14 11:02:52 +00:00
pvanhout
0e79106fc9 Revert "[DAG/AMDGPU] Use UniformityAnalysis in DAGISel"
This reverts commit 0022b5803fd4f5a4e9fcf233267c0ffa1b88f763.
2023-03-14 11:48:58 +01:00
pvanhout
0022b5803f [DAG/AMDGPU] Use UniformityAnalysis in DAGISel
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.

Reviewed By: sameerds

Differential Revision: https://reviews.llvm.org/D145918
2023-03-14 11:18:28 +01:00
Simon Pilgrim
c7d844ea0f [DAG] Use ISD::isBitwiseLogicOp in AND/OR/XOR checks. NFCI.
There's additional cases we can cleanup (mainly in target code), but this tries to cleanup generic code and PPC which had an equivalent helper.
2023-03-13 13:39:02 +00:00
Felipe de Azevedo Piovezan
6e861d818a [MachineCombiner] Preserve debug instruction number
Each target's `TargetInstrInfo` is responsible for announcing which code
patterns it is able to transform during the MachineCombiner pass.
Currently, these patterns are applied without preserving the debug
instruction number required by the InstrRef implementation of
LiveDebugValues. As such, we've seen a number of examples where debug
information is dropped for variables in InstrRef mode that were
otherwise available in VarLoc mode. This has been observed both in X86
and AArch examples.

This commit is an initial attempt at preserving said numbers by changing
the general (target agnostic) implementation of TargetInstrInfo: the
reassociation pattern must keep the debug number of the "top level"
instruction, i.e., the instruction whose value represents the final
value of the arithmetic expression. Intermediate values must have their
debug number dropped, as they have no equivalent value in the
unoptimized code.

Future work is required to update each target's
`TargetInstrInfo::genAlternativeCodeSequence` method.

Differential Revision: https://reviews.llvm.org/D145759
2023-03-13 09:29:30 -04:00
Chen Zheng
4f0ed16a46 Reland rGf35a09daebd0a90daa536432e62a2476f708150d and rG63854f91d3ee1056796a5ef27753648396cac6ec
[DAGCombiner] handle more store value forwarding

When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.

Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad

Reviewed By: nemanjai, RolandF

Differential Revision: https://reviews.llvm.org/D138899
2023-03-12 21:59:18 -04:00
Jun Ma
00eef4f7c3 [SelectionDAG] Fix mismatched truncate when combine BUILD_VECTOR with EXTRACT_SUBVECTOR
Just use correct type for truncation. Fixes PR59625

Differential Revision: https://reviews.llvm.org/D145757
2023-03-13 08:59:52 +08:00
Simon Pilgrim
82dc04befd [DAG] visitZERO_EXTEND - pull out the repeated SDLoc(N) variables 2023-03-12 15:18:46 +00:00
Simon Pilgrim
4d7da0e711 [DAG] Cleanup the (zext (shl (zext x), cst)) -> (shl (zext x), cst) fold. NFC.
Preliminary cleanup before adding some additional legality and value tracking handling.
2023-03-12 15:01:33 +00:00
Simon Pilgrim
b53ea2b9c5 [DAG] visitAND - fold (and (any_ext V), c) -> (zero_ext (and (trunc V), c)) if profitable.
Try to more aggressively narrow masks of extended values.

This is mainly for cases where the mask is trying to zero out any_extended upper bits, assuming we can zext/trunc the values for free.

This catches a few actual missed folds, as well as helps canonicalize a number of other cases which were being caught in isel etc.

Differential Revision: https://reviews.llvm.org/D145866
2023-03-12 13:25:23 +00:00
Simon Pilgrim
fad852efe4 [DAG] combineShiftAnd1ToBitTest - improve support for peeking through truncations
Allows us to handle shift amounts that exceed the original bitwidth
2023-03-11 16:37:47 +00:00
Yuanfang Chen
9aae408d55 [NFC] fix typo funciton -> function
credits to @jmagee
2023-03-10 18:05:25 -08:00
Tim Northover
5c18444289 MachO: support custom section names on global variables
These attributes have been accepted in ELF for a while, and are generated by
Clang in some places, so it makes sense to support them on MachO too.

https://reviews.llvm.org/D143173
2023-03-10 18:23:25 +00:00
Sameer Sahasrabuddhe
fd98416d37 [llvm][Uniformity] consistently handle always-uniform instructions
An instruction that is "always uniform" is so even if it occurs in an
irreducible cycle. The output produced by such an instruction may depend on the
implementation defined cycle hierarchy, but that does not affect the uniformity
of the output. In other words, an "always uniform" instruction is uniform even
if it is not m-converged.

Reviewed By: ruiling, ronlieb

Differential Revision: https://reviews.llvm.org/D145572
2023-03-10 14:23:40 +05:30
Rong Xu
ebe09e2a95 [FSAFDO] Improve FS discriminator encoding
This change improves FS discriminators in the following ways:
(1) use call-stack debug information in the the to generate
discriminators: the same (src/line) DILs can now have same
discriminator value if they come from different call-stacks.
This effectively increases the usable discriminator values
for each round of FS discriminator pass.
(2) don't generate the FS discriminator for meta instructions
(i.e. instructions not emitted). This reduces the number
discriminators conflicts (for the case we run out of discriminator
bits for that pass).
(3) use less expensive hashing of xxHash64.

These improvements should bring better performance for FSAFDO
and they should be used by default. But this change creates
incompatible FS discriminators. For the iterative profile users,
they might see a performance drop in the first release with
this change (due to the fact that the profiles have the old
discriminators and the compiler uses the new discriminator).
We have measured that this is not more than 1.5% on several
benchmarks. Note the degradation should be gone in the second
release and one should expect a performance gain over the binary
without this change.

One possible solution to the iterative profile issue would be
separating discriminators for profile-use and the ones emitted to
the binary. This would require a mechanism to allow two sets of
discriminators to be maintained and then phasing out the first
approach. This is too much churn in the compiler and the
performance implications do not seem to be worth the effort.

Instead, we put the changes under an option so iterative profile
users can do a gradual rollout of this change. We will make the
option default value to true in a later patch and eventually
purge this option from the code base.

Differential Revision: https://reviews.llvm.org/D145171
2023-03-09 23:18:48 -08:00
Yeting Kuo
b2c48559c8 [IR][DAG][RISCV] Allow scalable vector ISD::STRICT_FP_EXTEND and RISC-V supports for vector ISD::STRICT_FP_EXTEND.
The patch mainly does two things. The first is allowing scalable vector
ISD::STRICT_FP_EXTEND. The second is making RISC-V customized lower
strict_fpextend to riscv_strict_fpextend_vl, the strict version of
riscv_fpextend_vl.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D145548
2023-03-09 17:37:59 +08:00