33293 Commits

Author SHA1 Message Date
Yeting Kuo
0f8c761c48 [VP][RISCV] Recommit "Add vp.fshl/fshr and RISC-V support."
This reverts commit 7883e5b061bdbbe8bee5f479ebe911db5045b7e9.

The original commit was reverted that it didn't update test files after D136263
landed. The recommit fixed those.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D139509
2022-12-07 15:58:12 +08:00
Rahman Lavaee
6015a045d7 [Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number.
Let Propeller use specialized IDs for basic blocks, instead of MBB number.

This allows optimizations not just prior to asm-printer, but throughout the entire codegen.
This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version.

####Background
Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR.  This is done as follows.
    - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly.
    - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to.
    - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization.  Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point.
    - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks).
    - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR.  Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped.
    - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline.  Hence, MBB numbers are not suitable and we need something else.
####Solution
We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block.  It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs.

 To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies.

The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions.

####Impact on Size of the `LLVM_BB_ADDR_MAP` Section
Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary.

Reviewed By: tmsriram

Differential Revision: https://reviews.llvm.org/D100808
2022-12-06 22:50:09 -08:00
Kazu Hirata
934942c033 [llvm] Don't include Optional.h (NFC)
These source files no longer use Optional<T>, so they do not need to
include Optional.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-06 22:34:50 -08:00
Kazu Hirata
7883e5b061 Revert "[VP][RISCV] Add vp.fshl/fshr and RISC-V support."
This reverts commit 70de0e014013b4d97febe6704881a9a8c893d078.

I'm seeing:

Failed Tests (2):
  LLVM :: CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll
  LLVM :: CodeGen/RISCV/rvv/fshr-fshl-vp.ll

Also reported at:

https://lab.llvm.org/buildbot/#/builders/123/builds/14531
2022-12-06 22:27:43 -08:00
Yeting Kuo
70de0e0140 [VP][RISCV] Add vp.fshl/fshr and RISC-V support.
The patch made VectorLegalizer expand ISD::VP_FSHL and ISD::VP_FSHR to
achieve the codegen.

Reviewed By: craig.topper

Differential Revision: https://reviews.llvm.org/D138379
2022-12-07 12:16:36 +08:00
Kazu Hirata
405fc404bf [ADT] Don't including None.h (NFC)
These source files no longer use None, so they do not need to include
None.h.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-06 20:14:51 -08:00
Kazu Hirata
d8c00c4f63 [llvm] Don't include STLForwardCompat.h (NFC)
STLForwardCompat.h defines remove_cvref and remove_cvref_t.  These
source files use neither one of those.
2022-12-06 20:09:56 -08:00
Gregory Alfonso
cb38be9ed3 [NFC] Use Register instead of unsigned for variables that receive a Register object
Reviewed By: MaskRay

Differential Revision: https://reviews.llvm.org/D139451
2022-12-07 00:23:34 +00:00
Krzysztof Parzyszek
c589730ad5 [YAML] Convert Optional to std::optional 2022-12-06 12:49:32 -08:00
Sanjay Patel
adc7c589c3 [SDAG] try to convert bit set/clear to signbit test when trunc is free
(X & Pow2MaskC) == 0 --> (trunc X) >= 0
(X & Pow2MaskC) != 0 --> (trunc X) <  0

This was noted as a regression in the post-commit feedback for D112634
(where we canonicalized IR differently).

For x86, this saves a few instruction bytes. AArch64 seems neutral.

Differential Revision: https://reviews.llvm.org/D139363
2022-12-06 11:34:48 -05:00
Tobias Hieta
2298a44ccd
[CodeView] Add support for local S_CONSTANT records
CodeView doesn't have the ability to represent variables
in other ways than as in registers or memory values, but
LLVM very often transforms simple values into constants,
consider this program:

int f () { int i = 123; return i; }

LLVM will transform `i` into a constant value and just
leave behind a llvm.dbg.value, this can't be represented
as a S_LOCAL record in CodeView. But we can represent it
as a S_CONSTANT record.

This patch checks if the location of a debug value is null,
then we will insert a S_CONSTANT record instead of a S_LOCAL
value with the flag "OptimizedAway".

In lld we then output the S_CONSTANT in the right scope, before
they where always inserted in the global stream, now we check
the scope before inserting it.

This has shown to improve debugging for our developers
internally.

Fixes to llvm/llvm-project#55958

Reviewed By: aganea

Differential Revision: https://reviews.llvm.org/D138995
2022-12-06 10:34:01 +01:00
Philip Reames
186c192261 [SDAG] Allow scalable vectors in SimplifyDemanded routines
This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

Differential Revision: https://reviews.llvm.org/D137190
2022-12-05 12:42:16 -08:00
Jonas Paulsson
5ecd363295 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
This reverts commit 122efef8ee9be57055d204d52c38700fe933c033.

- Patch fixed to not reuse definitions from predecessors in EH landing pads.
- Late review suggestions (by MaskRay) have been addressed.
- M68k/pipeline.ll test updated.
- Init captures added in processBlock() to avoid capturing structured bindings.
- RISCV has this disabled for now.

Original commit message:

A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-05 12:53:50 -06:00
Matt Arsenault
47c68904a5 DAG: ComputeNumSignBits from load range metadata
The cases where the result type doesn't match the range type
are inadequately tested, but I'm not sure how to write such a
test. During the pre-legalize combine, any obviously optimizable
code gets handled so it's harder to test legalized extloads.
2022-12-05 11:57:13 -05:00
Philip Reames
7969ab85e0 [SDAG] Allow scalable vectors in ComputeKnownBits (try 2)
This was previously reverted due to a hang on a Hexagon bot.  This turned out to be a bug in the Hexagon backend around how splat_vectors are legalized (which they're using for fixed length vectors!).  I adjusted this patch to remove the implicit truncate support.  This hides the hexagon bug for now, and unblocks the rest of the change.

Original commit message:

This is the SelectionDAG equivalent of D136470, and is thus an alternate patch to D128159.

The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations.

This patch also includes an implementation for SPLAT_VECTOR as without it, the lane wise reasoning has no base case. The original patch which inspired this (D128159), also included STEP_VECTOR. I plan to do that as a separate patch.

Differential Revision: https://reviews.llvm.org/D137140
2022-12-05 08:52:37 -08:00
Dmitry Vyukov
dbe8c2c316 Use-after-return sanitizer binary metadata
Currently per-function metadata consists of:
(start-pc, size, features)

This adds a new UAR feature and if it's set an additional element:
(start-pc, size, features, stack-args-size)

Reviewed By: melver

Differential Revision: https://reviews.llvm.org/D136078
2022-12-05 14:40:31 +01:00
Vladislav Dzhidzhoev
f32cafedf0 [GlobalISel][DebugInfo] Propagate debug location for localized constants
After IRTranslator pass, constants are deduplicated and translated into instructions at entry block, having debug locations lost.
Localization of constants may cause emission of extra zero lines in debug_line section, like here https://godbolt.org/z/ecvsxxfKn. In this example, constant gets placed as
a first instruction in entry block, and despite it has no debug location, AsmPrinter emits zero line for it.

If a localized constant has the only user, we can assume that it has the same debug location as its user, since they are placed consequently.

Differential Revision: https://reviews.llvm.org/D128192
2022-12-05 16:38:24 +03:00
Fangrui Song
a996cc217c Remove unused #include "llvm/ADT/Optional.h" 2022-12-05 06:31:11 +00:00
Fangrui Song
89fae41ef1 [IR] llvm::Optional => std::optional
Many llvm/IR/* files have been migrated by other contributors.
This migrates most remaining files.
2022-12-05 04:13:11 +00:00
Kazu Hirata
595f1a6aaf [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 19:47:13 -08:00
Kazu Hirata
9f252e5567 [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:31:17 -08:00
Kazu Hirata
3c09ed006a [llvm] Use std::nullopt instead of None in comments (NFC)
This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 17:12:44 -08:00
Fangrui Song
89fab98e88 [DebugInfo] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-05 00:09:22 +00:00
Jonas Paulsson
122efef8ee Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions.""
This reverts commit 17db0de330f943833296ae72e26fa988bba39cb3.

Some more bots got broken - need to investigate.
2022-12-05 00:52:00 +01:00
Fangrui Song
b0df70403d [Target] llvm::Optional => std::optional
The updated functions are mostly internal with a few exceptions (virtual functions in
TargetInstrInfo.h, TargetRegisterInfo.h).
To minimize changes to LLVMCodeGen, GlobalISel files are skipped.

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 22:43:14 +00:00
Fangrui Song
f4c16c4473 [MC] llvm::Optional => std::optional
https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-04 21:36:08 +00:00
Fangrui Song
4e62072ca1 [Passes] llvm::Optional => std::optional 2022-12-04 20:44:52 +00:00
Krzysztof Parzyszek
f3b6dbfda8 Instructions: convert Optional to std::optional 2022-12-04 14:25:11 -06:00
Krzysztof Parzyszek
0ca43d4488 DebugInfoMetadata: convert Optional to std::optional 2022-12-04 11:52:02 -06:00
Benjamin Kramer
fcf4e360ba Iterate over StringMaps using structured bindings. NFCI. 2022-12-04 18:36:41 +01:00
Jonas Paulsson
17db0de330 Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."
Init captures added in processBlock() to avoid capturing structured bindings,
which caused the build problems (with clang).

RISCV has this disabled for now until problems relating to post RA pseudo
expansions are resolved.
2022-12-03 14:15:15 -06:00
Krzysztof Parzyszek
ab672e9173 FPEnv: convert Optional to std::optional 2022-12-03 13:55:56 -06:00
Fangrui Song
bac974278c CodeGen/CommandFlags: Convert Optional to std::optional 2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek
8c7c20f033 Convert Optional<CodeModel> to std::optional<CodeModel> 2022-12-03 12:08:47 -06:00
David Green
16a72a0f87 [AArch64] Enable the select optimize pass for AArch64
This enabled the select optimize patch for ARM Out of order AArch64
cores. It is trying to solve a problem that is difficult for the
compiler to fix. The criteria for when a csel is better or worse than a
branch depends heavily on whether the branch is well predicted and the
amount of ILP in the loop (as well as other criteria like the core in
question and the relative performance of the branch predictor).  The
pass seems to do a decent job though, with the inner loop heuristics
being well implemented and doing a better job than I had expected in
general, even without PGO information.

I've been doing quite a bit of benchmarking. The headline numbers are
these for SPEC2017 on a Neoverse N1:
  500.perlbench_r   -0.12%
  502.gcc_r         0.02%
  505.mcf_r         6.02%
  520.omnetpp_r     0.32%
  523.xalancbmk_r   0.20%
  525.x264_r        0.02%
  531.deepsjeng_r   0.00%
  541.leela_r       -0.09%
  548.exchange2_r   0.00%
  557.xz_r          -0.20%

Running benchmarks with a combination of the llvm-test-suite plus
several versions of SPEC gave between a 0.2% and 0.4% geomean
improvement depending on the core/run. The instruction count went down
by 0.1% too, which is a good sign, but the results can be a little
noisy.  Some issues from other benchmarks I had ran were improved in
rGca78b5601466f8515f5f958ef8e63d787d9d812e. In summary well predicted
branches will see in improvement, badly predicted branches may get
worse, and on average performance seems to be a little better overall.

This patch enables the pass for AArch64 under -O3 for cores that will
benefit for it. i.e. not in-order cores that do not fit into the "Assume
infinite resources that allow to fully exploit the available
instruction-level parallelism" cost model. It uses a subtarget feature
for specifying when the pass will be enabled, which I have enabled under
cpu=generic as the performance increases for out of order cores seems
larger than any decreases for inorder, which were minor.

Differential Revision: https://reviews.llvm.org/D138990
2022-12-03 16:08:58 +00:00
Kazu Hirata
998960ee1f [CodeGen] Use std::nullopt instead of None (NFC)
This patch mechanically replaces None with std::nullopt where the
compiler would warn if None were deprecated.  The intent is to reduce
the amount of manual work required in migrating from Optional to
std::optional.

This is part of an effort to migrate from llvm::Optional to
std::optional:

https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716
2022-12-02 20:36:08 -08:00
Jan Svoboda
abf0c6c0c0 Use CTAD on llvm::SaveAndRestore
Reviewed By: dblaikie

Differential Revision: https://reviews.llvm.org/D139229
2022-12-02 15:36:12 -08:00
Fangrui Song
ca23b7ca47 [AsmPrinter] .addrsig_sym: remove isTransitiveUsedByMetadataOnly
With D135642 ignoring unregistered symbols, isTransitiveUsedByMetadataOnly added
by D101512 is no longer needed (the operation is potentially slow). There is a
`.addrsig_sym` directive for an only-used-by-metadata symbol but it does not
emit an entry.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D138362
2022-12-02 19:05:43 +00:00
Sanjay Patel
0037e21f28 [SDAG] bail out of mergeTruncStores() if there's any other use in the chain
This fixes the miscompile in issue #58883.

The test demonstrates that we gave up on store merging in that example.

This change should be strictly safe (just adds another clause
to avoid the transform), and it does not prohibit any existing
valid optimizations based on regression tests. I want to believe
that it's also a sufficient fix (possibly overkill), but I'm not
sure how to prove that.

Differential Revision: https://reviews.llvm.org/D137791
2022-12-02 10:08:19 -05:00
Florian Hahn
63150f4639
Revert "Enhance stack protector for calling no return function"
This reverts commit 416e8c6ad529c57f21f46c6f52ded96d3ed239fb.

This commit causes a test failure with expensive checks due to a DT
verification failure. Revert to bring bot back to green:

https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/24249/testReport/junit/LLVM/CodeGen_X86/stack_protector_no_return_ll/

+ /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/clang-build/bin/llc /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/llvm-project/llvm/test/CodeGen/X86/stack-protector-no-return.ll -mtriple=x86_64-unknown-linux-gnu -o -
+ /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/clang-build/bin/FileCheck /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/llvm-project/llvm/test/CodeGen/X86/stack-protector-no-return.ll
DominatorTree is different than a freshly computed one!
	Current:
=============================--------------------------------
Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries.
  [1] %entry {4294967295,4294967295} [0]
    [2] %unreachable {4294967295,4294967295} [1]
    [2] %lpad {4294967295,4294967295} [1]
      [3] %invoke.cont {4294967295,4294967295} [2]
        [4] %invoke.cont2 {4294967295,4294967295} [3]
        [4] %SP_return3 {4294967295,4294967295} [3]
        [4] %CallStackCheckFailBlk2 {4294967295,4294967295} [3]
      [3] %lpad1 {4294967295,4294967295} [2]
        [4] %eh.resume {4294967295,4294967295} [3]
          [5] %SP_return6 {4294967295,4294967295} [4]
          [5] %CallStackCheckFailBlk5 {4294967295,4294967295} [4]
        [4] %terminate.lpad {4294967295,4294967295} [3]
          [5] %SP_return9 {4294967295,4294967295} [4]
          [5] %CallStackCheckFailBlk8 {4294967295,4294967295} [4]
    [2] %SP_return {4294967295,4294967295} [1]
    [2] %CallStackCheckFailBlk {4294967295,4294967295} [1]
Roots: %entry
2022-12-02 12:58:46 +00:00
tentzen
db6a979ae8 Revert "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2"
This reverts commit 1a949c871ab4a6b6d792849d3e8c0fa6958d27f5.
2022-12-02 02:44:18 -08:00
tentzen
1a949c871a [Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2
This patch is the Part-2 (BE LLVM) implementation of HW Exception handling.
Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4.

This new feature adds the support of Hardware Exception for Microsoft Windows
SEH (Structured Exception Handling).

Compiler options:
  For clang-cl.exe, the option is -EHa, the same as MSVC.
  For clang.exe, the extra option is -fasync-exceptions,
  plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual.

NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change.

The rules for C code:
For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules:
  First, no exception can move in or out of _try region., i.e., no "potential faulty
    instruction can be moved across _try boundary.
  Second, the order of exceptions for instructions 'directly' under a _try must be preserved
    (not applied to those in callees).
  Finally, global states (local/global/heap variables) that can be read outside of _try region
    must be updated in memory (not just in register) before the subsequent exception occurs.

The impact to C++ code:
  Although SEH is a feature for C code, -EHa does have a profound effect on C++
  side. When a C++ function (in the same compilation unit with option -EHa ) is
  called by a SEH C function, a hardware exception occurs in C++ code can also
  be handled properly by an upstream SEH _try-handler or a C++ catch(...).
  As such, when that happens in the middle of an object's life scope, the dtor
  must be invoked the same way as C++ Synchronous Exception during unwinding process.

Design:
A natural way to achieve the rules above in LLVM today is to allow an EH edge
added on memory/computation instruction (previous iload/istore idea) so that
exception path is modeled in Flow graph preciously. However, tracking every
single memory instruction and potential faulty instruction can create many
Invokes, complicate flow graph and possibly result in negative performance
impact for downstream optimization and code generation. Making all
optimizations be aware of the new semantic is also substantial.

This design does not intend to model exception path at instruction level.
Instead, the proposed design tracks and reports EH state at BLOCK-level to
reduce the complexity of flow graph and minimize the performance-impact on CPP
code under -EHa option.
One key element of this design is the ability to compute State number at
block-level. Our algorithm is based on the following rationales:

A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping
into a _try is not allowed. The single entry must start with a seh_try_begin()
invoke with a correct State number that is the initial state of the SEME.
Through control-flow, state number is propagated into all blocks. Side exits
marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[].
Note side exits can ONLY jump into parent scopes (lower state number).
Thus, when a block succeeds various states from its predecessors, the lowest
State triumphs others.  If some exits flow to unreachable, propagation on those
paths terminate, not affecting remaining blocks.
For CPP code, object lifetime region is usually a SEME as SEH _try.
However there is one rare exception: jumping into a lifetime that has Dtor but
has no Ctor is warned, but allowed:

Warning: jump bypasses variable with a non-trivial destructor

In that case, the region is actually a MEME (multiple entry multiple exits).
Our solution is to inject a eha_scope_begin() invoke in the side entry block to
ensure a correct State.
Implementation:
Part-1: Clang implementation (already in):
Please see commit 797ad701522988e212495285dade8efac41a24d4).

Part-2 : LLVM implementation described below.

For both C++ & C-code, the state of each block is computed at the same place in
BE (WinEHPreparing pass) where all other EH tables/maps are calculated.
In addition to _scope_begin & _scope_end, the computation of block state also
rely on the existing State tracking code (UnwindMap and InvokeStateMap).

For both C++ & C-code, the state of each block with potential trap instruction
is marked and reported in DAG Instruction Selection pass, the same place where
the state for -EHsc (synchronous exceptions) is done.
If the first instruction in a reported block scope can trap, a Nop is injected
before this instruction. This nop is needed to accommodate LLVM Windows EH
implementation, in which the address in IPToState table is offset by +1.
(note the purpose of that is to ensure the return address of a call is in the
same scope as the call address.

The handler for catch(...) for -EHa must handle HW exception. So it is
'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches
C++ exceptions).
Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW
exceptions can be passed through.

Original llvm-dev [RFC] discussions can be found in these two threads below:
https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html
https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html

Differential Revision: https://reviews.llvm.org/D102817/new/
2022-12-01 23:44:25 -08:00
Vasileios Porpodas
4e30c3ddf0 [NFC] Cleanup: Replaces BB->getInstList().erase() with BB->erase().
This is part of a series of cleanup patches towards making BasicBlock::getInstList() private.

Differential Revision: https://reviews.llvm.org/D139143
2022-12-01 18:19:23 -08:00
Krzysztof Parzyszek
864aaa21b4 TargetLowering: convert Optional to std::optional 2022-12-01 16:19:10 -08:00
Krzysztof Parzyszek
467432899b MemoryLocation: convert Optional to std::optional 2022-12-01 15:36:20 -08:00
Mitch Phillips
850defb861 Add assembler plumbing for sanitize_memtag
Extends the Asm reader/writer to support reading and writing the
'.memtag' directive (including allowing it on internal global
variables). Also add some extra tooling support, including objdump and
yaml2obj/obj2yaml.

Test that the sanitize_memtag IR attribute produces the expected asm
directive.

Uses the new Aarch64 MemtagABI specification
(https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst)
to identify symbols as tagged in object files. This is done using a
R_AARCH64_NONE relocation that identifies each tagged symbol, and these
relocations are tagged in a special SHT_AARCH64_MEMTAG_GLOBALS_STATIC
section. This signals to the linker that the global variable should be
tagged.

Reviewed By: fmayer, MaskRay, peter.smith

Differential Revision: https://reviews.llvm.org/D128958
2022-12-01 10:50:34 -08:00
ZHU Zijia
010a8f7a90 [CodeGen] Fix restore blocks' BasicBlock information in branch relaxation
In branch relaxation pass, restore blocks are created and placed before
the jump destination if indirect branches are required. For example:

        foo
        sd      s11, 0(sp)
        jump    .restore, s11
        bar
        bar
        bar
        j       .dest
.restore:
        ld      s11, 0(sp)
.dest:
        baz

The BasicBlock information of the restore MachineBasicBlock should be
identical to the dest MachineBasicBlock.

Reviewed By: arsenm

Differential Revision: https://reviews.llvm.org/D131863
2022-12-02 02:42:22 +08:00
Jonas Paulsson
8ef4632681 Revert "[CodeGen] Add new pass for late cleanup of redundant definitions."
Temporarily revert and fix buildbot failure.

This reverts commit 6d12599fd4134c1da63198c74a25490d28c733f6.
2022-12-01 13:29:24 -05:00
Jonas Paulsson
6d12599fd4 [CodeGen] Add new pass for late cleanup of redundant definitions.
A new pass MachineLateInstrsCleanup is added to be run after PEI.

This is a simple pass that removes redundant and identical instructions
whenever found by scanning the MF once while keeping track of register
definitions in a map. These instructions are typically immediate loads
resulting from rematerialization, and address loads emitted by target in
eliminateFrameInde().

This is enabled by default, but a target could easily disable it by means of
'disablePass(&MachineLateInstrsCleanupID);'.

This late cleanup is naturally not "optimal" in removing instructions as it
is done by looking at phys-regs, but still quite effective. It would be
desirable to improve other parts of CodeGen and avoid these redundant
instructions in the first place, but there are no ideas for this yet.

Differential Revision: https://reviews.llvm.org/D123394

Reviewed By: RKSimon, foad, craig.topper, arsenm, asb
2022-12-01 13:21:35 -05:00
Freddy Ye
89f36dd8f3 [X86] Add ExpandLargeFpConvert Pass and enable for X86
As stated in
https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528,
this implementation is very similar to ExpandLargeDivRem, which expands
‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions
with a bitwidth above a threshold into auto-generated functions. This is
useful for targets like x86_64 that cannot lower fp convertions with more
than 128 bits. The expanded nodes are referring from the IR generated by
`compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`,
and etc.

Corner cases:
1. For fp16: as there is no related builtins added in compliler-rt. So I
mainly utilized the fp32 <-> fp16 lib calls to implement.
2. For fp80: as this pass is soft fp emulation and no fp80 instructions can
help in this problem. I recommend users to deprecate this usage. For now, the
implementation uses fp128 as the temporary conversion type and inserts
fptrunc/ext at top/end of the function.
3. For bf16: as clang FE currently doesn't support bf16 algorithm operations
(convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for
now.
4. For unsigned FPToI: since both default hardware behaviors and libgcc are
ignoring "returns 0 for negative input" spec. This pass follows this old way
to ignore unsigned FPToI. See this example:
https://gcc.godbolt.org/z/bnv3jqW1M

The end-to-end tests are uploaded at https://reviews.llvm.org/D138261

Reviewed By: LuoYuanke, mgehre-amd

Differential Revision: https://reviews.llvm.org/D137241
2022-12-01 13:47:43 +08:00