The types of SEH aren't x86(-32) vs x64 but rather stack-based exception chaining
vs table-based exception handling. x86-32 is the only arch for which Windows
uses the former. 32-bit ARM would use what is called Win64SEH today, which
is a bit confusing so instead let's just rename it to be a bit more clear.
Reviewed By: compnerd, rnk
Differential Revision: https://reviews.llvm.org/D90117
Since Wasm comdat sections work similarly to ELF, we can use that mechanism
to eliminate duplicate dwarf type information in the same way.
Differential Revision: https://reviews.llvm.org/D88603
These logically belong together since it's a base commit plus
followup fixes to less common build configurations.
The patches are:
Revert "CfgInterface: rename interface() to getInterface()"
This reverts commit a74fc481588fcea9317cbf1f6c5888a30c9edd2d.
Revert "Wrap CfgTraitsFor in namespace llvm to please GCC 5"
This reverts commit f2a06875b604c00cbe96e54363f4f5d28935d610.
Revert "Try to make GCC5 happy about the CfgTraits thing"
This reverts commit 03a5f7ce12e2111c8b7bc5a95cff4c51b516250f.
Revert "Introduce CfgTraits abstraction"
This reverts commit c0cdd22c72fab47a3c37b5a8401763995cadaa77.
This patch changes MergeBlockIntoPredecessor to skip the call to
RemoveRedundantDbgInstrs, in effect partially reverting D71480 due to
some compile-time issues spotted in LoopUnroll and SimplifyCFG.
The call to RemoveRedundantDbgInstrs appears to have changed the
worst-case behavior of the merging utility. Loosely speaking, it seems
to have gone from O(#phis) to O(#insts).
It might not be possible to mitigate this by scanning a block to
determine whether there are any debug intrinsics to remove, since such a
scan costs O(#insts).
So: skip the call to RemoveRedundantDbgInstrs. There's surprisingly
little fallout from this, and most of it can be addressed by doing
RemoveRedundantDbgInstrs later. The exception is (the block-local
version of) SimplifyCFG, where it might just be too expensive to call
RemoveRedundantDbgInstrs.
Differential Revision: https://reviews.llvm.org/D88928
As reading the source code, I've found some minor nits:
-Use using instead of typedef
-Fix a comment
-Refactor
Differential Revision: https://reviews.llvm.org/D90155
For i1 types, boolean false is represented identically regardless of
the boolean content, so we can allow optimizations that otherwise
would not be correct for booleans with false represented as a negative
one.
Patch by Erik Hogeman.
Differential Revision: https://reviews.llvm.org/D90145
Sometimes in unoptimized code, we have dangling unreachable basic blocks with no predecessors. Basic block sections should be emitted for those as well. Without this patch, the included test fails with a fatal error in `AsmPrinter::emitBasicBlockEnd`.
Reviewed By: tmsriram
Differential Revision: https://reviews.llvm.org/D89423
We used to only emit static const data members in CodeView as
S_CONSTANTS when they were used; this patch makes it so they are always emitted.
I changed CodeViewDebug.cpp to find the static const members from the
class debug info instead of creating DIGlobalVariables in the IR
whenever a static const data member is used.
Bug: https://bugs.llvm.org/show_bug.cgi?id=47580
Differential Revision: https://reviews.llvm.org/D89072
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.
This brings the logic in line with the comment on the context line immediately
above the added precondition.
Add a test in sve-redundant-store.ll that the warning is not triggered.
Differential Revision: https://reviews.llvm.org/D89701
This reverts commit 4604441386dc5fcd3165f4b39f5fa2e2c600f1bc.
Reverting because it was not the intended version of the patch, which
follows this patch.
The modified code in visitSTORE was missing a scalable vector check, and still
using the now deprecated implicit cast of TypeSize to uint64_t through the
overloaded operator. This patch fixes these issues.
This brings the logic in line with the comment on the context line immediately
above the added precondition.
Add a test in Redundantstores.ll that the warning is not triggered.
This patch adds a remarks that provides counts for each opcode per basic block.
An snippet of the generated information can be seen below.
The current implementation uses the target specific opcode for the counts. For example, on AArch64 this means we currently get 2 entries for `add` instructions if the block contains 32 and 64 bit adds. Similarly, immediate version are treated differently.
Unfortunately there seems to be no convenient way to get only the mnemonic part of the instruction as a string AFAIK. This could be improved in the future.
```
--- !Analysis
Pass: asm-printer
Name: InstructionMix
DebugLoc: { File: arm64-instruction-mix-remarks.ll, Line: 30, Column: 30 }
Function: foo
Args:
- String: 'BasicBlock: '
- BasicBlock: else
- String: "\n"
- String: INST_MADDWrrr
- String: ': '
- INST_MADDWrrr: '2'
- String: "\n"
- String: INST_MOVZWi
- String: ': '
- INST_MOVZWi: '1'
```
Reviewed By: anemet, thegameg, paquette
Differential Revision: https://reviews.llvm.org/D89892
This adds a MultiHazardRecognizer and starts to make use of it in the
ARM backend. The idea of the class is to allow multiple independent
hazard recognizers to be added to a single base MultiHazardRecognizer,
allowing them to all work in parallel without requiring them to be
chained into subclasses. They can then be added or not based on cpu or
subtarget features, which will become useful in the ARM backend once
more hazard recognizers are being used for various things.
This also renames ARMHazardRecognizer to ARMHazardRecognizerFPMLx in the
process, to more clearly explain what that recognizer is designed for.
Differential Revision: https://reviews.llvm.org/D72939
Replace the X86 specific isSplatZeroExtended helper with a generic BuildVectorSDNode method.
I've just used this to simplify the X86ISD::BROADCASTM lowering so far (and remove isSplatZeroExtended), but we should be able to use this in more places to lower to complex broadcast patterns.
Differential Revision: https://reviews.llvm.org/D87930
This patch enables emitting DWARF `DW_OP_implicit_value` opcode when
tuning debug information for LLDB (`-debugger-tune=lldb`).
This will also propagate to Darwin platforms, since they use LLDB tuning
as a default.
rdar://67406059
Differential Revision: https://reviews.llvm.org/D90001
Signed-off-by: Med Ismail Bennani <medismail.bennani@gmail.com>
It's currently ambiguous in IR whether the source language explicitly
did not want a stack a stack protector (in C, via function attribute
no_stack_protector) or doesn't care for any given function.
It's common for code that manipulates the stack via inline assembly or
that has to set up its own stack canary (such as the Linux kernel) would
like to avoid stack protectors in certain functions. In this case, we've
been bitten by numerous bugs where a callee with a stack protector is
inlined into an __attribute__((__no_stack_protector__)) caller, which
generally breaks the caller's assumptions about not having a stack
protector. LTO exacerbates the issue.
While developers can avoid this by putting all no_stack_protector
functions in one translation unit together and compiling those with
-fno-stack-protector, it's generally not very ergonomic or as
ergonomic as a function attribute, and still doesn't work for LTO. See also:
https://lore.kernel.org/linux-pm/20200915172658.1432732-1-rkir@google.com/https://lore.kernel.org/lkml/20200918201436.2932360-30-samitolvanen@google.com/T/#u
Typically, when inlining a callee into a caller, the caller will be
upgraded in its level of stack protection (see adjustCallerSSPLevel()).
By adding an explicit attribute in the IR when the function attribute is
used in the source language, we can now identify such cases and prevent
inlining. Block inlining when the callee and caller differ in the case that one
contains `nossp` when the other has `ssp`, `sspstrong`, or `sspreq`.
Fixes pr/47479.
Reviewed By: void
Differential Revision: https://reviews.llvm.org/D87956
This was initiated from the uses of MCRegUnitIterator, so while likely
not exhaustive, it's a step forward.
Differential Revision: https://reviews.llvm.org/D89975
Implementation of instructions table.get, table.set, table.grow,
table.size, table.fill, table.copy.
Missing instructions are table.init and elem.drop as they deal with
element sections which are not yet implemented.
Added more tests to tables.s
Differential Revision: https://reviews.llvm.org/D89797
Deciding where to place debugging instructions when normal instructions
sink between blocks is difficult -- see PR44117. Dealing with this with
instruction-referencing variable locations is simple: we just tolerate
DBG_INSTR_REFs referring to values that haven't been computed yet. This
patch adds support into InstrRefBasedLDV to record when a variable value
appears in the middle of a block, and should have a DBG_VALUE added when it
appears (a debug use before def).
While described simply, this relies heavily on the value-propagation
algorithm in InstrRefBasedLDV. The implementation doesn't attempt to verify
the location of a value unless something non-trivial occurs to merge
variable values in vlocJoin. This means that a variable with a value that
has no location can retain it across all control flow (including loops).
It's only when another debug instruction specifies a different variable
value that we have to check, and find there's no location.
This property means that if a machine value is defined in a block dominated
by a DBG_INSTR_REF that refers to it, all the successor blocks can
automatically find a location for that value (if it's not clobbered). Thus
in a sense, InstrRefBasedLDV is already supporting and implementing
use-before-defs. This patch allows us to specify a variable location in the
block where it's defined.
When loading live-in variable locations, TransferTracker currently discards
those where it can't find a location for the variable value. However, we
can tell from the machine value number whether the value is defined in this
block. If it is, add it to a set of use-before-def records. Then, once the
relevant instruction has been processed, emit a DBG_VALUE immediately after
it.
Differential Revision: https://reviews.llvm.org/D85775
Downstream testing revealed some problems with this patch.
Reverting while investigating.
This reverts commit 2b96dcebfae65485859d956954f10f409abaae79.
Handle DBG_INSTR_REF instructions in LiveDebugValues, to determine and
propagate variable locations. The logic is fairly straight forwards:
Collect a map of debug-instruction-number to the machine value numbers
generated in the first walk through the function. When building the
variable value transfer function and we see a DBG_INSTR_REF, look up the
instruction it refers to, and pick the machine value number it generates,
That's it; the rest of LiveDebugValues continues as normal.
Awkwardly, there are two kinds of instruction numbering happening here: the
offset into the block (which is how machine value numbers are determined),
and the numbers that we label instructions with when generating
DBG_INSTR_REFs.
I've also restructured the TransferTracker redefVar code a little, to
separate some DBG_VALUE specific operations into its own method. The
changes around redefVar should be largely NFC, while allowing
DBG_INSTR_REFs to specify a value number rather than just a location.
Differential Revision: https://reviews.llvm.org/D85771
This patch adjusts _when_ something happens in LiveDebugValues /
InstrRefBasedLDV, to make it more amenable to dealing with DBG_INSTR_REF
instructions. There's no functional change.
In the current InstrRefBasedLDV implementation, we collect the machine
value-number transfer function for blocks at the same time as the
variable-value transfer function. After solving machine value numbers, the
variable-value transfer function is updated so that DBG_VALUEs of live-in
registers have the correct value. The same would need to be done for
DBG_INSTR_REFs, to connect instruction-references with machine value
numbers.
Rather than writing more code for that, this patch separates the two: we
collect the (machine-value-number) transfer function and solve for
machine value numbers, then step through the MachineInstrs again collecting
the variable value transfer function. This simplifies things for the new
few patches.
Differential Revision: https://reviews.llvm.org/D85760
Testing reveals that lldb and gdb have some problems with supporting
DW_OP_convert - gdb with Split DWARF tries to resolve the CU-relative
DIE offset relative to the skeleton DIE. lldb tries to treat the offset
as absolute, which judging by the llvm-dsymutil support for
DW_OP_convert, I guess works OK in MachO? (though probably llvm-dsymutil
is producing invalid DWARF by resolving the relative reference to an
absolute one?).
Specifically this disables DW_OP_convert usage in DWARFv5 if:
* Tuning for GDB and using Split DWARF
* Tuning for LLDB and not targeting MachO
Both FastRegAlloc and LiveDebugVariables/greedy need to cope with
DBG_INSTR_REFs. None of them actually need to take any action, other than
passing DBG_INSTR_REFs through: variable location information doesn't refer
to any registers at this stage.
LiveDebugVariables stashes the instruction information in a tuple, then
re-creates it later. This is only necessary as the register allocator
doesn't expect to see any debug instructions while it's working. No
equivalence classes or interval splitting is required at all!
No changes are needed for the fast register allocator, as it just ignores
debug instructions. The test added checks that both of them preserve
DBG_INSTR_REFs.
This also expands ScheduleDAGInstrs.cpp to treat DBG_INSTR_REFs the same as
DBG_VALUEs when rescheduling instructions around. The current movement of
DBG_VALUEs around is less than ideal, but it's not a regression to make
DBG_INSTR_REFs subject to the same movement.
Differential Revision: https://reviews.llvm.org/D85757
If the end instruction of the scheduling region was a DBG_VALUE, the
uses of the debug instruction were tracked as if they were real
uses. This would then hit the deadDefHasNoUse assertion in
addVRegDefDeps if the only use was the debug instruction.
This patch touches two optimizations, TwoAddressInstruction and X86's
FixupLEAs pass, both of which optimize by re-creating instructions. For
LEAs, various bits of arithmetic are better represented as LEAs on X86,
while TwoAddressInstruction sometimes converts instrs into three address
instructions if it's profitable.
For debug instruction referencing, both of these require substitutions to
be created -- the old instruction number must be pointed to the new
instruction number, as illustrated in the added test. If this isn't done,
any variable locations based on the optimized instruction are
conservatively dropped.
Differential Revision: https://reviews.llvm.org/D85756
Some of our conversion algorithms produce -0.0 when converting unsigned i64 to double when the rounding mode is round toward negative. This switches them to other algorithms that don't have this problem. Since it is undefined behavior to change rounding mode with the non-strict nodes, this patch only changes the behavior for strict nodes.
There are still problems with unsigned i32 conversions too which I'll try to fix in another patch.
Fixes part of PR47393
Reviewed By: efriedma
Differential Revision: https://reviews.llvm.org/D87115
As mentioned post-commit in D85749, the 'substituteDebugValuesForInst'
method added in c521e44defb5 would be better off with a limit on the
number of operands to substitute. This handles the common case of
"substitute the first operand between these two differing instructions",
or possibly up to N first operands.
Some instructions may be removable through processes such as IfConversion,
however DefinesPredicate can not be made aware of when this should be considered.
This parameter allows DefinesPredicate to distinguish these removable instructions
on a per-call basis, allowing for more fine-grained control from processes like
ifConversion.
Renames DefinesPredicate to ClobbersPredicate, to better reflect it's purpose
Differential Revision: https://reviews.llvm.org/D88494
Updates an optimization that relies on boolean contents being either 0
or 1 to properly check for this before triggering.
The following:
(X & 8) != 0 --> (X & 8) >> 3
Produces unexpected results when a boolean 'true' value is represented
by negative one.
Patch by Erik Hogeman.
Differential Revision: https://reviews.llvm.org/D89390
We were previously relying upon the TypeSize comparison operators to
obtain the maximum size of two types, however use of such operators is
being deprecated in favour of making the caller aware that it could
be dealing with scalable vector types. I have changed the code to assert
that the two types have the same scalable property and thus we can
simply take the maximum of the known minimum sizes instead.
Differential Revision: https://reviews.llvm.org/D88563
Comparing 32-bit `ptrdiff_t` against 32-bit `unsigned` results in
`-Wsign-compare` warnings for both GCC and Clang.
The warning for the cases in question appear to identify an issue
where the `ptrdiff_t` value would be mutated via conversion to an
unsigned type.
The warning is resolved by using the usual arithmetic conversions to
safely preserve the value of the `unsigned` operand while trying to
convert to a signed type. Host platforms where `unsigned` has the same
width as `unsigned long long` will need to make a different change, but
using an explicit cast has disadvantages that can be avoided for now.
Reviewed By: dantrushin
Differential Revision: https://reviews.llvm.org/D89612
If a target can encode multiple wait-states into a noop allow emitting such
instructions directly.
Reviewed By: rampitec, dmgreen
Differential Revision: https://reviews.llvm.org/D89753
The CfgTraits abstraction simplfies writing algorithms that are
generic over the type of CFG, and enables writing such algorithms
as regular non-template code that operates on opaque references
to CFG blocks and values.
Implementations of CfgTraits provide operations on the concrete
CFG types, e.g. `IrCfgTraits::BlockRef` is `BasicBlock *`.
CfgInterface is an abstract base class which provides operations
on opaque types CfgBlockRef and CfgValueRef. Those opaque types
encapsulate a `void *`, but the meaning depends on the concrete
CFG type. For example, MachineCfgTraits -- for use with MachineIR
in SSA form -- encodes a Register inside CfgValueRef. Converting
between concrete references and opaque/generic ones is done by
CfgTraits::{fromGeneric,toGeneric}. Convenience methods
CfgTraits::{un}wrap{Iterator,Range} are available as well.
Writing algorithms in terms of CfgInterface adds some overhead
(virtual method calls, plus in same cases it removes the
opportunity to inline iterators), but can be much more convenient
since generic algorithms can be written as non-templates.
This patch adds implementations of CfgTraits for all CFGs on
which dominator trees are calculated, so that the dominator
tree can be ported to this machinery. Only IrCfgTraits (LLVM IR)
and MachineCfgTraits (Machine IR in SSA form) are complete, the
other implementations are limited to the absolute minimum
required to make the upcoming dominator tree changes work.
v5:
- fix MachineCfgTraits::blockdef_iterator and allow it to iterate over
the instructions in a bundle
- use MachineBasicBlock::printName
v6:
- implement predecessors/successors for all CfgTraits implementations
- fix error in unwrapRange
- rename toGeneric/fromGeneric into wrapRef/unwrapRef to have naming
that is consistent with {wrap,unwrap}{Iterator,Range}
- use getVRegDef instead of getUniqueVRegDef
v7:
- std::forward fix in wrapping_iterator
- fix typos
v8:
- cleanup operators on CfgOpaqueType
- address other review comments
Change-Id: Ia75f4f268fded33fca11218a7d578c9aec1f3f4d
Differential Revision: https://reviews.llvm.org/D83088