This patch updates several functions in LLVM's IR generation code to accept
an IRBuilder object as an argument, rather than an Instruction that indicates
the insertion point for new instructions.
This change is necessary to handle sophisticated -Ofast optimization cases
from D148558 where it's unclear which instructions should be used as the
insertion point for new operations.
Differential Revision: https://reviews.llvm.org/D148703
As discussed in D151436, it's safe to do this as a simple shift (as is
done in LegalizeDAG.cpp) rather than needing a libcall. The added test
cases for RISC-V previously just triggered an assertion.
Codegen for bfloat_to_double will be slightly improved by D151434.
Differential Revision: https://reviews.llvm.org/D151563
Use big obj copy in range for-loop will call copy constructor every time,
which can be avoided by use ref instead.
Reviewed By: skan
Differential Revision: https://reviews.llvm.org/D150024
class InstructionRemover manages resources such as dynamically allocated memory, it's generally a good practice to either implement a custom copy constructor or disable the default one.
Reviewed By: pengfei
Differential Revision: https://reviews.llvm.org/D151543
This reverts commit 9b92f70d4758f75903ce93feaba5098130820d40. The issue
with the re-applied change was an implicit truncation due to the
multiplication. Although the operations were converted to `APInt`, the
values were implicitly converted to `long` due to the typing rules.
Fixes: #59594
Differential Revision: https://reviews.llvm.org/D140347
The generic implementation is umin(TC, VF * vscale).
Lowering to vsetvli for RISC-V will come in a future patch.
This patch is a pre-requisite to be able to CodeGen vectorized code from
D99750.
Reviewed By: reames, frasercrmck
Differential Revision: https://reviews.llvm.org/D149916
For dbg.value intrinsics targeting an llvm::Argument address whose expression
starts with an entry value, we lower this to a DEBUG_VALUE targeting the livein
physical register corresponding to that Argument.
Depends on D151332
Differential Revision: https://reviews.llvm.org/D151333
This will make it easier to add more cases in a subsequent commit and also
better conforms to the coding guidelines.
Depends on D151330
Differential Revision: https://reviews.llvm.org/D151331
Summary:
DbgValue intrinsics whose expression is an entry_value and whose address is
described an llvm::Argument must be lowered to the corresponding livein physical
register for that Argument.
Depends on D151329
Reviewers: aprantl
Subscribers:
For dbg.value intrinsics targeting an llvm::Argument address whose expression
starts with an entry value, we lower this to a DEBUG_VALUE targeting the livein
physical register corresponding to that Argument.
Depends on D151328
Differential Revision: https://reviews.llvm.org/D151329
This was originally added to preserve FMF on SETCC. Unfortunately,
it also incorrectly preserves nuw/nsw on ADD/SUB in some cases.
There's also no guarantee the new opcode is even the same opcode
as the original node.
This patch removes the code and adds code to explicitly preserve
FMF flags in the SETCC promotion function.
The other test changes are from nuw/nsw not being preserved. I
believe for all these tests it was correct to preserve the flags,
so we need new code to preserve the flags when possible. I'll post
another patch for that since it's a riskier change.
This should unblock D150769.
Differential Revision: https://reviews.llvm.org/D151472
When the worklist is initially being formed, there is no need to
consider all nodes for pruning. This is because the first time calling
getNextWorklistEntry will only clear those nodes which have no uses,
with their operands being added to the worklist. However, when the worklist is
created for the first time all nodes are added anyways, so this operation
actually ends up adding no nodes.
This patch adds a parameter IsCandidateForPruning to AddToWorklist with a
default value of true to avoid having to update every call site.
Differential Revision: https://reviews.llvm.org/D151416
Make sure we do not crash in rfindDebugLoc when starting at
instr_rend(). Solution is to see it as we start one MI before the
first MI, so we can start searching forward at instr_begin()
instead.
This behavior is similar to how findPrevDebugLoc(instr_end()) works.
Differential Revision: https://reviews.llvm.org/D150577
- Add some unittests for the findDebugLoc, rfindDebugLoc,
findPrevDebugLoc and rfindPrevDebugLoc helpers in MachineBasicBlock.
- Clean up code comments and code formatting related to the functions
mentioned above.
This was extracted as a pre-commit to D150577, adn some of the tests
are commented out since they would crash/assert in a rather
uncontrolled way.
This will make it easier to add more cases in a subsequent commit and also
better conforms to the coding guidelines.
Differential Revision: https://reviews.llvm.org/D151328
This is an attempt to reland D42600 and enabling this optimisation by default.
This also resolves the issue pointed out in the context of PGO build.
Differential Revision: https://reviews.llvm.org/D42600
D151036 adds an assertions that prohibits iterating over sub- and
super-registers of a null register. This is already the case when
iterating over register units of a null register, and worked by
accident for sub- and super-registers.
The only place where the assertion is currently triggering is in
CriticalAntiDepBreaker::ScanInstruction. Other places are changed
in case new assertions are added and should be harmless otherwise.
Live intervals for physical registers are calculated lazily on demand.
In a case like this:
16B %0:gpr32 = IMPLICIT_DEF
32B $wzr = COPY %0
if the live interval for $wzr did not already exist then the update code
in joinReservedPhysReg would create it with a definition at 32B, which
would remain even after the COPY was deleted.
Differential Revision: https://reviews.llvm.org/D151314
D151036 adds an assertions that prohibits iterating over sub- and
super-registers of a null register. This is already the case when
iterating over register units of a null register, and worked by
accident for sub- and super-registers.
The only place where the assertion is currently triggering is in
CriticalAntiDepBreaker::ScanInstruction. Other places are changed
in case new assertions are added and should be harmless otherwise.
Differential Revision: https://reviews.llvm.org/D151288
This information helps to avoid considering cloning for blocks with indirect branches.
Reviewed By: jhenderson
Differential Revision: https://reviews.llvm.org/D150611
salvageDebugInfo is a function that allows us to reatin debug info for
instructions that have been optimized out. Currently, it doesn't support
salvaging the debug information from icmp instrcutions, but DWARF
expressions can emulate an icmp by using the DWARF conditional
expressions. This patch adds support for salvaging debug information
from icmp instructions.
Differential Revision: https://reviews.llvm.org/D150216
Without frame pointers, the locations of variables on the stack are emitted
relative to the stack pointer (via the stack pointer being the value of
DW_AT_frame_base on the subprogram). If a call modifies the stack pointer
this results in the locations being wrong and the debugger displaying the
wrong values for variables.
By using DW_OP_call_frame_cfa in these situations the emitted location for
the variable will automatically handle changes in the stack pointer
(provided LLVM is emitting the correct CFI directives elsewhere, of course).
The CFA needs to be adjusted for the size of the stack frame (including the
return address) to allow the variable locations themselves to remain
unchanged by this patch.
Certain LLDB features cannot cope with DW_OP_call_frame_cfa, so this change
is heuristically limited to the cases where it's necessary for correctness
to minimize the fallout there.
Reviewed By: #debug-info, scott.linder, jryans, jmorse
Differential Revision: https://reviews.llvm.org/D143463
There are two motivations.
`-fno-pic -fstack-protector -mstack-protector-guard=global` created
`__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD.
This patch allows referencing the symbol indirectly with
-fno-direct-access-external-data.
Some Linux kernel folks want
`-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard`
created `__stack_chk_guard` to be referenced directly, avoiding
R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker).
https://github.com/llvm/llvm-project/issues/60116
Why they need this isn't so clear to me.
---
Add module flag "direct-access-external-data" and set the dso_local property of
the stack protector symbol. The module flag can benefit other LLVMCodeGen
synthesized symbols that are not represented in LLVM IR.
Nowadays, with `-fno-pic` being uncommon, ideally we should set
"direct-access-external-data" when it is true. However, doing so would require
~90 clang/test tests to be updated, which are too much.
As a compromise, we set "direct-access-external-data" only when it's different
from the implied default value.
Reviewed By: nickdesaulniers
Differential Revision: https://reviews.llvm.org/D150841
We can compute a simpler expression for Lo for these cases. This
is an alternative for the test cases in D151180 that works for
more targets.
This is similar to some of the special cases we have for expanding
setcc operands.
Differential Revision: https://reviews.llvm.org/D151182
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.
Add missing includes that were transitively included by this header to 3
other source files.
This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.
Currently we use RTTI objects to check type compatibility. To support non-unique
RTTI objects, commit 5745eccef54ddd3caca278d1d292a88b2281528b added a
`checkTypeInfoEquality` string matching to the runtime.
The scheme is inefficient.
```
_Z1fv:
.long 846595819 # jmp
.long .L__llvm_rtti_proxy-_Z3funv
...
main:
...
# Load the second word (pointer to the RTTI object) and dereference it.
movslq 4(%rsi), %rax
movq (%rax,%rsi), %rdx
# Is it the desired typeinfo object?
leaq _ZTIFvvE(%rip), %rax
# If not, call __ubsan_handle_function_type_mismatch_v1, which may recover if checkTypeInfoEquality allows
cmpq %rax, %rdx
jne .LBB1_2
...
.section .data.rel.ro,"aw",@progbits
.p2align 3, 0x0
.L__llvm_rtti_proxy:
.quad _ZTIFvvE
```
Let's replace the indirect `_ZTI` pointer with a type hash similar to
`-fsanitize=kcfi`.
```
_Z1fv:
.long 3238382334
.long 2772461324 # type hash
main:
...
# Load the second word (callee type hash) and check whether it is expected
cmpl $-1522505972, -4(%rax)
# If not, fail: call __ubsan_handle_function_type_mismatch
jne .LBB2_2
```
The RTTI object derives its name from `clang::MangleContext::mangleCXXRTTI`,
which uses `mangleType`. `mangleTypeName` uses `mangleType` as well. So the
type compatibility change is high-fidelity.
Since we no longer need RTTI pointers in
`__ubsan::__ubsan_handle_function_type_mismatch_v1`, let's switch it back to
version 0, the original signature before
e215996a2932ed7c472f4e94dc4345b30fd0c373 (2019).
`__ubsan::__ubsan_handle_function_type_mismatch_abort` is not
recoverable, so we can revert some changes from
e215996a2932ed7c472f4e94dc4345b30fd0c373.
Reviewed By: samitolvanen
Differential Revision: https://reviews.llvm.org/D148785
Remove the unnecessary `"llvm/IR/PatternMatch.h"` include directive from
`ComplexDeinterleavingPass.h` and move it to the corresponding source
file.
Add missing includes that were transitively included by this header to 2
other source files.
This reduces the total number of preprocessing tokens across the LLVM
source files in `lib` from (roughly) 1,964,876,961 to 1,935,091,611 - a
reduction of ~1.52%. This should result in a small improvement in
compilation time.
Differential Revision: https://reviews.llvm.org/D150514
I don't really understand what the point of wip_match_opcode is.
It doesn't seem to have any purpose other than to list opcodes
to have all the logic in pure C++. You can't seem to use it to
select multiple opcodes in the same way you use match.
Something is wrong with it, since the match emitter prints
"errors" if an opcode is covered by wip_match_opcode and
then appears in another pattern. For exmaple with this patch,
you see this several times in the build:
error: Leaf constant_fold_fabs is unreachable
note: Leaf idempotent_prop will have already matched
The combines are actually produced and the tests for them
do pass, so this seems to just be a broken warning.
This matches what scavengeRegisterBackwards does.
This is in preparation for converting most uses of scavengeRegister to
scavengeRegisterBackwards, to reduce test case churn when that lands and
to help with bisection if anything goes wrong.
Differential Revision: https://reviews.llvm.org/D150792
This getZExtOrTrunc seems to have been added when getPtrExtOrTrunc
was introduced. getPtrExtOrTrunc is currently equivalent to getZExtOrTrunc,
but could be changed for some target in the future.
Reviewed By: t.p.northover
Differential Revision: https://reviews.llvm.org/D149680
This patch-set aims to simplify the existing RVV segment load/store
intrinsics to use a type that represents a tuple of vectors instead.
To achieve this, first we need to relax the current limitation for an
aggregate type to be a target of load/store/alloca when the aggregate
type contains homogeneous scalable vector types. Then to adjust the
prolog of an LLVM function during lowering to clang. Finally we
re-define the RVV segment load/store intrinsics to use the tuple types.
The pull request under the RVV intrinsic specification is
riscv-non-isa/rvv-intrinsic-doc#198
---
This is the 1st patch of the patch-set. This patch is originated from
D98169.
This patch allows aggregate type (StructType) that contains homogeneous
scalable vector types to be a target of load/store/alloca. The RFC of
this patch was posted in LLVM Discourse.
https://discourse.llvm.org/t/rfc-ir-permit-load-store-alloca-for-struct-of-the-same-scalable-vector-type/69527
The main changes in this patch are:
Extend `StructLayout::StructSize` from `uint64_t` to `TypeSize` to
accommodate an expression of scalable size.
Allow `StructType:isSized` to also return true for homogeneous
scalable vector types.
Let `Type::isScalableTy` return true when `Type` is `StructType`
and contains scalable vectors
Extra description is added in the LLVM Language Reference Manual on the
relaxation of this patch.
Authored-by: Hsiangkai Wang <kai.wang@sifive.com>
Co-Authored-by: eop Chen <eop.chen@sifive.com>
Reviewed By: craig.topper, nikic
Differential Revision: https://reviews.llvm.org/D146872
The current implementation of -fsanitize=function places two words (the prolog
signature and the RTTI proxy) at the function entry, which makes the feature
incompatible with Intel Indirect Branch Tracking (IBT) that needs an ENDBR instruction
at the function entry. To allow the combination, move the two words before the
function entry, similar to -fsanitize=kcfi.
Armv8.5 Branch Target Identification (BTI) has a similar requirement.
Note: for IBT and BTI, whether a function gets a marker instruction at the entry
generally cannot be assumed (it can be disabled by a function attribute or
stronger LTO optimizations).
It is extremely unlikely for two words preceding a function entry to be
inaccessible. One way to achieve this is by ensuring that a function is
aligned at a page boundary and making the preceding page unmapped or
unreadable. This is not reasonable for application or library code.
(Think: the first text section has crt* code not instrumented by
-fsanitize=function.)
We use 0xc105cafe for all targets. .long 0xc105cafe disassembles to invalid
instructions on all architectures I have tested, except Power where it is
`lfs 8, -13570(5)` (Load Floating-Point with a weird offset, unlikely to be used in real code).
---
For the removed function in AsmPrinter.cpp, remove an assert: `mdconst::extract`
already asserts non-nullness.
For compiler-rt/test/ubsan/TestCases/TypeCheck/Function/function.cpp,
when the function doesn't have prolog/epilog (-O1 and above), after moving the two words,
the address of the function equals the address of ret instruction,
so symbolizing the function will additionally get a non-zero column number.
Adjust the test to allow an optional column number.
```
.long 3238382334
.long .L__llvm_rtti_proxy-_Z1fv
_Z1fv: // symbolizing here retrieves the line table entry from the second .loc
.file 0 ...
.loc 0 1 0
.cfi_startproc
.loc 0 2 1 prologue_end
retq
```
Reviewed By: peter.smith
Differential Revision: https://reviews.llvm.org/D148665
Fixes: https://github.com/llvm/llvm-project/issues/62725
This patch fixes an error in which a DBG_INSTR_REF referring to a DBG_PHI in a
block that is not directly reachable from the entry block results in a crash
during LiveDebugValues. Note that this fix prevents a crash from occurring, but
will give undef locations to users of these PHIs even if a valid location exists.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D150707
Not sure what this was originally intended for, but this seems to be
unused. It didn't seem to be used when it was first added in D64630
either.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D150606
Some metadata prettyprinting, including variable prettyprinting and
debug line info comments, is currently only supported for `DBG_VALUE`.
This allows `DBG_INSTR_REF` can be printed in the same way.
Reviewed By: jmorse
Differential Revision: https://reviews.llvm.org/D150620