Cover more cases in preparation for making greater use
of fcmp based lowerings. Also add more tests for the inverted
cases. Test iszero | isnan test masks. We should probably just
generate every combination of test masks.
The patch handles fixed type strict-fp by new RISCVISD::STRICT_ prefixed
isd nodes.
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D145900
Remove the `-lower-global-dtors-via-cxa-atexit` escape hatch introduced
in D121736 [1], which switched the default lowering of global
destructors on MachO to use `__cxa_atexit()` to avoid emitting
deprecated `__mod_term_func` sections.
I added this flag as an escape hatch in case the switch causes any
problems. We didn't discover any problems so now we can remove it.
[1] https://reviews.llvm.org/D121736
rdar://90277838
Differential Revision: https://reviews.llvm.org/D145715
The SystemZ backend will try to reuse an existing subtraction of two values
whenever they are to be compared for equality. This depends on the SystemZ
subtraction instruction setting the condition code, which can also signal
overflow.
A later pass will remove the compare and reuse the CC from the subtraction
directly. However, if that subtraction has the NSW flag set it will not
include the overflow bit in the updated CC user. That was a bug which can
lead to wrong results, as shown by a csmith program.
Fixes: https://github.com/llvm/llvm-project/issues/61268
Reviewed By: nikic, uweigand
Differential Revision: https://reviews.llvm.org/D145811
D144535 enabled machine copy propagation for RISC-V and added it to the
pass pipeline in addPreEmitPass2 (after the MachineOutliner).
Unfortunately, the MachineCopyPropagation pass is unable to correctly
analyse outlined functions, and will delete copy instructions where a
register is set that is intended to be live-out.
RISCVInstrInfo::buildOutlinedFrame will directly insert a JALR, while a
similar function going through the normal codegen path would have a
PseudoRet with operands indicating registers that are live-out.
This patch does the simplest fix, which is to run MachineCopyPropagation
before the MachineOutliner.
Differential Revision: https://reviews.llvm.org/D146037
Try to remove extra bitcasts around logicops if we're dealing with illegal types
Fixes the regressions in D145939
Differential Revision: https://reviews.llvm.org/D146032
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D145918
AMDGPU ISel can add extra passes when expensive checks are enabled. This means the pipeline can be reordered and the checks may fail.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D146038
i8 and i16 are not using overflow.
Reduce the number of zero extension instructions.
To reduce the uncertainty of the unknown,
most of the checks of the virtual function are kept
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D143646
Switch DAGISel over to UniformityAnalysis, which was one of the last remaining users of the DivergenceAnalysis.
No explosions seen during internal testing so this looks like a smooth transition.
Reviewed By: sameerds
Differential Revision: https://reviews.llvm.org/D145918
After this patch all arbitrary size integers (smaller than 64 bits) in
LLVM IR will be promoted to regular size type in SPIR-V (OpTypeInt
8/16/32/64).
Differential Revision: https://reviews.llvm.org/D145137
This is an initial attempt at preserving debug information in the pseudo
instruction expansion of the AArch backend. In particular, we preserve
the instruction number required by the InstrRef implementation of live
debug values.
There are many other expansions that need to be considered, but the ones
addressed in this commit should be extremely common, as they handle most
arithmetic and logical instructions.
Differential Revision: https://reviews.llvm.org/D145943
Commit 3671bdbcd214("[BPF] Fix a BTF type pruning bug") fixed a
pruning bug to allow generate more types. But the commit has a bug
which permits to generate more types than necessary. The following
is an example to illustrate the problem.
struct t1 {
int a;
};
struct t2 {
struct t1 *p1;
struct t1 *p2;
int b;
};
int foo(struct t2 *arg) {
return arg->b;
}
The following is the part of BTF generation sequence:
(1). 'struct t2 *arg' -> 'struct t1 *p1'
In this step, the type 'struct t1' will be generated as
a forward decl and the ptr type (to 'struct t1') will
be stored in the internal type table.
(2). now the second field 'struct t1 *p2' will be processed.
Since the ptr type (to 'struct t1') already in the type
table, the existing logic strips out ptr modifier and
is able to generate BTF type for 'struct t1'.
In the above step (2), if CheckPointer is true (the type traversal
chain including a struct member), 'ptr' modifier should be checked
and the subsequent type generation should be skipped since
the same case has been processed in visitDerivedType().
The issue is exposed when I am trying to use llvm15 to compile
some internal bpf programs. The bpf skeleton put the whole
ELF section (after striping some sections like dwarf) as a string.
The large BTF section triggered the following error:
bpf_object_with_struct_ops_test_prog_bpf/BpfObjectWithStructOpsTestProg.skel.h:222:23:
error: string literal of length 140144 exceeds maximum length 65536 that C++ compilers
are required to support [-Werror,-Woverlength-strings]
return (const void *)"\
^~
1 error generated.
Although adding -Wno-overlength-strings could workaround the issue,
improving llvm BTF generation sounds better esp. for users using vmlinux.h.
Differential Revision: https://reviews.llvm.org/D145816
This should be a typo in `emitConstantSizeRepmov`. Both its caller and
callee store the alignment in a 64-bit variables, no reason to truncate
it to 32-bit. It results in alignment turns into 0 when larger than
0x100000000.
Fixes#61348
Reviewed By: RKSimon
Differential Revision: https://reviews.llvm.org/D145863
Each target's `TargetInstrInfo` is responsible for announcing which code
patterns it is able to transform during the MachineCombiner pass.
Currently, these patterns are applied without preserving the debug
instruction number required by the InstrRef implementation of
LiveDebugValues. As such, we've seen a number of examples where debug
information is dropped for variables in InstrRef mode that were
otherwise available in VarLoc mode. This has been observed both in X86
and AArch examples.
This commit is an initial attempt at preserving said numbers by changing
the general (target agnostic) implementation of TargetInstrInfo: the
reassociation pattern must keep the debug number of the "top level"
instruction, i.e., the instruction whose value represents the final
value of the arithmetic expression. Intermediate values must have their
debug number dropped, as they have no equivalent value in the
unoptimized code.
Future work is required to update each target's
`TargetInstrInfo::genAlternativeCodeSequence` method.
Differential Revision: https://reviews.llvm.org/D145759
Without the single use restriction we may replace the and with a
more costly duplicated compare.
Differential Revision: https://reviews.llvm.org/D145755
Fix mistake in f16 table in previous patch.
Original commit message:
Use separate lookup tables instead of trying to reuse the fli.s
table.
We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
[DAGCombiner] handle more store value forwarding
When lowering calls on target like PPC, some stack loads
will be generated for by value parameters. Node CALLSEQ_START
prevents such loads from being combined.
Suggested by @RolandF, this patch removes the unnecessary
loads for the byval parameter by extending ForwardStoreValueToDirectLoad
Reviewed By: nemanjai, RolandF
Differential Revision: https://reviews.llvm.org/D138899
Use separate lookup tables instead of trying to reuse the fli.s
table.
We were missing the 2 denormal cases for fli.h. We also had an issue
where fli.d was only checking 8 bits of the 11 bit exponent.
We fail to use fli.h for the 2 denormal values.
We use fli.d for some values where the value is larger than a float
can represent due to truncating the exponent to 8 bits without checking
if it fits in 8 bits.
Post ISel, LDS variables are absolute values. Representing them as
such is simpler than the frame recalculation currently used to build assembler
tables from their addresses.
This is a precursor to lowering dynamic/external LDS accesses from non-kernel
functions.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D144221
Try to more aggressively narrow masks of extended values.
This is mainly for cases where the mask is trying to zero out any_extended upper bits, assuming we can zext/trunc the values for free.
This catches a few actual missed folds, as well as helps canonicalize a number of other cases which were being caught in isel etc.
Differential Revision: https://reviews.llvm.org/D145866
This came about while investigating ways to handle D145468 in a more generic manner, which involves trying harder to fold and(zext(x),c) -> zext(and(x,c))
Alive2: https://alive2.llvm.org/ce/z/7fXtDt (generic fold)
Differential Revision: https://reviews.llvm.org/D145855
On 64-bit target, when doing i64 BR_CC where one of the comparison operands is a
constant zero, try to fold the compare and BPcc into a BPr instruction.
For all integers, EQ and NE comparison are available, additionally for signed
integers, GT, GE, LT, and LE is also available.
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D142461
Integrate the BranchRelaxation pass to help with relaxing out-of-range
conditional branches.
This is mostly of concern for SPARCv9, which uses conditional branches with
much smaller range than its v8 counterparts.
(Some large autogenerated code, such as the ones generated by TableGen, already
hits this limitation when building in Release)
Reviewed By: arsenm
Differential Revision: https://reviews.llvm.org/D142458