Shufflevector semantics have changed so that poison mask elements
return poison rather than undef. Reflect this in the
canCreateUndefOrPoison() implementation.
`createFunctionType` returns a FunctionType that may contain a mask,
which is currently placed as the last parameter to the Function.
The placement happens according to `VFParameters` of `VFInfo`, and it
should be able to handle VFABI specification changes.
Regarding the return type, it uses the scalar type of the input instruction,
as the specification does not encode in the mangled name such information.
If that ever happens, that information should be available from `VFInfo`.
This patch replaces uses of StringRef::{starts,ends}with with
StringRef::{starts,ends}_with for consistency with
std::{string,string_view}::{starts,ends}_with in C++20.
I'm planning to deprecate and eventually remove
StringRef::{starts,ends}with.
The below changes were made:
- test the vector and scalar names, the ISA, and the presence or absence of a
mask in all tests that is relevant
- test the number of the parameters, the order/types and for masks if present
- replaced methods like `sin` to `foo` to make it more clear that these are not
existing functions but they are rather tests.
- using mostly `i32` for parameters where it is not relevant, except when the VF
of elements in explicitly checked, and `ptr` for references.
Also using `void` for return types.
- all `VFABIParserTest` tests are now listed contiguously in the source.
- Removed duplicate ISA tests
- Added an extra test to clearly show that the mangled name becomes the
VectorName, when no VectorName is specified.
- Use `VFInfo` for `isMasked`
- Minor code refactoring, cleanup, and improved comments
Minor simplification applied to VFShape::getScalarShape,
VFShape::get, and VFABI::tryDemangleForVFABI methods.
Also, remove unnecessary `static_cast` in `SLPVectorizer.cpp`
It's not safe for InstCombine to add disjoint metadata when converting
Add to Or otherwise.
I've added noundef attribute to preserve existing test behavior.
We can determine the VF from a combination of the mangled name (which
indicates the arguments that take vectors) and the element sizes of
the arguments for the scalar function the mapping has been established
for.
The assert when demangling fails has been removed in favour of just
not adding the mapping, which prevents the crash seen in
https://github.com/llvm/llvm-project/issues/71892
This patch also stops using _LLVM_ as an ISA for scalable vector tests,
since there aren't defined rules for the way vector arguments should be
handled (e.g. packed vs. unpacked representation).
With opaque pointers enabled, the existing ptr-to-ptr bitcast is a no-op
and no longer creates a constant that references the old function.
Replace the no-op bitcast with code that creates a constant that
references the old function. The test now fails if the 1 new line of
code added to `CallGraphUpdater::replaceFunctionWith()` in
cb0ecc5c33bd56a3eed0fa30ac787accec45d637 is removed (test passes if kept
intact).
---------
Co-authored-by: Nikita Popov <github@npopov.com>
`i64 @labs(i32)` is incorrectly recognized as `LibFunc_labs` because
type ID `Long` matches both `i32` and `i64`. This PR requires the type
of argument to match the return value.
Fixes#69059.
In the failure case we return null, which callers are checking. We were
also returning an fcNone which was unused. It's more consistent to
return fcAllFlags as any possible value, such that the value is always
directly usable without checking the returned value.
The `BlockFrequency` class abstracts `uint64_t` frequency values. Use it
more consistently in various APIs and disable implicit conversion to
make usage more consistent and explicit.
- Use `BlockFrequency Freq` parameter for `setBlockFreq`,
`getProfileCountFromFreq` and `setBlockFreqAndScale` functions.
- Return `BlockFrequency` in `getEntryFreq()` functions.
- While on it change some `const BlockFrequency& Freq` parameters to
plain `BlockFreqency Freq`.
- Mark `BlockFrequency(uint64_t)` constructor as explicit.
- Add missing `BlockFrequency::operator!=`.
- Remove `uint64_t BlockFreqency::getMaxFrequency()`.
- Add `BlockFrequency BlockFrequency::max()` function.
Currently the mappings from TLI are used to generate the list of
available "scalar to vector" mappings attached to scalar calls as
"vector-function-abi-variant" LLVM IR attribute. Function names from TLI
are wrapped in mangled name following the pattern:
_ZGV<isa><mask><vlen><parameters>_<scalar_name>[(<vector_redirection>)]
The problem is the mangled name uses _LLVM_ as the ISA name which
prevents the compiler to compute vectorization factor for scalable
vectors as it cannot make any decision based on the _LLVM_ ISA. If we
use "s" as the ISA name, the compiler can make decisions based on VFABI
specification where SVE spacific rules are described.
This patch is only a refactoring stage where there is no change to the
compiler's behaviour.
Instead of ConstantExpr::getCast() with a fixed opcode, use the
corresponding getXYZ methods instead. For the one place creating
a pointer bitcast drop it entirely, as this is redundant with
opaque pointers.
This patch adds in a couple more properties related to call instructions
and the CFG within the function that should expose a little bit more
about the characteristics of the function.
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D158681
This patch adds operand type counts to the detailed function properties
analysis. This is intended to enable more interesting and detailed
comparisons across different languages on specific metrics (like usage
of inline assembly or global values).
Reviewed By: jdoerfert
Differential Revision: https://reviews.llvm.org/D158018
This reverts commit d462f65b8242a82d2430605a741825bf10ebaca0.
It breaks the modules build again, but also may inhibit the use of `-DLLVM_TABLEGEN=`.
See the discussion here: https://reviews.llvm.org/D150144#4578311
rdar://113696899
This reverts commit 30b4351c7c75296dc60fc887212cdc98e85e9996.
This caused a dependency cycle that the Swift build picked up on:
```
CMake Error: The inter-target dependency graph contains the following strongly connected component (cycle):
"llvm-tblgen" of type EXECUTABLE
depends on "LLVMCodeGenTypes" (weak)
depends on "LLVMTableGenGlobalISel" (weak)
depends on "intrinsics_gen" (strong)
"LLVMTableGenGlobalISel" of type STATIC_LIBRARY
depends on "LLVMCodeGenTypes" (weak)
depends on "vt_gen" (strong)
"vt_gen" of type UTILITY
depends on "llvm-tblgen" (strong)
"autogen_intrinsics_RISCV" of type UTILITY
depends on "llvm-tblgen" (strong)
"intrinsics_gen" of type UTILITY
depends on "llvm-tblgen" (strong)
depends on "autogen_intrinsics_RISCV" (strong)
"LLVMCodeGenTypes" of type STATIC_LIBRARY
depends on "vt_gen" (strong)
```
rdar://113636528
This patch adds more detailed function properties gated behind a command
line flag for use primarily in experimentation and gathering statistics
on the functions in a module or project. The runtime cost should be
minimal as the computation is only done when the flag is set. There will
be a slight memory overhead when the ML inliner is enabled, but it
should be fairly small at a handful of bytes per function.
This is an adapted form of https://reviews.llvm.org/D109661.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D157358
Since we no longer support typed LLVM IR pointer types, the code can
be simplified into for example using PointerType::get directly instead
of using Type::getInt8PtrTy and Type::getInt32PtrTy etc.
Differential Revision: https://reviews.llvm.org/D156733
CYGWIN uses the same format of WIN32 for shared libraries.
As the comment says: "a shared library can't have undefined references", in this case CYGWIN must be handled like WIN32 by CMakeLists.txt, otherwise you will get several errors when linking because some symbols are undefined.
Attached patch fixes this issue and allows to complete the build process for those targets.
Differential Revision: https://reviews.llvm.org/D154794
When processing assumes, we also handle assumes on ptrtoint of the
value. In canonical IR, these will have the same size as the value.
However, in non-canonical IR there may be an implicit zext or
trunc, which results in a bit width mismatch. We currently handle
this by adjusting bitwidth everywhere, but this is fragile and I'm
pretty sure that the way we do this is incorrect for some predicates,
because we effectively end up commuting an ext/trunc and an icmp.
Instead, add an m_PtrToIntSameSize() matcher that will only handle
bitwidth preserving cases. For the bitwidth-changing cases, wait
until they have been canonicalized.
The original handling for this was added purely to prevent crashes
in an earlier implementation which failed to account for this
entirely.
On AMDGPU, alloca instructions have penalty that can
be avoided when SROA is applied after inlining.
This patch introduces the default implementation of
TargetTransformInfo::getCallerAllocaCost.
Reviewed By: mtrofin
Differential Revision: https://reviews.llvm.org/D149740
Use a unit test since I don't see any existing uses try to make use of
the high bits of a pointer.
This will also assert if the metadata type doesn't match the pointer
width, but I consider that a defect in the verifier and shouldn't be
handled.
AMDGPU allocates LDS globals by assigning !absolute_symbol with the
final fixed address. Tracking the high bits are 0 may help with
addressing mode matching.
For always poison shifts, any KnownBits return value is valid.
Currently we return unknown, but returning zero is generally more
profitable. We had some code in ValueTracking that tried to do this,
but was actually dead code.
Differential Revision: https://reviews.llvm.org/D150648
Add "Hot" AllocationType (in addition to existing cold, notcold).
Use lifetime access density as metric to identify hot allocations.
Treat hot as notcold for MemProfContextDisambiguation for now
before the disambiguation for "hot" is done.
Reviewed By: tejohnson
Differential Revision: https://reviews.llvm.org/D149932