Some VP intrinsic definitions were missing the
VP_PROPERTY_FUNCTIONAL_INTRINSIC property. This patch fills them in, and
adds a static_assert that all VP intrinsics have an equivalent opcode or
intrinsic defined so we don't forget them in future.
Some VP intrinsics don't have an equivalent, namely merge and strided
load/store. For those, a new property was added to mark that they don't
have a non-VP equivalent.
This adds a helper method to get the ID of the functionally equivalent
intrinsic, similar to the existing getFunctionalOpcodeForVP and
getConstrainedIntrinsicIDForVP method.
This patch ports PerfJITEventListener to a JITLink plugin, but adds unwind
record support and drops debuginfo support temporarily. Debuginfo can be
enabled in the future by providing a way to obtain a DWARFContext from a
LinkGraph.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D146169
This patch adjusts the legality check for riscv to use `cpop/cpopw` since `isOperationLegal(ISD::CTPOP, MVT::i32)` returns false on rv64gc_zbb.
Clang vs gcc: https://godbolt.org/z/rc3s4hjPh
Reviewed By: craig.topper
Differential Revision: https://reviews.llvm.org/D156390
This is the patch at https://reviews.llvm.org/D153692, migrating to
Github
After testing D147740 with multiple industrial projects with ~10 million
FunctionSamples, no MD5 collision has been found. In perfect hashing,
the probability of collision for N symbols over K possible hash value is
1 - K!/((K-N)! * K^N). When N is 1 million and K is 2^64, the
probability is 3*10^-8, when N is 10 million the probability is 3*10^-6,
so we are probably not going to find an actual case in real world
application. (However if K is 2^32, the probability of collision is
almost 1, this is indeed a problem, if anyone still use a large profile
on 32-bit machine, as hash_code is tied to size_t). Furthermore, when a
collision happens we can't do anything to recover it, unless using a
multi-map, but that is significantly slower, which contradicts the
purpose of optimizing the profile reader. One more thing, since we have
been using profiles with MD5 names, and they have to be coming from
non-MD5 sources, so if hash collision is to happen, it already happened
when we convert a non-MD5 profile to a MD5 one, so there's no point to
check for that in the reader, and this feature can be removed.
Revision c383f4d6550e enabled using variadic-form debug values to represent
single-location, non-stack-value debug values, and a further patch made all
DBG_INSTR_REFs use variadic form. Not all code paths were updated correctly to
handle the new syntax however, with entry values in still expecting an expression
that begins exactly DW_OP_LLVM_entry_value, 1.
A function already exists to select non-variadic-like expressions; this patch
adds an extra function to cheaply simplify such cases to non-variadic form, which
we use prior to any entry-value processing to put DBG_INSTR_REFs and DBG_VALUEs
down the same code path. We also use it for a few DIExpression functions that
check for whether the first element(s) of a DIExpression match a particular
pattern, so that they will return the same result for
DIExpression(DW_OP_LLVM_arg, 0, <ops>) as for DIExpression(<ops>).
Differential Revision: https://reviews.llvm.org/D158185
Debug info correlation is an option in InstrProfiling pass, which is used by
both IR instrumentation and front-end instrumentation. So, Clang coverage can
also benefits the binary size saving from it.
Reviewed By: ellis
Differential Revision: https://reviews.llvm.org/D157913
One of the main user of these kind of coroutines is swift. There yield-once (`retcon.once`) coroutines are used to temporary "expose" pointers to internal fields of various objects creating borrow scopes.
However, in some cases it might be useful also to allow these coroutines to produce a normal result, but there is no convenient way to represent this (as compared to switched-resume kind of coroutines where C++ `co_return`
is transformed to a member / callback call on promise object).
The extension is simple: we allow continuation function to have a non-void result and accept optional extra arguments via a special `llvm.coro.end.result` intrinsic that would essentially forward them as normal results.
This patch fixes the shared clause for the task construct with multiple
shared variables. The shareds field in the kmp_task_t is not an inline
array in the struct, rather it is a pointer to an array. With an inline
array, the pointer dereference to the outlined function body of the task
would segmentation fault when accessed by the runtime.
Reviewed By: kiranchandramohan, jdoerfert
Differential Revision: https://reviews.llvm.org/D158462
The DXContainer pipeline state information encodes a bunch of mask
vectors that are used to track things about the inputs and outputs from
each shader.
This adds support for reading and writing them throught he YAML test
interfaces. The writing logic in MC is extremely primitive and we'll
want to revisit the API for that, but since I'm not sure how we'll want
to generate the mask bits from DXIL during code generation I didn't want
to spend too much time on the API.
Fixes#59479
visit will skip visiting instructions it already has visited
to avoid issues with cycles in the data graph. However,
the result of this skipping behavior is that if we
encounter the same instruction twice, and that instruction
has a well defined result and isn't part of a cycle, we
will introduce unknowns into the analysis even though we
knew the size and offset of the instruction's result.
Instead of skipping such instructions, keep a cache of
the result of visiting them. This result is initialized
to unknown() before visiting, so if we happen to visit
it again recursively (perhaps as the result of a cycle
or a phi), we will get unknown as the cached result and
exit out.
Duplicate phi nodes were being directly removed, without
invalidating MDA. This could result in a new phi node being
allocated at the same address, incorrectly reusing a cache entry.
Fix this by optionally allowing EliminateDuplicatePHINodes() to
collect phi nodes to remove into a vector, which allows GVN to
handle removal itself.
Fixes https://github.com/llvm/llvm-project/issues/64598.
Differential Revision: https://reviews.llvm.org/D158849
Both Swift & LLD use TextAPI reader/writer apis to interface with TBD
files. Add doc strings to document what each API does. Also, add
shortcut APIs for validating input is a TBD file.
This reduces the differences between downstream and how tapi calls into
these APIs.
Add support for static Arm relocations of R_ARM_MOVT_ABS and R_ARM_MOVW_ABS_NC
which are emitted by movt and movw instructions. The implementation contains
relocation fixup and its testing as well as its encode/decode functions for
reading and writing immediate values together with its unittests.
Currently clang's medium code model treats all data as large, putting them in a large data section and using more expensive instruction sequences to access them.
Following gcc's -mlarge-data-threshold, which allows putting data under a certain size in a normal data section as opposed to a large data section. This allows using cheaper code sequences to access some portion of data in the binary (which will be implemented in LLVM in a future patch).
And under the medium codel mode, only put data above the large data threshold into large data sections, not all data.
Reviewed By: MaskRay, rnk
Differential Revision: https://reviews.llvm.org/D149288
This will make it easy for callers to see issues with and fix up calls
to createTargetMachine after a future change to the params of
TargetMachine.
This matches other nearby enums.
For downstream users, this should be a fairly straightforward
replacement,
e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive
or s/CGFT_/CodeGenFileType::
Multiplying raw block frequency with an integer carries a high risk
of overflow.
- Add `BlockFrequency::mul` return an std::optional with the product
or `nullopt` to indicate an overflow.
- Fix two instances where overflow was likely.
This pass will upgrade DXIL-style llvm constructs (which are mostly
metadata) into the representations we use in LLVM for the same concepts.
For now we just strip the valver metadata, which we don't need. Later
changes will make this pass more useful, and then we should be able to
wire it into clang and possibly the DirectX backend's AsmParser.
Default atomic ordering information is processed in the OpenMP dialect
to LLVM IR lowering stage at every spot where an operation can be
affected by it. The rest of clauses are stored globally in the
OpenMPIRBuilderConfig object before starting that lowering stage, so
that the OMPIRBuilder can conditionally modify code generation
depending on these. At the end of the process, the omp.requires
attribute is itself lowered into a global constructor that passes these
clauses as flags to the OpenMP runtime.
Depends on D147217, D147218 and D158278.
Differential Revision: https://reviews.llvm.org/D147219
This patch updates the `OpenMPIRBuilderConfig` structure to hold all
available 'requires' clauses, and it replicates part of the code
generation for the 'requires' registration function from clang in the
`OMPIRBuilder`, to be used with flang.
Porting the rest of features of the clang implementation to the IRBuilder
and sharing it between clang and flang remains for a future patch, due to the
complexity of the logic selecting the attributes of the generated
registration function.
Differential Revision: https://reviews.llvm.org/D147217
There is case that R_PPC64_REL24 with non-zero addend. The assertion is incorrectly triggered in such situation.
Reviewed By: lhames
Differential Revision: https://reviews.llvm.org/D158708
Support VE in long double demangler. This patch corrects
libcxxabi/test/test_demangle.pass.cpp on VE.
Reviewed By: MaskRay, #libc_abi, ldionne
Differential Revision: https://reviews.llvm.org/D159004
reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003)
This reverts commit ee643b706be2b6bef9980b25cc9cc988dab94bb5.
Fix up build failures in targets I missed in #66003
Kept as 3 commits for reviewers to see better what's changed. Will
squash when
merging.
- reland [InlineAsm] wrap ConstraintCode in enum class NFC (#66003)
- fix all the targets I missed in #66003
- fix off by one found by llvm/test/CodeGen/SystemZ/inline-asm-addr.ll
This adds more validation that a dxil triple is actually useable when
compiling HLSL.
The OS field of the triple needs to be a versioned shader model.
Later, we should set a default if this is empty and check that the
version is a shader model we can actually handle.
The Environment field of the triple needs to be specified and be a
valid shader stage. I'd like to allow this to be empty and treat it
like library, but allowing that currently crashes in DXIL metadata
handling.
Differential Revision: https://reviews.llvm.org/D159103
This reverts commit 2ca4d136124d151216aac77a0403dcb5c5835bcd.
Also revert the followup, "[InlineAsm] fix botched merge conflict resolution"
This reverts commit 8b9bf3a9f715ee5dce96eb1194441850c3663da1.
There were SystemZ and Mips build errors, too many to fix forward.
Similar to
commit 2fad6e69851e ("[InlineAsm] wrap Kind in enum class NFC")
Fix the TODOs added in
commit 93bd428742f9 ("[InlineAsm] refactor InlineAsm class NFC
(#65649)")
- Added WritableArmRelocation and ArmRelocation Structs
- Encode/Decode funcs for B/BL A1 and BLX A2 encodings
- Add ARM helper functions, consistent with the existing Thumb helper functions
- Add Test for ELF::R_ARM_CALL
Reviewed By: sgraenitz
Differential Revision: https://reviews.llvm.org/D157533
VPIntrinsics with VP_PROPERTY_BINARYOP property should have the ability
to be queried with with VPBinOpIntrinsic::isVPBinOp, similiar to how
intrinsics with the VP_PROPERTY_REDUCTION property can be queried with
VPReductionIntrinsic::isVPReduction.
This will be used in #65706. In that PR the usage of this class is
tested.
This adds a helper method to get the ID of the functionally equivalent
intrinsic, similar to the existing getFunctionalOpcodeForVP and
getConstrainedIntrinsicIDForVP methods.
Not sure if it's notable or not, but I can't find any existing uses of
VP_PROPERTY_FUNCTIONAL_INTRINSIC?
It could potentially be used in #65706 to scalarize VP intrinsics.
```
$ ./bin/clang --target=arm-linux-gnueabihf --print-supported-extensions
<...>
All available -march extensions for ARM
crc
crypto
sha2
aes
dotprod
<...>
```
This follows the format set by RISC-V and AArch64. As for AArch64, ARM
doesn't have versioned extensions like RISC-V does. So there is only 1
column, which contains the name.
Any extension without a "feature" is hidden as these cannot be used with
-march.
The FileIndex values returned from GetFileInformationByHandle are
considered stable and uniquely identifying a file, as long as the
handle is open. When handles are closed, there are no guarantees
for their stability or uniqueness. On some file systems (such as
NTFS), the indices are documented to be stable even across handles.
But with some file systems, in particular network mounts, file
indices can be reused very soon after handles are closed.
When such file indices are used for LLVM's UniqueID, files are
considered duplicates as soon as the filesystem driver happens to
have used the same file index for the handle used to inspect the
file. This caused widespread, non-obvious (seemingly random)
breakage. This can happen e.g. if running on a directory that is
shared via Remote Desktop or VirtualBox.
To avoid the issue, use a hash of the canonicalized path for the
file as unique identifier, instead of using FileIndex.
This fixes https://github.com/llvm/llvm-project/issues/61401 and
https://github.com/llvm/llvm-project/issues/22079.
Performance wise, this adds (usually) one extra call to
GetFinalPathNameByHandleW for each call to getStatus(). A test
cases such as running clang-scan-deps becomes around 1% slower
by this, which is considered tolerable.
Change the equivalent() function to use getUniqueID instead of
checking individual file_status fields. The
equivalent(Twine,Twine,bool& result) function calls status() on
each path successively, without keeping the file handles open,
which also is prone to such false positives. This also gets rid
of checks of other superfluous fields in the
equivalent(file_status, file_status) function - the unique ID of
a file should be enough (that is what is done for Unix anyway).
This comes with one known caveat: For hardlinks, each name for
the file now gets a different UniqueID, and equivalent() considers
them different. While that's not ideal, occasional false negatives
for equivalent() is usually that fatal (the cases where we strictly
do need to deduplicate files with different path names are quite
rare) compared to the issues caused by false positives for
equivalent() (where we'd deduplicate and omit totally distinct files).
The FileIndex is documented to be stable on NTFS though, so ideally
we could maybe have used it in the majority of cases. That would
require a heuristic for whether we can rely on FileIndex or not.
We considered using the existing function is_local_internal for that;
however that caused an unacceptable performance regression
(clang-scan-deps became 38% slower in one test, even more than that
in another test).
Differential Revision: https://reviews.llvm.org/D155579
The builtin_expect(), and C++20's likely, unlikely attributes assign branch_weights to annotated branches.
This patch adds the the ability to query branch !prof metadata and improve static analysis based on that.
Fixes: https://github.com/llvm/llvm-project/issues/64998
Reviewers: tejohnson, efriedma
Differential Revision: https://reviews.llvm.org/D159336