Renames asm-constraint-jR.ll and asm-constraint-jR.ll - on
case-insensitive files systems those are treated as one file. Originally
introduced in #92338.
Generalize LoopVectorizationPlanner::isMoreProfitable smoothly across
the fixed-vector and scalable-vector cases, taking the trip-count into
account, and fixing logical pitfalls that arise from a lack of
generality.
This is similar to 373c343a, but for targets with zero-or-negative-one
booleans.
The difference in tests is mostly due to G_SEXT_INREG being illegal for
some targets, in which case it gets expanded into G_SHL/G_ASHR pair,
which is not currently optimized by the combiner.
This is generally just for consistency with the rest of the pipeline.
The assertion for the insertion point is because I am not sure if
omp::PrivateClauseOp is supported by FirOpBuilder::getAllocaBlock. I
didn't try to fix it because I don't see why we would generate IR like
that.
See RFC:
https://discourse.llvm.org/t/rfc-add-an-interface-for-top-level-container-operations
First commit is reviewed in
https://github.com/llvm/llvm-project/pull/93682.
Lower RANK using fir.box_rank. This patches updates fir.box_rank to
accept box reference, this avoids the need of generating an assumed-rank
fir.load just for the sake of reading ALLOCATABLE/POINTER rank. The
fir.load would generate a "dynamic" memcpy that is hard to optimize
without further knowledge. A read effect is conditionally given to the
operation.
A follow-up to #92953. This should fix unexpected performance gains when Clang is built with GCC, and fix downstream LTO crashes reported in 4feae05c6a (r142466703)
This patch implements the lowering of vector.deinterleave
for 1D vectors.
For fixed vector types, the operation is lowered to two
llvm shufflevector operations. One for even indexed
elements and the other for odd indexed elements. A poison
operation is used to satisfy the parameters of the
shufflevector parameters.
For scalable vectors, the llvm vector.deinterleave2
intrinsic is used for lowering. As such the results
found by extraction and used to form the result
struct for the intrinsic.
This patch adds basic support for scalable vector types in load & store
instructions for AArch64 with GISel.
Only scalable vector types with a 128-bit base size are supported, e.g.
`<vscale x 4 x i32>`, `<vscale x 16 x i8>`.
This patch adapted some ideas from a similar abandoned patch
[https://github.com/llvm/llvm-project/pull/72976](https://github.com/llvm/llvm-project/pull/72976).
When an operation is erased in Python, its children may still be in the
"live" list inside Python bindings. After this, if some of the newly
allocated operations happen to reuse the same pointer address, this will
trigger an assertion in the bindings. This assertion would be incorrect
because the operations aren't actually live. Make sure we remove the
children operations from the "live" list when erasing the parent.
This also concentrates responsibility over the removal from the "live"
list and invalidation in a single place.
Note that this requires the IR to be sufficiently structurally valid so
a walk through it can succeed. If this invariant was broken by, e.g, C++
pass called from Python, there isn't much we can do.
Module names can be matched either by a full path or just their
basename. The completion machinery tried to do both, but had several
bugs:
- it always inserted the basename as a completion candidate, even if the
string being completed was a full path
- due to FileSpec canonicalization, it lost information about trailing
slashes (it treated "lib/<TAB>" as "lib<TAB>", even though it's clear
the former was trying to complete a directory name)
- due to both of the previous issues, the completion candidates could
end up being shorter than the string being completed, which caused
crashes (string out of range errors) when attempting to substitute the
results.
This patch rewrites to logic to remove these kinds of issues:
- basename and full path completion are handled separately
- full path completion is attempted always, basename only if the input
string does not contain a slash
- the code remembers both the canonical and original spelling or the
completed argument. The canonical arg is used for matching, while the
original spelling is used for completion. This way "/foo///.//b<TAB>"
can still match "/foo/bar", but it will complete to "/foo///.//bar".
A synthetic child provider might need to do considerable amount of work
to compute the number of children. lldb-dap is currently calling that
for all synthethic variables, but it's only actually using the value for
values which it deems to be "indexed" (which is determined by looking at
the name of the first child). This patch reverses the logic so that
GetNumChildren is only called for variables with a suitable first child.
Lower allocatable and pointers specification parts. Nothing special is
required to allocate the descriptor given they are required to be dummy
arguments, however, care must be taken with INTENT(OUT) to use the
runtime to deallocate them (inlined fir.embox + store is not possible).
- Update LLVM type conversion of assumed-rank fir.box/class to generate
the type of the maximum ranked descriptor. That way, alloca for assumed
rank descriptor copies are always big enough. This is needed in the
fir.load case that generates a new storage for the value
- Add a "computeBoxSize" helper to compute the dynamic size of a
descriptor.
- Use that size to generate an llvm.memcpy intrinsic to copy the input
descriptor into the new storage.
Looking at https://reviews.llvm.org/D108221?id=404635, it seems valid to
add the TBAA node on the memcpy, which I did.
In a further patch, I think we should likely always use a memcpy since
LLVM seems to have a better time optimizing it than fir.load/fir.store
patterns.
[clang] fix merging of UsingShadowDecl
Previously, when deciding if two UsingShadowDecls where mergeable,
we would incorrectly only look for both pointing to the exact redecla
ration, whereas the correct thing is to look for declarations to the
same entity.
This problem has existed as far back as 2013, introduced in commit
fd8634a09de71.
This problem could manifest itself as ODR check false positives when
importing modules.
Fixes: #80252
The data-layout independent constant folding currently has some rather
gnarly code for canonicalizing GEP indices to reduce "notional
overindexing", and then infers inbounds based on that canonicalization.
Now that we canonicalize to i8 GEPs, this canonicalization is
essentially useless, as we'll discard it as soon as the GEP hits the
data-layout aware constant folder anyway. As such, I'd like to remove
this code entirely.
This shouldn't have any impact on optimization capabilities.
These act as constants and should be propagated whenever possible. It is
safe to do so for mlir.undef and mlir.poison because they remain "dirty"
through out their lifetime and can be duplicated, merged, etc. per the
LangRef.
Signed-off-by: Guy David <guy.david@nextsilicon.com>
This commit changes the LLVM dialect's inliner interface to stop
assuming that the inlined function only contained unstructured control
flow. This is not necessarily true, and it lead to not properly
propagating the noalias information.
DWARFDebugInfo doesn't know how to resolve the "file_index" component of
a DIERef. This patch removes GetUnit (in favor of existing
GetUnitContainingDIEOffset) and changes GetDIE to take only the
components it actually uses.
This reverts commit aeccfee348c717165541d8d895b9b0cdfe31415c, and dependents:
Revert "[NFC] Fix PPC buildbot failure https://lab.llvm.org/buildbot/#/builders/230/builds/29066"
This reverts commit 2b1d1c51f6e321267cc86e9db7808298c59caf0e.
Revert "Fix test - remove unnecessary/incorrect `-S`, in favor of `-emit-llvm`"
This reverts commit ea1ecb50fa831583241fc531153bd2c072955d29.
The test is failing on MacOs and Windows
Implements HLSL availability diagnostics' default and relaxed mode.
HLSL availability diagnostics emits errors or warning when unavailable
shader APIs are used. Unavailable shader APIs are APIs that are exposed
in HLSL code but are not available in the target shader stage or shader
model version.
In the default mode the compiler emits an error when an unavailable API
is found in a code that is reachable from the shader entry point
function. In the future this check will also extended to exported
library functions (#92073). The relaxed diagnostic mode is the same
except the compiler emits a warning. This mode is enabled by
``-Wno-error=hlsl-availability``.
See HLSL Availability Diagnostics design doc
[here](https://github.com/llvm/llvm-project/blob/main/clang/docs/HLSL/AvailabilityDiagnostics.rst)
for more details.
Fixes#90095
When a module contains globals and/or function declarations only, the
'__llvm_profile_raw_version' variable should not be generated because
the module was not instrumented at all.
NFC
In https://github.com/llvm/llvm-project/pull/88323, I changed the logic
within `add_compiler_rt_runtime` to only explicitly code sign the
resulting library if an older version of Apple's ld64 was in use. This
was based on the assumption that newer versions of ld64 and the new
Apple linker always ad-hoc sign their output binaries. This is true in
most cases, but not when using Apple's new linker with the
`-darwin-target-variant` flag to build Mac binaries that are compatible
with Catalyst.
Rather than adding increasingly complicated logic to detect the exact
scenarios that require explicit code signing, I've opted to always
explicitly code sign when using any Apple linker. We instead detect and
use the 'linker-signed' codesigning option when possible to match the
signatures that the linker would otherwise create. This avoids having
non-'linker-signed' ad-hoc signatures which was the underlying problem
that https://github.com/llvm/llvm-project/pull/88323 was intended to
address.
Co-authored-by: Mark Rowe <markrowe@chromium.org>
Follow-up to a previous simplification
2473b1af085ad54e89666cedf684fdf10a84f058.
The xor difference between a SHT_NOTE and a read-only SHT_PROGBITS
(previously >=NOT_SPECIAL) should be smaller than RF_EXEC. Otherwise,
for the following section layout, `findOrphanPos` would place .text
before note.
```
// simplified from linkerscript/custom-section-type.s
non orphans:
progbits 0x8060c00 NOT_SPECIAL
note 0x8040003
orphan:
.text 0x8061000 NOT_SPECIAL
```
---
Identical to 2e0cfe69d0d705e9c5d5f217625bf7e3a0e90871.
The revert 30c10fda2ba539e70bff4f05625ec6358c0f7502 is wrong.
The patch introduces the gmock-based unittest infrastructure for PGO
Instrumentation and adds some test cases to check whether the
instrumentation has taken place. The testing infrastructure for analysis
modules was borrowed from the LoopPassManagerTest unittest and
simplified a bit to handle module analysis passes only. Actually, we are
testing whether the result of a trivial analysis pass was invalidated by
the PGOInstrumentGen one: we exploit the fact the pass invalidates all
the analysis results after a module was instrumented.
NFC.
This patch enhances the SCEVAAResult::alias() interface to handle two
pointers with different pointer bases.
Before calling getMinusSCEV(), we firstly try to explicitly convert
these two pointers into ptrtoint expressions to do that.
Either both pointers are used with ptrtoint or neither, so we can't
end up with a ptr + int mix.