We used to vectorize these scalably but after #147026 they were split
out from RecurKind::Add into their own RecurKinds, and we didn't mark
them as supported in isLegalToVectorizeReduction.
This caused the loop vectorizer to drop the scalable VPlan because it
thinks the reductions will be scalarized.
This fixes it by just marking them as supported.
Fixes#154554
These functions turned out to have the same bug that was in wcstok()
(fixed by 4fc9801), so add the missing tests and fix the code in a way
that matches wcstok().
Also fix incorrect test expectations in existing tests.
Also update the BUILD.bazel files to actually build the strsep() test.
The C++ standard requires stddef.h to declare all of its contents in the
global namespace. We were only doing it when trying to be compatible
with Microsoft extensions. Now we expose in C++11 or later, in addition
to exposing it in Microsoft extensions mode.
Fixes#154577
... builtins. We used to access the I'th index of the output vector, but
that doesn't work since the output vector is only half the size of the
input vector.
I tracked down the document which changed the way variables are handled
in for loops for C99, it was the same document that allowed mixing code
and declarations but the editor's report made it seem like the features
came from different papers.
This is an extension we backported to C89 but it's sufficiently distinct
in the tracking page so I've added it explicitly to the backported
features documentation.
Use `CRASHREPORTER_ANNOTATIONS_INITIALIZER` when possible, which will
handle the field initialization for us. That's what we already do in
compiler-rt:
0c480dd4b6/compiler-rt/lib/sanitizer_common/sanitizer_mac.cpp (L799-L817)
This way we won't get these warnings when the layout of
crashreporter_annotations_t changes:
```
llvm/lib/Support/PrettyStackTrace.cpp:92:65: warning: missing field 'blah' initializer [-Wmissing-field-initializers]
= { CRASHREPORTER_ANNOTATIONS_VERSION, 0, 0, 0, 0, 0, 0, 0 };
^
1 warning generated
```
This PR modifies the static_asserts checking the expected sizes in
__barrier_type.h, so that we can guarantee that our internal
implementation fits the public header.
The viewLikeOpInterface abstracts the behavior of an operation view one
buffer as another. However, the current interface only includes a
"getViewSource" method and lacks a "getViewDest" method.
Previously, it was generally assumed that viewLikeOpInterface operations
would have only one return value, which was the view dest. This
assumption was broken by memref.extract_strided_metadata, and more
operations may break these silent conventions in the future. Calling
"viewLikeInterface->getResult(0)" may lead to a core dump at runtime.
Therefore, we need 'getViewDest' method to standardize our behavior.
This patch adds the getViewDest function to viewLikeOpInterface and
modifies the usage points of viewLikeOpInterface to standardize its use.
Description
===========
OpenMP Tooling Interface Testing Library (ompTest) ompTest is a unit
testing framework for testing OpenMP implementations. It offers a
simple-to-use framework that allows a tester to check for OMPT events in
addition to regular unit testing code, supported by linking against
GoogleTest by default. It also facilitates writing concise tests while
bridging the semantic gap between the unit under test and the OMPT-event
testing.
Background
==========
This library has been developed to provide the means of testing OMPT
implementations with reasonable effort. Especially, asynchronous or
unordered events are supported and can be verified with ease, which may
prove to be challenging with LIT-based tests. Additionally, since the
assertions are part of the code being tested, ompTest can reference all
corresponding variables during assertion.
Basic Usage
===========
OMPT event assertions are placed before the code, which shall be tested.
These assertion can either be provided as one block or interleaved with
the test code. There are two types of asserters: (1) sequenced
"order-sensitive" and (2) set "unordered" assserters. Once the test is
being run, the corresponding events are triggered by the OpenMP runtime
and can be observed. Each of these observed events notifies asserters,
which then determine if the test should pass or fail.
Example (partial, interleaved)
==============================
```c++
int N = 100000;
int a[N];
int b[N];
OMPT_ASSERT_SEQUENCE(Target, TARGET, BEGIN, 0);
OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // a ?
OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &a);
OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // b ?
OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &b);
OMPT_ASSERT_SEQUENCE(TargetSubmit, 1);
OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &b);
OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &a);
OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE);
OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE);
OMPT_ASSERT_SEQUENCE(Target, TARGET, END, 0);
#pragma omp target parallel for
{
for (int j = 0; j < N; j++)
a[j] = b[j];
}
```
References
==========
This work has been presented at SC'24 workshops, see:
https://ieeexplore.ieee.org/document/10820689
Current State and Future Work
=============================
ompTest's development was mostly device-centric and aimed at OMPT device
callbacks and device-side tracing. Consequentially, a substantial part
of host-related events or features may not be supported in its current
state. However, we are confident that the related functionality can be
added and ompTest provides a general foundation for future OpenMP and
especially OMPT testing. This PR will allow us to upstream the
corresponding features, like OMPT device-side tracing in the future with
significantly reduced risk of introducing regressions in the process.
Build
=====
ompTest is linked against LLVM's GoogleTest by default, but can also be
built 'standalone'. Additionally, it comes with a set of unit tests,
which in turn require GoogleTest (overriding a standalone build). The
unit tests are added to the `check-openmp` target.
Use the following parameters to perform the corresponding build:
`LIBOMPTEST_BUILD_STANDALONE` (Default: ${OPENMP_STANDALONE_BUILD})
`LIBOMPTEST_BUILD_UNITTESTS` (Default: OFF)
---------
Co-authored-by: Jan-Patrick Lehr <JanPatrick.Lehr@amd.com>
Co-authored-by: Joachim <protze@rz.rwth-aachen.de>
That code is from a time when typeid pointers didn't exist. We can get
there for non-block, non-integral pointers, but we can't meaningfully
handle that case. Just return false.
Fixes#153712
Reverts llvm/llvm-project#153756
It leads to new build bot failure.
https://lab.llvm.org/buildbot/#/builders/145/builds/9200
```
BUILD FAILED: failed build (failure)
Step 5 (build-unified-tree) failure: build (failure) ...
254.983 [140/55/1504] Building CXX object tools/clang/tools/extra/clangd/tool/CMakeFiles/obj.clangdMain.dir/ClangdMain.cpp.o
FAILED: tools/clang/tools/extra/clangd/tool/CMakeFiles/obj.clangdMain.dir/ClangdMain.cpp.o
ccache /home/buildbots/llvm-external-buildbots/clang.19.1.7/bin/clang++ --gcc-toolchain=/gcc-toolchain/usr -DGTEST_HAS_RTTI=0 -D_DEBUG -D_GLIBCXX_ASSERTIONS -D_GNU_SOURCE -D__STDC_CONSTANT_MACROS -D__STDC_FORMAT_MACROS -D__STDC_LIMIT_MACROS -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/tools/extra/clangd/tool -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/tool -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/../include-cleaner/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/tools/extra/clangd/../clang-tidy -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-
rhel-test/clang-ppc64le-rhel/build/tools/clang/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd -I/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/build/tools/clang/tools/extra/clangd -fPIC -fno-semantic-interposition -fvisibility-inlines-hidden -Werror -Werror=date-time -Werror=unguarded-availability-new -Wall -Wextra -Wno-unused-parameter -Wwrite-strings -Wcast-qual -Wmissing-field-initializers -pedantic -Wno-long-long -Wc++98-compat-extra-semi -Wimplicit-fallthrough -Wcovered-switch-default -Wno-noexcept-type -Wnon-virtual-dtor -Wdelete-non-virtual-dtor -Wsuggest-override -Wstring-conversion -Wmisleading-indentation -Wctad-maybe -unsupported -fdiagnostics-color -ffunction-sections -fdata-sections -fno-common -Woverloaded-virtual -Wno-nested-anon-types -O3 -DNDEBUG -std=c++17 -fno-exceptions -funwind-tables -fno-rtti -UNDEBUG -MD -MT tools/clang/tools/extra/clangd/tool/CMakeFiles/obj.clangdMain.dir/ClangdMain.cpp.o -MF tools/clang/tools/extra/clangd/tool/CMakeFiles/obj.clangdMain.dir/ClangdMain.cpp.o.d -o tools/clang/tools/extra/clangd/tool/CMakeFiles/obj.clangdMain.dir/ClangdMain.cpp.o -c /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/tool/ClangdMain.cpp
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/tool/ClangdMain.cpp:10:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/ClangdLSPServer.h:12:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/ClangdServer.h:12:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/CodeComplete.h:18:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/ASTSignals.h:12:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/ParsedAST.h:23:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/CollectMacros.h:12:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/Protocol.h:26:
In file included from /home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/URI.h:14:
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Registry.h:110:47: error: instantiation of variable 'llvm::Registry<clang::clangd::FeatureModule>::Head' required here, but no definition is available [-Werror,-Wundefined-var-template]
110 | static iterator begin() { return iterator(Head); }
| ^
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Registry.h:114:25: note: in instantiation of member function 'llvm::Registry<clang::clangd::FeatureModule>::begin' requested here
114 | return make_range(begin(), end());
| ^
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/clang-tools-extra/clangd/tool/ClangdMain.cpp:1021:64: note: in instantiation of member function 'llvm::Registry<clang::clangd::FeatureModule>::entries' requested here
1021 | for (FeatureModuleRegistry::entry E : FeatureModuleRegistry::entries()) {
| ^
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Registry.h:61:18: note: forward declaration of template entity is here
61 | static node *Head;
| ^
/home/buildbots/llvm-external-buildbots/workers/ppc64le-clang-rhel-test/clang-ppc64le-rhel/llvm-project/llvm/include/llvm/Support/Registry.h:110:47: note: add an explicit instantiation declaration to suppress this warning if 'llvm::Registry<clang::clangd::FeatureModule>::Head' is explicitly instantiated in another translation unit
110 | static iterator begin() { return iterator(Head); }
| ^
1 error generated.
```
I need some time to fix this in a correct way
If we have entries in Def2LaneDefs, we always have to use it. Move the
check before.
Otherwise we may not pick the correct operand, e.g. if Op was a
replicate recipe that got single-scalar after replicating it.
Fixes https://github.com/llvm/llvm-project/issues/154330.
The cost model previously overestimating throughput costs to wide
fixed-length saturating arithmetic intrinsics when using SVE with a
fixed vscale of 2. These costs ended up much higher than for the same
operations using NEON, despite being fully legal and efficient with SVE.
This patch adjusts the cost model to avoid penalising these intrinsics
under SVE.
This patch adds a -r|--relative-paths option to llvm-lit, which when
enabled will print test case names using paths relative to the current
working directory. The legacy default without that option is that test
cases are identified using a path relative to the test suite.
Only the summary report is impacted. That normally include failed tests,
unless unless options such as --show-pass.
Background to this patch was the discussion here
https://discourse.llvm.org/t/test-case-paths-in-llvm-lit-output-are-lacking-the-location-of-the-test-dir-itself/87973
with a goal to making it easier to copy-n-paste the path to the failing
test cases.
Examples showing difference in "Passed Tests" and "Failed Tests":
> llvm-lit --show-pass test/Transforms/Foo
PASS: LLVM :: Transforms/Foo/test1.txt (1 of 2)
FAIL: LLVM :: Transforms/Foo/test2.txt (2 of 2)
Passed Tests (1):
LLVM :: Transforms/Foo/test1.txt
Failed Tests (1):
LLVM :: Transforms/Foo/test2.txt
> llvm-lit --show-pass --relative-paths test/Transforms/Foo
PASS: LLVM :: Transforms/Foo/test1.txt (1 of 2)
FAIL: LLVM :: Transforms/Foo/test2.txt (2 of 2)
Passed Tests (1):
test/Transforms/Foo/test1.txt
Failed Tests (1):
test/Transforms/Foo/test2.txt
`replaceAllUsesWith` is not safe to use in a dialect conversion and will
be deactivated soon (#154112). Fix commit fixes some API violations.
Also some general improvements.
When an instruction that the disassembler does not recognize is in an IT
block, we should still advance the IT state otherwise the IT state
spills over into the next recognized instruction, which is incorrect.
We want to avoid disassembly like:
it eq
<unknown> // Often because disassembler has insufficient target info.
addeq r0,r0,r0 // eq spills over into add.
Fixes#150569
It was previously failing because of a warning marking a C++20 feature
as an extension.
This is a follow-up to 85043c1c146fd5658ad4c5b5138e58994333e645 that
introduced the test.
The vector granule (AArch64 DWARF register 46) is a pseudo-register that
contains the available size in bits of SVE vector registers in the
current call frame, divided by 64. The vector granule can be used in
DWARF expressions to describe SVE/SME stack frame layouts (e.g., the
location of SVE callee-saves).
The first time VG is evaluated (if not already set), it is initialized
to the result of evaluating a "CNTD" instruction (this assumes SVE is
available).
To support SME, the value of VG can change per call frame; this is
currently handled like any other callee-save and is intended to support
the unwind information implemented in #152283. This limits how VG is
used in the CFI information of functions with "streaming-mode changes"
(mode changes that change the SVE vector length), to make the unwinder's
job easier.
The orc-rt extensible RTTI mechanism is used to provide simple dynamic RTTI
checks for orc-rt types that do not depend on standard C++ RTTI (meaning that
they will work equally well for programs compiled with -fno-rtti).
ORC_RT_MARK_AS_BITMASK_ENUM and ORC_RT_DECLARE_ENUM_AS_BITMASK can be used to
easily add support for bitmask operators (&, |, ^, ~) to enum types.
This code was derived from LLVM's include/llvm/ADT/BitmaskEnum.h header.
It is generally better to allow the target independent combines before
creating AArch64 specific nodes (providing they don't mess it up). This
moves the generation of BSL nodes to lowering, not a combine, so that
intermediate nodes are more likely to be optimized. There is a small
change in the constant handling to detect legalized buildvector
arguments correctly.
Fixes#149380 but not directly. #151856 contained a direct fix for
expanding the pseudos.
The simplest way is:
1. Save `vtype` to a scalar register.
2. Insert a `vsetvli`.
3. Use segment load/store.
4. Restore `vtype` via `vsetvl`.
But `vsetvl` is usually slow, so this PR is not in this way.
Instead, we use wider whole load/store instructions if the register
encoding is aligned. We have done the same optimization for COPY in
https://github.com/llvm/llvm-project/pull/84455.
We found this suboptimal implementation when porting some video codec
kernels via RVV intrinsics.
The order entries in the tablegen API files are iterated is not the
order
they appear in the file. To avoid any issues with the order changing
in future, we now generate all definitions of a certain class before
class that can use them.
This is a NFC; the definitions don't actually change, just the order
they exist in in the OffloadAPI.h header.