The fixed-point layout algorithm handles linker scripts, thunks, and
relaxOnce (to suppress out-of-range GOT-indirect-to-PC-relative
optimization). These passes are not needed for relocatable links because
they require address information that is not yet available.
Since we don't scan relocations for relocatable links, the
`createThunks` and `relaxOnce` functions are no-ops anyway, making these
passes redundant.
To prevent cluttering the line history, I place the `if (...) break;`
inside the for loop.
Pull Request: https://github.com/llvm/llvm-project/pull/152240
Splat is deprecated, and being prepared for removal in a future release.
https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/5
The command I used, catches almost every splat op:
```
perl -i -pe
's/vector\.splat\s+(\S+)\s*:\s*vector<((?:\[?\d+\]?x)*)\s*([^>]+)>/vector.broadcast
$1 : $3 to vector<$2$3>/g' filename
```
This patchset uses `llvm::Align` when constructing `vector.{load,store}`
operations. The use of `llvm::Align` allows us to specify the unit of
alignment and strongly type alignment as opposed to having just unsigned
integers.
This should execute also the MLIR SPIRV Target tests which require the
SPIRV-Tools validator
---------
Signed-off-by: Davide Grohmann <davide.grohmann@arm.com>
This PR fixes the issue that caused an ub in PR #151472.
The issue was a shl call taking a negative shift amount (posDiff). The
result was never used, but tablegen would perform the calculation
anyway. The fix was to replace the shl call with just multiplications
with constants.
Original PR description:
This patch improves the helper classes in the SpacemiT-X60 vector
scheduling model and will be used in follow-up PRs:
There are now two functions to map LMUL to values:
* ConstValueUntilLMULThenDoubleBase: returns BaseValue for LMUL values
before startLMUL, Value for startLMUL, then doubles Value for each
subsequent LMUL. Useful for cases where fractional LMULs have constant
cycles, and integer LMULs double as they increase.
* GetLMULValue: takes an ordered list of LMUL cycles and LMUL and
returns the corresponding cycle. Useful for cases we can't easily cover
with ConstValueUntilLMULThenDoubleBase.
This PR also adds some useful simplified versions of
ConstValueUntilLMULThenDoubleBase, e.g.: ConstValueUntilLMULThenDouble
(when BaseValue == Value), or ConstOneUntilMF4ThenDouble (when cycles
start to double after MF2)
When `LLVM_INSTALL_TOOLCHAIN_ONLY=ON`, the MLIR shared library
(`libMLIR*`) is not installed even though it is built with the
`INSTALL_WITH_TOOLCHAIN` argument to the `add_mlir_library` cmake
function. This patch ensures that `libMLIR*` is installed when
`LLVM_INSTALL_TOOLCHAIN_ONLY=ON`.
Patch verified
[here](https://github.com/llvm/llvm-project/issues/151247#issuecomment-3156387055).
fixes#151247
`add_conformance_test` checks for libc and prints a warning if it is not
found. However, this warning ends up being printed once for each test,
spamming the cmake log. Moving it up to the folder cmake allows it to
be reported only once.
We currently log every single test that we run in premerge. This leads
to gigantic logs (200k+ lines on Linux) that can be difficult to parse
through. Having an indicator of progress is nice, especially for the
LLVM tests, but is not strictly necessary and not often used (I
imagine). Having a progress indicator from lit that works in CI cases is
on my TODO list.
For the rare cases where someone does need to see the list of tests that
run, the JUnit XML emitted by lit is available in the artifacts.
Adds missing C++ run lines to test files containing `constexpr` tests.
Also adds missing 32/64-bit test coverage to the following tests:
- `clang/test/CodeGen/X86/avx512-reduceIntrin.c`
- `clang/test/CodeGen/X86/avx512-reduceMinMaxIntrin.c`
- `clang/test/CodeGen/X86/avx512vpopcntdq-builtins.c`
- `clang/test/CodeGen/X86/avx512vpopcntdqvl-builtins.c`
Additionally, fixes a `_mm512_popcnt_epi64` `constexpr` test that
incorrectly assumed 32-bit integers, leading to incorrect bit counts.
This change updates the test result to assume 64-bit integers.
Split out from https://github.com/llvm/llvm-project/pull/150248:
Use the size of the alloca instead of the size passed to the lifetime
intrinsic.
As a bonus, this handles dynamic allocas correctly (see the added test)
instead of doing a memset with size -1...
Currrently flang-rt assumes that LLVM was always built with the dynamic
MSVC runtime. This may not be the case, if the user has specified a
different runtime with -DCMAKE_MSVC_RUNTIME_LIBRARY. Since this flag is
implied by -DLLVM_ENABLE_RPMALLOC=On, which is used by the Windows
release script, this is causing that script to fail.
Fixes#151920
Adds `linalg-morph-ops` pass to convert an op from one representation to another:
named-op <--> category_op (elementwise, contraction, ..) <--> generic
e.g.
```mlir
%exp = linalg.exp ins(%A : tensor<16x8xf32>) outs(%B : tensor<16x8xf32>) -> tensor<16x8xf32>
```
After `mlir-opt -linalg-morph-ops=named-to-category ..`
```mlir
%0 = linalg.elementwise kind=#linalg.elementwise_kind<exp> ins(%arg0 : tensor<16x8xf32> ..
Note: this is generalization of
`--linalg-generalize-named-ops` is the path `named-op --> generic-op`
`--linalg-specialize-generic-ops` is the path `named-op <-- generic-op`
email: quic_mabsar@quicinc.com
Changes: The original patch, landed as 1336675, was reverted due to a
bug in LoopVectorize resulting in a crash. The bug has now been fixed by
95c32bf ([VPlan] Return invalid cost if any skeleton block has invalid
costs), and this reland is identical to the original patch.
[NVPTX] Add Prefetch tensormap intrinsics
This PR adds prefetch intrinsics with the relevant tensormap_space.
* Lit tests are added as part of prefetch.ll
* The generated PTX is verified with a 12.3 ptxas executable.
* Added docs for these intrinsics in NVPTXUsage.rst.
For more information, refer to the PTX ISA for prefetch intrinsic :
[Prefetch
Tensormap](https://docs.nvidia.com/cuda/parallel-thread-execution/#data-movement-and-conversion-instructions-prefetch-prefetchu)
@durga4github @schwarzschild-radius
Previously, specializing the GraphWriter class required a full class
specialization.
This change introduces CRTP for GraphWriter, allowing for partial
specialization.
This change is in support of printing the module dependency graph as
part of the RFC for driver-managed module builds, for which we want to
print the graph nodes in a more human-readable format by:
- Printing descriptive IDs instead of pointer addresses as node labels.
- Printing the full node labels separately from the node relations to
avoid clutter.
With this approach, only GraphWriter::writeNodes() needs to be
specialized (, aside from DOTGraphTraits).
RFC for driver-managed module builds:
https://discourse.llvm.org/t/rfc-modules-support-simple-c-20-modules-use-from-the-clang-driver-without-a-build-system
This commit converts RetainCountChecker to the new checker family
framework that was introduced in the commit
6833076a5d9f5719539a24e900037da5a3979289
This commit also performs some minor cleanup around the parts that had
to be changed, but lots of technical debt still remains in this old
codebase.
This patch extends #149095 for EOR and ORR.
It uses a simple partition scheme to try to find two suitable disjoint
bitmasks that can be used with EOR/ORR to reconstruct the original mask.
Fixes: #148987.
This fixes https://github.com/llvm/llvm-project/issues/152097
This commit fixes two instances of a (somewhat) recently enabled
assertion. One with a test, the other I can't reproduce (might be dead
code) but certainly looks like an instance of the same problem.
The PR that introduced the regression:
https://github.com/llvm/llvm-project/pull/117558
With this patch, the AVR backend is usable again for TinyGo.
Judging from the reaction to
https://github.com/llvm/llvm-project/pull/152302, we are not ready to
make this a fatal error.
Remove the specific version number, and update the libc message to match
the others' wording.
Add a device function to check if a device queue is empty. If liboffload
tries to create an event for an empty queue, we create an "empty" event
that is already complete.
This allows `olCreateEvent`, `olSyncEvent` and `olWaitEvent` to run
quickly for empty queues.
When LLVM_LINK_LLVM_DYLIB is ON, `check-bolt` target reports unit test
failures:
BOLT-Unit :: Core/./CoreTests/failed_to_discover_tests_from_gtest
BOLT-Unit :: Profile/./ProfileTests/failed_to_discover_tests_from_gtest
The reason is that when llvm-lit runs a unit-test executable:
/path/to/CoreTests --gtest_list_tests '--gtest_filter=-*DISABLED_*'
an assertion is triggered with the following message:
LLVM ERROR: Option 'default' already exists!
This assertion triggers when the initializer of defaultListDAGScheduler
defined at SelectionDAGISel.cpp:219 is called as a statically-linked
function after already being called during the initialization of
libLLVM.
The issue can be traced down to LLVMTestingSupport library which depends
on libLLVM as neither COMPONENT_LIB nor DISABLE_LLVM_LINK_LLVM_DYLIB is
specified in a call to `add_llvm_library(LLVMTestingSupport ...)`.
Specifying DISABLE_LLVM_LINK_LLVM_DYLIB for LLVMTestingSupport makes
Clang unit test fail and COMPONENT_LIB is probably inappropriate for a
testing-specific library, thus as a workaround, added Error.cpp source
from LLVMTestingSupport directly to the list of source files of
CoreTests target (as it depends on
`llvm::detail::TakeError(llvm::Error)`) and removed LLVMTestingSupport
from the list of dependencies of ProfileTests.
This makes the test actually test the intended rewrite
pass. Also add some tests with inline immediates in src2.
Switch the target to gfx942 for future test functions.
Removing the unnecessary friend declarations from `<__tree>` also
removes the need for forward declaration headers for `map` and `set`,
which this patch also removes.