Add AMDGPUTargetLowering::canCreateUndefOrPoisonForTargetNode handler
and tag BFE_I32/U32 nodes as they can only propagate poison, not create
poison/undef.
Fighting some of the remaining regressions in #152107
Reland of https://github.com/llvm/llvm-project/pull/147381
Added changes to fix observed BuildBot failures:
* CMake version (reduced minimum to `3.20`, was: `3.22`)
* GoogleTest linking (missing `./build/lib/libllvm_gtest.a`)
* Related header issue (missing `#include
"llvm/Support/raw_os_ostream.h"`)
Original message
Description
===========
OpenMP Tooling Interface Testing Library (ompTest) ompTest is a unit testing framework for testing OpenMP implementations. It offers a simple-to-use framework that allows a tester to check for OMPT events in addition to regular unit testing code, supported by linking against GoogleTest by default. It also facilitates writing concise tests while bridging the semantic gap between the unit under test and the OMPT-event testing.
Background
==========
This library has been developed to provide the means of testing OMPT implementations with reasonable effort. Especially, asynchronous or unordered events are supported and can be verified with ease, which may prove to be challenging with LIT-based tests. Additionally, since the assertions are part of the code being tested, ompTest can reference all corresponding variables during assertion.
Basic Usage
===========
OMPT event assertions are placed before the code, which shall be tested. These assertion can either be provided as one block or interleaved with the test code. There are two types of asserters: (1) sequenced "order-sensitive" and (2) set "unordered" assserters. Once the test is being run, the corresponding events are triggered by the OpenMP runtime and can be observed. Each of these observed events notifies asserters, which then determine if the test should pass or fail.
Example (partial, interleaved)
==============================
```c++
int N = 100000;
int a[N];
int b[N];
OMPT_ASSERT_SEQUENCE(Target, TARGET, BEGIN, 0);
OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // a ?
OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &a);
OMPT_ASSERT_SEQUENCE(TargetDataOp, ALLOC, N * sizeof(int)); // b ?
OMPT_ASSERT_SEQUENCE(TargetDataOp, H2D, N * sizeof(int), &b);
OMPT_ASSERT_SEQUENCE(TargetSubmit, 1);
OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &b);
OMPT_ASSERT_SEQUENCE(TargetDataOp, D2H, N * sizeof(int), nullptr, &a);
OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE);
OMPT_ASSERT_SEQUENCE(TargetDataOp, DELETE);
OMPT_ASSERT_SEQUENCE(Target, TARGET, END, 0);
#pragma omp target parallel for
{
for (int j = 0; j < N; j++)
a[j] = b[j];
}
```
References
==========
This work has been presented at SC'24 workshops, see: https://ieeexplore.ieee.org/document/10820689
Current State and Future Work
=============================
ompTest's development was mostly device-centric and aimed at OMPT device callbacks and device-side tracing. Consequentially, a substantial part of host-related events or features may not be supported in its current state. However, we are confident that the related functionality can be added and ompTest provides a general foundation for future OpenMP and especially OMPT testing. This PR will allow us to upstream the corresponding features, like OMPT device-side tracing in the future with significantly reduced risk of introducing regressions in the process.
Build
=====
ompTest is linked against LLVM's GoogleTest by default, but can also be built 'standalone'. Additionally, it comes with a set of unit tests, which in turn require GoogleTest (overriding a standalone build). The unit tests are added to the `check-openmp` target.
Use the following parameters to perform the corresponding build:
`LIBOMPTEST_BUILD_STANDALONE` (Default: ${OPENMP_STANDALONE_BUILD})
`LIBOMPTEST_BUILD_UNITTESTS` (Default: OFF)
---------
Co-authored-by: Jan-Patrick Lehr <JanPatrick.Lehr@amd.com>
Co-authored-by: Joachim <protze@rz.rwth-aachen.de>
Co-authored-by: Joachim Jenke <jenke@itc.rwth-aachen.de>
orc_rt_WrapperFunctionResult is a byte-buffer with inline storage and a
builtin error state. It is intended as a general purpose return type for
functions that return a serialized result (e.g. for communication across
ABIs or via IPC/RPC).
orc_rt_WrapperFunctionResult contains a small amount of inline storage,
allowing it to avoid heap-allocation for small return types (e.g. bools,
chars, pointers).
It broke the build:
compiler-rt/lib/hwasan/hwasan_thread.cpp:177:11: error: unknown type name 'ssize_t'; did you mean 'size_t'?
177 | (ssize_t)unique_id_, (void *)this, (void *)stack_bottom(),
| ^~~~~~~
| size_t
> This change addresses CodeQL format-string warnings across multiple
> sanitizer libraries by adding explicit casts to ensure that printf-style
> format specifiers match the actual argument types.
>
> Key updates:
> - Cast pointer arguments to (void*) when used with %p.
> - Use appropriate integer types and specifiers (e.g., size_t -> %zu,
> ssize_t -> %zd) to avoid mismatches.
> - Fix format specifier mismatches across xray, memprof, lsan, hwasan,
> dfsan.
>
> These changes are no-ops at runtime but improve type safety, silence
> static analysis warnings, and reduce the risk of UB in variadic calls.
This reverts commit d3d5751a39452327690b4e011a23de8327f02e86.
## Change:
* Added `--dump-dot-func` command-line option that allows users to dump
CFGs only for specific functions instead of dumping all functions (the
current only available option being `--dump-dot-all`)
## Usage:
* Users can now specify function names or regex patterns (e.g.,
`--dump-dot-func=main,helper` or `--dump-dot-func="init.*`") to generate
.dot files only for functions of interest
* Aims to save time when analysing specific functions in large binaries
(e.g., only dumping graphs for performance-critical functions identified
through profiling) and we can now avoid reduce output clutter from
generating thousands of unnecessary .dot files when analysing large
binaries
## Testing
The introduced test `dump-dot-func.test` confirms the new option does
the following:
- [x] 1. `dump-dot-func` can correctly filter a specified functions
- [x] 2. Can achieve the above with regexes
- [x] 3. Can do 1. with a list of functions
- [x] No option specified creates no dot files
- [x] Passing in a non-existent function generates no dumping messages
- [x] `dump-dot-all` continues to work as expected
This patch does two things:
- Remove exception specifications of `= default`ed special member
functions
- `= default` special member functions
The first part is NFC because the explicit specification does exactly
the same as the implicit specification. The second is NFC because it
does exactly what the `= default`ed special member does.
Win/ASan relies on the runtime's functions being 16-byte aligned so it
can intercept them with hotpatching. This used to be true (but not
guaranteed) until #149444.
Passing /hotpatch will give us enough alignment and generally ensure
that the functions are hotpatchable.
Drop poison generating flags on trunc when distributing trunc over
add/sub/or. We need to do this since for example
(add (trunc nuw A), (trunc nuw B)) is more poisonous than
(trunc nuw (add A, B))).
In some situations it is pessimistic to drop the flags. Such as
if the add in the example above also has the nuw flag. For now we
keep it simple and always drop the flags.
Worth mentioning is that we drop the flags when cloning
instructions and rebuilding the chain. This is done after the
"allowsPreservingNUW" checks in ConstantOffsetExtractor::Extract.
So we still take the "trunc nuw" into consideration when determining
if nuw can be preserved in the gep (which should be ok since that
check also require that all the involved binary operations has nuw).
Fixes#154116
This handles constant folding for the AVX2 per-element shift intrinsics, which handle out of bounds shift amounts (logical result = 0, arithmetic result = signbit splat)
AVX512 intrinsics will follow in follow up patches
First stage of #154287
This is a weird special case added in 2015, simplifying an even older
condition. It is a no-op for ELF (isExternal is always false) and seems
unneeded for non-ELF.
Similay to
94655dc8ae
The difference is that in LoongArch, the ALIGN is synthesized when the
alignment is >4, (instead of >=4), and the number of bytes inserted is
`sec->addralign - 4`.
extractSubregFromImm previously would sign extend the 16-bit subregister
extracts, but not the 32-bit. We try to consistently store immediates
as sign extended, since not doing it can result in misreported
isInlineImmediate checks.
Since src_{private|shared}_{base|limit} registers are added and
are not artifical compiler happily uses it when it can. In HW
these registers do not exist and the encoding belongs to their
64-bit super-register or 32-bit low register. Same instructions
will produce relocation if run through asm.
This patch updates the TMA Tensor prefetch Op
to add support for im2col_w/w128 and tile_gather4 modes.
This completes support for all modes available in Blackwell.
* lit tests are added for all possible combinations.
* The invalid tests are moved to a separate file with more coverage.
Signed-off-by: Durgadoss R <durgadossr@nvidia.com>
This patch adds basic assembler and MC layer infrastructure for
RISC-V big-endian targets (riscv32be/riscv64be):
- Register big-endian targets in RISCVTargetMachine
- Add big-endian data layout strings
- Implement endianness-aware fixup application in assembler
backend
- Add byte swapping for data fixups on BE cores
- Update MC layer components (AsmInfo, MCTargetDesc, Disassembler,
AsmParser)
This provides the foundation for BE support but does not yet include:
- Codegen patterns for BE
- Load/store instruction handling
- BE-specific subtarget features
These are almost all for internal-developer-users only so "look at
debugserver.cpp" wasn't unreasonable, but we rarely add any new options
so a simple list of all recognized options isn't a burden to throw in
the help method.
This change addresses CodeQL format-string warnings across multiple
sanitizer libraries by adding explicit casts to ensure that printf-style
format specifiers match the actual argument types.
Key updates:
- Cast pointer arguments to (void*) when used with %p.
- Use appropriate integer types and specifiers (e.g., size_t -> %zu,
ssize_t -> %zd) to avoid mismatches.
- Fix format specifier mismatches across xray, memprof, lsan, hwasan,
dfsan.
These changes are no-ops at runtime but improve type safety, silence
static analysis warnings, and reduce the risk of UB in variadic calls.
In 88f409194, we changed the way the crashlog scripted process was
launched since the previous approach required to parse the file twice,
by stopping at entry, setting the crashlog object in the middle of the
scripted process launch and resuming it.
Since then, we've introduced SBScriptObject which allows to pass any
arbitrary python object accross the SBAPI boundary to another scripted
affordance.
This patch make sure of that to include the parse crashlog object into
the scripted process launch info dictionary, which eliviates the need to
stop at entry.
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>
Signed-off-by: Med Ismail Bennani <ismail@bennani.ma>