Use start + (end - start) / 2 instead of (start + end) / 2 to compute
the midpoint address. The original expression overflows when start + end
exceeds UPTR_MAX, which happens on 32-bit targets whose memory layout
includes regions above 0x80000000.
On musl, rlimit64 is an alias for rlimit rather than a distinct type
provided by glibc. Add a SANITIZER_MUSL elif branch so that
struct_rlimit64_sz is defined for musl-based Linux targets.
#184545 default-enables the IO sandbox in assert-builds. This causes
Clang using Polly to crash (#188568).
The issue is that `PassBuilder` uses `vfs::getRealFileSystem()` by
default which is considered a IO sandbox violation in the Clang process.
With this PR store the VFS from the `PassBuilder` from the original
`registerPollyPasses` call for creating other `PassBuilder` instances.
This PR also adds infrastructure for running Polly in `clang` (in
addition in `opt`). `opt` does not enable the sandbox such that we need
separate tests using Clang.
Closes: #188568
Need to check if the potential bitcast/bswap-like construct is a root of
the reduction, otherwise it cannot represent a bitcast/bswap construct.
Fixes#189184
When invoking `-test-bytecode-roundtrip=test-dialect-version=X.Y` on a
module that contains no test dialect operations, the reader type
callback in `runTest0` called
`reader.getDialectVersion<test::TestDialect>()` and then immediately
asserted that it succeeded. However, if the test dialect was never
referenced in the bytecode (because no test dialect types appear in the
module), the dialect's version information is not stored in the
bytecode, so `getDialectVersion` legitimately returns failure.
When the test dialect version is unavailable in the bytecode being read,
the module contains no test dialect types, so no "funky"-group overrides
are needed and the callback can safely skip by returning `success()`.
A regression test is added with a module that has no test dialect ops,
exercising the `test-dialect-version=2.0` path that previously crashed.
Fixes#128321Fixes#128325
Assisted-by: Claude Code
When a dynamic index of -1 (the kPoisonIndex sentinel) was folded into
the static position of a vector.insert op,
foldDenseElementsAttrDestInsertOp would proceed to call
calculateInsertPosition, which returned -1. The subsequent iterator
arithmetic (allValues.begin() + (-1)) was undefined behaviour, causing
an assertion in DenseElementsAttr::get.
Fix by bailing out early in foldDenseElementsAttrDestInsertOp when any
static position equals kPoisonIndex, consistent with how
InsertChainFullyInitialized already guards this case.
Fixes#188404
Assisted-by: Claude Code
To simplify the output of the reduction-tree pass, this PR introduces
the eraseRedundantBlocksInRegion. For regions containing multiple
execution paths, this functionality selects the shortest 'interesting'
path. Additionally, this PR adds the getSuccessorForwardOperands API to
BranchOpInterface. This allows us to extract the ForwardOperands for a
specific path chosen from multiple alternatives, enabling the creation
of a cf.br operation for the redirected jump.
[Driver][HIP] Fix bundled -S emitting bitcode instead of assembly for
device
PR #188262 added support for bundling HIP -S output under the new
offload driver, but the device backend still entered the
bitcode-emitting path in ConstructPhaseAction. The condition at the
Backend phase checked for the new offload driver and directed device
code to emit TY_LLVM_BC, without excluding the -S case. This caused
the device section in the bundled .s to contain LLVM bitcode instead
of textual AMDGPU assembly.
This broke the HIP UT CheckCodeObjAttr test which greps
copyKernel.s for "uniform_work_group_size" — a string that only
appears in textual assembly, not in bitcode.
Fix by excluding -S (without -emit-llvm) from the new-driver
bitcode path, so the device backend falls through to emit TY_PP_Asm
(textual assembly). Also add a missing lit test check that the
device backend produces assembler output for the bundled -S case.
Fixes: LCOMPILER-553
Previously, it generated extra `single` quote marks around the outer
braces (i.e., `'{'` `6442:\220,1\22` `'}'`). SPIR-V backend does not
expect that. It expects `{6442:\220,1\22}`.
The test used to look all good, but actually not. The WeakVH just make
itself null after the pointed value being replaced. So a zero value was
used because VarIndex become null. The test checks looks all good.
Actually only the WeakTrackingVH have the ability to be updated to new
value.
Change the test slightly to make that using zero index is wrong.
This removes dyn_cast invocations where the argument is already of the
target type (including through subtyping). This was created by adding a
static assert in dyn_cast and letting an LLM iterate until the code base
compiled. I then went through each example and cleaned it up. This does
not commit the static assert in dyn_cast, because it would prevent a lot
of uses in templated code. To prevent backsliding we should instead add
an LLVM aware version of
https://clang.llvm.org/extra/clang-tidy/checks/readability/redundant-casting.html
(or expand the existing one).
The code generated for calls with FPCC eligible structs as arguments
doesn't consider the bitfield, which results in a store crossing the
boundary of the memory allocated using alloca, e.g.
For the code:
```
struct __attribute__((packed, aligned(1))) S {
const float f0;
unsigned f1 : 1;
};
unsigned func(struct S arg)
{
return arg.f1;
}
```
The generated IR is:
```
define dso_local signext i32 @func(
float [[TMP0:%.*]], i32 [[TMP1:%.*]]) #[[ATTR0:[0-9]+]] {
[[ENTRY:.*:]]
[[ARG:%.*]] = alloca [[STRUCT_S:%.*]], align 1
[[TMP2:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 0
store float [[TMP0]], ptr [[TMP2]], align 1
[[TMP3:%.*]] = getelementptr inbounds nuw { float, i32 }, ptr [[ARG]], i32 0, i32 1
store i32 [[TMP1]], ptr [[TMP3]], align 1
[[F1:%.*]] = getelementptr inbounds nuw [[STRUCT_S]], ptr [[ARG]], i32 0, i32 1
[[BF_LOAD:%.*]] = load i8, ptr [[F1]], align 1
[[BF_CLEAR:%.*]] = and i8 [[BF_LOAD]], 1
[[BF_CAST:%.*]] = zext i8 [[BF_CLEAR]] to i32
ret i32 [[BF_CAST]]
```
Where, `store i32 [[TMP1]], ptr [[TMP3]], align 1` can be seen crossing
the boundary of the allocated memory. If, the IR is seen after
optimizations (EarlyCSEPass), the IR left is:
```
define dso_local noundef signext i32 @func(
float [[TMP0:%.*]], i32 [[TMP1:%.*]]) local_unnamed_addr #[[ATTR0:[0-9]+]] {
[[ENTRY:.*:]]
ret i32 0
```
The patch trims the second member of the struct after taking into
consideration the bitwidth to decide the appropriate integer type and
the test shows the results of this patch.
Note that the bug is seen only when `f` extension is enabled for FPCC
eligibility.
Co-authored-by: muhammad.kamran4 <muhammad.kamran@esperantotech.com>
Get rid of several .h.def files which were used to ensure that the
macro definitions from llvm-libc-macro would be included in the public
header. Replace this logic with YAML instead - add entries to the
"macros" list that point to the correct "macro_header" to ensure it
would be included.
For C standard library headers, list several standard-define macros
to document their availability. For POSIX/Linux headers, only reference
a handful of macro, since more planning is needed to decide how to
represent platform-specific macro in YAML.
When SPIRV-LLVM-Translator is built in-tree (i.e., placed in
llvm/projects folder), llvm-spirv target exists.
Drop legacy llvm-spirv_target dependency (was for non-runtime build) and
add llvm-spirv to runtimes dependencies.
fixes#188131
This change address stylistic changes @bogners requested in
https://github.com/llvm/llvm-project/pull/186215/ It also adds the
`storeMatrixArrayFromVector`. to
SPIRVLegalizePointerCast.cpp when we detect the matrix array of vector
memory layout
Changes to storeArrayFromVector were cleanup
Assisted-by Github Copilot for test case check lines
The docgen script was previously hardcoded to assume all implemented
macros must be placed in a *-macros.h header. This updates docgen to
read inline macro_value properties directly from the source YAML files,
correctly recognizing them as implemented.
Define the POSIX cpio.h header and its standard macros in the libc build
system. Configure the macros directly in the YAML specification to allow
automated header generation without a custom definition template.