As per OpenMP 5.1, we need to assume that when the lookup for
`use_device_ptr/addr` fails, the incoming pointer was already device
accessible.
Prior to 5.1, a lookup-failure meant a user-error (for
`use_device_ptr`),
so we could do anything in that scenario. For `use_device_addr`,
it was always incorrect to set the address to null.
OpenMP 6.1 adds a way to retain the previous behavior of nullifying a
pointer
when the lookup fails. That will be tackled by the PR stack
starting with https://github.com/llvm/llvm-project/pull/169603.
a83c89495ba6fe0134dcaa02372c320cc7ff0dbf caused assertion failures here
as if we have a single bit induction variable and two lanes (0 and 1),
then the second lane index (1) will be out of bounds of what a signed
1-bit integer can hold. Lane indices are always >0 according to
VPlanHelpers.h:125, and the lane representation in this code is also
unsigned.
The test case come from tensorflow/XLA.
This reverts commit 906b48616c03948a4df62a5a144f7108f3c455e8.
The forward fix for this got reverted in
25976e83606f1a7615e3725e6038bb53ee96c3d5, so reverting the original
commit given it is still broken and the forward fix that mitigated most
of the issues is no longer in tree.
[Clang][DebugInfo] Add a flag to use expansion loc for macro params.
This patch adds a flag to allow users to preserve the old behaviour -
use the macro expansion location for parameters. This is useful for
wider testing of sample profile driven PGO which relies on debug
information based mapping. This flag is intended to be temporary
and should be safe to remove by EOY 2026. Filed #175249 to
track the cleanup.
---------
Assisted-by: Gemini
There are a few test cases in TestMultithreaded.py. Most of them set a
breakpoint by name on "next". There's no problem with doing that, but
one of the tests cases in particular relies on being able to grab a
specific breakpoint location corresponding to the test inferior.
If you have libc++ symbols, this test will also have breakpoint
locations for symbols named `next` in libc++. I could have changed the
test to find the correct `next` breakpoint location, but it seems easier
to give it a more uncommon name instead.
This PR adds validation for register numbers.
Register numbers ought never to exceed UINT32_MAX, or 4294967295
Additionally, resource arrays will have each resource element bound
sequentially, and those resource's register numbers should not exceed
UINT32_MAX, or 4294967295. Even though not explicitly given a register
number, their effective register number is also validated.
This accounts for nested resource declarations and resource arrays too.
Fixes https://github.com/llvm/llvm-project/issues/136809
This commit introduces new ext-shape operations,
- LOG2_CEIL_SHAPE
- LOG2_FLOOR_SHAPE
- EXP2_SHAPE
These additions include the operator definitions, same-rank
verification, and level checks during validation.
---------
Co-authored-by: Luke Hutton <luke.hutton@arm.com>
The encoding scheme for 48-bit and larger instructions has not
been ratified yet. The RISC-V ISA manual previously included a
proposal that included 4 reserved major opcodes. LLVM's
disassembler implements this proposal as does binutils.
A vendor extension might have used the reserved opcodes,
as a non-conforming 32-bit extension. Try to decode as a
32-bit instruction first to catch these cases.
Should help with #174571.
Consider the following program:
```
int main() {
int foo[2][3][4];
int (*bar)[3][4] = foo;
return 0;
}
```
If we:
- compile this program
- launch an LLDB debugging session
- launch the process and let it stop at the `return 0;` statement
then the following LLDB command:
```
(lldb) script lldb.frame.FindVariable("bar").GetChildAtIndex(0).get_expr_path()
```
will produce the following output:
```
bar->[0]
```
What we were expecting:
- a valid expression in the C programming language
- that would allow us (in the scope of the `main` function) access the
appropriate object.
What we've got is a string that does not represent a valid expression in
the C programming language.
This pull-request proposes a fix to this problem.
---------
Co-authored-by: Matej Košík <matej.kosik@codasip.com>
In regions destined for GPU offload, computing an address_of means
getting device address directly - no need (and actually incorrect) to
insert a runtime call to get the address. This was already working for
regions such as `gpu.launch` - but now it applies to acc regions as
well.
Some AST nodes had their "source" member visited by the parse tree
visitor, while others, in particular those that were handled by the
trait-based visitors, did not.
Make sure that we call the Walk function on the "source" member for all
classes that have it.
Summary:
We rely on this in most places we work with address spaces. This allows
target address spaces to implicity convert to generic ones.
I actually have no clue if this is valid or correct with SPIR-V, hoping
someone with more target / backend knowledge can chime in.
---------
Co-authored-by: Matt Arsenault <arsenm2@gmail.com>
When connected to a GDB remote platform, we always use "gdb-remote" as
the process plugin when attaching. This means that the `--plugin`
argument to `process attach` is effectively ignored. This patch makes it
so that "gdb-remote" remains the default, while still honoring the
process plugin name specified in the attach info. The same thing applies
to launching a process.
rdar://167845923
Add a new metadata node `!implicit.ref` to represent an implicit
dependency between 2 symbols. The metadata is unique to AIX and gets
lowered to a relocation that adds an explicit link between the section
the global that the metadata is placed on is allocated in, to the
asscoiated symbol. This relocation will cause the associated symbol to
remain live if the section is not garbage collected. This is used mainly
for compiler features where there is some hidden runtime dependency
between the symbols that isn't otherwise obvious to the linker.
Following the improvements introduced in #109833 and the most recent
development of the libamath library (used by `-fveclib=ArmPL`), this
patch adds the missing mappings for the functions that return literal
struct values.
Handle empty unions in CIR record lowering and LLVM conversion by
emitting padding when needed, guarding `getLargestMember` for
empty/padded unions, and lowering to empty or padded LLVM structs based
on language rules.
Added regression tests for C and C++ empty union lowering in
`clang/test/CIR/CodeGen/empty-union.c` and `empty-union.cpp`.
On 64-bit AIX, set allocator size to 256G and set beginning to
0x0a00000000000000.
Issue: #138916
---------
Co-authored-by: Hubert Tong <hubert.reinterpretcast@gmail.com>
CodeEmitterGen CaseMap values is always a vector of integer IDs (HwMode
or instruction opcode). So change the map values to be a vector of
integers instead of strings and instead print the string form when
emitting the case statements. This will help reduce the memory footprint
by not storing potentially long strings (for opcode names) in the map.
The P extension requires us to use base ISA load/store instructions for
small vectors. We need to make sure we don't generate misaligned
instructions.
We'll need to do more work here if we want P and V to be enabled at the
same time, but that's a future problem.
This is another instance where we weren't checking that the result of
FileSystem::CreateDataBuffer and unconditionally accessing it, similar
to the bug in SourceManager last week. In this particular case,
ObjectFile was assuming that we can read the contents non-zero, which
isn't true for directory nodes.
Jim figured this one out yesterday. I'm just putting up the patch and
adding a test.
rdar://167796036
Instead of matching 6 different masks, use an ImmLeaf to detect any of
the 6 masks.
This isn't NFC because using an immediate directly will call
computeKnownBits to fill in bits that are expected to be 1, but have
been cleared because they are known 0 in the LHS of the and. We don't
have tests for this, if it's important we can switch to a ComplexPattern
to restore that behavior.
Add support for G_PTRMASK but we are missing p8 (buffer resource) due to
a legalizer issue in GlobalISel which does not occur on SelectionDAG:
`LLVM ERROR: unable to legalize instruction: %17:_(p8) = G_PTRMASK %0:_,
%22:_(s128) (in function: v_ptrmask_buffer_resource_variable_i48)`
Added a FIXME to indicate this issue.
Closes#172176.
Previously, `FoldOpIntoSelect` wouldn't fold multi-use selects if
`MultiUse` wasn't explicitly true. This prevents useful folding when the
select is used multiple times in the same intrinsic call. Similar to
what is done in `foldOpIntoPhi`, we'll now check that all of the uses
come from a single user, rather than checking that there is only one
use.