500381 Commits

Author SHA1 Message Date
Paul T Robinson
6416958067
Revert "[CUDA] Fix a couple of driver tests that really weren't being run" (#93988)
Reverts llvm/llvm-project#93960

The change to offloading-interoperability.c broke many bots.
2024-05-31 13:14:20 -04:00
Paul T Robinson
97c34eb8df
[CUDA] Fix a couple of driver tests that really weren't being run (#93960) 2024-05-31 12:47:37 -04:00
Konstantin Zhuravlyov
775f1cd34d
AMDGPU: Add gfx12-generic target (#93875) 2024-05-31 12:46:44 -04:00
Nikita Popov
57eb92ea6c
[llvm-objdump][test] Relax directory prefix check in source-interleave test (#93789)
This test currently has an explicit regex for characters that are
supposedly valid inside a directory name -- however, it does not
actually cover all necessary characters. For example, this test fails if
the path contains a tilde.

Instead, replace this with a wildcard.
2024-05-31 18:46:15 +02:00
Fangrui Song
7b6a89f346
[ELF] Detect convergence of output section addresses
Some linker scripts don't converge. https://reviews.llvm.org/D66279
("[ELF] Make LinkerScript::assignAddresses iterative") detected
convergence of symbol assignments.

This patch detects convergence of output section addresses. While input
sections might also have convergence issues, they are less common as
expressions that could cause convergence issues typically involve output
sections and symbol assignments.

GNU ld has an error `non constant or forward reference address expression for section` that
correctly rejects
```
SECTIONS {
  .text ADDR(.data)+0x1000 : { *(.text) }
  .data : { *(.data) }
}
```

but not the following variant:
```
SECTIONS {
  .text foo : { *(.text) }
  .data : { *(.data) }
  foo = ADDR(.data)+0x1000;
}
```

Our approach consistently rejects both cases.

Link: https://discourse.llvm.org/t/lld-and-layout-convergence/79232

Pull Request: https://github.com/llvm/llvm-project/pull/93888
2024-05-31 09:31:15 -07:00
Victor Perez
98d5d3448d
[MLIR][GPU-LLVM] Define -convert-gpu-to-llvm-spv pass (#90972)
Define pass for GPU to LLVM conversion for SPIR-V backend tool ingest.

Supported operations:

- `gpu.block_id`
- `gpu.global_id`
- `gpu.block_dim`
- `gpu.thread_id`
- `gpu.grid_dim`
- `gpu.barrier`
- `gpu.shuffle`

---------

Signed-off-by: Victor Perez <victor.perez@codeplay.com>
2024-05-31 17:47:53 +02:00
erichkeane
85ea1aaf15 [OpenACC] Fix device_type clause appertainment
Seemingly I forgot to implement the appertainment checks when doing the
original device_type implementation, so we fell through to the 'not
implemented' section of the diagnostics.

This patch corrects the appertainment, so that we disallow it correctly.
2024-05-31 08:43:48 -07:00
Ivan Kosarev
ca0dae0d6b
[AMDGPU][NFC] Eliminate GCNPredicateControl. (#93964)
Removes ~100K instances of SIAssemblerPredicate and VIAssemblerPredicate
fields from instruction records.
2024-05-31 16:28:43 +01:00
Vladislav Dzhidzhoev
5e423f1c51 [lldb][test] Add --sysroot argument to dotest.py
This argument allows to set specific sysroot pass which will be used for
building LLDB API test programs.
It might come in handy for setting up cross-platform remote runs of API
tests on Windows host.

It can be useful for cross-compiling LLDB API tests. The argument can be
set using `LLDB_TEST_USER_ARGS` argument:
```
cmake ...
-DLLDB_TEST_USER_ARGS="...;--sysroot;C:\path\to\sysroot;..."
...
```
2024-05-31 17:24:14 +02:00
Valentin Clement (バレンタイン クレメン)
e6bef08e22
[flang] Avoid double free in bufferize pass (#93922)
In some cases where we have an `hlfir.no_reassoc` operation, the
bufferization pass could not earse the hlfir.destroy op during the
`hlfir.associate` op conversion as show in the example below.

```
func.func @double_free(%arg0: !fir.boxchar<1>) {
  %c5 = arith.constant 5 : index
  %true = arith.constant true
  %0 = hlfir.as_expr %arg0 move %true : (!fir.boxchar<1>, i1) -> !hlfir.expr<!fir.char<1,?>>
  %1 = hlfir.no_reassoc %0 : !hlfir.expr<!fir.char<1,?>>
  %2:3 = hlfir.associate %1 typeparams %c5 {adapt.valuebyref} : (!hlfir.expr<!fir.char<1,?>>, index) -> (!fir.boxchar<1>, !fir.ref<!fir.char<1,?>>, i1)
  fir.call @noop(%2#0) : (!fir.boxchar<1>) -> ()
  hlfir.end_associate %2#1, %2#2 : !fir.ref<!fir.char<1,?>>, i1
  hlfir.destroy %0 : !hlfir.expr<!fir.char<1,?>>
  return
} 
func.func private @noop(!fir.boxchar<1>)
```

The bufferization pass is looking at uses of its source `%1` that is the
result of an `hlfir.no_reassoc` operation. In order to avoid double free
generation, also look at the indirection in presence of
`hlfir.no_reassoc`.
2024-05-31 08:23:27 -07:00
Vladislav Dzhidzhoev
c5e417a812 [lldb] Fix 'session save' command on Windows
1. Use dashes (-) instead of colons (:) as time separator in a session log
file name since Windows doesn't support saving files with names containing
colons.

2. Temporary file creation code is changed in the test:
On Windows, the temporary file should be closed before 'session save'
writes session log to it. NamedTemporaryFile() can preserve the file
after closing it with delete_on_close=False option.
However, this option is only available since Python 3.12. Thus
mkstemp() is used for temporary file creation as the more compatible
option.
2024-05-31 17:18:21 +02:00
Luke Lau
fb87e11e72 [RISCV] Add test case for strided scatter with scalar offset. NFC 2024-05-31 15:49:16 +01:00
Kirill Podoprigora
6163775077
[clang] `README.txt`: Replace the link to the old bug tracker with the new one. (#93878) 2024-05-31 10:43:42 -04:00
Craig Topper
edf4e02906
[RISCV] Support multiple levels of truncates in combineTruncToVnclip. (#93752)
We can use multiple vnclips to saturate an i32 value into an i8 value.
2024-05-31 09:09:12 -05:00
David Truby
5c7f7cc4de
[flang] Fix exec.f90 test on LIT integrated shell (#93961)
The exec.f90 test sets an environment variable for a specific command
directly
rather than using env, which doesn't work on shells that don't support
this
syntax, most notably the LIT integrated shell. This patch simply adds
env so
that this works on the integrated shell.
2024-05-31 15:03:44 +01:00
Simon Pilgrim
b52962d1b8 [X86] LowerVSELECT - split v16i16/v32i8 pre-AVX2 VSELECT ops if enough of the operands are free to split.
Often on AVX1 we're better off consistently using 128-bit instructions, so recognise when the operands are loads that can be freely/cheaply split - ideally this functionality needs to be moved to isFreeToSplitVector but we're using it in a few places where we don't want to split loads yet.

Based off a regression reported after #92794
2024-05-31 14:43:10 +01:00
Florian Hahn
654cd94629
[VPlan] Unconditionally run optimizeForVFAndUF.
Now that the VPlan for the main vector loop gets cloned in the epilogue
vectorization code path, there optimizeForVFAndUF can be applied
unconditionally.
2024-05-31 06:32:49 -07:00
Nikita Popov
6ee845d240 [IR] Remove handling for removed ConstantExprs (NFC) 2024-05-31 15:03:22 +02:00
Simon Pilgrim
f0e8d003e5 [X86] widen_load-3.ll - add missing nounwind attributes 2024-05-31 13:59:49 +01:00
Elvina Yakubova
765ce86991
[BOLT][DOC] Add script for automatic user guide generation (#93822) 2024-05-31 13:50:51 +01:00
Nikita Popov
37ecd43335 [ExecutionEngine] Remove handling for removed ConstantExprs (NFCI)
These constant expressions no longer exist, so don't handle them.
2024-05-31 14:49:40 +02:00
Elvina Yakubova
23427b808c
[BOLT][NFC] Fix typo in DWARFRewriter.cpp (#93955) 2024-05-31 13:43:20 +01:00
Daniil Kovalev
7acd2c0652
[lld][ELF][AArch64] Support R_AARCH64_GOT_LD_PREL19 relocation (#89592)
With tiny code model, the GOT slot contents can be loaded via `ldr x0,
:got:sym` which corresponds to `R_AARCH64_GOT_LD_PREL19` static
GOT-relative relocation.

See
https://github.com/ARM-software/abi-aa/blob/main/aaelf64/aaelf64.rst#static-aarch64-relocations
2024-05-31 14:57:58 +03:00
jeanPerier
f917c396c9
[flang] improve and rename Entity::hasNonDefaultLowerBounds (#93848)
Improve hasNonDefaultLowerBounds to follow box fir.convert. This helps
HLFIR helpers to generate less code when it can be easily deduced that
the fir.box lower bounds were set to ones.

It will help me for SELECT RANK lowering to avoid generating
hlfir.declare with lower bounds inside the RANK CASE (Current situation
would not be incorrect, the lower bounds would be SSA value ending-up
being one, I just want simpler IR).

Renamed to mayHaveNonDefaultLowerBounds since it may still answer yes when
the lower bounds are ones.
2024-05-31 13:41:11 +02:00
Adam Siemieniuk
8f4d5a32ac
[mlir][tensor] Fold unpadding collapse_shape into extract_slice (#93554) 2024-05-31 13:29:40 +02:00
Simon Pilgrim
189efb0fbb [X86] vselect-pcmp.ll - add tests showing poor codegen on AVX1 targets where we have to split/concat 128-bit subvectors
We'd be better off consistently using 128-bit instructions

Based off a regression reported after #92794
2024-05-31 12:29:01 +01:00
Sergey Kachkov
f34dedbf44
[LoopPeel] Support min/max intrinsics in loop peeling (#93162)
This patch adds processing of min/max intrinsics in LoopPeel in the
similar way as it was done for conditional statements: for
min/max(IterVal, BoundVal) we peel iterations where IterVal < BoundVal
for monotonically increasing IterVal; for monotonically decreasing
IterVal we peel iterations where IterVal > BoundVal (strict comparision
predicates are used to minimize number of peeled iterations).
2024-05-31 13:58:10 +03:00
Endre Fülöp
46b3145b7c
[clang][analyzer][NFC] Add test for a limitation of alpha.unix.BlockInCriticalSection checker (#93799)
Updated the documentation in `checkers.rst` to include an example of how
`trylock` function is handled.
Added a new test for a scenario where `pthread_mutex_trylock` is used,
demonstrating the current limitation.
2024-05-31 12:51:14 +02:00
Endre Fülöp
196dca7561
[clang][analyzer][NFC] Improve docs of alpha.unix.BlockInCriticalSection (#93812)
- Enhanced descriptions for blocking and critical section functions
- Added an additional code sample highlighting interleaved C and C++
style mutexes
2024-05-31 12:50:04 +02:00
Henry Linjamäki
a65771fce4
[SPIR-V] Prefer llvm-spirv-<LLVM_VERSION_MAJOR> tool (#77897)
Prefer using `llvm-spirv-<LLVM_VERSION_MAJOR>` tool (i.e.
`llvm-spirv-18`) over plain `llvm-spirv`. If the versioned tool is not
found in PATH, fall back to use the plain `llvm-spirv`.

An issue with the using `llvm-spirv` is that the one found in PATH might
be compiled against older LLVM version which could lead to crashes or
obscure bugs. For example, `llvm-spirv` distributed by Ubuntu links
against different LLVM version depending on the Ubuntu release (LLVM-10
in 20.04LTS, LLVM-13 in 22.04LTS).
2024-05-31 12:21:21 +02:00
Tom Eccles
c8fad4fb88
[flang][OpenMP][NFC] Reduce OMPMarkDeclareTarget boilerplate (#93797)
The pass constructor can be generated automatically by tablegen.

This pass does not need adapting to work with non-function top level
operations because it operates specifically on call operations inside of
an OpenMP declare target function.
2024-05-31 11:13:54 +01:00
Sergey Kachkov
60a890d855 [LoopPeel] Add pre-commit test for min/max intrinsics 2024-05-31 13:06:08 +03:00
Nikita Popov
de32a3df35 [Clang] Regenerate test checks (NFC)
To minimize diffs in an upcoming change.
2024-05-31 11:18:17 +02:00
Stanislav Mekhanoshin
2766a66fa7
[AMDGPU] Remove FlatVariant argument from isLegalFlatAddressingMode. NFC. (#93938)
This argument is easily deduced from AS argument.
2024-05-31 01:58:12 -07:00
Guillaume Chatelet
48ba7da9c8
[libc][NFC] Allow compilation of memcpy with -m32 (#93790)
Needed to support i386 (#93709).
2024-05-31 10:48:38 +02:00
Jay Foad
b1be480b03 [DAGCombiner] Move CanReassociate down to first use. NFC. 2024-05-31 09:44:47 +01:00
Nikita Popov
51e459a561 Revert "[ConstantFold] Remove non-trivial gep-of-gep fold (#93823)"
This reverts commit e1cc9e4eaddcc295b4e775512e33b947b1514c17.

This causes some non-trivial text size increases in unoptimized
builds for Bullet. Revert while I investigate.
2024-05-31 10:37:32 +02:00
Théo Degioanni
b86a9c5bf2
[mlir][irdl] Lookup symbols near dialects instead of locally (#92819)
Because symbols cannot refer to operations outside of their symbol
tables, it was impossible to refer to operations outside of the dialect
currently being defined. This PR modifies the lookup logic to happen
relative to the symbol table containing the dialect-defining operations.
This is a bit of hack but should unblock the situation here.
2024-05-31 09:15:50 +01:00
Yvan Roux
ae86278090
[Nomination] Add ST representative to Security group (#93176)
I'd like to nominate myself to join the LLVM Security group as a
representative of ST. I work in ST's compiler team contributing to
upstream (LLVM and GNU) and several downstream toolchains. We believe
that it is important for us to be part of this group to address or
report any potential security issues the LLVM project or our toolchains
may encounter.
2024-05-31 10:13:26 +02:00
Sander de Smalen
f484c79e7a
[AArch64] Avoid NEON ctpop in Streaming-SVE mode (#93826)
The NEON ctpop instruction is also used for scalars.
2024-05-31 09:01:17 +01:00
Matheus Izvekov
be566d2eac
[clang] AST Visitor: skip empty qualifiers in QualifiedTemplateName (#93926) 2024-05-31 04:31:31 -03:00
Nikita Popov
e1cc9e4ead
[ConstantFold] Remove non-trivial gep-of-gep fold (#93823)
This fold is subtly incorrect, because DL-unaware constant folding does
not know the correct index type to use, and just performs the addition
in the type that happens to already be there. This is incorrect, since
sext(X)+sext(Y) is generally not the same as sext(X+Y). See the
`@constexpr_gep_of_gep_with_narrow_type()` for a miscompile with the
current implementation.

One could try to restrict the fold to cases where no overflow occurs,
but I'm not bothering with that here, because the DL-aware constant
folding will take care of this anyway. I've only kept the
straightforward zero-index case, where we just concatenate two GEPs.
2024-05-31 09:25:38 +02:00
Nikita Popov
63dc31b68b Reapply [IR] Avoid creating icmp/fcmp constant expressions (#92885)
Reapply after https://github.com/llvm/llvm-project/pull/93548,
which should address the lldb failure on macos.

-----

Do not create icmp/fcmp constant expressions in IRBuilder etc anymore,
i.e. treat them as "undesirable". This is in preparation for removing
them entirely.

Part of:
https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179
2024-05-31 08:55:59 +02:00
Fabian Ritter
0821b7937c
[AMDGPU] Copy Defs and Uses from Pseudo to Real Instructions (#93004)
Currently, the tablegen files that generate the instruction definitions
in lib/Target/AMDGPU/AMDGPUGenInstrInfo.inc often only include implicit
operands for the architecture-independent pseudo instructions, but not
for the corresponding real instructions. The missing implicit operands
(most prominently: the EXEC mask) do not affect code generation, since
that operates on pseudo instructions, but they are problematic when
working with real instructions, e.g., as a decoding result from the MC
layer.

This patch copies the implicit Defs and Uses from pseudo instructions to
the corresponding real instructions, so that implicit operands are also
defined for real instructions.

Addresses issue #89830.
2024-05-31 08:40:54 +02:00
jeanPerier
5228c2cbd6
[flang][FIR] add fir.is_assumed_size operation (#93853)
Assumed-rank fir.box/class may describe assumed-size array. This case
needs special handling in SELECT RANK. It is not possible to generate
FIR code to detect that a fir.box is an assumed-size (the way to detect
that is to check that upper dimension extent is -1 in the descriptor).

Instead of emitting a runtime call directly in lowering, add an
operation that can later be lowered to a runtime call or inline code
when the descriptor layout is known.
2024-05-31 08:38:40 +02:00
jeanPerier
f49d26bc77
[flang][runtime] add IsAssumedSize API (#93857)
Needed for SELECT RANK implementation. I want to stay away from
generating the `rank > 0 && ...` logic in FIR codegen for now.
2024-05-31 08:37:23 +02:00
Cyndy Ishida
4985f25ffc
[IR] Fix IWYU violation (#93918)
GEPNoWrapFlags.h calls `assert` creating a undeclared identifier error
when running an Apple-stage2 build with LLVM_ENABLE_MODULES enabled.

resolves: rdar://129031201
2024-05-30 23:19:46 -07:00
Nikita Popov
71ccd0d8cc
[IRInterpreter] Return zero address for missing weak function (#93548)
If a weak function is missing, still return it's address (zero) rather
than failing interpretation. Otherwise we have a mismatch between
Interpret() and CanInterpret() resulting in failures that would not
occur with JIT execution.

Alternatively, we could try to look for weak symbols in CanInterpret()
and generally reject them there.

This is the root cause for the issue exposed by
https://github.com/llvm/llvm-project/pull/92885. Previously, the case
affected by that always fell back to JIT because an icmp constant
expression was used, which is not supported by the interpreter. Now a
normal icmp instruction is used, which is supported. However, we fail to
interpret due to incorrect handling of weak function addresses.
2024-05-31 08:18:35 +02:00
Malay Sanghi
089dfeee8a
[X86] Add support for MS inp functions. (#93804)
support _inp, _inpw, _inpd.
These functions were removed from the Windows runtime library, but aare
still supported for kernel mode development.
2024-05-31 14:11:33 +08:00
Zixu Wang
1fa073ab89
[MachO] Stop parsing past end of rebase/bind table (#93897)
`MachORebaseEntry::moveNext()` and `MachOBindEntry::moveNext()` assume
that the rebase/bind table ends with `{REBASE|BIND}_OPCODE_DONE` or an
actual rebase/bind. However a valid rebase/bind table might also end
with other effectively no-op opcodes, which caused the parser to move
past the end and go into the next table, resulting in corrupted entries
or infinite loops.
2024-05-30 23:08:01 -07:00