538174 Commits

Author SHA1 Message Date
Aiden Grossman
de3e8fff20 [NFC][CI] Reformat python files
Looks like some of these were not properly formatted at some point. This
patch reformats these files so that future diffs are cleaner when
running the formatter over the whole file.
2025-05-20 21:52:33 +00:00
Daniel Paoliello
a414877a7a
[x64][win] Add compiler support for x64 import call optimization (equivalent to MSVC /d2guardretpoline) (#126631)
This is the x64 equivalent of #121516

Since import call optimization was originally [added to x64 Windows to
implement a more efficient retpoline
mitigation](https://techcommunity.microsoft.com/blog/windowsosplatform/mitigating-spectre-variant-2-with-retpoline-on-windows/295618)
the section and constant names relating to this all mention "retpoline"
and we need to mark indirect calls, control-flow guard calls and jumps
for jump tables in the section alongside calls to imported functions.

As with the AArch64 feature, this emits a new section into the obj which
is used by the MSVC linker to generate the Dynamic Value Relocation
Table and the section itself does not appear in the final binary.

The Windows Loader requires a specific sequence of instructions be
emitted when this feature is enabled:
* Indirect calls/jumps must have the function pointer to jump to in
`rax`.
* Calls to imported functions must use the `rex` prefix and be followed
by a 5-byte nop.
* Indirect calls must be followed by a 3-byte nop.
2025-05-20 14:48:41 -07:00
Aiden Grossman
a690852b29
[llvm-exegesis] Error instead of aborting on verification failure (#137581)
This patch makes llvm-exegesis emit an error when the machine function
fails in MachineVerification rather than aborting. This allows
downstream users (particularly https://github.com/google/gematria) to
handle these errors rather than having the entire process crash. This
essentially be NFC from the user perspective minus the addition of the
new error message.
2025-05-20 14:48:17 -07:00
Andrew Rogers
98595cfd6f
[llvm] prepare explicit template instantiations in llvm/CodeGen for DLL export annotations (#140653)
## Purpose

This patch prepares the llvm/CodeGen library for public interface
annotations in support of an LLVM Windows DLL (shared library) build,
tracked in #109483. The purpose of this patch is to make the upcoming
codemod of this library more straight-forward. It is not expected to
impact any functionality.

The `LLVM_ABI` annotations will be added in a subsequent patch. These
changes are required to build with visibility annotations using Clang
and gcc on Linux/Darwin/etc; Windows DLL can build fine without them.

## Overview
This PR does four things in preparation for adding `LLVM_ABI`
annotations to llvm/CodeGen:
1. Explicitly include `Machine.h` and `Function.h` headers from
`MachinePassManager.cpp` so that `Function` and `Machine` types are
available for the instantiations of `InnerAnalysisManagerProxy`. Without
this change, Clang only will only export one of the templates after
visibility annotations are added to them. Unclear if this is a Clang bug
or expected behavior, but this change avoids the issue and should be
harmless.
2. Refactor the definition of `MachineFunctionAnalysisManager` to its
own header file. Without this change, it is not possible to add
visibility annotations to the declaration with causing gcc to produce
`-Wattribute` warnings.
3. Remove the redundant specialization of the
`DominatorTreeBase<MachineBasicBlock, false>::addRoot` method. The
specialization is the same as implemented in `DominatorTreeBase` so
should be unnecessary. Without this change, it is not possible to
annotate the subsequent instantiations of `DominatorTreeBase` in the
header file without gcc producing `-Wattribute` warnings. Mark
unspecialized `addRoot` as `inline` to match the removed specialized
version.
4. Move the explicit instantiations of the `GenericDomTreeUpdater`
template earlier in the header file. These need to appear before being
used in the `MachineDomTreeUpdater` class definition or gcc will produce
warnings once visibility annotations are added.

## Background

The LLVM Windows DLL effort is tracked in #109483. Additional context is
provided in [this
discourse](https://discourse.llvm.org/t/psa-annotating-llvm-public-interface/85307).

Clang and gcc handle visibility attributes on explicit template
instantiations a bit differently; gcc is pickier and generates
`-Wattribute` warnings when an explicit instantiation with a visibility
annotation appears after the type has already appeared in the
translation unit. These warnings can be avoided by moving explicit
template instantiations so they always appear first.

## Validation

Local builds and tests to validate cross-platform compatibility. This
included llvm, clang, and lldb on the following configurations:

- Windows with MSVC
- Windows with Clang
- Linux with GCC
- Linux with Clang
- Darwin with Clang
2025-05-20 14:40:20 -07:00
Kazu Hirata
e25abd0d54
[bugpoint] Use a range-based for loop (NFC) (#140743) 2025-05-20 14:34:44 -07:00
Kazu Hirata
cbac2a9241
[llvm] Use llvm::is_contained (NFC) (#140742) 2025-05-20 14:34:16 -07:00
Peiyong Lin
04ad8d4900
Emit inbounds and nuw attributes in memref. (#138984)
Now that MLIR accepts nuw and nusw in getelementptr, this patch emits
the inbounds and nuw attributes when lower memref to LLVM in load and
store operators.

This patch also strengthens the memref.load and memref.store spec about
undefined behaviour during lowering.

This patch also lifts the |rewriter| parameter in getStridedElementPtr
ahead so that LLVM::GEPNoWrapFlags can be added at the end with a
default value and grouped together with other operators' parameters.

Signed-off-by: Lin, Peiyong <linpyong@gmail.com>
2025-05-20 14:16:22 -07:00
Arthur Eubanks
11db1285e4 [gn build] Manually port 8f03e1a 2025-05-20 21:09:37 +00:00
Aaron Puchert
317c932622
Suppress errors from well-formed-testing type traits in SFINAE contexts (#135390)
There are several type traits that produce a boolean value or type based
on the well-formedness of some expression (more precisely, the immediate
context, i.e. for example excluding nested template instantiation):
* `__is_constructible` and variants,
* `__is_convertible` and variants,
* `__is_assignable` and variants,
* `__reference_{binds_to,{constructs,converts}_from}_temporary`,
* `__is_trivially_equality_comparable`,
* `__builtin_common_type`.

(It should be noted that the standard doesn't always base this on the
immediate context being well-formed: for `std::common_type` it's based
on whether some expression "denotes a valid type." But I assume that's
an editorial issue and means the same thing.)

Errors in the immediate context are suppressed, instead the type traits
return another value or produce a different type if the expression is
not well-formed. This is achieved using an `SFINAETrap` with
`AccessCheckingSFINAE` set to true. If the type trait is used outside of
an SFINAE context, errors are discarded because in that case the
`SFINAETrap` sets `InNonInstantiationSFINAEContext`, which makes
`isSFINAEContext` return an `optional(nullptr)`, which causes the errors
to be discarded in `EmitDiagnostic`. However, in an SFINAE context this
doesn't happen, and errors are added to `SuppressedDiagnostics` in the
`TemplateDeductionInfo` returned by `isSFINAEContext`. Once we're done
with deducing template arguments and have decided which template is
going to be instantiated, the errors corresponding to the chosen
template are then emitted. At this point we get errors from those type
traits that we wouldn't have seen if used with the same arguments
outside of an SFINAE context. That doesn't seem right.

So what we want to do is always set `InNonInstantiationSFINAEContext`
when evaluating these well-formed-testing type traits, regardless of
whether we're in an SFINAE context or not. This should only affect the
immediate context, as nested contexts add a new `CodeSynthesisContext`
that resets `InNonInstantiationSFINAEContext` for the time it's active.

Going through uses of `SFINAETrap` with `AccessCheckingSFINAE` = `true`,
it occurred to me that all of them want this behavior and we can just
use this parameter to decide whether to use a non-instantiation context.
The uses are precisely the type traits mentioned above plus the
`TentativeAnalysisScope`, where I think it is also fine. (Though I think
we don't do tentative analysis in SFINAE contexts anyway.)

Because the parameter no longer just sets `AccessCheckingSFINAE` in Sema
but also `InNonInstantiationSFINAEContext`, I think it should be renamed
(along with uses, which also point the reviewer to the affected places).
Since we're testing for validity of some expression, `ForValidityCheck`
seems to be a good name.

The added tests should more or less correspond to the users of
`SFINAETrap` with `AccessCheckingSFINAE` = `true`. I added a test for
errors outside of the immediate context for only one type trait, because
it requires some setup and is relatively noisy.

We put the `ForValidityCheck` condition first because it's constant in
all uses and this would then allow the compiler to prune the call to
`isSFINAEContext` when true.

Fixes #132044.
2025-05-20 23:02:51 +02:00
Igor Kudrin
3f196e0293
[lldb][core] Fix getting summary of a variable pointing to r/o memory (#139196)
Motivation example:

```
> lldb -c altmain2.core
...
(lldb) var F
(const char *) F = 0x0804a000 ""
```

The variable `F` points to a read-only memory page not dumped to the
core file, so `Process::ReadMemory()` cannot read the data. The patch
switches to `Target::ReadMemory()`, which can read data both from the
process memory and the application binary.
2025-05-20 13:50:24 -07:00
Matt Arsenault
5aa3171f2c
AMDGPU: Add regression test for multiple frame index lowering (#140784)
Failures appeared after https://github.com/llvm/llvm-project/pull/140587 but this case wasn't covered
2025-05-20 22:37:43 +02:00
Jeremy Morse
26d9cb17a6
[MC][DebugInfo] Emit linetable entries with known offsets immediately (#134677)
DWARF linetable entries are usually emitted as a sequence of
MCDwarfLineAddrFragment fragments containing the line-number difference
and an MCExpr describing the instruction-range the linetable entry
covers. These then get relaxed during assembly emission.

However, a large number of these instruction-range expressions are
ranges within a fixed MCDataFragment, i.e. a range over fixed-size
instructions that are not subject to relaxation at a later stage. Thus,
we can compute the address-delta immediately, and not spend time and
memory describing that computation so it can be deferred.
2025-05-20 21:26:56 +01:00
Florian Hahn
705e27c234
[LoopPeel] Add tests for peeling from end with variable trip counts.
Add more test coverage for peeling the last iteration with variable trip
counts. Separate test cases for constant and variable trip counts in
different files.
2025-05-20 21:07:21 +01:00
Ebuka Ezike
1b6b036c02
[lldb][docs] add command to save core file in gdb to lldb command map. (#140771) 2025-05-20 20:57:51 +01:00
Anthony Cabrera-Lara
be5b4fad29
Update InterpreterProperties.td (#140746)
Fix typo in interpreter property description.

Fixes #140708
2025-05-20 12:55:59 -07:00
Philip Reames
8708c42e31 [RISCV] Add zvqdotq tests using partial.reduce.add [nfc] 2025-05-20 11:48:36 -07:00
Timm Baeder
9260d310f1
[clang][bytecode][NFC] Remove Frame.cpp (#140750)
The file was basically empty. The actual implementation for function
frames of the two interpreter life in their own respective files.
2025-05-20 20:41:32 +02:00
Aaron Ballman
c555c8d554
[C] Do not diagnose flexible array members with -Wdefault-const-init-field-unsafe (#140578)
This addresses post-commit review feedback from someone who discovered
that we diagnosed code like the following:
```
  struct S {
    int len;
    const char fam[];
  } s;
```
despite it being invalid to initialize the flexible array member.

Note, this applies to flexible array members and zero-sized arrays at
the end of a structure (an old-style flexible array member), but it does
not apply to one-sized arrays at the end of a structure because those do
occupy storage that can be initialized.
2025-05-20 14:40:12 -04:00
Philip Reames
0ccd57e289 [RISCV] Add basic coverage of vector.partial.reduce.add [nfc] 2025-05-20 11:31:46 -07:00
Dan Blackwell
4964d98057
[compiler-rt] Replace deprecated os_trace calls on mac (#138908)
Currently there are deprecation warnings suppressed for `os_trace`; this
patch replaces all uses with `os_log_error`.

rdar://140295247
2025-05-20 11:31:40 -07:00
Jan Patrick Lehr
b99e57583e
Revert "[mlir] [XeGPU] Add XeGPU workgroup to subgroup pass (#139477)" (#140779)
This reverts commit 747620db2a02b889ae3ba3921d6c0e526a3e7677.

Multiple bot failures
2025-05-20 20:31:00 +02:00
Kazu Hirata
611f47c46c [flang] Fix a warning
This patch fixes:

  flang/lib/Optimizer/HLFIR/Transforms/LowerHLFIROrderedAssignments.cpp:1377:10:
  error: variable 'isValid' set but not used
  [-Werror,-Wunused-but-set-variable]
2025-05-20 11:24:15 -07:00
Arthur Eubanks
38250ed3b2 [gn build] Manually port a9ee8e4a
Can make these into gn args later if needed.
2025-05-20 18:22:04 +00:00
Valentin Clement (バレンタイン クレメン)
c17ae161fd
[flang][cuda] Use nullptr for comparison (#140767)
Comparison without explicit nullptr seems to bring false positives. Use
explicit nullptr.
2025-05-20 11:04:06 -07:00
Abhina Sree
a9ee8e4a45
Create a EncodingConverter class with both iconv and icu support. (#138893)
This patch adds a wrapper class called EncodingConverter for
ConverterEBCDIC. This class is then extended to support the ICU library
or iconv library. The ICU library currently takes priority over the
iconv library.

Relevant RFCs:

https://discourse.llvm.org/t/rfc-adding-a-charset-converter-to-the-llvm-support-library/69795

https://discourse.llvm.org/t/rfc-enabling-fexec-charset-support-to-llvm-and-clang-reposting/71512

Stacked PR to enable fexec-charset that depends on this:
https://github.com/llvm/llvm-project/pull/138895

See old PR for review and commit history:
https://github.com/llvm/llvm-project/pull/74516
2025-05-20 14:02:22 -04:00
Andy Kaylor
cbcfe667bb
[CIR] Upstream support for iterator-based range for loops (#140636)
This change adds handling for C++ member operator calls, implicit no-op
casts, and l-value call expressions. Together, these changes enable
handling of range for loops based on iterators.
2025-05-20 10:52:15 -07:00
Justin Cady
0931874b21
[Coverage] Add testing to validate code coverage for exceptions (#133463)
While investigating an issue with code coverage reporting around
exceptions it was useful to have a baseline of what works today.

This change adds end-to-end testing to validate code coverage behavior
that is currently working with regards to exception handling.
2025-05-20 13:43:32 -04:00
Maksim Panchenko
51e222ef48
[BOLT][AArch64] Fix crash for conditional tail calls (#140669)
When conditional tail call is located in old code while BOLT is
operating in lite mode, the call will require optional pending
relocation with a type that is currently not supported resulting in a
build-time crash.

Before a proper fix is implemented, ignore conditional tail calls for
relocation purposes and mark their target functions to be patched, i.e.
to be served as veneers/thunks.
2025-05-20 10:38:00 -07:00
Nishant Patel
747620db2a
[mlir] [XeGPU] Add XeGPU workgroup to subgroup pass (#139477)
This PR adds the XeGPU workgroup (wg) to subgroup (sg) pass. The wg to
sg pass transforms the xegpu wg level operations to subgroup operations
based on the sg_layout and sg_data attribute. The PR adds transformation
patterns for following Ops

1. CreateNdDesc
2. LoadNd
3. StoreNd
4. PrefetchNd
4. UpdateNdOffset
5. Dpas
2025-05-20 12:35:50 -05:00
Sarah Spall
5999988af8
[HLSL] Move where ZExt happens in 'EmitStoreThroughExtVectorComponentLValue' to handle bug with hlsl boolean vector swizzles (#140627)
In 'EmitStoreThroughExtVectorComponentLValue', move the code which ZExts
in the case the Destination Scalar Type is larger than the Source Scalar
Type, to the top of the function, to ensure each condition is handled.

The previous code missed this case:
```
bool4 b = true.xxxx;
b.xyz = false.xxx;
```
Leading to a bad shuffle vector. 

Closes #140564
2025-05-20 10:27:34 -07:00
Sarah Spall
2a1af502d4
[DirectX] scalarize the dx.isinf intrinsic (#140638)
The DXIL IsInf op only takes scalars.
Closes #140577
2025-05-20 10:26:58 -07:00
Craig Topper
0cf6b4f5ee
[Docs][RISCV] Move Zilsd to 'Supported' status. NFC (#140757) 2025-05-20 10:23:13 -07:00
David Green
47b89fb412
[AArch64] Use i32 extract from UADDV in popcount lowering. (#140718)
We need the top bits to be zeroes, but an v8i8->i32 EXTRACT_VECTOR_ELT will
anyext into the top bits. The instruction we create (UADDV) is known to be
zeroes in the upper bits, so we can convert to a larger v2i32 vector and
extract from there, similar to the operation currently performed for i64
types.

Fixes #140707
2025-05-20 18:09:18 +01:00
Aaron Ballman
6fb23afb8d
[C] Do not diagnose unions with -Wdefault-const-init (#140725)
A default-initialized union with a const member is generally reasonable
in C and isn't necessarily incompatible with C++, so we now silence the
diagnostic in that case. However, we do still diagnose a const-
qualified, default-initialized union as that is incompatible with C++.
2025-05-20 13:04:24 -04:00
Fangrui Song
a1e314d10d [test] Add lit.local.cfg after #140471 2025-05-20 09:51:13 -07:00
Dave Lee
ff127624be
[lldb] Reduce max-children-count default to readable size (#139826)
Change the default from 256 to 24. The argument is that 256 is too large to be scanned
by eye, and too large to print in a terminal which can be only 40-50 lines in height.

When all children must be shown, `frame variable` and `expression` both support the `-A`
(`--show-all-children`) flag.

rdar://145327522
2025-05-20 09:34:42 -07:00
Brox Chen
7e9d9dba9c
[AMDGPU][True16][CodeGen] update test fmax3/fmin3 test with true16 mode (#140752)
This is a NFC patch.

This patch duplicate GFX11plus runlines and apply them with
"+mattr=+real-true16" and "+mattr=-real-true16" on fmax3/fmin3 tests,
and putting '-real-true16' on gisel testline. And then update the test
with the update script
2025-05-20 12:33:41 -04:00
Min-Yih Hsu
b3c3297c1a [RISCV] Fix missing WriteRes for Q extensions in SiFiveP800 scheudling model 2025-05-20 09:24:48 -07:00
Slava Zakharin
54aa9282ed
[flang] Undo the effects of CSE for hlfir.exactly_once. (#140190)
CSE may delete operations from hlfir.exactly_once and reuse
the equivalent results from the parent region(s), e.g. from the parent
hlfir.region_assign. This makes it problematic to clone
hlfir.exactly_once before the top-level hlfir.where.
This patch adds a "canonicalizer" that pulls in such operations
back into hlfir.exactly_once.
2025-05-20 09:22:05 -07:00
Min-Yih Hsu
b92b548168
[RISCV] Add scheduling model for SiFive P800 processors (#139316)
The scheduling model for SiFive P800 series cores. They have 6 integer
pipes, 2 floating point pipes, and 2 vector pipes.

https://chipsandcheese.com/p/hot-chips-2023-sifives-p870-takes-risc-v-further

The tests are meant to have the same coverage as its P600 counterpart.
2025-05-20 09:13:08 -07:00
CarolineConcatto
17e293d5b8
[LLVM][AArch64]CFINV - Add UNPREDICTABLE behaviour if CRm is not zero (#140593)
Now CFINV follows AXFLAGS behaviour for CRm.

It looks like (0) in the instruction encoding means that the behaviour
is UNPREDICTABLE if that bit is not zero.
2025-05-20 17:11:11 +01:00
erichkeane
e8dff7bea4 [OpenACC] Fix location of array-section diagnostic.
In a sub-subscript of an array-section, it is actually an array section.
So make sure we get the location correct when there isn't a 'colon' to
look at.
2025-05-20 09:04:32 -07:00
Craig Topper
4a0ae4f504
[RISCV] Add LD_RV32/SD_RV32 to a few more functions in RISCVInstrInfo. (#140640)
isLoadFromStackSlot/isStoreToStackSlot/getMemOperandsWithOffsetWidth

The first 2 probably requires spills/reloads which we don't use
LD_RV32/SD_RV32 for yet.

I think getMemOperandsWithOffsetWidth is mainly used for load/store
clustering. I think we can assume this just works.
2025-05-20 09:01:03 -07:00
Fangrui Song
95e4db8fa7
[llvm-objdump] --adjust-vma: Call getInstruction with adjusted address
llvm-objdump currently calls MCDisassembler::getInstruction with
unadjusted address and MCInstPrinter::printInst with adjusted address.
The decoded branch targets will be adjusted as expected for most targets
(as the getInstruction address is insignificant) but not for SystemZ
(where the getInstruction address is displayed).

Specify an adjust address to fix SystemZInstPrinter output.

The added test utilizes llvm/utils/update_test_body.py to make updates
easier and additionally checks that we don't adjust SHN_ABS symbol
addresses.

Pull Request: https://github.com/llvm/llvm-project/pull/140471
2025-05-20 08:54:53 -07:00
Kazu Hirata
ad80f73631 [X86] Fix a warning
This patch fixes:

  llvm/lib/Target/X86/X86ISelLowering.cpp:39622:12: error: explicitly
  assigning value of variable of type 'SDValue' to itself
  [-Werror,-Wself-assign-overloaded]
2025-05-20 08:37:48 -07:00
erichkeane
138a899fe0 [OpenACC][CIR] Implement simple 'copy' lowering for combined constructs
These are identical in IR as the 'compute' constructs, but require a
little additional work since we have 2 operations to work around, not
just 1. Note that the test is nearly identical to the compute version,
except that the combined 'tag's are present, plus the 'loop' construct.
2025-05-20 08:37:30 -07:00
Alexey Bataev
2318491432 [SLP][NFC]Do the analysis first and then actual codegen, NFC 2025-05-20 08:12:53 -07:00
Simon Pilgrim
09fd8f0093
[X86] matchBinaryPermuteShuffle - match AVX512 "cross lane" SHLDQ/SRLDQ style patterns using VALIGN (#140538)
Very similar to what we do in lowerShuffleAsVALIGN

I've updated isTargetShuffleEquivalent to correctly handle SM_SentinelZero in the expected shuffle mask, but it only allows an exact match (or the test mask was undef) - it can't be used to match zero elements with MaskedVectorIsZero.

Noticed while working on #140516
2025-05-20 16:07:56 +01:00
Simon Pilgrim
621a5a976e
[X86] combineAdd - use SDPatternMatch to simplify "(add (zext (vXi1 X)), Y) -> (sub Y, (sext (vXi1 X)))" matching. (#140731) 2025-05-20 15:59:56 +01:00
Prabhu Rajasekaran
1a9377bef3
[clang][analysis] Thread Safety Analysis: Handle parenthesis (#140656) 2025-05-20 07:45:14 -07:00