544943 Commits

Author SHA1 Message Date
Craig Topper
9ba45c5c5e [RISCV] Move RISCVDAGToDAGISel::SelectAddrRegRegScale definition later. NFC
This function was placed between some static functions and their
callers. Reorder to keep the related code together.
2025-07-14 21:12:10 -07:00
Sam Elliott
3faaa5cdb0
[RISCV] Fix QC.E.LI -> C.LI with Bare Symbol Compression (#146763)
There's a long comment explaining this approach in RISCVInstrInfoXqci.td

This change also fixes some problems when fixups are able to be resolved for `qc.e.li` and `qc.li`.
2025-07-14 21:00:38 -07:00
Craig Topper
4923313727
[RISCV] Fix typo in comment. NFC (#148754)
'unsigned' was misspelled, but it seemed easier to write uimm9 than to
spell it out.
2025-07-14 20:56:07 -07:00
Craig Topper
31944ac45b
[RISCV] Render P-ext simm10_unsigned as a simm10 after parsing. (#148749)
Instead of allowing a parsed MCInst to have a either uimm10 or simm10,
always render as simm10. This avoids a mismatch between parsed MCInst
and disassembled MCInst when a uimm10 value is used.
2025-07-14 20:55:10 -07:00
Craig Topper
3265a36c55
[RISCV] Refactor RISCVDAGToDAGISel::selectSimm5Shl2. NFC (#148731)
Return from the for loop body instead of using a break and checking the
shift amount after.
2025-07-14 20:54:06 -07:00
Craig Topper
eea5c291bb
[DAGCombiner] Pass SDNodeFlags to getNode instead of modifying nodes. (#148744)
getNode has logic to intersect flags correctly if the new node happens
to CSE with an existing node. Setting node flags after getNode bypasses
this logic and may change the node for other uses where the flags don't
hold.
2025-07-14 20:53:14 -07:00
Craig Topper
9a9db2a39c
[RISCV] Prefix mcpu/mtune/march/mabi with '-' in comments. NFC (#148723) 2025-07-14 20:52:46 -07:00
Matt Arsenault
25b00c033c
AArch64: Fix asserting on unexpected triples (#147880) 2025-07-15 12:47:55 +09:00
Florian Mayer
be200e2b80
[SelectionDAG] improve error message for invalid op bundles (#148722) 2025-07-14 20:41:10 -07:00
Brian Cain
d2bcc51a5a
[LLD] Merge .hexagon.attributes sections (#148098)
Merge the attributes of object files being linked together. The
`.hexagon.attributes` section can be used by loaders and analysis tools.
This is similar to the .riscv.attributes, introduced in
8a900f2438b4a167b98404565ad4da2645cc9330 /
https://reviews.llvm.org/D138550.
2025-07-14 22:36:05 -05:00
Jim Lin
96148f9214 [RISCV] Use cond_code instead for PseudoCCNDS_BFOS and PseudoCCNDS_BFOZ. 2025-07-15 11:19:09 +08:00
Trevor Gross
10b5558b61
[X86] Update the fp128/i128 test to show stack alignment (NFC) (#148753)
Adding an extra argument before a `fp128` only changes the stack offset
by four bytes, while it should instead go in the next 16-aligned slot.
Add a test demonstrating the current behavior.

`no_x86_scrub_sp` is added because offset from the stack pointer is
needed to show the problem.

Relevant issue: https://github.com/llvm/llvm-project/issues/77401
2025-07-15 11:14:25 +08:00
tedwoodward
eb6da944af
[lldb] Improve disassembly of unknown instructions (#145793)
LLDB uses the LLVM disassembler to determine the size of instructions and
to do the actual disassembly. Currently, if the LLVM disassembler can't
disassemble an instruction, LLDB will ignore the instruction size, assume
the instruction size is the minimum size for that device, print no useful
opcode, and print nothing for the instruction.

This patch changes this behavior to separate the instruction size and
"can't disassemble". If the LLVM disassembler knows the size, but can't
dissasemble the instruction, LLDB will use that size. It will print out
the opcode, and will print "<unknown>" for the instruction. This is much
more useful to both a user and a script.

The impetus behind this change is to clean up RISC-V disassembly when
the LLVM disassembler doesn't understand all of the instructions.
RISC-V supports proprietary extensions, where the TD files don't know
about certain instructions, and the disassembler can't disassemble them.
Internal users want to be able to disassemble these instructions.

With llvm-objdump, the solution is to pipe the output of the disassembly
through a filter program. This patch modifies LLDB's disassembly to look
more like llvm-objdump's, and includes an example python script that adds
a command "fdis" that will disassemble, then pipe the output through a
specified filter program. This has been tested with crustfilt, a sample
filter located at https://github.com/quic/crustfilt .

Changes in this PR:
- Decouple "can't disassemble" with "instruction size".
  DisassemblerLLVMC::MCDisasmInstance::GetMCInst now returns a bool for
    valid disassembly, and has the size as an out paramter.
  Use the size even if the disassembly is invalid.
  Disassemble if disassemby is valid.

- Always print out the opcode when -b is specified.
  Previously it wouldn't print out the opcode if it couldn't disassemble.

- Print out RISC-V opcodes the way llvm-objdump does.
  Code for the new Opcode Type eType16_32Tuples by Jason Molenda.

- Print <unknown> for instructions that can't be disassembled, matching
  llvm-objdump, instead of printing nothing.

- Update max riscv32 and riscv64 instruction size to 8.

- Add example "fdis" command script.

- Added disassembly byte test for x86 with known and unknown instructions.
- Added disassembly byte test for riscv32 with known and unknown instructions,
  with and without filtering.
- Added test from Jason Molenda to RISC-V disassembly unit tests.
2025-07-14 21:50:22 -05:00
Connector Switch
91b3dbe273
[libc] Update some implementation status for search.h (#148414)
- `VISIT` was implemented in
https://github.com/llvm/llvm-project/pull/132746.
- `lsearch` was implemented in
https://github.com/llvm/llvm-project/pull/131431.

At first, I thought this would be updated automatically, but it seems
that the header status needs to be added manually.
2025-07-15 10:34:30 +08:00
Oliver Hunt
451a9ce9ff
[clang][ObjC][PAC] Add ptrauth protections to objective-c (#147899)
This PR introduces the use of pointer authentication to objective-c[++].

This includes:

* __ptrauth qualifier support for ivars
* protection of isa and super fields
* protection of SEL typed ivars
* protection of class_ro_t data
* protection of methodlist pointers and content
2025-07-14 19:32:18 -07:00
Valentin Clement (バレンタイン クレメン)
90ef114a33
[flang][cuda] Add cuf.set_allocator_idx for device component (#148750) 2025-07-14 19:31:44 -07:00
Oliver Hunt
7cde974233
[clang] Update diagnostics and documentation for type aware allocators (#148576)
Alas reflection pushed p2719 out of C++26, so this PR changes the
diagnostics to reflect that for now type aware allocation is
functionally a clang extension.
2025-07-14 19:20:36 -07:00
Peter Collingbourne
1ddb909a42 remote-exec: Detect and propagate signal death in the remote process.
If the remote process died with a signal, this will be exposed by ssh
as an exit code in the range 128 < rc < 160. We may be running under
`not --crash` which will expect us to also die with a signal, so send
the signal to ourselves so that wait4() in `not` will detect the signal.

Speculative fix for failing buildbot:
https://lab.llvm.org/buildbot/#/builders/193/builds/9070
2025-07-14 18:57:48 -07:00
Florian Mayer
14dc3e3d5f
[SelectionDAG] [KCFI] Allow "kcfi" on invoke (#148742)
This is handled in CallBase, so it is valid for both call and invoke
2025-07-14 18:55:09 -07:00
Deric C.
27b3b4a665
[DirectX] Move the scalarizer pass to before dxil-flatten-arrays (#146800)
Fixes #145924 and #140416
Depends on #146173 being merged first.

This PR moves the scalarizer pass to immediately before the
dxil-flatten-arrays pass to allow the dxil-flatten-arrays pass to turn
scalar GEPs (including i8 GEPs) into flattened array GEPs where
applicable.

A number of LLVM DirectX CodeGen tests have been edited to remove scalar
GEPs and also correct previously uncaught incorrectly-transformed GEPs.

No more validation errors of the form `Access to out-of-bounds memory is
disallowed` or `TGSM pointers must originate from an unambiguous TGSM
global variable` appear anymore after this PR when compiling DML
shaders.
2025-07-14 18:13:42 -07:00
Jim Lin
7ba0c98265 [RISCV] Rename the vector crypto intrinsic test vcpopv.c to vcpop.c. NFC.
To be consistent with https://github.com/riscv-non-isa/rvv-intrinsic-doc/tree/post-1.0-dev/auto-generated/vector-crypto/llvm-api-tests
2025-07-15 09:00:35 +08:00
Jim Lin
22707fd4a5
[RISCV] Add Andes XAndesBFHCvt (Andes Scalar BFLOAT16) extension (#148563)
The spec can be found at:

https://github.com/andestech/andes-v5-isa/releases/tag/ast-v5_4_0-release.

The extension includes only two instructions: one for converting from
f32 to f16, and another for converting from f16 to f32.

This patch only implements MC support for XAndesBFHCvt.
2025-07-15 08:59:00 +08:00
Stanislav Mekhanoshin
cbba8f0acb
[AMDGPU] Codegen support for v_fmaak_f64/f_fmamk_f64 (#148734) 2025-07-14 17:57:06 -07:00
Valentin Clement (バレンタイン クレメン)
2c6771889a
[flang][cuda] Introduce cuf.set_allocator_idx operation (#148717) 2025-07-14 17:23:18 -07:00
Valentin Clement (バレンタイン クレメン)
5eecec8e81
[flang] Fix use of __has_builtin and formatting (#148746)
`__has_builtin` is not available on all compilers. Make sure it works
when not defined.

Also fix some formatting issues: 
- Use brace initialization where possible
- Fix wrong capitalization of variables.
- Add `std::` for `unit64_t` and `int64_t` as it is mostly done in this
part of the codebase.
2025-07-14 17:21:09 -07:00
Konstantin Varlamov
49d2b5f1cd
[libc++][hardening] Introduce a dylib function to log hardening errors. (#148266)
Unlike `verbose_abort`, this function merely logs the error but does not
terminate execution. It is intended to make it possible to implement the
`observe` semantic for Hardening.
2025-07-14 17:04:33 -07:00
Matt Arsenault
ee5b9369cd
Hexagon: Add frexp intrinsic test (#148671) 2025-07-15 09:00:03 +09:00
Matt Arsenault
43206d1b2e
Hexagon: Add test for llvm.exp10 intrinsic (#148664)
This is mostly to test the libcall behavior
2025-07-15 08:56:35 +09:00
Craig Topper
f07107337f
[DAGCombiner] Pass SDNodeFlags to getSelect instead of modifying the node returned. (#148733) 2025-07-14 16:50:10 -07:00
Haohai Wen
6b7c6fd8b4
[PseudoProbe] use print to emit function name (#147873)
This PR is part of #123870.

For COFF Asm, function name should be wrapped in quotes.
MCSymbol::print will automatically do that.
2025-07-15 07:49:27 +08:00
Deric C.
352215c6eb
[DirectX] Simplify and correct the flattening of GEPs in DXILFlattenArrays (#146173)
In tandem with #146800, this PR fixes #145370

This PR simplifies the logic for collapsing GEP chains and replacing
GEPs to multidimensional arrays with GEPs to flattened arrays. This
implementation avoids unnecessary recursion and more robustly computes
the index to the flattened array by using the GEPOperator's
collectOffset function, which has the side effect of allowing "i8 GEPs"
and other types of GEPs to be handled naturally in the flattening /
collapsing of GEP chains.

Furthermore, a handful of LLVM DirectX CodeGen tests have been edited to
fix incorrect GEP offsets, mismatched types (e.g., loading i32s from a
an array of floats), and typos.
2025-07-14 16:39:01 -07:00
S. VenkataKeerthy
ec90786ad1
[NFC][IR2Vec] Exposing helpers in IR2Vec Vocabulary (#147841)
Minor refactoring IR2Vec vocabulary. This would help in upcoming PRs related to the IR2Vec tool.

(Tracking issue - #141817)
2025-07-14 16:38:50 -07:00
S. VenkataKeerthy
8ae8b50d36
[NFC][IR2Vec] Minor refactoring of opcode access in vocabulary (#147585)
Refactored IR2Vec vocabulary handling to improve code organization and error handling. This would help in upcoming PRs related to the IR2Vec tool.

(Tracking issue - #141817)
2025-07-14 16:35:24 -07:00
Matt Arsenault
d1db176e82
ARM: Stop setting sincos_stret calling convention (#147457)
This was going out of its way to explicitly mark these as
ARM_AAPCS_VFP. This has been explicitly set since 8b40366b54bd4,
where the commit message states that "sincos" (not sincos_stret)
has a special calling convention. However, that commit also sets
the calling convention for all libcalls to ARM_AAPCS_VFP, and
getEffectiveCallingConv returns the same for CCC anyway in tests
using isWatchABI triples.

The net result of this appears to be a change in behavior when
using -float-abi=soft with isWatchABI, which have no tests so
I assume this is a theoretical combination.

If I assert
```
  if (getTargetMachine().getTargetTriple().isWatchABI()) {
    assert(!useSoftFloat());
    assert(getEffectiveCallingConv(CallingConv::C, false) == CallingConv::ARM_AAPCS_VFP);
  }
```
Only 2 tests fail the second condition, which look like copy paste
accidents
using v7k triples with linux and only needed a filler triple. This is a
consequence
of strangely using the target architecture in place of the OS ABI check,
as was done in 042a6c1fe19caf48af7e287dc8f6fd5fec158093
2025-07-15 08:30:49 +09:00
Sudharsan Veeravalli
085e8f1e52
[RISCV] Relax destination instruction dag operand matching in CompresInstEmitter (#148660)
We have some 48-bit instructions in the `Xqci` spec that currently
cannot be compressed to their 32-bit variants due to the constraint in
`CompressInstEmitter` on destination instruction operands not being
allowed to mismatch with the DAG operands.

For eg. the` QC_E_ADDI` instruction can be compressed to the `ADDI`
instruction when the immediate is signed-12 bit but this is currently
not possible since the `QC_E_ADDI` instruction has `GPRNoX0` register
operands while the `ADDI` instruction has `GPR` register operands
leading to an operand type validation error.

I think we can remove the check that only source instruction operands
can mismatch with the corresponding DAG operands and rely on the fact
that we check if the DAG register operand type is a subclass of the
instruction register operand type.
2025-07-15 04:52:51 +05:30
Igor Kudrin
ad9a9537e6
[clang] Fix -Wuninitialized for values passed by const pointers (#147221)
This enables producing a "variable is uninitialized" warning when a
value is passed to a pointer-to-const argument:

```
void foo(const int *);
void test() {
  int *v;
  foo(v);
}
```

Fixes #37460
2025-07-14 16:03:08 -07:00
Stanislav Mekhanoshin
a32040e483
[AMDGPU] Use 64-bit literals in codegen on gfx1250 (#148727) 2025-07-14 15:47:24 -07:00
Uzair Nawaz
56a4f8d8c1
[libc] Wchar Stringconverter (#146388)
Implemented a string converter class to encapsulate the logic of
converting between utf8 <-> utf32
2025-07-14 15:45:46 -07:00
Igor Kudrin
00dacf8c22
[clang] Add -Wuninitialized-const-pointer (#148337)
This option is similar to -Wuninitialized-const-reference, but diagnoses
the passing of an uninitialized value via a const pointer, like in the
following code:
```
void foo(const int *);
void test() {
  int v;
  foo(&v);
}
```
This is an extract from #147221 as suggested in [this
comment](https://github.com/llvm/llvm-project/pull/147221#discussion_r2190998730).
2025-07-14 15:44:43 -07:00
Charitha Saumya
244ebef1dd
Reapply [mlir][vector] Refactor WarpOpScfForOp to support unused or swapped forOp results. (#148313)
Reapply attempt for : https://github.com/llvm/llvm-project/pull/148291
Fix for the build failure reported in :
https://lab.llvm.org/buildbot/#/builders/116/builds/15477

-----

This crash is caused by mismatch of distributed type returned by
`getDistributedType` and intended distributed type for forOp results.

Solution diff:
20c2cf6766

Example:
```
func.func @warp_scf_for_broadcasted_result(%arg0: index) -> vector<1xf32> {
  %c128 = arith.constant 128 : index
  %c1 = arith.constant 1 : index
  %c0 = arith.constant 0 : index
  %2 = gpu.warp_execute_on_lane_0(%arg0)[32] -> (vector<1xf32>) {
    %ini = "some_def"() : () -> (vector<1xf32>)
    %0 = scf.for %arg3 = %c0 to %c128 step %c1 iter_args(%arg4 = %ini) -> (vector<1xf32>) {
      %1 = "some_op"(%arg4) : (vector<1xf32>) -> (vector<1xf32>)
      scf.yield %1 : vector<1xf32>
    }
    gpu.yield %0 : vector<1xf32>
  }
  return %2 : vector<1xf32>
}
``` 
In this case the distributed type for forOp result is `vector<1xf32>`
(result is not distributed and broadcasted to all lanes instead).
However, in this case `getDistributedType` will return NULL type.

Therefore, if the distributed type can be recovered from warpOp, we
should always do that first before using `getDistributedType`
2025-07-14 15:41:56 -07:00
Stanislav Mekhanoshin
5277021c3c
[AMDGPU] Add gfx1250 v_fmac_f64 implementation (#148725) 2025-07-14 15:39:04 -07:00
Rahul Joshi
633728f3b5
[NFC][TableGen][DecoderEmitter] Eliminate indent for a few functions (#148718)
Eliminate the `indent` argument for functions which are always called
with `indent(0)`.
2025-07-14 15:23:41 -07:00
James Newling
99875733fc
[mlir][vector] Use vector.broadcast in place of vector.splat (#148028)
Part of deprecation of vector.splat

RFC:
https://discourse.llvm.org/t/rfc-mlir-vector-deprecate-then-remove-vector-splat/87143/4
More complete deprecation:
https://github.com/llvm/llvm-project/pull/147818
2025-07-14 15:12:21 -07:00
Daniel Paoliello
027f5ba24e
[win][aarch64] Enable the llvm/test/CodeGen/WinEH tests for AArch64 (#147860)
Enabled AArch64 runs for these tests where it made sense.

Also removed the "temporary" suffixes filter that was added over 10
years ago, I believe the "misched-copy.s output file" has been cleaned
from the runners by now...
2025-07-14 15:07:00 -07:00
Daniel Paoliello
13b720d255
[win][x64] Re-use fixed object if multiple catchpads use the same alloca for their catch objects (#147849)
Addresses
<https://github.com/llvm/llvm-project/pull/147421#discussion_r2191234968>
for x86

If more than one `catchpad ` uses the same `alloca` for their catch
objects, then we will allocate more than one object in the fixed area
resulting in wasted stack space.

As a follow up, Clang could be updated to re-use the same `alloca` for
all by-reference and by-pointer catch objects.
2025-07-14 15:06:31 -07:00
Nikita Popov
4b52d221a0
[Support][BLAKE3] Prefix blake3_xof_many_avx512 (#148607)
This symbol was introduced in #147948, but not prefixed, resulting in
conflicts if libblake3 and LLVM are both linked statically into the same
binary.
2025-07-14 14:32:47 -07:00
Amir Ayupov
0d5325bb20
[BOLT] Directly use call count in buildCallGraph (#134966)
In call graph construction, call block count is used for call graph edge
weight. Change that to use call count directly if it's available, 
falling back to block count if not.

Test Plan:
This change together with disabling `fix-block-counts` improves profile
quality metrics, e.g. for large binaries and sampled LBR profiles:

`br_inst_retired.near_taken:uppp` trigger event
- Ads1: 
  - Profiled functions 58096
  - CFG imbalance 2.63% -> 2.45%
  - CG imbalance 8.23% -> 7.44%

- Ads2:
  - Profiled functions 54358
  - CFG imbalance 3.12% -> 2.77%
  - CG imbalance 8.22% -> 7.06%

- uwsgi:
  - Profiled functions 78103
  - CFG imbalance 4.42% -> 4.03%
  - CG imbalance 100.00% -> 100.00%

`cycles:u` trigger event:
- web: 
  - Profiled functions 31306
  - CG flow imbalance: 31.16% -> 20.29%
  - CFG flow imbalance: 7.04% -> 6.44%
2025-07-14 14:28:52 -07:00
Nishant Patel
834591e062
[MLIR] [Vector] Linearization patterns for vector.load and vector.store (#145115)
This PR add inearizarion pattern for vector.load and vector.store. It is
follow up PR to
https://github.com/llvm/llvm-project/pull/143420#issuecomment-2967406606
2025-07-14 14:24:52 -07:00
Abid Qadeer
45fa0b29bc
Revert "[OMPIRBuilder] Don't use invalid debug loc in task proxy function." (#148728)
There is a sanitizer fail in CI after this which I need to investigate.
Reverting for now.
Reverts llvm/llvm-project#148284
2025-07-14 22:23:21 +01:00
Peter Klausler
40ceaf1d99
[flang][runtime] Fix bad instance of std::optional in runtime (#148724)
The runtime needs to use common::optional, not std::optional.
2025-07-14 14:12:49 -07:00