546657 Commits

Author SHA1 Message Date
Tom Stellard
cff9ae7a15
[CMake][Release] Build with -ffat-lto-objects (#140381)
Fixes #133580
2025-07-29 15:33:49 -07:00
Stanislav Mekhanoshin
d99238263c
[AMDGPU] Implement v_mad_u32/v_mad_nc_u|i64_u32 on gfx1250 (#151226) 2025-07-29 15:06:35 -07:00
kkent030315
32127045c8
[llvm-readobj][COFF] Add support for more CET and hotpatch flags (#150967)
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_COMPAT_STRICT_MODE`
- Added
`IMAGE_DLL_CHARACTERISTICS_EX_CET_SET_CONTEXT_IP_VALIDATION_RELAXED_MODE`
- Added
`IMAGE_DLL_CHARACTERISTICS_EX_CET_DYNAMIC_APIS_ALLOW_IN_PROC_ONLY`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_RESERVED_1`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_CET_RESERVED_2`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_FORWARD_CFI_COMPAT`
- Added `IMAGE_DLL_CHARACTERISTICS_EX_HOTPATCH_COMPATIBLE`
2025-07-30 00:51:57 +03:00
Jordan Rupprecht
86f74c4d01
[bazel] Use rules_cc everywhere and reformat (#149584)
We already use cc rules from `@rules_cc//cc:defs.bzl` in a few files,
but this uses it everywhere. Done automatically by running `buildifier
--lint=fix
--warnings=native-cc-binary,native-cc-library,native-cc-test,load` over
all the files. I also ran `buildifier` once more to ensure there wasn't
any missing formatting, so that caused a few unrelated diffs.
2025-07-29 16:30:25 -05:00
Chelsea Cassanova
c162846f8b
[lldb][cmake] Create dependencies for LLDB header targets (#150995)
The LLDB standalone build using Xcode currently fails due to the headers
being attached to multiple targets, but none of these targets depending
on each other. This commit resolves this by creating those dependencies.
2025-07-29 14:24:35 -07:00
Dan Blackwell
ba2e49cac9
[libFuzzer] Mark libFuzzer SIGTRAP test unsupported on windows (#151109)
This change is based on the UNSUPPORTED mark from the existing sigusr
test
c59cc54284/compiler-rt/test/fuzzer/sigusr.test (L4)
2025-07-29 17:08:02 -04:00
Philip Reames
ce23830508
[RISCV] Combine a vsse from a vsseg with one active segment (#151198)
This is a rewrite of the current strided store optimization to be a DAG
combine. This allows it to kick in slightly more broadly, in particular
for the scalable lowering paths.
2025-07-29 14:05:48 -07:00
Krishna Pandey
616cef0883
[libc][math] Make BFloat16 comparison tests constexpr (#151211)
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
2025-07-29 16:51:57 -04:00
Florian Hahn
446b3de5b6
[IndVars] Add tests showing missed folding opportunity. 2025-07-29 21:25:52 +01:00
Philip Reames
fe93f75cc6 [RISCV] Address post commit style suggestion 2025-07-29 13:23:21 -07:00
Stanislav Mekhanoshin
bfb6829148
[AMDGPU] Organize VOP3 profiles for single HasExt64BitDPP. NFC. (#151212)
This shall simplify further delta as more profiles will be
added inside these braces.
2025-07-29 13:04:15 -07:00
Ramkumar Ramachandra
5d4e1e0c84
[RISCV] Fix build failure in getIntrinsicInstrCost (#151210)
bd66fd0 ([CostModel/RISCV] Fix costs of vector [l](lrint|lround))
introduced buildbot failures by using a temporary ArrayRef when a
SmallVector should have been used. Fix this.

Failure: https://lab.llvm.org/buildbot/#/builders/186/builds/11133
2025-07-29 20:42:54 +01:00
Florian Hahn
55f9eccee9
[LV] Revert back to use Loop::isLoopInvariant in isPredicatedInst. (#150828)
This partially reverts https://github.com/llvm/llvm-project/pull/140744,
restoring the original TheLoop->isLoopInvariant check instead the more
powerful Legal->isInvariant, which uses SCEV.

This causes a mis-compile, because SCEV can prove that the stored value
is loop-invariant, which in turn converts the store to a uniform store.
But in VPlan, we aren't yet able to determine that the stored value is
loop-invariant, so we extract the last lane, which is incorrect, because
it does not account for the mask of the store.

Restoring the original code is a safe fix and avoids this subtle
divergence.

Fixes https://github.com/llvm/llvm-project/issues/149347.

PR: https://github.com/llvm/llvm-project/pull/150828
2025-07-29 20:32:31 +01:00
Craig Topper
f3c531c676 [RISCV] Use SDValue::getOperand instead of SDNode::getOperand for consistency. NFC 2025-07-29 12:30:15 -07:00
Kazu Hirata
c00e8dd9ea [lldb] Fix a warning
This patch fixes:

  lldb/source/Plugins/Process/wasm/ProcessWasm.cpp:107:25: error:
  format specifies type 'unsigned long long' but the argument has type
  'lldb::tid_t' (aka 'unsigned long') [-Werror,-Wformat]
2025-07-29 12:25:46 -07:00
Maksim Panchenko
1e0edb072a
[BOLT][AArch64] Compensate for missing code markers (#151060)
Code written in assembly can have missing code markers. In BOLT, we can
compensate by recognizing that a function entry point should start a
code sequence.

Seen such code in lua jit library.
2025-07-29 12:01:06 -07:00
S. VenkataKeerthy
130f24b28d
[IR2Vec][llvm-ir2vec] Revamp triplet generation and add entity mapping mode (#149214)
Add entity mapping mode to llvm-ir2vec and improve triplet generation format for knowledge graph embedding training.

This change streamlines the workflow for training the vocabulary embeddings with IR2Vec by:
1. Directly generating numeric IDs instead of requiring string-to-ID preprocessing
2. Providing entity mappings in standard knowledge graph embedding format
3. Structuring triplet output in train2id format compatible with knowledge graph embedding frameworks
4. Adding metadata headers to simplify post-processing and training setup

These improvements make IR2Vec more compatible with standard knowledge graph embedding training pipelines and reduce the preprocessing steps needed before training.

See #149215 for more details on how it is used.

(Tracking issues - #141817, #141834)
2025-07-29 11:56:52 -07:00
Jordan Rupprecht
052b836d23
[bazel] Port #150696: XeVM to LLVMIR (#151207) 2025-07-29 13:25:14 -05:00
Ramkumar Ramachandra
13366759c3
[VectorUtils] Trivially vectorize ldexp, [l]lround (#145545) 2025-07-29 19:23:09 +01:00
Ramkumar Ramachandra
bd66fd0d01
[CostModel/RISCV] Fix costs of vector [l](lrint|lround) (#146058)
Take the actual instruction cost into account, and don't fallthrough to
code that doesn't apply to [l]lrint. Also strip invalid costs for
[b]f16, as a companion to #146507, and unify it with [l]lround costs as
a companion to #147713.
2025-07-29 19:22:11 +01:00
Muhammad Bassiouni
551dcc3e82
[libc][math] Refactor atan implementation to header-only in src/__support/math folder. (#150852)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-07-29 21:16:57 +03:00
Aaron Ballman
30a5d569b2
[C23] AST equivalence of attributes (#151196)
Implicitly declared types (like __NSConstantString_tag, etc) will be
declared with visibility attributes. This causes problems when merging
ASTs because we currently reject declaration merging for declarations
with attributes.

This relaxes that restriction somewhat; implicit declarations can now
have attributes when merging; we assume that if the compiler generated
it, it's fine.
2025-07-29 14:09:02 -04:00
Sang Ik Lee
5ae79baab3
[MLIR][XeVM] Add XeVM to LLVMIR translation. (#150696)
Add XeVM dialect to LLVMIR translation.
Currently no ops are translated.
Only xevm.DecorationCacheControl are translated to metadata for spirv
decoration - !spirv.DecorationCacheControlINTEL.

Co-authored-by: Artem Kroviakov artem.kroviakov@intel.com
2025-07-29 11:00:25 -07:00
Aaron Ballman
6a22580305
Switch sanity check to assert; NFC (#151181)
This was written out of an abundance of caution because the changes were
being added to the release branch. Now we can be a little less cautious
and switch to using an assert. No behavioral changes are expected.
2025-07-29 13:54:45 -04:00
Krishna Pandey
20d992d366
[libc][math] Fix buildbot fails (#151186)
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2025-07-29 13:51:40 -04:00
Aaron Ballman
d2361e43d1
[C23] More improved type compatibility for enumerations (#150946)
The structural equivalence checker was not paying attention to whether
enumerations had compatible fixed underlying types or not.

Fixes #150594
2025-07-29 13:30:52 -04:00
Timm Baeder
4a44a85c89
[clang][bytecode] Add Pointer::initializeAllElements() (#151151)
To initialize all elements of a primitive array at once. This saves us
from creating the InitMap just to destroy it again after all elements
have been initialized.
2025-07-29 19:30:01 +02:00
Tony Varghese
59c3fe6505
[PowerPC] Exploit xxeval instruction for ternary patterns - ternary(A, X, and(B,C)) (#141733)
## Description
<!--- Title/Description will be Subject/Body of commit message.      -->
<!--- Please be concise and limit the subject line to 50 characters, -->
<!--- and wrap the Description at 72 characters.                     -->
<!--- Describe why this is required, what problem it solves.         -->
Adds support for ternary equivalent operations of the form `ternary(A,
X, and(B,C))` where `X=[xor(B,C)| nor(B,C)| eqv(B,C)| not(B)| not(C)]`.

List of `xxeval` equivalent ternary operations added and the
corresponding `imm` value required:

Ternary Operator| Imm Value
--|--
ternary(A,  xor(B,C), and(B,C))	| 22
ternary(A,  nor(B,C), and(B,C))	| 24
ternary(A,  eqv(B,C), and(B,C))	| 25
ternary(A,  not(C), and(B,C))	| 26
ternary(A,  not(B), and(B,C))	| 28

eg.  `xxeval XT,XA,XB,XC,22` 
- performs `XA ? xor(XB, XC) : and(XB,XC)`and places the result in `XT`.

Co-authored-by: Tony Varghese <tony.varghese@ibm.com>
2025-07-29 22:56:05 +05:30
Jordan Rupprecht
bc605f4ce8
[bazel] Port #151150: Move InitAll*** implementation into static library (#151183)
And prune deps when splitting
2025-07-29 12:25:32 -05:00
Muhammad Bassiouni
efbbc0b319
[libc][math] Refactor asinhf16 implementation to header-only in src/__support/math folder. (#150849)
Part of #147386

in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450
2025-07-29 20:19:01 +03:00
Jonas Devlieghere
a28e7f1aad
[lldb] Add WebAssembly Process Plugin (#150143)
Extend support in LLDB for WebAssembly. This PR adds a new Process
plugin (ProcessWasm) that extends ProcessGDBRemote for WebAssembly
targets. It adds support for WebAssembly's memory model with separate
address spaces, and the ability to fetch the call stack from the
WebAssembly runtime.

I have tested this change with the WebAssembly Micro Runtime (WAMR,
https://github.com/bytecodealliance/wasm-micro-runtime) which implements
a GDB debug stub and supports the qWasmCallStack packet.

```
(lldb) process connect --plugin wasm connect://localhost:4567
Process 1 stopped
* thread #1, name = 'nobody', stop reason = trace
    frame #0: 0x40000000000001ad
wasm32_args.wasm`main:
->  0x40000000000001ad <+3>:  global.get 0
    0x40000000000001b3 <+9>:  i32.const 16
    0x40000000000001b5 <+11>: i32.sub
    0x40000000000001b6 <+12>: local.set 0
(lldb) b add
Breakpoint 1: where = wasm32_args.wasm`add + 28 at test.c:4:12, address = 0x400000000000019c
(lldb) c
Process 1 resuming
Process 1 stopped
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
    frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
   1    int
   2    add(int a, int b)
   3    {
-> 4        return a + b;
   5    }
   6
   7    int
(lldb) bt
* thread #1, name = 'nobody', stop reason = breakpoint 1.1
  * frame #0: 0x400000000000019c wasm32_args.wasm`add(a=<unavailable>, b=<unavailable>) at test.c:4:12
    frame #1: 0x40000000000001e5 wasm32_args.wasm`main at test.c:12:12
    frame #2: 0x40000000000001fe wasm32_args.wasm
```

This PR is based on an unmerged patch from Paolo Severini:
https://reviews.llvm.org/D78801. I intentionally stuck to the
foundations to keep this PR small. I have more PRs in the pipeline to
support the other features/packets.

My motivation for supporting Wasm is to support debugging Swift compiled
to WebAssembly:
https://www.swift.org/documentation/articles/wasm-getting-started.html
2025-07-29 10:07:13 -07:00
Razvan Lupusoru
4128cf3b26
[flang][acc] Lower do and do concurrent loops specially in acc regions (#149614)
When OpenACC is enabled and Fortran loops are annotated with `acc loop`,
they are lowered to `acc.loop` operation. And rest of the contained
loops use the normal FIR lowering path.

Hovever, the OpenACC specification has special provisions related to
contained loops and their induction variable. In order to adhere to
this, we convert all valid contained loops to `acc.loop` in order to
store this information appropriately.

The provisions in the spec that motivated this change (line numbers are
from OpenACC 3.4):
- 1353 Loop variables in Fortran do statements within a compute
construct are predetermined to be private to the thread that executes
the loop.
- 3783 When do concurrent appears without a loop construct in a kernels
construct it is treated as if it is annotated with loop auto. If it
appears in a parallel construct or an accelerator routine then it is
treated as if it is annotated with loop independent.

By valid loops - we convert do loops and do concurrent loops which have
induction variable. Loops which are unstructured are not handled.
2025-07-29 10:03:22 -07:00
Brox Chen
2a3f72ee6e
[AMDGPU][CodeGen][True16] Correct size calculation for d16 insts (#151042)
D16 pesudo instructions are introduced in true16 mode to represet a D16
load/store. In MC lowering, the pesudo instructions are lowered to the
corresponding D16 Lo/Hi MC Inst respecting the register allocation.

However, the pesudo instruction has size 0 and cause an issue in the
Inst size estimation. Use D16 Lo when calculating inst size
2025-07-29 13:01:57 -04:00
jeremyd2019
a3228b6bf9
[Clang][Cygwin] Enable few conditions that are shared with MinGW (#149637)
The Cygwin target is generally very similar to the MinGW target. The
default auto-import behavior, the default calling convention, the
`.dll.a` import library extension, the `__GXX_TYPEINFO_EQUALITY_INLINE`
pre-define by `g++`, and the long double configuration.

Co-authored-by: Mateusz Mikuła <oss@mateuszmikula.dev>
2025-07-29 10:01:43 -07:00
jeremyd2019
28b3190053
[LLVM][Cygwin] Enable conditions that are shared with MinGW (#149638)
Cygwin and MinGW share the auto import behavior that could result in
__stack_check_guard being non-dso-local. Allow windres to assume a
Cygwin target as well as a MinGW one, so defines like _WIN32 would not
be present on Cygwin.
2025-07-29 10:01:04 -07:00
Vivian Zhang
dc6d7f0637
[mlir][linalg] Fix padding shape computation in PadTilingInterface for convs (#149576)
This PR fixes the computation of padded shapes for convolution-style
affine maps (e.g., d0 + d1) in `PadTilingInterface`. Previously, the
codes used the direct sum of loop upper bounds, leading to over-padding.
For example, the following `conv_2d_nhwc_fhwc` op, if only padding the c
dimensions to multiples of 16, it also incorrectly pads the convolved
dimensions and generates the wrong input shape as:

```
%padded = tensor.pad %arg0 low[0, 0, 0, 0] high[0, 1, 1, 12] {
^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index):
  tensor.yield %cst : f32
} : tensor<1x16x16x4xf32> to tensor<1x17x17x16xf32>
%padded_0 = tensor.pad %arg1 low[0, 0, 0, 0] high[0, 0, 0, 12] {
^bb0(%arg3: index, %arg4: index, %arg5: index, %arg6: index):
  tensor.yield %cst : f32
} : tensor<16x3x3x4xf32> to tensor<16x3x3x16xf32>
%0 = linalg.conv_2d_nhwc_fhwc {dilations = dense<1> : tensor<2xi64>, strides = dense<1> : tensor<2xi64>} ins(%padded, %padded_0 : tensor<1x17x17x16xf32>, tensor<16x3x3x16xf32>) outs(%arg2 : tensor<1x14x14x16xf32>) -> tensor<1x14x14x16xf32>
return %0 : tensor<1x14x14x16xf32>
```

The new implementation uses the maximum accessed index as the input for
affine map and then adds 1 after aggregating all the terms to get the
final padded size. This fixed
https://github.com/llvm/llvm-project/issues/148679.
2025-07-29 09:58:30 -07:00
Andy Kaylor
8a1b252a99
[CIR] Upstream proper function alias lowering (#150520)
This change implements correct lowering of function aliases to the LLVM
dialect.
2025-07-29 09:45:37 -07:00
Changpeng Fang
6184ef1c2f
[AMDGPU] Support f64 atomics on gfx1250 (#151172)
- BUF/FLAT/GLOBAL_ADD/MIN/MAX_F64
   - DS_ADD_F64

Co-authored-by: Konstantin Zhuravlyov <Konstantin Zhuravlyov@amd.com>
2025-07-29 09:41:00 -07:00
Krzysztof Drewniak
330a7e1136
[mlir][Vector] Make elementwise-on-broadcast sinking handle splat consts (#150867)
There is a pattern that rewrites
elementwise_op(broadcast(x1 : T to U), broadcast(x2 : T to U), ...) to
broadcast(elementwise_op(x1, x2, ...) : T to U).

This pattern did not, however, account for the case where a broadcast
constant is represented as a SplatElementsAttr, which can safely be
reshaped or scalarized but is not a `vector.broadcast` or `vector.splat`
operation.

This patch fixes this oversight, prenting premature broadcasting.

This did result in the need to update some linalg dialect tests, which
now feature a less-broadcast computation and/or more constant folding.
2025-07-29 09:40:49 -07:00
Uzair Nawaz
a1aba84c2b
[libc] Reland #148948 "Implement barriers for pthreads" (#151021)
Fixed build dependencies for pthread_barrier_t (add __barrier_type to
cmake dependencies)
2025-07-29 16:39:40 +00:00
sribee8
a653934b58
[libc] Reland wchar string conversion mb to wc (#151048)
Added crash on nullptr to mbstowcs

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-07-29 16:34:10 +00:00
Abid Qadeer
335dbba741
[OMPIRBuilder] Don't drop debug loc from LocationDescription. (#148713)
`LocationDescription` contains both the insertion point and the debug
location. When `LocationDescription` is available, it is better to use
`updateToLocation` which will update both. This PR replaces
`restoreIP(Loc.IP)` with `updateToLocation(Loc)` as former may not
update debug location in all cases.

I am not checking the return value of `updateToLocation` because that is
checked just a few lines above in all cases and we would have returned
early if it failed.
2025-07-29 17:31:29 +01:00
davidtrevelyan
875491f59e
[rtsan][compiler-rt] Fix ioctl test causing segfault on exit (#151182)
I was observing segfaults at executable exit in the rtsan instrumented
unit tests. Bisecting the offending test led to observing that this test
is not using our safe test fixture for anything involving a file
descriptor. Changing to use the fixture eliminated the segfault on exit.
2025-07-29 17:31:19 +01:00
Stephen Tozer
e1e312e6af Revert "[Dexter] Add DAP support for Dexter, including lldb-dap (#149394)"
This reverts commit 83dfdd8f5485f6b50213c88f02878f86b3f53852.

Temporary revert, as the above patch contains some python code requiring at
least version 3.10, when the minimum required by LLVM is 3.8.
2025-07-29 17:26:30 +01:00
Sam Elliott
f925ecbf19
[RISCV] Use Hints for Xqcisim/Xqcisync Aliases (#151040)
My aim here is to make these a little easier to maintain by relying on
aliases where these instructions overlap with the Hint instructions they
are based on.

The following instructions have not been converted to aliases as they
have complex mappings from ther immediate encodings to the immediate
encoding of the underlying instruction (setting high bits):
- qc.pputci
- qc.sync, qc.sync, qc.syncwf, qc.syncwl
- qc.c.sync, qc.c.syncr, qc.c.syncwf, qc.syncwl

Co-authored-by: Sudharsan Veeravalli <quic_svs@quicinc.com>
2025-07-29 09:21:25 -07:00
Andy Kaylor
88620aee98
[CIR] Add support for array cleanups (#150499)
This adds support for array cleanups, including the ArrayDtor op.
2025-07-29 09:21:15 -07:00
Krishna Pandey
111edfcab8
[libc][math][c++23] Add fabsbf16 math function (#148398)
This PR implements fabsbf16 math function for BFloat16 type along with
the tests.

---------

Signed-off-by: krishna2803 <kpandey81930@gmail.com>
Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>
Co-authored-by: OverMighty <its.overmighty@gmail.com>
2025-07-29 12:19:21 -04:00
Andy Kaylor
32779cd698
[CIR] Add proper handling for no prototype function calls (#150553)
This adds standard-comforming handling for calls to functions that were
declared in C source in the no prototype form.
2025-07-29 09:16:17 -07:00
Akash Banerjee
0a4c6522a6
[MLIR] Add conversion support for more ops from ComplexToROCDLLibraryCalls (#151166) 2025-07-29 17:11:46 +01:00
Leandro Lacerda
2abd58cb7e
[Offload] Add framework for math conformance tests (#149242)
This PR introduces the initial version of a C++ framework for the
conformance testing of GPU math library functions, building upon the
skeleton provided in #146391.

The main goal of this framework is to systematically measure the
accuracy of math functions in the GPU libc, verifying correctness or at
least conformance to standards like OpenCL via exhaustive or random
accuracy tests.
2025-07-29 11:08:27 -05:00