544175 Commits

Author SHA1 Message Date
mingmingl
93fd3ebf9f resolve comments 2025-07-25 13:15:47 -07:00
Mingming Liu
bfc2b572d8
Apply suggestion from @paschalis-mpeis
Co-authored-by: Paschalis Mpeis <paschalis.mpeis@arm.com>
2025-07-25 11:27:51 -07:00
mingmingl
90046d700e incorporate code review comments 2025-07-24 12:06:28 -07:00
mingmingl
0b18c2dd48 run 'git clang format' 2025-07-10 10:22:09 -07:00
mingmingl
a55b23be68 run 'git merge main' and resolve conflicts 2025-07-10 09:26:31 -07:00
Pengcheng Wang
b57df56b48
[RISCV] Add UnsupportedSchedXXX for vendor extensions package (#147666)
There will be more schedule definitions for vendor extentions and
we need to add these `UnsupportedSchedXXX` to exsiting models every
time we add new schedule definitions.

The fact is that each vendor will barely implement other vendors'
extensions, so we can package these definitions into one.
2025-07-10 14:15:22 +08:00
David Green
10f782456e
[AArch64] Enable other cost kinds for getCmpSelInstrCost. (#144375)
This removes the CostKind == TCK_RecipThroughput limitation from
getCmpSelInstrCost, allowing it to return more accurate costs for CodeSize and
Lat / SizeLat. Especially for larger vectors under CodeSize, the returned costs
are currently 1, not the legalization cost.
2025-07-10 07:12:21 +01:00
Timm Baeder
36cbd43ae8
[clang][bytecode] Check new/delete mismatch earlier (#147732)
This fixes a mismatch in diagnostic output with the current intepreter.
2025-07-10 07:33:33 +02:00
Stanislav Mekhanoshin
00a85e5704
[AMDGPU] gfx1250: MC support for 64-bit literals (#147861) 2025-07-09 22:25:47 -07:00
Jim Lin
69ff853729 [RISCV] Move the intrinsic tests for vfwmaccbf16 to zvfbfwma directory. NFC.
A follow-up commit for #147644.
2025-07-10 13:04:27 +08:00
Stanislav Mekhanoshin
fd894f6e9e
[AMDGPU] gfx1250 MC support for v_mov_b64 (#147859)
It is incomplete in terms of the DPP diagnistics, that is much
more involved change.
2025-07-09 21:31:27 -07:00
Matt Arsenault
617af3cc50
AArch64: Base MCAsmInfo type on binary format before OS (#147875)
Fixes asserting with windows-elf triples. Should fix regression
reported in https://github.com/llvm/llvm-project/pull/147225#issuecomment-3054190938

I'm not sure this is a valid triple, but I'm guessing the MCJIT usage
is an accident. This does change the behavior from trying to use WinEH
to DwarfCFI; however the backend crashes with WinEH so I'm assuming following
ELF is the more correct option.
2025-07-10 13:06:14 +09:00
darkbuck
378e9bb7e0
[cir-translate] Fix crash issue where the data layout string is missing (#147209)
- Targets like 'aarch64' or 'arm' only populate the data layout string
after the constructor. Need to call 'CreateTargetInfo' to setup them
properly.
2025-07-09 23:26:15 -04:00
Peter Collingbourne
c4f18d6874 remote-exec: Only copy command line arguments which name files that exist.
Speculative fix for failing buildbot:
https://lab.llvm.org/buildbot/#/builders/193/builds/8961
2025-07-09 20:24:40 -07:00
Craig Topper
831b198c65
[RISCV][Docs] Add bfloat types to RISCVVectorExtension.rst. NFC (#147867) 2025-07-09 20:15:48 -07:00
Boyao Wang
697beb3f17
[TargetLowering] Change getOptimalMemOpType and findOptimalMemOpLowering to take LLVM Context (#147664)
Add LLVM Context to getOptimalMemOpType and findOptimalMemOpLowering. So
that we can use EVT::getVectorVT to generate EVT type in
getOptimalMemOpType.

Related to [#146673](https://github.com/llvm/llvm-project/pull/146673).
2025-07-10 11:11:09 +08:00
Marco Vitale
c86c815fc5
[Sema] Fix lifetime extension for temporaries in range-based for loops in C++23 (#145164)
C++23 mandates that temporaries used in range-based for loops are
lifetime-extended
to cover the full loop. This patch adds a check for loop variables and
compiler-
generated `__range` bindings to apply the correct extension.

Includes test cases based on examples from CWG900/P2644R1.

Fixes https://github.com/llvm/llvm-project/issues/109793
2025-07-10 09:57:07 +08:00
Peter Collingbourne
f1c4df5b7b builtins: Speculative MSVC fix.
Attempt to fix these build failures:
https://lab.llvm.org/buildbot/#/builders/107/builds/12601

The suspected cause is that #133530 caused us to start
passing -std:c11 to MSVC, which activated this code path
that uses _Complex, which MSVC does not support. See:
https://learn.microsoft.com/en-us/cpp/c-runtime-library/complex-math-support

Fix it by also checking _MSC_VER.
2025-07-09 18:32:41 -07:00
Fangrui Song
1ae99f5894 [msan] Fix -Wunused-but-set-variable after #147839 2025-07-09 18:14:19 -07:00
Jim Lin
2eab6f9bb2 [RISCV] Move the intrinsic tests for vfwcvtbf16 and vfncvtbf16 to zvfbfmin directory. NFC.
A follow-up commit for #147644.
2025-07-10 09:13:22 +08:00
Jim Lin
84eeb23484
[RISCV] Implement intrinsics for XAndesVSIntLoad (#147767)
This patch implements clang intrinsic support for XAndesVSIntLoad.

The document for the intrinsics can be found at:
https://github.com/andestech/andes-vector-intrinsic-doc/blob/ast-v5_4_0-release-v5/auto-generated/andes-v5/intrinsic_funcs/04_andes_vector_int4_load_extension.adoc

Co-authored-by: Lino Hsing-Yu Peng <linopeng@andestech.com>
2025-07-10 09:11:29 +08:00
A. Jiang
e8a50a2568
[libc++][docs] Update paper & LWG issue lists after 2025-06 meeting (#147668)
CWG papers requiring library support are also listed.
2025-07-10 08:56:36 +08:00
Thurston Dang
7c66099545
[msan] Simplify 'maskedCheckAVXIndexShadow' (#147839)
The current instrumentation has more or and element extraction than a
coal mine:

```
[[TMP10:%.*]] = extractelement <16 x i32> [[TMP9]], i64 0
[[TMP11:%.*]] = and i32 [[TMP10]], 15
[[TMP43:%.*]] = or i32 [[TMP10]], [[TMP11]]
[[TMP12:%.*]] = extractelement <16 x i32> [[TMP9]], i64 1
[[TMP13:%.*]] = and i32 [[TMP12]], 15
[[TMP44:%.*]] = or i32 [[TMP12]], [[TMP13]]
    ...
[[TMP40:%.*]] = extractelement <16 x i32> [[TMP9]], i64 15
[[TMP41:%.*]] = and i32 [[TMP40]], 15
[[TMP57:%.*]] = or i32 [[TMP40]], [[TMP41]]
[[_MSCMP:%.*]] = icmp ne i32 [[TMP57]], 0
br i1 [[_MSCMP]], label [[TMP102:%.*]], label [[TMP103:%.*]], !prof [[PROF1]]
```

Simplify it to:

```
[[TMP10:%.*]] = trunc <16 x i32> [[T]] to <16 x i4>
[[TMP12:%.*]] = bitcast <16 x i4> [[TMP10]] to i64
[[_MSCMP:%.*]] = icmp ne i64 [[TMP12]], 0
br i1 [[_MSCMP]], label %[[BB13:.*]], label %[[BB14:.*]], !prof [[PROF1]]
```
2025-07-09 17:56:16 -07:00
Chao Chen
75524dee18
[mlir][xegpu] Relax rank restriction of TensorDescType (#145916) 2025-07-09 19:40:24 -05:00
Jake Egan
d286540734
[sanitizer_common] Introduce SANITIZER_MMAP_BEGIN macro (#147645)
To prepare for other platforms, such as 64-bit AIX, that have a non-zero
mmap beginning address.

---------

Co-authored-by: David Justo <david.justo.1996@gmail.com>
2025-07-09 20:14:23 -04:00
Wenju He
28aa5a64ef
[libclc] Declare workitem built-ins in clc, move ptx-nvidiacl workitem built-ins into clc (#144333)
Changes in this PR:
* Declare most of workitem functions in clc and opencl folders.
* Call clc workitem function in corresponding OpenCL workitem function.
* Move ptx-nvidiacl workitem built-in implementations into clc.
* Move a few amdgcn workitem built-in implementations into clc.
* Include only needed headers in OpenCL workitem functions.
* Implement get_local_linear_id, get_max_sub_group_size,
get_num_sub_groups,
get_sub_group_id, get_sub_group_local_id, get_sub_group_size for
ptx-nvidiacl.

llvm-diff shows this PR adds a few new symbols to nvptx64--nvidiacl.bc.
llvm-diff shows no change to amdgcn--amdhsa.bc, nvptx--.bc and
nvptx64--.bc.
2025-07-10 08:04:16 +08:00
Vincent Lee
03b0ae8da8
[mlgo-utils] Create symlinked entrypoints in root directory (#146981)
These scripts belong in the `mlgo-utils` directory when directly used
with python3. But since they are also used to package with pip, symlink
the entrypoint scripts to mlgo-utils directory. Adjust the bazel paths
to account for this as well. This loosely follows the same structure as lit.

Verified that I was also able to build the package successfully and use
the script.
2025-07-09 16:57:20 -07:00
sribee8
d5436b0b95
[libc] wcslcat implementation (#146588)
implemented wcslcat and tests.

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-07-09 23:54:03 +00:00
Alexey Bataev
ac4a38e9bd
[SLP] Emit reduction instead of 2 extracts + scalar op, when vectorizing operands (#147583)
Added emission of the 2-element reduction instead of 2 extracts + scalar
op, when trying to vectorize operands of the instruction, if it is more
profitable.
2025-07-09 19:52:09 -04:00
Mingming Liu
20daa73a09
[NFC]Codestyle changes for SampleFDO library (#147840)
* Introduce an error code for illegal_line_offset in sampleprof_error
namespace, and use it for line offset parsing error.
* Add `const` for `LineLocation::serialize`.
* Use structured binding, make_first/second_range in loops.

I'm working on a [sample-profile format
change](https://github.com/llvm/llvm-project/compare/users/mingmingl-llvm/samplefdo-profile-format)
to extend SampleFDO profile with vtable profiles. And this change splits
the non-functional changes.
2025-07-09 16:48:17 -07:00
Kazu Hirata
cd65f8bf17 [mlir] Fix a warning
This patch fixes:

  mlir/lib/Dialect/Vector/Transforms/LowerVectorToFromElementsToShuffleTree.cpp:42:20:
  error: unused variable 'kIndScale' [-Werror,-Wunused-const-variable]
2025-07-09 16:45:18 -07:00
Peter Collingbourne
a37f0a00a2 gn build: Port db03408b2445 2025-07-09 16:32:13 -07:00
Craig Topper
574b66f241
[RISCV] Use Selection::haveNoCommonBitsSet in RISCVDAGToDAGISel::orDisjoint. (#147838) 2025-07-09 16:18:51 -07:00
Peter Collingbourne
5b1db59fb8
compiler-rt: Introduce runtime functions for emulated PAC.
The emulated PAC runtime functions emulate the ARMv8.3a pointer
authentication instructions and are intended for use in heterogeneous
testing environments. For more information, see the associated RFC:
https://discourse.llvm.org/t/rfc-emulated-pac/85557

Reviewers: llvm-beanz, petrhosek

Pull Request: https://github.com/llvm/llvm-project/pull/133530
2025-07-09 16:18:37 -07:00
Craig Topper
20a68c6179
[RISCV] Remove BREV8 and ORC_B from hasAllNBitUsers in RISCVOptWInstrs. (#147830)
These instructions operate on bytes so we need to round the demanded
bits up to the nearest byte which we aren't doing. I think we forgot to
update this when we changed from hasAllWUsers to hasNBitUsers.

We don't have any test case for these instruction so remove them until
we can put together a test.
2025-07-09 16:18:04 -07:00
Alex Sepkowski
7c16a31aa5
Address a handful of C4146 compiler warnings where literals can be replaced with std::numeric_limits (#147623)
This PR addresses instances of compiler warning C4146 that can be
replaced with std::numeric_limits. Specifically, these are cases where a
literal such as '-1ULL' was used to assign a value to a uint64_t
variable. The intent is much cleaner if we use the appropriate
std::numeric_limits value<Type>::max() for these cases.


Addresses #147439
2025-07-09 16:13:28 -07:00
Diego Caballero
ddf9b91f9f
[mlir][Vector] Add vector.shuffle tree transformation (#145740)
This PR adds a new transformation that turns sequences of `vector.to_elements` and `vector.from_elements` into a binary tree of `vector.shuffle` operations.

(Related RFC:
https://discourse.llvm.org/t/rfc-adding-vector-to-elements-op-to-the-vector-dialect/86779).

Example:

```
  %0:4 = vector.to_elements %a : vector<4xf32>
  %1:4 = vector.to_elements %b : vector<4xf32>
  %2:4 = vector.to_elements %c : vector<4xf32>
  %3 = vector.from_elements %0#0, %0#1, %0#2, %0#3,
                            %1#0, %1#1, %1#2, %1#3,
                            %2#0, %2#1, %2#2, %2#3 : vector<12xf32>

==>

  %0 = vector.shuffle %a, %b [0, 1, 2, 3, 4, 5, 6, 7] : vector<4xf32>, vector<4xf32>
  %1 = vector.shuffle %c, %c [0, 1, 2, 3, -1, -1, -1, -1] : vector<4xf32>, vector<4xf32>
  %2 = vector.shuffle %0, %1 [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11] : vector<8xf32>, vector<8xf32>
```

The algorithm leverages the structured extraction/insertion information
of `vector.to_elements` and `vector.from_elements` operations and builds
a set of intervals to determine the vector length that should be used at
each level of the tree to combine the level inputs in pairs.

There are a few improvements that can be implemented in the future, such
as shuffle mask compression to avoid unnecessarily large vector lengths
with poison values, but I decided to keep things "simpler" and spend
more time documenting the different steps of the algorithm so that
people can follow along.
2025-07-09 16:09:53 -07:00
Peter Collingbourne
7f3afab918
Extract SipHash implementation into a header.
This is so that we'll be able to use it in compiler-rt as well.
Dependencies on LLVM Support were removed from the header by restoring
code from the original SipHash implementation.

Reviewers: kuhar, dwblaikie, ahmedbougacha

Reviewed By: dwblaikie

Pull Request: https://github.com/llvm/llvm-project/pull/134197
2025-07-09 16:07:16 -07:00
Bogdan Vetrenko
071e30220d
[libc][NFC] fix comment typo ("documentation") (#147836) 2025-07-09 15:44:10 -07:00
mingmingl
c623675fca [SampleFDO][TypeProf]Support vtable type profiling for ext-binary and text format 2025-07-09 14:59:22 -07:00
sribee8
f1acd69bfe
[libc] Added internal wctype functions (#147798)
Copy pasted the ctype equivalents

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-07-09 21:58:55 +00:00
Aiden Grossman
b12fcff4ff
[libcxx] Bump Container Runner Version (#147831)
This patch bumps the runner version from v3.222.0 to v3.226.0 as
v3.222.0 is too old at this point to connect to Github. This is needed
for the new premerge system given we are directly using this container.
This did not impact the existing libc++ CI as the runner was contained
in a separate container image.
2025-07-09 14:56:11 -07:00
Nilanjana Basu
2fc4a4a9d3
[Driver][SamplePGO] Enable -fsample-profile-use-profi (#146795)
Since profile inference improves sample coverage, it should be turned on by default.
2025-07-09 14:53:28 -07:00
Alex Langford
9337594e33
[Support] Don't re-raise signals sent from kernel (#145759)
When an llvm tool crashes (e.g. from a segmentation fault),
SignalHandler will re-raise the signal. The effect is that crash reports
now contain SignalHandler in the stack trace. The crash reports are
still useful, but the presence of SignalHandler can confuse tooling and
automation that deduplicate or analyze crash reports.

rdar://150464802
2025-07-09 14:53:15 -07:00
Corentin Jabot
6d00c4297f
[Clang] Do not skip over RequiresExprBodyDecl when creating lambdas (#147764)
When we create a lambda, we would skip over declaration contexts
representing a require expression body, which would lead to wrong
lookup.

Note that I wasn't able to establish why the code
in `Sema::createLambdaClosureType` was there to begin with (it's not
exactly recent)

The changes to mangling only ensure the status quo is preserved and do
not attempt to address the known issues of
mangling lambdas in require clauses.

In particular the itanium mangling is consistent with Clang before this
patch but differs from GCC's.

Fixes #147650
2025-07-10 00:21:09 +03:00
sribee8
16f046281b
[libc] wcslcpy implementation (#146571)
Implemented wcslcpy and tests.

---------

Co-authored-by: Sriya Pratipati <sriyap@google.com>
2025-07-09 21:17:16 +00:00
Michael Buch
9d8058e3b8
[lldb][test] Move std::function from libcxx to generic directory (#147701)
This just moves the test from `libcxx` to `generic`. There are currently
no `std::function` formatters for libstdc++ so I didn't add a test-case
for it.

Split out from https://github.com/llvm/llvm-project/pull/146740
2025-07-09 22:16:59 +01:00
Craig Topper
2fc6c73b39
[LegalizeTypes] Preserve disjoint flag when expanding OR. (#147640) 2025-07-09 14:15:42 -07:00
Andres-Salamanca
7563531fc9
[CIR] Add test for parsing bitfield_info attribute (#147628)
This PR adds a test for parsing the bitfield_info attribute.
Additionally, it updates the `storage_type` and `is_signed` fields to
match the style used in the incubator ASM format guide.
2025-07-09 16:03:47 -05:00
Krzysztof Parzyszek
2546c6d3f7
[flang][OpenMP] Recognize remaining OpenMP 6.0 spellings in parser (#147723)
Parse OpenMP 6.0 spellings for directives that don't use
OmpDirectiveNameParser.
2025-07-09 16:02:24 -05:00