545233 Commits

Author SHA1 Message Date
Wenju He
cf36f49c04
[libclc] Enable clang fp reciprocal in clc_native_divide/recip/rsqrt/tan (#149269)
The pragma adds `arcp` flag to `fdiv` instruction in these functions.
The flag can provide better performance.
2025-07-18 07:50:35 +08:00
Stanislav Mekhanoshin
0b6df5485e
[AMDGPU] Reenable tanh real-true16 run line. NFC. (#149411) 2025-07-17 16:11:25 -07:00
Stanislav Mekhanoshin
c15a50ad22
[AMDGPU] More flatGVS gfx1250 patterns (#149410) 2025-07-17 16:10:59 -07:00
Han-Chung Wang
6ff471883f
[mlir][linalg] Improve linalg.pack consumer fusion. (#148993)
If a dimension is not tiled, it is always valid to fuse the pack op,
even if it has padding semantics. Because it always generates a full
slice along the dimension.

If a dimension is tiled and it does not need extra padding, the fusion
is valid.

The revision also formats corresponding tests for consistency.

---------

Signed-off-by: hanhanW <hanhan0912@gmail.com>
2025-07-17 16:06:06 -07:00
Michael Buch
8f4deff5d5
[libcxx][fstream][NFC] Make __failed helper lambda a member function (#149390)
This patch makes the `__failed` lambda a member function on `fstream`.
This fixes two LLDB expression evaluation test failures that got
introduced with https://github.com/llvm/llvm-project/pull/147389:
```
16:22:51  ********************
16:22:51  Unresolved Tests (2):
16:22:51    lldb-api :: commands/expression/import-std-module/list-dbg-info-content/TestDbgInfoContentListFromStdModule.py
16:22:51    lldb-api :: commands/expression/import-std-module/list/TestListFromStdModule.py
```

The expression evaluator is asserting in the Clang parser:
```
Assertion failed: (capture_size() == Class->capture_size() && "Wrong number of captures"), function LambdaExpr, file ExprCXX.cpp, line 1277.
PLEASE submit a bug report to https://github.com/llvm/llvm-project/issues/ and include the crash backtrace.
```

Ideally we'd figure out why LLDB is falling over on this lambda. But to
unblock CI for now, make this a member function.

In the long run we should figure out the LLDB bug here so libc++ doesn't
need to care about whether it uses lambdas like this or not.
2025-07-17 23:50:39 +01:00
Vitaly Buka
100d8f7cc7
[clang][docs] Fix example in SanitizerSpecialCaseList.rst (#149244)
As-ie example suppresses buffer overflow in
malloc, and leave memory leak in place. It can be
confusing.

Fixes #62421.
2025-07-17 15:27:43 -07:00
Kazu Hirata
2da59287aa
[Target] Remove unnecessary casts (NFC) (#149342)
getFunction().getParent() already returns Module *.
2025-07-17 15:24:25 -07:00
Kazu Hirata
2d7ff097f2
[Sema] Remove unnecessary casts (NFC) (#149340)
getArrayIndex(), getArrayRangeStart(), and getArrayRangeEnd() already
return Expr *.
2025-07-17 15:24:18 -07:00
Kazu Hirata
be6893af87
[CodeGen] Remove an unnecessary cast (NFC) (#149339)
getExceptionMode() already returns LangOptions::FPExceptionModeKind.
2025-07-17 15:24:10 -07:00
Kazu Hirata
f48e2bbe98
[AST] Remove an unnecessary cast (NFC) (#149338)
getFinallyStmt() already returns ObjCAtFinallyStmt *.
2025-07-17 15:24:02 -07:00
Kazu Hirata
2a7328daca
[flang] Migrate away from ArrayRef(std::nullopt_t) (#149337)
ArrayRef(std::nullopt_t) has been deprecated.  This patch replaces
std::nullopt with {}.

A subsequence patch will address those places where we need to replace
std::nullopt with mlir::TypeRange{} or mlir::ValueRange{}.
2025-07-17 15:23:55 -07:00
Charitha Saumya
fc3781853b
[mlir][xegpu] Minor fixes in XeGPU subgroup distribution. (#147846)
This PR addresses the following issues.

1. Add the missing attributes when creating a new GPU funcOp in
`MoveFuncBodyToWarpExecuteOnLane0` pattern.
2. Bug fix in LoadNd distribution to make sure LoadOp is the last op in
warpOp region before it is distributed (needed for preserving the memory
op ordering during distribution).
3. Add utility for removing OpOperand or OpResult layout attributes.
2025-07-17 15:13:20 -07:00
Roland McGrath
72a2d8220a
[libc] Convert dlfcn.h to pure YAML (#149362)
Remove the unnecessary .h.def file and move all the macro
definitions directly into dlfcn.yaml.
2025-07-17 15:05:20 -07:00
Deric C.
fae8df2b82
[DirectX] Fix GEP flattening with 0-indexed GEPs on global variables (#149211)
Fixes #149179 

The issue is that `Builder.CreateGEP` does not return a GEP Instruction
or GEP ContantExpr when the pointer operand is a global variable and all
indices are constant zeroes.

This PR ensures that a GEP instruction is created if `Builder.CreateGEP`
did not return a GEP.
2025-07-17 14:51:53 -07:00
Deric C.
689e95817e
[DirectX] Add a GEP to scalar load/store on globals and remove incorrect assertion (#149191)
Fixes #149180

This PR removes an assertion that triggered on valid IR. It has been
replaced with an if statement that returns early if the conditions are
not correct.

This PR also adds GEPs to scalar loads and stores from/to global
variables.
2025-07-17 14:46:45 -07:00
S. VenkataKeerthy
5d78332e8a
Add llvm-ir2vec.rst to pr-subscribes-mlgo (#149412) 2025-07-17 14:46:24 -07:00
Stanislav Mekhanoshin
25619c406e
[AMDGPU] Remove unused VGLOBAL_Real_AllAddr_gfx12. NFC. (#149398) 2025-07-17 14:45:26 -07:00
Ivan Butygin
6b29ee9d9a
[mlir][amdgpu] Properly handle mismatching memref ranks in amdgpu.gather_to_lds (#149407)
This op doesn't have any rank or indices restrictions on src/dst
memrefs, but was using `SameVariadicOperandSize` which was causing
issues. Also fix some other issues while we at it.
2025-07-18 00:42:25 +03:00
Michael Buch
b8264293a7 [lldb][test] TestChildDepthTruncation: don't force DWARF
Fixes test on Windows. Same reason as https://github.com/llvm/llvm-project/pull/149322
2025-07-17 22:36:25 +01:00
Changpeng Fang
70046cd2b5
AMDGPU: Remove the dot4 test in insert-delay-alu-wmma-xdl.mir, NFC (#149375)
This is irrelevant, and caused a failure in downstream.

Fixes: SWDEV-544025
2025-07-17 14:26:09 -07:00
Nikolas Klauser
be3d614cc1
[libc++] Fix hash_multi{map,set}::insert (#149290) 2025-07-17 23:23:04 +02:00
Stanislav Mekhanoshin
422a250b0b
[AMDGPU] add tests for Change FLAT SADDR to VADDR form in moveToVALU. NFC. (#149392) 2025-07-17 14:18:52 -07:00
Tomohiro Kashiwada
8de61eb01c
[Support/BLAKE3] quick fix for Cygwin build (#148635)
BLAKE3 1.8.2 ( imported in d2ad63a193216d008c8161879a59c5f42e0125cc )
fails to build for the Cygwin target.

see: https://github.com/BLAKE3-team/BLAKE3/issues/494

As a temporary workaround, add `&& !defined(__CYGWIN__)` to BLAKE3
locally.

resolves https://github.com/llvm/llvm-project/issues/148365
2025-07-18 00:16:08 +03:00
Jian Cai
7e220630d2
[mlir][docs] Rename OpTrait to Trait in ODS doc (#148276)
This makes the doc consistent with the code base.
2025-07-17 14:13:28 -07:00
S. VenkataKeerthy
64c7e7efeb
Add tools/llvm-ir2vec to pr-subscribes-mlgo (#149405) 2025-07-17 14:03:21 -07:00
S. VenkataKeerthy
202f30ede1
[IR2Vec][llvm-ir2vec] Add support for reading from stdin (#149213)
Add support for reading LLVM IR from stdin in the llvm-ir2vec tool.

This allows usage of the tool in pipelines where LLVM IR is generated or transformed on-the-fly just like the other llvm tools. Useful in upcoming PRs.

(Tracking issue - #141817)
2025-07-17 13:43:53 -07:00
S. VenkataKeerthy
61a45d20cf
[IR2Vec][NFC] Add helper methods for numeric ID mapping in Vocabulary (#149212)
Add helper methods to IR2Vec's Vocabulary class for numeric ID mapping and vocabulary size calculation. These APIs will be useful in triplet generation for `llvm-ir2vec` tool (See #149214). 

(Tracking issue - #141817)
2025-07-17 13:40:51 -07:00
Jianhui Li
aea2d53961
[MLIR][XeGPU] make offsets optional for create_nd_tdesc (#148335) 2025-07-17 15:33:39 -05:00
Tobias Hieta
867ff3001e
Use Parallel xz for test-suite sources. (#149389) 2025-07-17 22:33:27 +02:00
Michael Buch
1e7ec351c4
[lldb] Adjust default target.max-children-depth (#149282)
Deeply nested structs can be noisy, so Apple's LLDB fork sets the
default to `4`:
9c93adbb28/lldb/source/Target/TargetProperties.td (L134-L136)

Thought it would be useful to upstream this. Though happy to pick a
different default or keep it as-is.
2025-07-17 21:24:27 +01:00
Peter Rong
b0c6148584
[DWARFLinker] Use different addresses to distinguish invalid DW_AT_LLVM_stmt_sequence offset (#149376)
It'd be helpful (especially when `llvm-dwarfdump ... | grep
<invalid_address>`) to separate two different invalid reasons for
debugging.
2025-07-17 13:19:26 -07:00
Fraser Cormack
284dd5ba84
[SelectionDAG] Fix misplaced commas in operand bundle errors (#149331) 2025-07-17 21:18:05 +01:00
Eli Friedman
6a60f18997
[clang] Fix potential constant expression checking with constexpr-unknown. (#149227)
071765749a70b22fb62f2efc07a3f242ff5b4c52 improved constexpr-unknown
diagnostics, but potential constant expression checking broke in the
process: we produce diagnostics in more cases. Suppress the diagnostics
as appropriate.

This fix affects -Winvalid-constexpr and the enable_if attribute. (The
-Winvalid-constexpr diagnostic isn't really important right now, but it
will become important if we allow constexpr-unknown with pre-C++23
standards.)

Fixes #149041.  Fixes #149188.
2025-07-17 13:14:34 -07:00
Prabhu Rajasekaran
e8182fb501
[libc] add wctype.h header (#149202)
Add basic configurations to generate wctype.h header file. To begin with
this header file just exposes one function iswalpha.
2025-07-17 13:06:04 -07:00
Florian Mayer
48cd22c566
[NFC] simplify LowerAllowCheckPass::printPipeline (#149374) 2025-07-17 12:54:56 -07:00
Jeremy Kun
a8880265e1
[mlir] Fix CI breakage from https://github.com/llvm/llvm-project/pull/146228 (#149378)
Some platforms print `{anonymous}` instead of the other two forms
accepted by the test regex. This PR just removes the attempt to guess
how the anonymous namespace will be printed.

@Kewen12 is there a way to trigger the particular CIs that failed in
https://github.com/llvm/llvm-project/pull/146228 on this PR?

Co-authored-by: Jeremy Kun <j2kun@users.noreply.github.com>
2025-07-17 21:52:37 +02:00
Jake Egan
4e6b843cf5
[asan] Revert global check for non-AIX (#149245)
287b24e1899eb6ce62eb9daef5a24faae5e66c1e moved the
`GetGlobalAddressInformation` call earlier, but this broke a chromium
test, so make this workaround for AIX only.
2025-07-17 15:50:44 -04:00
Shilei Tian
7e105fbdbe
[AMDGPU] Add support for v_tanh_f32 on gfx1250 (#149360)
Co-authored-by: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin@amd.com>
2025-07-17 15:42:35 -04:00
Peter Collingbourne
e68efed71b Fix more compiler-rt tests after #149015. 2025-07-17 12:35:18 -07:00
S. VenkataKeerthy
f2956173ae
[IR2Vec] Adding documentation for llvm-ir2vec tool (#148719)
Tracking issues - #141817, #141834
2025-07-17 12:09:50 -07:00
S. VenkataKeerthy
70e2319e9a
[IR2Vec] Add embeddings mode to llvm-ir2vec tool (#147844)
Add embedding generation functionality to the llvm-ir2vec tool, complementing the existing triplet generation mode.

This change completes the IR2Vec tool by adding the embedding generation functionality, which was previously mentioned as a TODO item. The tool now supports both triplet generation for vocabulary training and embedding generation using a trained vocabulary.
2025-07-17 12:06:52 -07:00
S. VenkataKeerthy
d994487db7
[IR2Vec] Add llvm-ir2vec tool for generating triplet embeddings (#147842)
Add a new LLVM tool `llvm-ir2vec`. This tool is primarily intended to generate triplets for training the vocabulary (#141834) and to potentially generate the embeddings in a stand alone manner.

This PR introduces the tool with triplet generation functionality. In the upcoming PRs I'll add scripts under `utils/mlgo` to complete the vocabulary tooling. #147844 adds embedding generation logic to the tool.

(Tracking issue - #141817)
2025-07-17 12:03:56 -07:00
Shilei Tian
fd5fc76c91
[AMDGPU] Add support for v_cos_bf16 on gfx1250 (#149355)
Co-authored-by: Mekhanoshin, Stanislav <Stanislav.Mekhanoshin@amd.com>
2025-07-17 14:43:34 -04:00
Krzysztof Parzyszek
73d4cea68c
[flang][OpenMP] Generalize isOpenMPPrivatizingConstruct (#148654)
Instead of treating all block and all loop constructs as privatizing,
actually check if the construct allows a privatizing clause.
2025-07-17 13:41:04 -05:00
Peter Collingbourne
2c0c87be12 Speculative buildbot fix. 2025-07-17 11:28:36 -07:00
Eugene Epshteyn
413e71b700
[flang] Main program symbol no longer conflicts with the other symbols (#149169)
The following code is now accepted:
```
module m
end
program m
use m
end
```
The PROGRAM name doesn't really have an effect on the compilation
result, so it shouldn't result in symbol name conflicts.

This change makes the main program symbol name all uppercase in the
cooked character stream. This makes it distinct from all other symbol
names that are all lowercase in cooked character stream.

Modified the tests that were checking for lower case main program name.
2025-07-17 14:18:21 -04:00
Andy Kaylor
afff28e4cb
[CI][Github] Enable CIR CI build and test (#147430)
This change modifies CI scripts to add a pseudo-project for CIR and
detect when CIR-specific files are modified. It also enables building
clang with CIR enabled whenever both the clang and mlir projects are
being built.

Building and testing CIR is only enabled on Linux at this time, as CIR
doesn't properly support Windows or MacOS yet.
2025-07-17 11:17:52 -07:00
Peter Collingbourne
3fa07ed5b3
Rename config.host_os to config.target_os.
config.host_os is derived from CMAKE_SYSTEM_NAME
which specifies the target. See:
https://cmake.org/cmake/help/latest/variable/CMAKE_SYSTEM_NAME.html

To reduce confusion, rename it to config.target_os.

The variable name config.target_os was already being used by the Orc
tests. Rename it to config.orc_test_target_os with a FIXME to remove.

Reviewers: JDevlieghere, MaskRay

Reviewed By: MaskRay

Pull Request: https://github.com/llvm/llvm-project/pull/149015
2025-07-17 11:12:29 -07:00
Alex MacLean
f480e1b825
[NVPTX] Add PRMT constant folding and cleanup usage of PRMT node (#148906) 2025-07-17 11:10:23 -07:00
Andrzej Warzyński
3b11aaaf94
[mlir][linalg] Add support for scalable vectorization of linalg.mmt4d (#146531)
This patch adds support for scalable vectorization of linalg.mmt4d. The
key design change is the introduction of a new vectorizer state variable:

* `assumeDynamicDimsMatchVecSizes`

...along with the corresponding Transform dialect attribute:

* `assume_dynamic_dims_match_vec_sizes`.

This flag instructs the vectorizer to assume that dynamic memref/tensor
dimensions match the corresponding vector sizes (fixed or scalable). With this
assumption, masking becomes unnecessary, which simplifies the lowering pipeline
significantly.

While this assumption is not universally valid, it typically holds for
`linalg.mmt4d`. Inputs and outputs are explicitly packed using `linalg.pack`,
and this packing includes padding, ensuring that dimension sizes align with
vector sizes (*).

* Related discussion: https://github.com/llvm/llvm-project/issues/143920

An upcoming patch will include an end-to-end test that leverages scalable
vectorization of linalg.mmt4d to demonstrate the newly enabled functionality.
This would not be feasible without the changes introduced here, as it would
otherwise require additional logic to handle complex - but ultimately redundant
- masks.

(*) This holds provided that the tile sizes used for packing match the vector
sizes used during vectorization. It is the user’s responsibility to enforce
this.
2025-07-17 19:02:08 +01:00