548870 Commits

Author SHA1 Message Date
Wenju He
76bb98746b
[NFC][libclc] add missing __CLC_ prefix all internal macros (#153523)
This unifies naming scheme of macros to address review comment
https://github.com/intel/llvm/pull/19779#discussion_r2272194357

math constant value macros are not changed, e.g.
`#define AU0 -9.86494292470009928597e-03`
2025-08-18 07:21:04 +08:00
Fangrui Song
34c7b7ccae MCSymbol: Remove setUndefined
The name is misleading, as setting Fragment to nullptr does not
necessarily make it undefined - common and equated symbols have
a nullptr fragment as well.
2025-08-17 15:57:27 -07:00
Wenju He
bce14c69db
[libclc] Fix out-of-bound value for workitem functions according to OpenCL spec (#153784) 2025-08-18 06:51:01 +08:00
Abhinav Gaba
12769aa728
[Offload] Introduce ATTACH map-type support for pointer attachment. (#149036)
This patch introduces libomptarget support for the ATTACH map-type,
which can be used to implement OpenMP conditional compliant pointer
attachment, based on whether the pointer/pointee is newly mapped on a
given construct.

For example, for the following:

```c
  int *p;
  #pragma omp target enter data map(p[1:10])
```

The following maps can be emitted by clang:
```
  (A)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM
  &p, &p[1], sizeof(p), ATTACH
```

Without this map-type, these two possible maps could be emitted by
clang:
```
  (B)
  &p[0], &p[1], 10 * sizeof(p[1]), TO | FROM

  (C)
  &p, &p[1], 10 * sizeof(p[1]), TO | FROM | PTR_AND_OBJ
````

(B) does not perform any pointer attachment, while (C) also maps the
pointer p, which are both incorrect.

In terms of implementation, maps with the ATTACH map-type are handled
after all other maps have been processed, as it requires knowledge of
which new allocations happened as part of the construct. As per OpenMP
5.0, an attachment should happen only when either the pointer or the
pointee was newly mapped while handling the construct.

Maps with ATTACH map-type-bit do not increase/decrease the ref-count.

With OpenMP 6.1, `attach(always/never)` can be used to force/prevent
attachment. For `attach(always)`, the compiler will insert the ALWAYS
map-type, which would let libomptarget bypass the check about one of the
pointer/pointee being new. With `attach(never)`, the ATTACH map will not
be emitted at all.

The size argument of the ATTACH map-type can specify values greater than
`sizeof(void*)` which can be used to support pointer attachment on
Fortran descriptors. Note that this also requires shadow-pointer
tracking to also support them. That has not been implemented in this
patch.

This was worked upon in coordination with Ravi Narayanaswamy, who has
since retired. Happy retirement, Ravi!

---------

Co-authored-by: Alex Duran <alejandro.duran@intel.com>
2025-08-17 15:17:04 -07:00
Baranov Victor
dff8dac9dc
[clang-tidy][docs] Add description of "clang-diagnostic-error" (#153870)
This helps better distinguish warnings that could be disabled via
`.clang-tidy` config (like `clang-diagnostic-literal-conversion`) from
errors that could not be suppressed at all (like
`clang-diagnostic-error`) because it's a hard compiler error.
2025-08-18 00:18:32 +03:00
Shenghang Tsai
7610b13729
[MLIR] Split ExecutionEngine Initialization out of ctor into an explicit method call (#153524)
Retry landing https://github.com/llvm/llvm-project/pull/153373
## Major changes from previous attempt
- remove the test in CAPI because no existing tests in CAPI deal with
sanitizer exemptions
- update `mlir/docs/Dialects/GPU.md` to reflect the new behavior: load
GPU binary in global ctors, instead of loading them at call site.
- skip the test on Aarch64 since we have an issue with initialization there

---------

Co-authored-by: Mehdi Amini <joker.eph@gmail.com>
2025-08-17 23:07:24 +02:00
Mohamed Emad
40833eea21
Reland "[libc][math][c23] Implement C23 math function asinpif16" (#152690)
#146226 with fixing asinpi MPFR number function and make it work when
mpfr < `4.2.0`
2025-08-18 00:04:47 +03:00
Florian Hahn
5892a2beec
[VPlan] Remove dead code from GetBroadCastInstr (NFCI).
All relevant places should already explicitly materialize broadcasts.
Remove dead code from VPTransformState::get
2025-08-17 21:51:14 +01:00
Sergei Barannikov
6947fb4556 [TableGen] Use structured binding in one place (NFC) 2025-08-17 23:50:23 +03:00
Sergei Barannikov
a10773c864
[TableGen][DecoderEmitter] Remove EncodingIDAndOpcode struct (NFC) (#154028)
Most of the time we don't need instruction opcode. There is no need to
carry it around all the time, we can easily get it by other means.
Rename affected variables accordingly.

Part of an effort to simplify DecoderEmitter code.
2025-08-17 20:13:48 +00:00
owenca
6cfedea492
[clang-format] Add SpaceInEmptyBraces option (#153765)
Also set it to SIEB_Always for WebKit style.

Closes #85525.
Closes #93635.
2025-08-17 12:56:22 -07:00
owenca
a21d17f1d7
[clang-format] Fix a bug in breaking before FunctionDeclarationName (#153924)
Fixes #153891
2025-08-17 12:54:48 -07:00
owenca
5e57a10f50
[clang-format] Allow breaking before bit-field colons (#153529)
Fixes #153448
2025-08-17 12:54:23 -07:00
owenca
9a692e0f94
[clang-format] Don't annotate class property specifiers as StartOfName (#153525)
Fixes #153443
2025-08-17 12:53:57 -07:00
Adam Nemet
350cb989b8
[X86] Explicitly widen larger than v4f16 to the legal v8f16 (NFC) (#153839)
This patch makes the current behavior explicit to prepare for adding VTs
for v[567]f16.

Right now these types are EVTs and hence don't fall under
getPreferredVectorAction and are simply widened to the next legal
power-of-two vector type. For SSE2 this is v8f16.

Without the preparatory patch however, the behavior would change after
adding these types. getPreferredVectorAction would try to split them
because this is the current behavior for any f16 vector type that is not
legal.

There is a lot more detail at
https://github.com/llvm/llvm-project/issues/152150 in particular how
splitting these new types leads to an inconsistency between
NumRegistersForVT and getTypeAction.

The patch ensures that after the new types are added they would continue
to be widened rather than split. Once the patch to enable v[567]f16
lands, it will be an NFC for x86.
2025-08-17 19:15:10 +00:00
Andreas Jonson
0561ff6a12
[LVI] Add support for trunc nuw range. (#154021)
Proof: https://alive2.llvm.org/ce/z/a5Yjb8
2025-08-17 20:24:09 +02:00
Veera
e1aa415220
[mlir][InferIntRangeCommon] Fix Division by Zero Crash (#151637)
Fixes #131273

Adds a check to avoid division when max value of denominator is zero.
2025-08-17 10:56:34 -07:00
Aiden Grossman
71925a90c8
[libc] Setup hdrgen for ioctl (#153976)
This patch adds some hdrgen yaml for ioctl(). Otherwise the function
never actually ends up being available in a full build. This is the last
thing that is needed to enable turning on LIBCXX_ENABLE_RANDOM_DEVICE.
2025-08-17 08:52:29 -07:00
mdenson
65ffa53cb7
[Clang] unrecognized html tag causing undesirable comment lexing (#152944)
Simple fix for this particular html tag. A more complete solution should
be implemented.

1. Add all html tags to table so they are recognized. Some input on what
is desirable/safe would be appreciated
2. Change the lex strategy to deal with this in a different manner

Fixes #32680

---------

Co-authored-by: Brock Denson <brock.denson@virscient.com>
2025-08-17 15:59:47 +02:00
Erik Davis
a66d8f62e6
[mlir][doc] fixup code block (#153977)
This fixes a small typo in the toy tutorial. A code block was not
correctly terminated, causing it to run into the subsequent block.
2025-08-17 13:01:05 +02:00
Baranov Victor
66a2d1b758
[clang-tidy][NFC] Remove py2 conditions from clang-tidy scripts (#154005) 2025-08-17 13:25:22 +03:00
v1nh1shungry
326d749a36
[clang-tidy] Fix cppcoreguidelines-prefer-member-initializer false positive for inherited members (#153941)
```cpp
struct Base {
  int m;
};

template <class T>
struct Derived : Base {
  Derived() { m = 0; }
};
```

would previously generate the following output:

```
<source>:7:15: warning: 'm' should be initialized in a member initializer of the constructor [cppcoreguidelines-prefer-member-initializer]
    7 |   Derived() { m = 0; }
      |               ^~~~~~
      |             : m(0)
```

This patch fixes this false positive.

Note that before this patch the checker won't give false positive for

```cpp
struct Derived : Base {
  Derived() { m = 0; }
};
```

and the constructor's AST is

```
`-CXXConstructorDecl 0x557df03d1fb0 <line:7:3, col:22> col:3 Derived 'void ()' implicit-inline
    |-CXXCtorInitializer 'Base'
    | `-CXXConstructExpr 0x557df03d2748 <col:3> 'Base' 'void () noexcept'
    `-CompoundStmt 0x557df03d2898 <col:13, col:22>
      `-BinaryOperator 0x557df03d2878 <col:15, col:19> 'int' lvalue '='
        |-MemberExpr 0x557df03d2828 <col:15> 'int' lvalue ->m 0x557df03d1c40
        | `-ImplicitCastExpr 0x557df03d2808 <col:15> 'Base *' <UncheckedDerivedToBase (Base)>
        |   `-CXXThisExpr 0x557df03d27f8 <col:15> 'Derived *' implicit this
        `-IntegerLiteral 0x557df03d2858 <col:19> 'int' 0
```

so `isAssignmentToMemberOf` would return empty due to


f0967fca04/clang-tools-extra/clang-tidy/cppcoreguidelines/PreferMemberInitializerCheck.cpp (L118-L119)

Fixes #104400
2025-08-17 11:42:38 +02:00
Carlos Galvez
bd77e9acf0
[clang-tidy] Avoid matching nodes in system headers (#151035)
This commit is a re-do of e4a8969e56572371201863594b3a549de2e23f32,
which got reverted, with the same goal: dramatically speed-up clang-tidy
by avoiding doing work in system headers (which is wasteful as warnings
are later discarded). This proposal was already discussed here with
favorable feedback: https://github.com/llvm/llvm-project/pull/132725

The novelty of this patch is:

- It's less aggressive: it does not fiddle with AST traversal. This
solves the issue with the previous patch, which impacted the ability to
inspect parents of a given node.

- Instead, what we optimize for is exitting early in each `Traverse*`
function of `MatchASTVisitor` if the node is in a system header, thus
avoiding calling the `match()` function with its corresponding callback
(when there is a match).

- It does not cause any failing tests.

- It does not move `MatchFinderOptions` - instead we add a user-defined
default constructor which solves the same problem.

- It introduces a function `shouldSkipNode` which can be extended for
adding more conditions. For example there's a PR open about skipping
modules in clang-tidy where this could come handy:
https://github.com/llvm/llvm-project/pull/145630

As a benchmark, I ran clang-tidy with all checks activated, on a single
.cpp file which #includes all the standard C++ headers, then measure the
time as well as found warnings.

On trunk:

```
Suppressed 75413 warnings (75413 in non-user code).

real	0m12.418s
user	0m12.270s
sys	0m0.129s
```

With this patch:

```
Suppressed 11448 warnings (11448 in non-user code).
Use -header-filter=.* to display errors from all non-system headers. Use -system-headers to display errors from system headers as well.

real	0m1.666s
user	0m1.538s
sys	0m0.129s
```

With the original patch that got reverted:

```
Suppressed 11428 warnings (11428 in non-user code).

real	0m1.193s
user	0m1.096s
sys	0m0.096s
```

We therefore get a dramatic reduction in number of warnings and runtime,
with no change in functionality.

The remaining warnings are due to `PPCallbacks` - implementing a similar
system-header exclusion mechanism there can lead to almost no warnings
left in system headers. This does not bring the runtime down as much,
though, so it's probably not worth the effort.

Fixes #52959

Co-authored-by: Carlos Gálvez <carlos.galvez@zenseact.com>
2025-08-17 11:40:48 +02:00
Andreas Jonson
5ae8a9b8ce
[SimplifyCfg] Handle trunc nuw i1 condition in Equality comparison. (#153051)
proof: https://alive2.llvm.org/ce/z/WVt4-F
2025-08-17 09:53:40 +02:00
Timm Baeder
e44784fb44
[clang][bytecode] Fix pseudo dtor calls on non-pointers (#153970)
The isGLValue() check made us ignore expressions we shouldn't ignore.
2025-08-17 08:47:57 +02:00
Sergei Barannikov
ea4325f174
[TableGen][DecoderEmitter] Improve conflicts dump (#154001)
* Print filter stack in non-reversed order.
* Print encoding name to the right of encoding bits to deal with
alignment issues.
* Use the correct bit width when printing encoding bits.

Example of old output:
```
		01000100........
		01000...........
		0100............
		................
	tADDhirr 000000000000000001000100________
	tADDrSP 000000000000000001000100_1101___
	tADDspr 0000000000000000010001001____101
```

New output:
```
    ................
    0100............
    01000...........
    01000100........
    01000100________  tADDhirr
    01000100_1101___  tADDrSP
    010001001____101  tADDspr
```
2025-08-17 06:42:25 +00:00
Sergei Barannikov
05f1673e75 [TableGen] Make a function static (NFC)
Also, modernize the return value to std::optional.
2025-08-17 09:31:28 +03:00
Sergei Barannikov
05827e7ccb [TableGen][DecoderEmitter] Dump conflicts earlier
Dump a conflict as soon as we discover it, no need to wait until
we start building the decoder table.
This improves debugging experience.
2025-08-17 08:20:31 +03:00
Errant
3d83dbb736
[clang] Fix typos in OMPClauseProfiler method names for consistency (#153852) 2025-08-17 07:01:59 +02:00
Sergei Barannikov
fc6024d895
[TableGen][DecoderEmitter] Shrink lifetime of Filters vector (NFC) (#153998)
Only one element of the `Filters` vector (see `BestIndex`) is used
outside the method that fills it. Localize the vector to the method,
replacing the member variable with the only used element.

Part of an effort to simplify DecoderEmitter code.
2025-08-17 04:02:16 +00:00
Owen Pan
ee51f35993 [clang-format][doc] Add OneLineFormatOffRegex to format-off section 2025-08-16 20:59:07 -07:00
Hristo Hristov
f3008c1140
[libc++][flat_set] LWG3751, LWG3774 (#153934)
- LWG3751: Missing feature macro for `flat_set`

Implemented in LLVM21:
7013b51548

Closes  #105021

- LWG3774: `<flat_set>` should include `<compare>`

Implemented in LLVM21:
2f1416bbcd

684797b644/libcxx/include/flat_set (L77)

Closes #105036
2025-08-17 09:52:07 +08:00
knickish
bc3754de0a
[M68k] Add anyext patterns for PCD addressing mode (#150356)
Does what it says on the tin: anyext loads with the PCD addressing mode
were failing addr mode selection, adding the patterns resolved it.
2025-08-16 23:33:47 +00:00
Aiden Grossman
29d49c8a37
[libc] Correct standard for getcpu (#153982) 2025-08-16 16:05:45 -07:00
Aiden Grossman
1f5047e430
[Github] Remove call to llvm-project-tests.yml from spirv-tests.yml
This will eventually allow for removing llvm-project-tests.yml. This
should significantly reduce the complexity of these workflows at the
cost of a little bit of duplication standard to github actions.

Reviewers: michalpaszkowski, sudonatalie

Reviewed By: sudonatalie

Pull Request: https://github.com/llvm/llvm-project/pull/153869
2025-08-16 15:52:39 -07:00
Fangrui Song
2cedb286b8 MCSymbol: Remove unused IsTarget parameter from declareCommon 2025-08-16 15:47:39 -07:00
Fangrui Song
aa96e20dce MCSymbol: Remove AMDGPU-specific Kind::TargetCommon
The SymContentsTargetCommon kind introduced by
https://reviews.llvm.org/D61493 lackes significant and should be treated
as a regular common symbol with a different section index.

Update ELFObjectWriter to respect the specified section index.
The new representation also works with Hexagon's SHN_HEXAGON_SCOMMON.
2025-08-16 15:39:33 -07:00
Fangrui Song
190778a8ba MCSymbol: Rename SymContents to kind
The names "SymbolContents" and "SymContents*" members are confusing.
Rename to kind and Kind::XXX similar to lld/ELF/Symbols.h

Rename SymContentsVariable to Kind::Equated as the former term is
"equated symbol", not "variable".
2025-08-16 15:10:35 -07:00
Sergei Barannikov
7bb73455f7
[TableGen][DecoderEmitter] Add helpers for working with scopes (NFC) (#153979)
Part of an effort to simplify DecoderEmitter code.
2025-08-16 21:49:17 +00:00
Shafik Yaghmour
f8740920ee
[Clang][Sema] Check the return value of DiagnoseClassNameShadow in ActOnEnumConstant (#143754)
Static analysis flagged that we were not checking the return value of
DiagnoseClassNameShadow when we did so everywhere else. Modifying this
case to match how other places uses it makes sense and does not change
behavior. Likely if this check fails later actions will fail as well but
it is more correct to exit early.
2025-08-16 14:08:39 -07:00
Aiden Grossman
ddae3b74a3 [CI] Show Stats in CI Log
This patch makes utils.sh also print the stats out. This is particularly
useful in postcommit CI where we are currently not saving artifacts.
2025-08-16 20:55:45 +00:00
Florian Hahn
73775a0f27
[LV] Add test for #153946.
Add test for miscompile from
https://github.com/llvm/llvm-project/issues/153946, caused by poison
propagation.
2025-08-16 21:19:20 +01:00
Kazu Hirata
1c8da29f48
[ADT] Use small_buckets() in SmallPtrSetImpl::remove_if (NFC) (#153962) 2025-08-16 13:15:36 -07:00
Leandro Lacerda
75bf739208
[libc][gpu] Disable loop unrolling in the throughput benchmark loop (#153971)
This patch makes GPU throughput benchmark results more comparable across
targets by disabling loop unrolling in the benchmark loop.

Motivation:
* PTX (post-LTO) evidence on NVPTX: for libc `sin`, the generated PTX
shows the `throughput` loop unrolled 8x at `N=128` (one iteration
advances the input pointer by 64 bytes = 8 doubles), interleaving eight
independent chains before the back-edge. This hides latency and
significantly reduces cycles/call as the batch size `N` grows.
* Observed scaling (NVPTX measurements): with unrolling enabled, `sin`
dropped from ~3,100 cycles/call at `N=1` to ~360 at `N=128`. After
enforcing `#pragma clang loop unroll(disable)`, results stabilized
(e.g., from ~3100 cycles/call at `N=1` to ~2700 at `N=128`).
* libdevice contrast: the libdevice `sin` path did not exhibit a similar
drop in our measurements, and the PTX appears as compact internal calls
rather than a long FMA chain, leaving less ILP for the outer loop to
extract.

What this change does:
* Applies `#pragma clang loop unroll(disable)` to the GPU `throughput()`
loop in both NVPTX and AMDGPU backends.

Leaving unrolling entirely to the optimizer makes apples-to-apples
comparisons uneven (e.g., libc vs. vendor). Disabling unrolling yields
fairer, more consistent numbers.
2025-08-16 20:14:26 +00:00
Sergei Barannikov
3acb679bda [TableGen] Remove redundant variable (NFC) 2025-08-16 23:11:53 +03:00
Florian Hahn
351d398a37
[VPlan] Run final VPlan simplifications before codegen.
Dissolving the hierarchical VPlan CFG and converting abstract to
concrete recipes can expose additional simplification opportunities.

Do a final run of simplifyRecipes before executing the VPlan.
2025-08-16 18:54:27 +01:00
Fangrui Song
1893caa9bc MCSymbol: Decrease the bitfield size of SymbolContents
Follow-up to 57b0843f68f5f349c73d1bf54e321a1a6d1800bf

The size of MCSymbol has been reduced to 24 bytes on 64-bit systems.
2025-08-16 10:43:05 -07:00
Sergei Barannikov
aa2fe4eb3d
[PowerPC] Remove some unused SDNodes and FastISel workaround (NFC) (#153964)
These nodes have never been used since introduction in 2013/2015.
2025-08-16 17:01:03 +00:00
Matthias Springer
0d8aa9d9ec
[mlir][SparseTensor] Simplify pipeline (#152908)
This refactoring improves compilation time.
2025-08-16 18:45:26 +02:00
Timm Baeder
373206d5e0
[clang][bytecode] Prefer ParmVarDecls as function parameters (#153952)
We might create a local temporary variable for a ParmVarDecl, in which
case a DeclRefExpr for that ParmVarDecl should _still_ result in us
choosing the parameter, not that local.
2025-08-16 17:22:14 +02:00