1230 Commits

Author SHA1 Message Date
schittir
fdfcebb38d
[clang][SYCL] Add sycl_external attribute and restrict emitting device code (#140282)
This patch is part of the upstreaming effort for supporting SYCL
language front end.
It makes the following changes:
1. Adds sycl_external attribute for functions with external linkage,
which is intended for use to implement the SYCL_EXTERNAL macro as
specified by the SYCL 2020 specification
2. Adds checks to avoid emitting device code when sycl_external and
sycl_kernel_entry_point attributes are not enabled
3. Fixes test failures caused by the above changes

This patch is missing diagnostics for the following diagnostics listed
in the SYCL 2020 specification's section 5.10.1, which will be addressed
in a subsequent PR:
Functions that are declared using SYCL_EXTERNAL have the following
additional restrictions beyond those imposed on other device functions:
1. If the SYCL backend does not support the generic address space then
the function cannot use raw pointers as parameter or return types.
Explicit pointer classes must be used instead;
2. The function cannot call group::parallel_for_work_item;
3. The function cannot be called from a parallel_for_work_group scope.

In addition to that, the subsequent PR will also implement diagnostics
for inline functions including virtual functions defined as inline.

---------

Co-authored-by: Mariya Podchishchaeva <mariya.podchishchaeva@intel.com>
2025-08-20 12:37:37 -04:00
Bill Wendling
aa4805a090
[Clang][attr] Add 'cfi_salt' attribute (#141846)
The 'cfi_salt' attribute specifies a string literal that is used as a
"salt" for Control-Flow Integrity (CFI) checks to distinguish between
functions with the same type signature. This attribute can be applied
to function declarations, function definitions, and function pointer
typedefs.

This attribute prevents function pointers from being replaced with
pointers to functions that have a compatible type, which can be a CFI
bypass vector.

The attribute affects type compatibility during compilation and CFI
hash generation during code generation.

  Attribute syntax: [[clang::cfi_salt("<salt_string>")]]
  GNU-style syntax: __attribute__((cfi_salt("<salt_string>")))

- The attribute takes a single string of non-NULL ASCII characters.
- It only applies to function types; using it on a non-function type
  will generate an error.
- All function declarations and the function definition must include
  the attribute and use identical salt values.

Example usage:

  // Header file:
  #define __cfi_salt(S) __attribute__((cfi_salt(S)))

  // Convenient typedefs to avoid nested declarator syntax.
  typedef int (*fp_unsalted_t)(void);
  typedef int (*fp_salted_t)(void) __cfi_salt("pepper");

  struct widget_ops {
    fp_unsalted_t init;     // Regular CFI.
    fp_salted_t exec;       // Salted CFI.
    fp_unsalted_t teardown; // Regular CFI.
  };

  // bar.c file:
  static int bar_init(void) { ... }
  static int bar_salted_exec(void) __cfi_salt("pepper") { ... }
  static int bar_teardown(void) { ... }

  static struct widget_generator _generator = {
    .init = bar_init,
    .exec = bar_salted_exec,
    .teardown = bar_teardown,
  };

  struct widget_generator *widget_gen = _generator;

  // 2nd .c file:
  int generate_a_widget(void) {
    int ret;

    // Called with non-salted CFI.
    ret = widget_gen.init();
    if (ret)
      return ret;

    // Called with salted CFI.
    ret = widget_gen.exec();
    if (ret)
      return ret;

    // Called with non-salted CFI.
    return widget_gen.teardown();
  }

Link: https://github.com/ClangBuiltLinux/linux/issues/1736
Link: https://github.com/KSPP/linux/issues/365

---------

Signed-off-by: Bill Wendling <morbo@google.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2025-08-14 13:07:38 -07:00
Matheus Izvekov
91cdd35008
[clang] Improve nested name specifier AST representation (#147835)
This is a major change on how we represent nested name qualifications in
the AST.

* The nested name specifier itself and how it's stored is changed. The
prefixes for types are handled within the type hierarchy, which makes
canonicalization for them super cheap, no memory allocation required.
Also translating a type into nested name specifier form becomes a no-op.
An identifier is stored as a DependentNameType. The nested name
specifier gains a lightweight handle class, to be used instead of
passing around pointers, which is similar to what is implemented for
TemplateName. There is still one free bit available, and this handle can
be used within a PointerUnion and PointerIntPair, which should keep
bit-packing aficionados happy.
* The ElaboratedType node is removed, all type nodes in which it could
previously apply to can now store the elaborated keyword and name
qualifier, tail allocating when present.
* TagTypes can now point to the exact declaration found when producing
these, as opposed to the previous situation of there only existing one
TagType per entity. This increases the amount of type sugar retained,
and can have several applications, for example in tracking module
ownership, and other tools which care about source file origins, such as
IWYU. These TagTypes are lazily allocated, in order to limit the
increase in AST size.

This patch offers a great performance benefit.

It greatly improves compilation time for
[stdexec](https://github.com/NVIDIA/stdexec). For one datapoint, for
`test_on2.cpp` in that project, which is the slowest compiling test,
this patch improves `-c` compilation time by about 7.2%, with the
`-fsyntax-only` improvement being at ~12%.

This has great results on compile-time-tracker as well:

![image](https://github.com/user-attachments/assets/700dce98-2cab-4aa8-97d1-b038c0bee831)

This patch also further enables other optimziations in the future, and
will reduce the performance impact of template specialization resugaring
when that lands.

It has some other miscelaneous drive-by fixes.

About the review: Yes the patch is huge, sorry about that. Part of the
reason is that I started by the nested name specifier part, before the
ElaboratedType part, but that had a huge performance downside, as
ElaboratedType is a big performance hog. I didn't have the steam to go
back and change the patch after the fact.

There is also a lot of internal API changes, and it made sense to remove
ElaboratedType in one go, versus removing it from one type at a time, as
that would present much more churn to the users. Also, the nested name
specifier having a different API avoids missing changes related to how
prefixes work now, which could make existing code compile but not work.

How to review: The important changes are all in
`clang/include/clang/AST` and `clang/lib/AST`, with also important
changes in `clang/lib/Sema/TreeTransform.h`.

The rest and bulk of the changes are mostly consequences of the changes
in API.

PS: TagType::getDecl is renamed to `getOriginalDecl` in this patch, just
for easier to rebasing. I plan to rename it back after this lands.

Fixes #136624
Fixes https://github.com/llvm/llvm-project/issues/43179
Fixes https://github.com/llvm/llvm-project/issues/68670
Fixes https://github.com/llvm/llvm-project/issues/92757
2025-08-09 05:06:53 -03:00
Artem Belevich
507b879b6e
[CUDA] add support for targeting sm_103/sm_121 with CUDA-12.9 (#151587) 2025-07-31 13:38:54 -07:00
Alan Zhao
92858528c2
[clang][timers][stats] Add a flag to enable timers in the stats file (#149946)
As reported in #138173, enabling `-ftime-report` adds pass timing info
to the stats file if `-stats-file` is specified. This was determined to
be WAI. However, if one intentionally wants to put timer information in
the stats file, using `-ftime-report` may lead to a lot of logspam (that
can't be removed by directing stderr to `/dev/null` as that would also
redirect compiler errors). To address this, this PR adds a flag
`-stats-file-timers` that adds timer data to the stats file without
outputting to stderr.
2025-07-22 18:50:45 -07:00
Igor Kudrin
00dacf8c22
[clang] Add -Wuninitialized-const-pointer (#148337)
This option is similar to -Wuninitialized-const-reference, but diagnoses
the passing of an uninitialized value via a const pointer, like in the
following code:
```
void foo(const int *);
void test() {
  int v;
  foo(&v);
}
```
This is an extract from #147221 as suggested in [this
comment](https://github.com/llvm/llvm-project/pull/147221#discussion_r2190998730).
2025-07-14 15:44:43 -07:00
Ricardo Jesus
84e54515bc
[AArch64] Add support for -mcpu=gb10. (#146515)
This patch adds support for -mcpu=gb10 (NVIDIA GB10). This is a
big.LITTLE cluster of Cortex-X925 and Cortex-A725 cores. The appropriate
MIDR numbers are added to detect them in -mcpu=native.

We did not add an -mcpu=cortex-x925.cortex-a725 option because GB10 does
include the crypto instructions which we want on by default, and the
current convention is to not enable such extensions for Arm Cortex cores
in -mcpu where they are optional in the IP.

Relevant GCC patch:
https://gcc.gnu.org/pipermail/gcc-patches/2025-June/687005.html
2025-07-07 11:14:26 +01:00
Nikolas Klauser
afc6c2bb9b
[Clang] Allow the use of [[gnu::visibility]] with #pragma clang attribute (#145653)
I don't see any reason this shouldn't be allowed. AFAICT this is only
disabled due to the heuristics used to determine whether it makes sense
to allow the use of an attribute with `#pragma clang attribute`.

This allows libc++ to drop `_LIBCPP_HIDE_FROM_ABI` in a lot of places,
making the library significantly easier to read.
2025-06-26 14:54:15 +02:00
Jim Lin
2f9c97c030
[RISCV] Add Andes AX45MPV processor definition (#145267)
Andes AX45MPV is 64-bit in-order dual-issue 8-stage pipeline
linux-capable CPU implementing the RV64IMAFDCV ISA extension. That is
developed by Andes Technology https://www.andestech.com, a RISC-V IP
provider.

The overviews for AX45MPV:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax45mpv/

Scheduling model for RVV extension will be implemented a follow-up PR.
2025-06-24 08:57:55 +08:00
Jameson Nash
96ab74bf17
[InstCombine] remove undef loads, such as memcpy from undef (#143958)
Extend `isAllocSiteRemovable` to be able to check if the ModRef info
indicates the alloca is only Ref or only Mod, and be able to remove it
accordingly. It seemed that there were a surprising number of
benchmarks with this pattern which weren't getting optimized previously
(due to MemorySSA walk limits). There were somewhat more existing tests
than I'd like to have modified which were simply doing exactly this
pattern (and thus relying on undef memory). Claude code contributed the
new tests (and found an important typo that I'd made).

This implements the discussion in
https://github.com/llvm/llvm-project/pull/143782#discussion_r2142720376.
2025-06-20 10:32:31 -04:00
Stanislav Mekhanoshin
69974658f0
[AMDGPU] Initial support for gfx1250 target. (#144965)
This is just a stub for now.
2025-06-19 22:52:51 -07:00
Tom Vijlbrief
ad94f77a6a
[AVR] Add many new AVR MCU model definitions (#144229)
1. Added the missing XMEGA2 definition. The avr64 devices use xmega2 which has SPM(X) defined.

2. The avr16/avr32 devices do have SPM and SPMX features, but the current xmega3 definition has not.
   Xmega3 is also used for modern attiny series which do not have SPM(X), so that is correct.
   Leave the avr16/avr32 devices unchanged (using xmega3 to be in sync with gcc definitions).

Fixes https://github.com/llvm/llvm-project/issues/116116
2025-06-16 09:25:40 +08:00
Jim Lin
2a8c7d3c69
[RISCV] Add support for -mtune=andes-45-series (#142900)
Enables the use of `-mtune=andes-45-series` to generate code optimized
with the Andes 45 series scheduling model and tuning features.
2025-06-06 11:34:19 +08:00
Nick Sarnie
3b9ebe9201
[clang] Simplify device kernel attributes (#137882)
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-05 14:15:38 +00:00
Min-Yih Hsu
feb21e26fa
[RISCV] Add SiFive X390 processor definition (#142517)
X390 is an in-order core designed for AI/ML workload, with VLEN=1024.
https://www.sifive.com/cores/intelligence-x300-series

Scheduling model will be added in a follow-up patch.
2025-06-04 09:25:59 -07:00
Marco Elver
c7ccfc6dfc
Thread Safety Analysis: Support reentrant capabilities (#137133)
Introduce the `reentrant_capability` attribute, which may be specified
alongside the `capability(..)` attribute to denote that the defined
capability type is reentrant. Marking a capability as reentrant means
that acquiring the same capability multiple times is safe, and does not
produce warnings on attempted re-acquisition.

The most significant changes required are plumbing to propagate if the
attribute is present to a CapabilityExpr, and introducing
ReentrancyDepth to the LockableFactEntry class.
2025-05-26 17:03:55 +02:00
Jim Lin
569b6f6dad
[RISCV] Add Andes A25/AX25 processor definition (#140681)
Andes A25/AX25 are 32/64bit, 5-stage pipeline, linux-capable CPUs that
implement the RV[32|64]IMAFDC_Zba_Zbb_Zbc_Zbs ISA extensions. They are
developed by Andes Technology https://www.andestech.com, a RISC-V IP
provider.

The overviews for A25/AX25:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-a25/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax25/

Scheduling model will be implemented in a later PR.
2025-05-22 09:22:32 +08:00
Fraser Cormack
6553dc30b8
[NVPTX] Support the OpenCL generic addrspace feature by default (#137940)
As best as I can see, all NVPTX architectures support the generic
address space.

I note there's a FIXME in the target's address space map about 'generic'
still having to be added to the target but we haven't observed any
issues with it downstream. The generic address space is mapped to the
same target address space as default/private (0), but this isn't
necessarily a problem for users.
2025-05-21 09:55:11 +01:00
Ties Stuij
269f5fe91e
[AARCH64] Add support for Cortex-A320 (#139055)
This patch adds initial support for the recently announced Armv9
Cortex-A320 processor.

For more information, including the Technical Reference Manual, see:
https://developer.arm.com/Processors/Cortex-A320

---------

Co-authored-by: Oliver Stannard <oliver.stannard@arm.com>
2025-05-09 16:24:48 +01:00
Min-Yih Hsu
ca1ebff9de
[RISCV] Add processor definition for SiFive P870 (#137725)
SiFive P870 is a RVA23 compatible high-performance CPU:
https://www.sifive.com/cores/performance-p800

Scheduling model will be added in a follow-up PR.
2025-05-05 18:48:21 -07:00
Alan Zhao
69327c16d1
[clang] Make -ftime-report and -ftime-report-json honor -info-output-file (#138035)
This way, the output of `-ftime-report` and `-ftime-report-json` can be
redirected to a specific file rather than just stderr.
2025-04-30 14:48:17 -07:00
Alan Zhao
4a6c81dc0e
[clang] Implement JSON formatted -ftime-report (#137737)
This patch adds a new flag, -ftime-report-json, which outputs the same
information as -ftime-report but as JSON instead of -ftime-report's
pretty printed format.
2025-04-30 13:43:05 -07:00
Aaron Ballman
e8ae779471
[C] Add diagnostic + attr for unterminated strings (#137829)
This introduces three things related to intialization like:

char buf[3] = "foo";

where the array does not declare enough space for the null terminator
but otherwise can represent the array contents exactly.

1) An attribute named 'nonstring' which can be used to mark that a
   field or variable is not intended to hold string data.

2) -Wunterminated-string-initialization, which is grouped under
   -Wextra, and diagnoses the above construct unless the declaration
   uses the 'nonstring' attribute.

3) -Wc++-unterminated-string-initialization, which is grouped under
   -Wc++-compat, and diagnoses the above construct even if the
   declaration uses the 'nonstring' attribute.

Fixes #137705
2025-04-30 10:14:18 -04:00
Fraser Cormack
1e31f4b5eb
[AMDGPU] Support the OpenCL generic addrspace feature by default (#137636)
This feature should be supported on AMDGCN architectures with flat
addressing.
2025-04-29 14:14:00 +01:00
Aaron Ballman
df267d77f6
[C] Add new -Wimplicit-int-enum-cast to -Wc++-compat (#137658)
This introduces a new diagnostic group to diagnose implicit casts from
int to an enumeration type. In C, this is valid, but it is not
compatible with C++.

Additionally, this moves the "implicit conversion from enum type to
different enum type" diagnostic from `-Wenum-conversion` to a new group
`-Wimplicit-enum-enum-cast`, which is a more accurate home for it.
`-Wimplicit-enum-enum-cast` is also under `-Wimplicit-int-enum-cast`, as
it is the same incompatibility (the enumeration on the right-hand is
promoted to `int`, so it's an int -> enum conversion).

Fixes #37027
2025-04-29 07:06:08 -04:00
Jim Lin
5981be7692
[RISCV] Add Andes A45/AX45 processor definition (#136832)
Andes A45/AX45 are 32/64bit in-order dual-issue 8-stage pipeline
linux-capable CPU implementing the RV[32|64]IMAFDC_Zba_Zbb_Zbs ISA
extensions. They are developed by Andes Technology
https://www.andestech.com, a RISC-V IP provider.

The overviews for A45/AX45:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-a45/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-ax45/

Scheduling model will be implemented in a later PR.
2025-04-24 09:16:12 +08:00
Jim Lin
832ca744f2
[RISCV] Add Andes N45/NX45 processor definition (#136670)
Andes N45/NX45 are 32/64bit in-order dual-issue 8-stage pipeline CPU
architecture implementing the RV[32|64]IMAFDC_Zba_Zbb_Zbs ISA
extensions. They are developed by Andes Technology
https://www.andestech.com, a RISC-V IP provider.

The overviews for N45/NX45:
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-n45/
https://www.andestech.com/en/products-solutions/andescore-processors/riscv-nx45/

Scheduling model will be implemented in a later PR.
2025-04-23 14:16:23 +08:00
Chyaka
0e3e0bf42c
[RISCV] Add processor definition for XiangShan-KunMingHu-V2R2 (#123193)
XiangShan-KunMingHu is the third generation of Open-source
high-performance RISC-V processor developed by Beijing Institute of Open
Source Chip (BOSC) , and its latest version is V2R2.

The KunMingHu manual is now available at
https://github.com/OpenXiangShan/XiangShan-User-Guide/releases.
It will be updated on the official XiangShan documentation site:
https://docs.xiangshan.cc/zh-cn/latest

You can find the corresponding ISA extension from the XiangShan Github
repository:
https://github.com/OpenXiangShan/XiangShan/blob/master/src/main/scala/xiangshan/Parameters.scala

If you want to track the latest performance data of KunMingHu, please
check XiangShan Biweekly: https://docs.xiangshan.cc/zh-cn/latest/blog

This PR adds the processor definition for KunMingHu V2R2, developed by
the XSCC team https://github.com/orgs/OpenXiangShan/teams/xscc.

The scheduling model for XiangShan-KunMingHu V2R2 will be submitted in a
subsequent PR.

---------

Co-authored-by: Shenglin Tang <tangshenglin@ict.ac.cn>
Co-authored-by: Xu, Zefan <ceba_robot@outlook.com>
Co-authored-by: Tang Haojin <tanghaojin@outlook.com>
2025-04-21 10:06:43 +08:00
Aaron Ballman
c609cd2df9
Give this diagnostic a diagnostic group (#136182)
I put this under -Wunitialized because that's the same group it's under
in GCC.

Fixes #41104
2025-04-18 07:09:27 -04:00
Ulrich Weigand
80267f8148
Support z17 processor name and scheduler description (#135254)
The recently announced IBM z17 processor implements the architecture
already supported as "arch15" in LLVM. This patch adds support for "z17"
as an alternate architecture name for arch15.

This patch also add the scheduler description for the z17 processor,
provided by Jonas Paulsson.
2025-04-11 00:20:58 +02:00
Matheus Izvekov
f302f35526
[clang] Track final substitution for Subst* AST nodes (#132748) 2025-04-02 19:27:29 -03:00
Sirraide
10c6ebc427
Reapply "[Clang] [NFC] Introduce a helper for emitting compatibility diagnostics (#132348)" (#134043)
This reapplies #132348 with a fix to the python bindings tests, reverting
076397ff32.
2025-04-02 10:40:05 +02:00
Sirraide
076397ff32
Revert "[Clang] [NFC] Introduce a helper for emitting compatibility diagnostics" (#134036)
Reverts llvm/llvm-project#132348

Some tests are failing and I still need to figure out what is going on
here.
2025-04-02 08:29:05 +02:00
Sirraide
9d06e0879b
[Clang] [NFC] Introduce a helper for emitting compatibility diagnostics (#132348)
This is a follow-up to #132129.

Currently, only `Parser` and `SemaBase` get a `DiagCompat()` helper; I’m
planning to keep refactoring compatibility warnings and add more helpers
to other classes as needed. I also refactored a single parser compat
warning just to make sure everything works properly when diagnostics
across multiple components (i.e. Sema and Parser in this case) are
involved.
2025-04-02 08:06:29 +02:00
Ricardo Jesus
847e46ca01
[AArch64] Add initial support for -mcpu=olympus. (#132368)
This patch adds support for the NVIDIA Olympus core.

This does not add any special tuning decisions, and those may come
later.
2025-03-25 08:09:04 +00:00
Shilei Tian
ff8aa300d6
[AMDGPU] Remove outdated COV6 warning (#132814) 2025-03-24 19:57:07 -04:00
Matheus Izvekov
d447c6e9b7
[clang] NFC: remove stray newlines from clang/test/Misc/diag-template-diffing-cxx11.cpp 2025-03-24 13:18:07 -03:00
Sirraide
f01b56ffb3
[Clang] [NFC] Introduce helpers for defining compatibilty warnings (#132129)
This introduces some tablegen helpers for defining compatibility
warnings. The main aim of this is to both simplify adding new
compatibility warnings as well as to unify the naming of compatibility
warnings.

I’ve refactored ~half of the compatiblity warnings (that follow the
usual scheme) in `DiagnosticSemaKinds.td` for illustration purposes and
also to simplify/unify the wording of some of them (I also corrected a
typo in one of them as a drive-by fix).

I haven’t (yet) migrated *all* warnings even in that one file, and there
are some more specialised ones for which the scheme I’ve established
here doesn’t work (e.g. because they’re warning+error instead of
warning+extwarn; however, warning+extension *is* supported), but the
point of this isn’t to implement *all* compatibility-related warnings
this way, only to make the common case a bit easier to handle.

This currently also only handles C++ compatibility warnings, but it
should be fairly straight-forward to extend the tablegen code so it can
also be used for C compatibility warnings (if this gets merged, I’m
planning to do that in a follow-up pr).

The vast majority of compatibility warnings are emitted by writing
```c++
Diag(Loc, getLangOpts().CPlusPlusYZ ? diag::ext_... : diag::warn_...)
```
in accordance with which I’ve chosen the following naming scheme:
```c++
Diag(Loc, getLangOpts().CPlusPlusYZ ? diag::compat_cxxyz_foo : diag::compat_pre_cxxyz_foo)
```
That is, for a warning about a C++20 feature—i.e. C++≤17
compatibility—we get:
```c++
Diag(Loc, getLangOpts().CPlusPlus20 ? diag::compat_cxx20_foo : diag::compat_pre_cxx20_foo)
```
While there is an argument to be made against writing ‘`compat_cxx20`’
here since is technically a case of ‘C++17 compatibility’ and not ‘C++20
compatibility’, I at least find this easier to reason about, because I
can just write the same number 3 times instead of having to use
`ext_cxx20_foo` but `warn_cxx17_foo`. Instead, I like to read this as a
warning about the ‘compatibility *of* a C++20 feature’ rather than
‘*with* C++17’.

I also experimented with moving all compatibility warnings to a separate
file, but 1. I don’t think it’s worth the effort, and 2. I think it
hurts compile times a bit because at least in my testing I felt that I
had to recompile more code than if we just keep e.g. Sema-specific
compat warnings in the Sema diagnostics file.

Instead, I’ve opted to put them all in the same place within any one
file; currently this is a the very top but I don’t really have strong
opinions about this.
2025-03-21 03:55:42 +01:00
Alan Zhao
864a53b4a4
Reapply "Use global TimerGroups for both new pass manager and old pass manager timers" (#131173) (#131217)
This reverts commit 31ebe6647b7f1fc7f6778a5438175b12f82357ae.

The reason for the test failure is likely due to
`Name2PairMap::getTimerGroup(...)` not holding a lock.
2025-03-13 16:20:39 -07:00
Arthur Eubanks
31ebe6647b
Revert "Use global TimerGroups for both new pass manager and old pass manager timers" (#131173)
Reverts llvm/llvm-project#130375

Causes breakages, e.g.
https://lab.llvm.org/buildbot/#/builders/160/builds/14607
2025-03-13 10:29:15 -07:00
Alan Zhao
09d8e442ac
[llvm][Timer] Use global TimerGroups for both new pass manager and old pass manager timers (#130375)
Additionally, remove the behavior for both pass manager's timer manager
classes (`PassTimingInfo` for the old pass manager and
`TimePassesHandler` for the new pass manager) where these classes would
print the values of their timers upon destruction.

Currently, each pass manager manages their own `TimerGroup`s. This is
problematic because of duplicate `TimerGroup`s (both pass managers have
a `TimerGroup` for pass times with identical names and descriptions).
The result is that in Clang, `-ftime-report` has two "Pass execution
timing report" sections (one for the new pass manager which manages
optimization passes, and one for the old pass manager which manages the
backend). The result of this change is that Clang's `-ftime-report` now
prints both optimization and backend pass timing info in a unified "Pass
execution timing report" section.

Moving the ownership of the `TimerGroups` to globals also makes it
easier to implement JSON-formatted `-ftime-report`. This was not
possible with the old structure because the two pass managers were
created and destroyed in far parts of the codebase and outputting JSON
requires the printing logic to be at the same place because of
formatting.

Previous discourse discussion:
https://discourse.llvm.org/t/difficulties-with-implementing-json-formatted-ftime-report/84353
2025-03-13 10:13:28 -07:00
Nikita Popov
07f3388fff Revert "[clang] Implement instantiation context note for checking template parameters (#126088)"
This reverts commit a24523ac8dc07f3478311a5969184b922b520395.

This is causing significant compile-time regressions for C++ code, see:
https://github.com/llvm/llvm-project/pull/126088#issuecomment-2704874202
2025-03-10 10:32:08 +01:00
Matheus Izvekov
a24523ac8d
[clang] Implement instantiation context note for checking template parameters (#126088)
Instead of manually adding a note pointing to the relevant template
parameter to every relevant error, which is very easy to miss, this
patch adds a new instantiation context note, so that this can work using
RAII magic.

This fixes a bunch of places where these notes were missing, and is more
future-proof.

Some diagnostics are reworked to make better use of this note:
- Errors about missing template arguments now refer to the parameter
which is missing an argument.
- Template Template parameter mismatches now refer to template
parameters as parameters instead of arguments.

It's likely this will add the note to some diagnostics where the
parameter is not super relevant, but this can be reworked with time and
the decrease in maintenance burden makes up for it.

This bypasses the templight dumper for the new context entry, as the
tests are very hard to update.

This depends on #125453, which is needed to avoid losing the context
note for errors occuring during template argument deduction.
2025-03-06 14:58:42 -03:00
Sebastian Jodłowski
0127f169dc
[CUDA] Add support for sm101 and sm120 target architectures (#127187)
Add support for sm101 and sm120 target architectures. It requires CUDA
12.8.

---------

Co-authored-by: Sebastian Jodlowski <sjodlowski@nuro.ai>
2025-02-19 14:41:07 -08:00
Fabian Ritter
8615f9aaff
[AMDGPU] Replace gfx940 and gfx941 with gfx942 in llvm (#126763)
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all non-documentation occurrences of gfx940/gfx941 from
the llvm directory, and the remaining occurrences in clang.

Documentation changes will follow.

For SWDEV-512631
2025-02-19 10:20:48 +01:00
Fabian Ritter
029c8e783d
[AMDGPU][clang] Replace gfx940 and gfx941 with gfx942 in clang (#126762)
gfx940 and gfx941 are no longer supported. This is one of a series of
PRs to remove them from the code base.

This PR removes all occurrences of gfx940/gfx941 from clang that can be
removed without changes in the llvm directory. The
target-invalid-cpu-note/amdgcn.c test is not included here since it
tests a list of targets that is defined in
llvm/lib/TargetParser/TargetParser.cpp.

For SWDEV-512631
2025-02-19 10:11:48 +01:00
Ahmed Bougacha
f0e39c45df
[AArch64] Add aliases for processors apple-a18/s6..10. (#127152)
apple-a18 is an alias of apple-m4.
apple-s6/s7/s8 are aliases of apple-a13.
apple-s9/s10 are aliases of apple-a16.

As with some other aliases today, this reflects identical ISA feature
support, but not necessarily identical microarchitectures and
performance characteristics.
2025-02-17 11:18:45 -08:00
Pengcheng Wang
7eadc1960d
[RISCV] Add a generic OOO CPU (#120712)
We add a generic out-of-order CPU model here just like what GCC
has done.
    
People may use this model to evaluate some optimizations, and more
importantly, people can use this model as a template to customize
their own CPU models.
    
The design (units, cycles, ...) of this model is random so don't
take it seriously.
2025-02-14 17:35:02 +08:00
Sirraide
c4a019747c
[Clang] Remove ARCMigrate (#119269)
In the discussion around #116792, @rjmccall mentioned that ARCMigrate
has been obsoleted and that we could go ahead and remove it from Clang,
so this patch does just that.
2025-01-30 05:32:25 +01:00
Sergey Kozub
616979ebd7
[NVPTX] Add support for PTX 8.6 and CUDA 12.6 (12.8) (#123398)
Add CUDA versions 12.7, 12.8, 12.9 which support PTX8.6+ (enables using Blackwell-specific instructions).
2025-01-21 11:00:24 +01:00