18194 Commits

Author SHA1 Message Date
Steven Perron
acde20b560
[HLSL][SPIRV] Add vk::constant_id attribute. (#143544)
The vk::constant_id attribute is used to indicate that a global const
variable
represents a specialization constant in SPIR-V. This PR adds this
attribute to clang.

The documentation for the attribute is
[here](https://github.com/microsoft/DirectXShaderCompiler/blob/main/docs/SPIR-V.rst#specialization-constants).

The strategy is to to modify the initializer to get the value of a
specialize constant for a builtin defined in the SPIR-V backend.

Implements https://github.com/llvm/wg-hlsl/pull/287

Fixes https://github.com/llvm/llvm-project/issues/142448

---------

Co-authored-by: Nathan Gauër <github@keenuts.net>
2025-06-18 06:39:52 -04:00
Daniel Paoliello
2488f26d15
[win][x64] Unwind v2 3/n: Add support for requiring unwind v2 to be used (equivalent to MSVC's /d2epilogunwindrequirev2) (#143577)
#129142 added support for emitting Windows x64 unwind v2 information,
but it was "best effort". If any function didn't follow the requirements
for v2 it was silently downgraded to v1.

There are some parts of Windows (specifically kernel-mode code running
on Xbox) that require v2, hence we need the ability to fail the
compilation if v2 can't be used.

This change also adds a heuristic to check if there might be too many
unwind codes, it's currently conservative (i.e., assumes that certain
prolog instructions will use the maximum number of unwind codes).

Future work: attempting to chain unwind info across multiple tables if
there are too many unwind codes due to epilogs and adding a heuristic to
detect if an epilog will be too far from the end of the function.
2025-06-16 15:06:41 -07:00
Steven Perron
a027eb4472
[HLSL] Use hidden visibility for external linkage. (#140292)
Implements

https://github.com/llvm/wg-hlsl/blob/main/proposals/0026-symbol-visibility.md.

The change is to stop using the `hlsl.export` attribute. Instead,
symbols with "program linkage" in HLSL will have export linkage with
default visibility, and symbols with "external linkage" in HLSL will
have export linkage with hidden visibility.
2025-06-16 16:44:55 -04:00
Kazu Hirata
c01532177f
[clang] Remove unused includes (NFC) (#144285)
These are identified by misc-include-cleaner.  I've filtered out those
that break builds.  Also, I'm staying away from llvm-config.h,
config.h, and Compiler.h, which likely cause platform- or
compiler-specific build failures.
2025-06-15 21:00:36 -07:00
Jacek Caban
be5c96bfac
[CodeGen][COFF] Always emit CodeView compiler info on Windows targets (#142970)
MSVC always emits minimal CodeView metadata with compiler information,
even when debug info is otherwise disabled. Other tools may rely on this
metadata being present. For example, linkers use it to determine whether
hotpatching is enabled for the object file.
2025-06-13 22:48:29 +02:00
FYK
52d34865b9
Fix and reapply IR PGO support for Flang (#142892)
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:

```
undefined reference to `llvm::DebugInfoCorrelate'  
undefined reference to `llvm::ProfileCorrelate'
```

The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.

### Fix:

This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.

---------

Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-06-13 12:05:16 -06:00
Steven Perron
bd33eef7f1
[HLSL][SPIRV] Use resource names (#143412)
The SPIR-V backend does not have access to the original name of a
resource in the source, so it tries to create a name. This leads to some
problems with reflection.
    
That is why start to pass the name of the resource from Clang to the
SPIR-V backend.
    
Fixes #138533
2025-06-13 12:21:38 -04:00
Yaxun (Sam) Liu
7232c07eb9 Reland [HIP] use offload wrapper for non-device-only non-rdc (#143964)
Fixed a typo:

-  auto Section = (Prefix + "llvm_offload_entries").str();
+  auto Section = (Prefix + "_offload_entries").str();

which broke buildbot e.g.

https://lab.llvm.org/buildbot/#/builders/208/builds/1948
2025-06-12 21:41:41 -04:00
Yaxun (Sam) Liu
8890706db6 Revert "Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (#143964)"
This reverts commit 22f9b4aa1dad597d908be77be1e10ba4c77330ce.
2025-06-12 21:33:05 -04:00
Yaxun (Sam) Liu
22f9b4aa1d
Reland [HIP] use offload wrapper for non-device-only non-rdc (#132869) (#143964)
Fixed two issues:

1. assertion with -flto. the linker wrapper action is missing for
wrapping the device binary. Added it for -flto.

2. when there are two HIP files, the kernels in the second file were not
found. This is because the -r option of linker wrapper assumes offload
entries section of HIP to be hip_offloading_entries but it is actually
llvm_offload_entries, causing the offload entries sections not made
unique for different object files. Fixed and tested working for both
-fgpu-rdc and -fno-gpu-rdc case with and without -r
2025-06-12 20:08:55 -04:00
David Green
030a471753 [AArch64][Clang] Exclude address spaces from pointer-only coercion types.
As reported on #135064, the generic pointer coercion code in
CoerceIntOrPtrToIntOrPtr cannot handle address space casts (it tries to bitcast
the pointers). This bails out if an address space qualifier is found on the
pointer.
2025-06-12 20:51:58 +01:00
Nathan Gauër
50f534e21c
[HLSL][SPIR-V] Handle SV_Position builtin in PS (#141759)
This commit is using the same mechanism as vk::ext_builtin_input to
implement the SV_Position semantic input.
The HLSL signature is not yet ready for DXIL, hence this commit only
implements the SPIR-V side.

This is incomplete as it doesn't allow the semantic on hull/domain and
other shaders, but it's a first step to validate the overall
input/output
semantic logic.

Fixes https://github.com/llvm/llvm-project/issues/136969
2025-06-11 14:22:54 +02:00
Adrian Vogelsgesang
756e7cfd86
[debuginfo][coro] Fix linkage name for clones of coro functions (#141889)
So far, the `DW_AT_linkage_name` of the coroutine `resume`, `destroy`,
`cleanup` and `noalloc` function clones were incorrectly set to the
original function name instead of the updated function names.

With this commit, we now update the `DW_AT_linkage_name` to the correct
name. This has multiple benefits:

1. it's easier for me (and other toolchain developers) to understand the
   output of `llvm-dwarf-dump` when coroutines are involved.
2. When hitting a breakpoint, both LLDB and GDB now tell you which clone
   of the function you are in. E.g., GDB now prints "Breakpoint 1.2,
   coro_func(int) [clone .resume] (v=43) at ..." instead of "Breakpoint
   1.2, coro_func(int) (v=43) at ...".
3. GDB's `info line coro_func` command now allows you to distinguish the
   multiple different clones of the function.

In Swift, the linkage names of the clones were already updated. The
comment right above the relevant code in `CoroSplit.cpp` already hinted
that the linkage name should probably also be updated in C++. This
comment was added in commit 6ce76ff7eb7640, and back then the
corresponding `DW_AT_specification` (i.e., `SP->getDeclaration()`) was
not updated, yet, which led to problems for C++. In the meantime, commit
ca1a5b37c7236d added code to also update `SP->getDeclaration`, as such
there is no reason anymore to not update the linkage name for C++.

Note that most test cases used inconsistent function names for the LLVM
function vs. the DISubprogram linkage name. clang would never emit such
LLVM IR. This confused me initially, and hence I fixed it while updating
the test case.

Drive-by fix: The change in `CGVTables.cpp` is purely stylistic, NFC.
When looking for other usages of `replaceWithDistinct`, I got initially
confused because `CGVTables.cpp` was calling a static function via an
object instance.
2025-06-11 13:50:32 +02:00
CHANDRA GHALE
afbcf9529a
[OpenMP 6.0 ]Codegen for Reduction over private variables with reduction clause (#134709)
Codegen support for reduction over private variable with reduction
clause. Section 7.6.10 in in OpenMP 6.0 spec.
- An internal shared copy is initialized with an initializer value.
- The shared copy is updated by combining its value with the values from
the private copies created by the clause.
- Once an encountering thread verifies that all updates are complete,
its original list item is updated by merging its value with that of the
shared copy and then broadcast to all threads.

Sample Test Case from OpenMP 6.0 Example 
```
#include <assert.h>
#include <omp.h>
#define N 10

void do_red(int n, int *v, int &sum_v)
{
    sum_v = 0; // sum_v is private
    #pragma omp for reduction(original(private),+: sum_v)
    for (int i = 0; i < n; i++) 
    {
        sum_v += v[i];
    }
}

int main(void)
{
    int v[N];
    for (int i = 0; i < N; i++)
        v[i] = i;
    #pragma omp parallel num_threads(4)
    {
        int s_v; // s_v is private
        do_red(N, v, s_v);
        assert(s_v == 45);
    }
    return 0;
}
```
Expected Codegen:
```
 // A shared global/static variable is introduced for the reduction result.
 // This variable is initialized (e.g., using memset or a UDR initializer)
 // e.g., .omp.reduction.internal_private_var

 // Barrier before any thread performs combination
  call void @__kmpc_barrier(...)

 // Initialization block (executed by thread 0)
 // e.g., call void @llvm.memset.p0.i64(...) or call @udr_initializer(...)

  call void @__kmpc_critical(...)
    // Inside critical section:
    // Load the current value from the shared variable
    // Load the thread-local private variable's value
    // Perform the reduction operation 
    // Store the result back to the shared variable

  call void @__kmpc_end_critical(...)
  // Barrier after all threads complete their combinations

  call void @__kmpc_barrier(...)
 // Broadcast phase:
 // Load the final result from the shared variable)
 // Store the final result to the original private variable in each thread
 // Final barrier after broadcast

  call void @__kmpc_barrier(...)
```

---------

Co-authored-by: Chandra Ghale <ghale@pe31.hpc.amslabs.hpecorp.net>
2025-06-11 14:01:31 +05:30
Ethan Luis McDonough
67ff66e677
[PGO][Offload] Fix offload coverage mapping (#143490)
This pull request fixes coverage mapping on GPU targets. 

- It adds an address space cast to the coverage mapping generation pass.
- It reads the profiled function names from the ELF directly. Reading it
from public globals was causing issues in cases where multiple
device-code object files are linked together.
2025-06-10 20:19:38 -05:00
Kazu Hirata
30dd652c29
[clang] Use *Map::try_emplace (NFC) (#143563)
- try_emplace(Key) is shorter than insert({Key, nullptr}).
- try_emplace performs value initialization without value parameters.
- We overwrite values on successful insertion anyway.
2025-06-10 11:32:02 -07:00
Nathan Gauër
6582d7d348
[HLSL] Add WaveGetLaneCount() intrinsic to FE (#143127)
This commit adds code to lower WaveGetLaneCount() into the SPV or DXIL
intrinsic. The backends will then need to lower the intrinsic into
proper SPIR-V/DXIL.

Related to #99159
2025-06-10 14:25:09 +02:00
Paul Walker
f43aaf90df
[NFC][LLVM] Refactor IRBuilder::Create{VScale,ElementCount,TypeSize}. (#142803)
CreateVScale took a scaling parameter that had a single use outside of
IRBuilder with all other callers having to create a redundant
ConstantInt. To work round this some code perferred to use
CreateIntrinsic directly.

This patch simplifies CreateVScale to return a call to the llvm.vscale()
intrinsic and nothing more. As well as simplifying the existing call
sites I've also migrated the uses of CreateIntrinsic.

Whilst IRBuilder used CreateVScale's scaling parameter as part of the
implementations of CreateElementCount and CreateTypeSize, I have
follow-on work to switch them to the NUW varaiety and thus they would
stop using CreateVScale's scaling as well. To prepare for this I have
moved the multiplication and constant folding into the implementations
of CreateElementCount and CreateTypeSize.

As a final step I have replaced some callers of CreateVScale with
CreateElementCount where it's clear from the code they wanted the
latter.
2025-06-10 12:35:59 +01:00
Oliver Hunt
487e757f3e
[clang][NFC] Remove dead PassTypeToPlacementDelete field (#143448)
The CallDeleteDuringNew::PassTypeToPlacementDelete field became unneeded
during the many refactorings of P2719 but I didn't actually remove it.
2025-06-09 23:28:33 -07:00
David Green
5f648c370e
[AArch64] Change the coercion type of structs with pointer members. (#135064)
The aim here is to avoid a ptrtoint->inttoptr round-trip through the function
argument whilst keeping the calling convention the same. Given a struct which
is <= 128bits in size, which can only contain either 1 or 2 pointers, we
convert to a ptr or [2 x ptr] as opposed to the old coercion that uses i64 or
[2 x i64]. This helps alias analysis produce more accurate results.
2025-06-10 07:04:54 +01:00
Joseph Huber
f5e499a338
Revert "[HIP] use offload wrapper for non-device-only non-rdc (#132869)" (#143432)
This breaks a lot of new driver HIP compilation. We should probably
revert this for now until we can make a fixed version.

```c++

static __global__ void print() { printf("%s\n", "foo"); }

void b();

int main() {
  hipLaunchKernelGGL(print, dim3(1), dim3(1), 0, 0);
  auto y = hipDeviceSynchronize();
  b();
}
```
```c++

static __global__ void print() { printf("%s\n", "bar"); }

void b() {
  hipLaunchKernelGGL(print, dim3(1), dim3(1), 0, 0);
  auto y = hipDeviceSynchronize();
}
```
```console
$ clang++ a.hip b.hip --offload-arch=gfx1030 --offload-new-driver
$ ./a.out
foo
foo
```
```console
$ clang++ a.hip b.hip --offload-arch=gfx1030 --offload-new-driver -flto
<crash>
```

This reverts commit d54c28b9c1396fa92d9347ac1135da7907121cb8.
2025-06-09 17:18:49 -05:00
Matheus Izvekov
366f48890d
[clang] AST: fix dependency calculation for TypedefTypes (#143291)
The dependency from the type sugar of the underlying type of a Typedef
were not being considered for the dependency of the TypedefType itself.

A TypedefType should be instantiation dependent if it involves
non-instantiated template parameters, even if they don't contribute to
the canonical type.

Besides, a TypedefType should be instantiation dependent if it is
declared in a dependent context, but fixing that would have performance
consequences, as otherwise non-dependent typedef declarations would need
to be transformed during instantiation as well.

This removes the workaround added in
https://github.com/llvm/llvm-project/pull/90032

Fixes https://github.com/llvm/llvm-project/issues/89774
2025-06-08 17:07:36 -03:00
Corentin Jabot
5c76ae2894
[Clang] Support constexpr asm at global scope. (#143268)
I previously failed to realize this feature existed...

Fixes #137459
Fixes #143242
2025-06-08 09:16:57 +02:00
Kazu Hirata
0ef1e69f22
[clang] Strip away lambdas (NFC) (#143226)
We don't need lambdas here.
2025-06-06 22:55:26 -07:00
Thurston Dang
428afa62b0
[ubsan] Add more -fsanitize-annotate-debug-info checks (#141997)
This extends https://github.com/llvm/llvm-project/pull/138577 to more UBSan checks, by changing SanitizerDebugLocation (formerly SanitizerScope) to add annotations if enabled for the specified ordinals.

Annotations will use the ordinal name if there is exactly one ordinal specified in the SanitizerDebugLocation; otherwise, it will use the handler name.

Updates the tests from https://github.com/llvm/llvm-project/pull/141814.

---------

Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2025-06-06 14:59:32 -07:00
Nuri Amari
347186b259
Avoid Assertion Failure Using -fcs-profile-generate with distributed thin-lto (#129736)
When using `-fcs-generate-profile` with distributed thin-lto in the same
fashion we do for local thin-lto, we hit the following assertion:


6041c745f3/llvm/lib/Support/PGOOptions.cpp (L36)

Using local thin-lto with LLD for MachO, we set the missing path
automatically to a default value: https://reviews.llvm.org/D151589. In
this fix we add the same behavior.

---------

Co-authored-by: Nuri Amari <nuriamari@fb.com>
2025-06-06 17:58:19 -04:00
Peter Collingbourne
d1b0b4bb44
Add -funique-source-file-identifier option.
This option complements -funique-source-file-names and allows the user
to use a different unique identifier than the source file path.

Reviewers: teresajohnson

Reviewed By: teresajohnson

Pull Request: https://github.com/llvm/llvm-project/pull/142901
2025-06-05 10:52:01 -07:00
Snehasish Kumar
16c7b3c9f5
[MemProf] Split MemProfiler into Instrumentation and Use. (#142811)
Most of the recent development on the MemProfiler has been on the Use part. The instrumentation has been quite stable for a while. As the complexity of the use grows (with undrifting, diagnostics etc) I figured it would be good to separate these two implementations.
2025-06-05 07:36:50 -07:00
Nick Sarnie
3b9ebe9201
[clang] Simplify device kernel attributes (#137882)
We have multiple different attributes in clang representing device
kernels for specific targets/languages. Refactor them into one attribute
with different spellings to make it more easily scalable for new
languages/targets.

---------

Signed-off-by: Sarnie, Nick <nick.sarnie@intel.com>
2025-06-05 14:15:38 +00:00
Dan McGregor
b5e84ca740
[Clang] Remap paths in OpenMP runtime calls (#82541) (#141250)
Apply the debug prefix mapping to the OpenMP location strings.

Fixes https://github.com/llvm/llvm-project/issues/82541
2025-06-05 08:24:57 -04:00
Bruno De Fraine
c3b8a15eab
[CodeGen] Add TBAA struct path info for array members (#137719)
This enables the LLVM optimizer to view accesses to distinct struct
members as independent, also for array members. For example, the
following two stores no longer alias:

    struct S { int a[10]; int b; };
    void test(S *p, int i) {
      p->a[i] = ...;
      p->b = ...;
    }

Array members were already added to TBAA struct type nodes in commit
57493e29. Here, we extend a path tag for an array subscript expression.
2025-06-05 13:37:18 +02:00
PiJoules
b194cf1e40
[clang] Function type attribute to prevent CFI instrumentation (#135836)
This introduces the attribute discussed in

https://discourse.llvm.org/t/rfc-function-type-attribute-to-prevent-cfi-instrumentation/85458.

The proposed name has been changed from `no_cfi` to
`cfi_unchecked_callee` to help differentiate from `no_sanitize("cfi")`
more easily. The proposed attribute has the following semantics:

1. Indirect calls to a function type with this attribute will not be
instrumented with CFI. That is, the indirect call will not be checked.
Note that this only changes the behavior for indirect calls on pointers
to function types having this attribute. It does not prevent all
indirect function calls for a given type from being checked.
2. All direct references to a function whose type has this attribute
will always reference the true function definition rather than an entry
in the CFI jump table.
3. When a pointer to a function with this attribute is implicitly cast
to a pointer to a function without this attribute, the compiler will
give a warning saying this attribute is discarded. This warning can be
silenced with an explicit C-style cast or C++ static_cast.
2025-06-04 11:19:26 -07:00
Orlando Cazalet-Hyams
54d544b831
[KeyInstr][Clang] Ret atom (#134652)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

When returning a value, stores to the `retval` allocas and branches to `return`
block are put in the same atom group. They are both rank 1, which could in
theory introduce an extra step in some optimized code. This low risk currently
feels an acceptable for keeping the code a bit simpler (as opposed to adding
scaffolding to make the store rank 2).

In the case of a single return (no control flow) the return instruction inherits
the atom group of the branch to the return block when the blocks get folded
togather.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-04 15:43:49 +01:00
Nathan Gauër
20d70196c9
[HLSL][SPIR-V] Implement vk::ext_builtin_input attribute (#138530)
This variable attribute is used in HLSL to add Vulkan specific builtins
in a shader.
The attribute is documented here:

17727e88fd/proposals/0011-inline-spirv.md

Those variable, even if marked as `static` are externally initialized by
the pipeline/driver/GPU. This is handled by moving them to a specific
address space `hlsl_input`, also added by this commit.

The design for input variables in Clang can be found here:
355771361e/proposals/0019-spirv-input-builtin.md


Co-authored-by: Justin Bogner <mail@justinbogner.com>
2025-06-04 13:22:37 +02:00
Orlando Cazalet-Hyams
ac42923c2d Reapply "[KeyInstr][Clang] For range stmt atoms" (#142630)
This reverts commit e6529dcedb3955706a8af5710591f1ac1bac26a3 with crash fixed.

Original PR https://github.com/llvm/llvm-project/pull/134647

This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-04 10:53:29 +01:00
Oliver Hunt
93314bd946
[clang][PAC] Add __builtin_get_vtable_pointer (#139790)
With pointer authentication it becomes non-trivial to correctly load the
vtable pointer of a polymorphic object.

__builtin_get_vtable_pointer is a function that performs the load and
performs the appropriate authentication operations if necessary.
2025-06-04 00:21:20 -07:00
Finn Plummer
9ec5afea77
[NFC][RootSignature] Move RootSignature util functions (#142491)
`HLSLRootSignature.h` was originally created to hold the struct
definitions of an `llvm::hlsl::rootsig::RootElement` and some helper
functions for it.

However, there many users of the structs that don't require any of the
helper methods. This requires us to link the `FrontendHLSL` library,
where we otherwise wouldn't need to.

For instance:
- This [revert](https://github.com/llvm/llvm-project/pull/142005) was
required as it requires linking to the unrequired `FrontendHLSL` library
- As part of the change required here:
https://github.com/llvm/llvm-project/issues/126557. We will want to add
an `HLSLRootSignatureVersion` enum. Ideally this could live with the
root signature struct defs, but we don't want to link the helper objects
into `clang/Basic/TargetOptions.h`

This change allows the struct definitions to be kept in a single header
file and to then have the `FrontendHLSL` library only be linked when
required.
2025-06-03 09:59:50 -07:00
Orlando Cazalet-Hyams
e6529dcedb
Revert "[KeyInstr][Clang] For range stmt atoms" (#142630)
Reverts llvm/llvm-project#134647

Bot failure:

https://lab.llvm.org/buildbot/#/builders/144/builds/26730/steps/6/logs/FAIL__Clang__terminate-statements_cpp
2025-06-03 16:15:46 +01:00
Orlando Cazalet-Hyams
10024363dd
[KeyInstr][Clang] For range stmt atoms (#134647)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-03 15:44:15 +01:00
Orlando Cazalet-Hyams
8e50e882a8 [KeyInstr][Clang] Break and Continue stmt atoms
[KeyInstr][Clang] For stmt atom (#134646)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-03 14:25:48 +01:00
Orlando Cazalet-Hyams
0555594195
[KeyInstr][Clang] For stmt atom (#134646)
This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-03 13:47:32 +01:00
Orlando Cazalet-Hyams
347273db2f
[KeyInstr][Clang] Coerced store atoms (#134653)
[KeyInstr][Clang] Coerced store atoms

This patch is part of a stack that teaches Clang to generate Key Instructions
metadata for C and C++.

RFC:
https://discourse.llvm.org/t/rfc-improving-is-stmt-placement-for-better-interactive-debugging/82668

The feature is only functional in LLVM if LLVM is built with CMake flag
LLVM_EXPERIMENTAL_KEY_INSTRUCTIONs. Eventually that flag will be removed.
2025-06-03 09:22:37 +01:00
Ami-zhang
8c65f68330
[clang][LoongArch] Add support for the _Float16 type (#141703)
Enable _Float16 for LoongArch target. Additionally, this change fixes
incorrect ABI lowering of _Float16 in the case of structs containing
fp16 that are eligible for passing via GPR+FPR or FPR+FPR. Finally, it
also fixes int16 -> __fp16 conversion code gen, which uses generic LLVM
IR rather than llvm.convert.to.fp16 intrinsics.
2025-06-03 14:26:11 +08:00
Vitaly Buka
2622e6bfa0
[NFC][CodeGen] Extract SanitizerHandler into own header (#142527) 2025-06-02 22:47:08 -07:00
Vladislav Dzhidzhoev
dec8f1314f
[llvm][DebugInfo][clang] Finalize all declaration subprograms in DIBuilder::finalize() (#139914)
DIBuilder began tracking definition subprograms and finalizing them in
`DIBuilder::finalize()` in eb1bb4e419.
Currently, `finalizeSubprogram()` attaches local variables, imported
entities, and labels to the `retainedNodes:` field of a corresponding
subprogram.

After 75819aedf, the definition and some declaration subprograms are
finalized in `DIBuilder::finalize()`:
`AllSubprograms` holds references to definition subprograms.
`AllRetainTypes` holds references to declaration subprograms.
For DISubprogram elements of both variables, `finalizeSubprogram()` was
called there.

However, `retainTypes()` is not necessarily called for every declaration
subprogram (as in 40a3fcb0).

DIBuilder clients may also want to attach DILocalVariables to
declaration subprograms, for example, in 58bdf8f9a8.

Thus, the `finalizeSubprogram()` function is called for all definition
subprograms in `DIBuilder::finalize()` because they are stored in the
`AllSubprograms` by the `CreateFunction(isDefinition: true)` call. But
for the declaration subprograms, it should be called manually.

With this commit, `AllSubprograms` is used for holding and finalizing all DISubprograms.
2025-06-02 15:22:53 +02:00
Alexandros Lamprineas
b3fd2ea888
[Clang][FMV] Stop emitting implicit default version using target_clones. (#141808)
With the current behavior the following example yields a linker error:
"multiple definition of `foo.default'"

// Translation Unit 1
__attribute__((target_clones("dotprod, sve"))) int foo(void) { return 1; }

// Translation Unit 2
int foo(void) { return 0; }
__attribute__((target_version("dotprod"))) int foo(void);
__attribute__((target_version("sve"))) int foo(void);
int bar(void) { return foo(); }

That is because foo.default is generated twice. As a user I don't find
this particularly intuitive. If I wanted the default to be generated in
TU1 I'd rather write target_clones("dotprod, sve", "default")
explicitly.

When changing the code I noticed that the RISC-V target defers the
resolver emission when encountering a target_version definition. This
seems accidental since it only makes sense for AArch64, where we only
emit a resolver once we've processed the entire TU, and only if the
default version is present. I've changed this so that RISC-V immediately
emmits the resolver. I adjusted the codegen tests since the functions
now appear in a different order.

Implements https://github.com/ARM-software/acle/pull/377
2025-06-02 11:04:00 +01:00
Nikita Popov
e2b536431d
[CodeGen] Move CodeGenPGO behind unique_ptr (NFC) (#142155)
The InstrProf headers are very expensive. Avoid including them in all of
CodeGen/ by moving the CodeGenPGO member behind a unqiue_ptr.

This reduces clang build time by 0.8%.
2025-06-02 09:51:54 +02:00
Tarun Prabhu
597340b5b6
Revert "Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler" (#142159)
Reverts llvm/llvm-project#136098
2025-05-30 08:27:08 -06:00
FYK
d27a210a77
Add IR Profile-Guided Optimization (IR PGO) support to the Flang compiler (#136098)
This patch implements IR-based Profile-Guided Optimization support in
Flang through the following flags:

- `-fprofile-generate` for instrumentation-based profile generation

- `-fprofile-use=<dir>/file` for profile-guided optimization

Resolves #74216 (implements IR PGO support phase)

**Key changes:**

- Frontend flag handling aligned with Clang/GCC semantics

- Instrumentation hooks into LLVM PGO infrastructure

- LIT tests verifying:

    - Instrumentation metadata generation

    - Profile loading from specified path

    - Branch weight attribution (IR checks)

**Tests:**

- Added gcc-flag-compatibility.f90 test module verifying:

    -  Flag parsing boundary conditions

    -  IR-level profile annotation consistency

    -  Profile input path normalization rules

- SPEC2006 benchmark results will be shared in comments

For details on LLVM's PGO framework, refer to [Clang PGO
Documentation](https://clang.llvm.org/docs/UsersManual.html#profile-guided-optimization).

This implementation was developed by [XSCC Compiler
Team](https://github.com/orgs/OpenXiangShan/teams/xscc).

---------

Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Tom Eccles <t@freedommail.info>
2025-05-30 08:13:53 -06:00
Thurston Dang
cb065a578a [NFCI][ubsan] Add/update deprecation TODOs
Mention -fsanitize-skip-hot-cutoffs and -fsanitize-annotate-debug-info
are available as replacements
2025-05-29 18:13:47 +00:00