538 Commits

Author SHA1 Message Date
Damyan Pepper
cc49f3b3e1
[NFC][HLSL] Remove confusing enum aliases / duplicates (#153909)
Remove:

* DescriptorType enum - this almost exactly shadowed the ResourceClass
enum
* ClauseType aliased ResourceClass

Although these were introduced to make the HLSL root signature handling
code a bit cleaner, they were ultimately causing confusion as they
appeared to be unique enums that needed to be converted between each
other.

Closes #153890
2025-08-18 08:58:33 -07:00
Krzysztof Parzyszek
ae75884130
[Frontend][OpenMP] Add 6.1 as a valid OpenMP version (#153628)
Co-authored-by: Michael Klemm <michael.klemm@amd.com>
2025-08-18 09:13:27 -05:00
Kazu Hirata
07eb7b7692
[llvm] Replace SmallSet with SmallPtrSet (NFC) (#154068)
This patch replaces SmallSet<T *, N> with SmallPtrSet<T *, N>.  Note
that SmallSet.h "redirects" SmallSet to SmallPtrSet for pointer
element types:

  template <typename PointeeType, unsigned N>
class SmallSet<PointeeType*, N> : public SmallPtrSet<PointeeType*, N>
{};

We only have 140 instances that rely on this "redirection", with the
vast majority of them under llvm/. Since relying on the redirection
doesn't improve readability, this patch replaces SmallSet with
SmallPtrSet for pointer element types.
2025-08-18 07:01:29 -07:00
joaosaffran
37729d8ceb
[HLSL] Refactoring DXILABI.h to not depend on scope printer (#153840)
This patch refactors DXILABI to remove the dependency on scope printer. 
Closes: #153827

---------

Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
2025-08-15 21:33:44 -04:00
joaosaffran
d56fa96524
[DirectX] Add Range Overlap validation (#152229)
As part of the Root Signature Spec, we need to validate if Root
Signatures are not defining overlapping ranges.
Closes: https://github.com/llvm/llvm-project/issues/126645

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
Co-authored-by: Joao Saffran <jderezende@microsoft.com>
2025-08-14 18:40:11 -04:00
Abid Qadeer
62d0b712b7
[OMPIRBuilder] Avoid invalid debug location. (#153190)
Fixes #153043.

This is another case of debug location not getting updated when the
insert point is changed by the `restoreIP`. Fixed by using the wrapper
function that updates the debug location.
2025-08-12 16:20:52 +01:00
Abid Qadeer
3746bd21cb
[OMPIRBuilder][NFC] Fix build failure after 151306. (#153061)
A function added in pr#151306 was under NDEBUG macro which caused the
build to fail in certain cases. It has been moved out of the #ifdef
check to ensure it is always compiled.
2025-08-11 19:46:42 +01:00
Helena Kotas
fb1035cfb4
[DirectX] Fix resource binding analysis incorrectly removing duplicates (#152253)
The resource binding analysis was incorrectly reducing the size of the
`Bindings` vector by one element after sorting and de-duplication. This
led to an inaccurate setting of the `HasOverlappingBinding` flag in the
`DXILResourceBindingInfo` analysis, as the truncated vector no longer
reflected the true binding state.

This update corrects the shrink logic and introduces an `assert` in the
`DXILPostOptimizationValidation` pass. The assertion will trigger if
`HasOverlappingBinding` is set but no corresponding error is detected,
helping catch future inconsistencies.

The bug surfaced when the `srv_metadata.hlsl` and `uav_metadata.hlsl`
tests were updated to include unbounded resource arrays as part of
https://github.com/llvm/llvm-project/issues/145422. These updated test
files are included in this PR, as they would cause the new assertion to
fire if the original issue remained unresolved.

Depends on #152250
2025-08-11 10:53:00 -07:00
Abid Qadeer
049953fe8d
[OMPIRBuilder] Avoid invalid debug location. (#151306)
This fixes #147063.

I tried to fix this issue in more general way in
https://github.com/llvm/llvm-project/pull/147091 but the reviewer
suggested to fix the locations which are causing this issue. So this is
a more targeted approach.

The `restoreIP` is frequently used in the `OMPIRBuilder` to change the
insert position. This function eventually calls
`SetInsertPoint(BasicBlock *TheBB, BasicBlock::iterator IP)`. This
function updates the insert point and the debug location. But if the
`IP` is pointing to the end of the `TheBB`, then the debug location is
not updated and we could have a mismatch between insert point and the
debug location.

The problem can occur in 2 different code patterns. 

This code below shows the first scenario.
```
  1. auto curPos = builder.saveIP();
  2. builder.restoreIP(/* some new pos */);
  3.  // generate some code
  4. builder.restoreIP(curPos);
```

If `curPos` points to the end of basic block, we could have a problem.
But it is easy one to handle as we have the location before hand and can
save the correct debug location before 2 and then restore it after 3.
This can be done either manually or using the `llvm::InsertPointGuard`
as shown below.

```
  // manual approach
  auto curPos = builder.saveIP();
  llvm::DebugLoc DbgLoc = builder.getCurrentDebugLocation();
  builder.restoreIP(/* some new pos */);
  // generate some code
  builder.SetCurrentDebugLocation(DbgLoc);
  builder.restoreIP(curPos);

  {
     // using InsertPointGuard
     llvm::InsertPointGuard IPG(builder);
     builder.restoreIP(/* some new pos */);
     // generate some code
  }
```
This PR fixes one problematic case using the manual approach.

For the 2nd scenario, look at the code below.

```
1. void fn(InsertPointTy allocIP, InsertPointTy codegenIP) {
2.   builder.setInsertPoint(allocIP);
3.   // generate some alloca
4.   builder.setInsertPoint(codegenIP);
5. }
```

The `fn` can be called from anywhere and we can't assume the debug
location of the builder is valid at the start of the function. So if 4
does not update the debug location because the `codegenIP` points at the
end of the block, the rest of the code can end up using the debug
location of the `allocaIP`. Unlike the first case, we don't have a debug
location that we can save before hand and restore afterwards.

The solution here is to use the location of the last instruction in that
block. I have added a wrapper function over `restoreIP` that could be
called for such cases. This PR uses it to fix one problematic case.
2025-08-11 18:47:30 +01:00
Shafik Yaghmour
51bc0c1d6b
[HLSL][NFC] Fix range check in verifyRegisterSpace (#152615)
Static analysis flagged the second part of this range check as always
true. RegisterSpace is uint32_t therefore the max value is 0xFFFFFFFF
and so the first check is sufficient.
2025-08-08 10:04:34 -07:00
Finn Plummer
cb2d56ce96
[NFC][HLSL][DirectX] Consolidate ResourceClassNames (#152213)
During the split of the various `Frontend/HLSL` libraries, there was an
oversight to duplicate the `ResourceClassNames` definitions. This commit
simply consolidates the definitions into `DXContainer.h` as
`getResourceClasses`
2025-08-07 16:10:50 -07:00
Anchu Rajendran S
49ccf46adc
[OpenMP] [IR Builder] Changes to Support Scan Operation (#136035)
Scan reductions are supported in OpenMP with the help of scan directive.
Reduction clause of the for loop/simd directive can take an `inscan`
modifier along with the body of the directive specifying a `scan`
directive. This PR implements the lowering logic for scan reductions in
workshare loops of OpenMP.
The body of the for loop is split into two loops (Input phase loop and
Scan Phase loop) and a scan reduction loop is added in the middle. The
Input phase loop populates a temporary buffer with initial values that
are to be reduced. The buffer is used by the reduction loop to perform
scan reduction. Scan phase loop copies the values of the buffer to the
reduction variable before executing the scan phase. Below is a high
level view of the code generated.
```
<declare pointer to buffer> ptr
omp parallel {
  size num_iters = <num_iters>
  // temp buffer allocation
  omp masked {
    buff = malloc(num_iters*scanvarstype)
    *ptr = buff
  } 
 barrier;
  // input phase loop
  for (i: 0..<num_iters>) {
    <input phase>;
    buffer = *ptr;
    buffer[i] = red;
  }
  // scan reduction
  omp masked
  {
    for (int k = 0; k != ceil(log2(num_iters)); ++k) {
      i=pow(2,k)
      for (size cnt = last_iter; cnt >= i; --cnt) {
        buffer = *ptr;
        buffer[cnt] op= buffer[cnt-i];
      }
    }
  }
 barrier;
 // scan phase loop
  for (0..<num_iters>) {
    buffer = *ptr;
    red = buffer[i] ;
    <scan phase>;
  }
  // temp buffer deletion
  omp masked {
    free(*ptr)
  }
  barrier;
}
```
The temporary buffer needs to be shared between all threads performing
reduction since it is read/written in Input and Scan workshare Loops.
This is achieved by declaring a pointer to the buffer in the shared
region and dynamically allocating the buffer by the master thread.
This is the reason why allocation, deallocation and scan reduction are
performed within `masked`. The code is verified to produce correct
results for Fortran programs with the code changes in the PR
https://github.com/llvm/llvm-project/pull/133149
2025-08-07 14:58:11 -07:00
Anchu Rajendran S
38542efcba
[flang][OMPIRBuilder][MLIR][llvm] Backend support for atomic control options (#151579)
Adding mlir to llvm support for atomic control options.

Atomic Control Options are used to specify architectural characteristics
to help lowering of atomic operations. The options used are:
`-f[no-]atomic-remote-memory`, `-f[no-]atomic-fine-grained-memory`,
 `-f[no-]atomic-ignore-denormal-mode`.
Legacy option `-m[no-]unsafe-fp-atomics` is aliased to
`-f[no-]ignore-denormal-mode`.
More details can be found in
https://github.com/llvm/llvm-project/pull/102569. This PR implements the
MLIR to LLVM lowering support of atomic control attributes specified
with OpenMP `atomicUpdateOp`.

Initial support can be found in PR:
https://github.com/llvm/llvm-project/pull/150860
2025-08-07 12:23:38 -07:00
Finn Plummer
acb5d0c211
[NFC][HLSL] Replace uses of getResourceName/printEnum (#152211)
Introduce the `enumToStringRef` enum into `ScopedPrinter.h` that
replicates `enumToString` behaviour, expect that instead of returning a
hex value string, it just returns an empty string. This allows us to
return a StringRef and easily check if an invalid enum was provided
based on the StringRef size

This then uses `enumToStringRef` to remove the redundant
`getResourceName` and `printEnum` functions.

Resolves: https://github.com/llvm/llvm-project/issues/151200.
2025-08-06 16:35:16 -07:00
Justin Bogner
8a2d3f5653
[HLSL][Sema] Use hlsl::BindingInfoBuilder instead of RangeInfo. NFC (#150634)
Clean up some duplicated logic. We had two ways to do the same thing
here, and BindingInfoBuilder is more flexible.
2025-08-05 10:47:06 -07:00
Simon Pilgrim
2579ffccc7 [HLSL] MetadataParser::parseRootDescriptors - fix gcc Wparentheses warning. NFC. 2025-08-03 11:51:59 +01:00
Kazu Hirata
5e3cc00060 [Offloading] Fix a warning
This patch fixes:

  llvm/lib/Frontend/Offloading/PropertySet.cpp:95:12: error: unused
  variable '[It, Inserted]' [-Werror,-Wunused-variable]
2025-08-01 23:22:15 -07:00
Arvind Sudarsanam
ee67f78776
Fix error caused by reference to local binding (#151789)
This change fixes one of the failures in
https://github.com/llvm/llvm-project/pull/147321

Following code snippet:
`
for (const auto &[CategoryName, PropSet] : PSRegistry) {
        J.attributeObject(CategoryName, [&] {
                   for (const auto &[PropName, PropVal] : PropSet) {
`
causes a build warning that is emitted as an error.
error: reference to local binding 'PropSet' declared in enclosing lambda
expression

This is resolved by capturing PropSet in a local variable.

Thanks

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-08-02 00:57:26 -04:00
Justin Cai
185a23e865
[SYCL] Add property set types and JSON representation (#147321)
This PR adds the `PropertySet` type, along with a pair of functions used
to serialize and deserialize into a JSON representation. A property set
is a key-value map, with values being one of 2 types - uint32 or byte
array. A property set registry is a collection of property sets, indexed
by a "category" name.

In SYCL offloading, property sets will be used to communicate metadata
about device images needed by the SYCL runtime. For example, there is a
property set which has a byte array containing the numeric ID, offset,
and size of each SYCL2020 spec constant. Another example is a property
set describing the optional kernel features used in the module: does it
use fp64? fp16? atomic64?

This metadata will be computed by `clang-sycl-linker` and the JSON
representation will be inserted in the string table of each
output `OffloadBinary`. This JSON will be consumed the SYCL offload
wrapper and will be lowered to the binary form SYCL runtime expects.

For example, consider this SYCL program that calls a kernel that uses
fp64:

```c++
#include <sycl/sycl.hpp>

using namespace sycl;
class MyKernel;

int main() {
  queue q;
  auto *p = malloc_shared<double>(1, q);
  *p = .1;
  q.single_task<MyKernel>([=]{ *p *= 2; }).wait();
  std::cout << *p << "\n";
  free(p, q);
}
```

The device code for this program would have the kernel marked with
`!sycl_used_aspects`:

```
define spir_kernel void @_ZTS8MyKernel([...]) !sycl_used_aspects !n { [...] }
!n = {i32 6}
```

`clang-sycl-linker` would recognize this metadata and then would output
the following JSON in the `OffloadBinary`'s key-value map:

```
{
  "SYCL/device requirements": {
    // aspects contains a list of sycl::aspect values used
    // by the module; in this case just the value 6 encoded 
    // as a 4-byte little-endian integer 
    "aspects": "BjAwMA=="
  }
}
```

The SYCL offload wrapper would lower those property sets to something
like this:

```c++
struct _sycl_device_binary_property_set_struct {
  char *CategoryName;
  _sycl_device_binary_property *PropertiesBegin;
  _sycl_device_binary_property *PropertiesEnd;
};

struct _sycl_device_binary_property_struct {
  char *PropertyName;  
  void *ValAddr;     
  uint64_t ValSize; 
};

//  
_sycl_device_binary_property_struct device_requirements[] = {
  /* PropertyName */ "aspects",
  /* ValAddr */ [pointer to the bytes 0x06 0x00 0x00 0x00],
  /* ValSize */ 4,
};

_sycl_device_binary_property_set_struct properties[] = {
  /* CategoryName */ "SYCL/device requirements",
  /* PropertiesBegin */ device_requirements,
  /* PropertiesEnd */ std::end(device_requirments),
}
```

---------

Co-authored-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-08-01 20:05:45 -07:00
joaosaffran
b2e5303292
[DirectX] Error handling improve in root signature metadata Parser (#149232)
This PR addresses
https://github.com/llvm/llvm-project/pull/144465#issuecomment-3063422828.
Using `joinErrors` and `llvm:Error` instead of boolean values.

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
Co-authored-by: Joao Saffran <{ID}+{username}@users.noreply.github.com>
2025-07-31 22:59:02 -04:00
Justin Bogner
3f066f5fcf
[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633)
We extract the binding logic out of the DXILResource analysis passes into the
FrontendHLSL library. This will allow us to use this logic for resource and
root signature bindings in both the DirectX backend and the HLSL frontend.
2025-07-31 08:35:47 -07:00
Abid Qadeer
335dbba741
[OMPIRBuilder] Don't drop debug loc from LocationDescription. (#148713)
`LocationDescription` contains both the insertion point and the debug
location. When `LocationDescription` is available, it is better to use
`updateToLocation` which will update both. This PR replaces
`restoreIP(Loc.IP)` with `updateToLocation(Loc)` as former may not
update debug location in all cases.

I am not checking the return value of `updateToLocation` because that is
checked just a few lines above in all cases and we would have returned
early if it failed.
2025-07-29 17:31:29 +01:00
Abid Qadeer
7d7f3819e0
Revert "[OMPIRBuilder] Don't use invalid debug loc in reduction functions." (#150832)
Reverts llvm/llvm-project#147950

I noticed some fails in the reduction tests with clang after this
change. I need to understand the failures better. Reverting this for
now.
2025-07-27 16:43:28 +01:00
joaosaffran
c21e2a5e24
[DirectX] Moving Root Signature Metadata Parsing in to Shared Root Signature Metadata lib (#149221)
This PR, moves the existing Root Signature Metadata Parsing logic used
in `DXILRootSignature` to the common library used by both frontend and
backend. Closes:
[#145942](https://github.com/llvm/llvm-project/issues/145942)

---------

Co-authored-by: joaosaffran <joao.saffran@microsoft.com>
2025-07-23 17:03:13 -07:00
Tom Eccles
a1c61ac756
[mlir][OpenMP] Allow composite SIMD REDUCTION and IF (#147568)
Reduction support: https://github.com/llvm/llvm-project/pull/146671
If Support is fixed in this PR

The problem for the IF clause in composite constructs was that wsloop
and simd both operate on the same CanonicalLoopInfo structure: with the
SIMD processed first, followed by the wsloop. Previously the IF clause
generated code like
```
if (cond) {
  while (...) {
    simd_loop_body;
  }
} else {
  while (...) {
    nonsimd_loop_body;
  }
}
```
The problem with this is that this invalidates the CanonicalLoopInfo
structure to be processed by the wsloop later. To avoid this, in this
patch I preserve the original loop, moving the IF clause inside of the
loop:
```
while (...) {
  if (cond) {
    simd_loop_body;
  } else {
    non_simd_loop_body;
  }
}
```
On simple examples I tried LLVM was able to hoist the if condition
outside of the loop at -O3.

The disadvantage of this is that we cannot add the
llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD
loops because they both share a loop back edge. There's no way of
solving this without keeping the old design of having two different
loops: which cannot be represented using only one CanonicalLoopInfo
structure. I don't think the presence or absence of this attribute makes
much difference. In my testing it is the llvm.loop.parallel_access
metadata which makes the difference to vectorization. LLVM will
vectorize if legal whether or not this attribute is there in the TRUE
branch. In the FALSE branch this means the loop might be vectorized even
when the condition is false: but I think this is still standards
compliant: OpenMP 6.0 says that when the if clause is false that should
be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined
as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are
silently ignored when SIMD IF is used.

I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN
and some vectorization encouragement metadata when combined with IF than
to ignore IF because IF could have correctness consequences whereas the
rest are optimiztion hints. For example, the user might use the IF
clause to disable SIMD programatically when it is known not safe to
vectorize the loop. In this case it is not at all safe to add the
parallel access or SAFELEN metadata.
2025-07-15 10:30:02 +01:00
Abid Qadeer
45fa0b29bc
Revert "[OMPIRBuilder] Don't use invalid debug loc in task proxy function." (#148728)
There is a sanitizer fail in CI after this which I need to investigate.
Reverting for now.
Reverts llvm/llvm-project#148284
2025-07-14 22:23:21 +01:00
Abid Qadeer
9d778089db
[OMPIRBuilder] Don't use invalid debug loc in task proxy function. (#148284)
This is similar to https://github.com/llvm/llvm-project/pull/147950 but
for task proxy function.
2025-07-14 21:04:34 +01:00
Finn Plummer
7ecb37b703
[HLSL][RootSignature] Retain SourceLocation of RootElement for SemaHLSL diagnostics (#147115)
At the moment, when we report diagnostics from `SemaHLSL` we only
provide the source location of the root signature attr. This allows for
significantly less helpful diagnostics (for eg. reporting resource range
overlaps).

This pr implements a way to retain the source location of a root element
when it is parsed, so that we can output the `SourceLocation` of each
root element that causes the overlap in the diagnostics during semantic
analysis.

This pr defines a wrapper struct `clang::hlsl::RootSignatureElement` in
`SemaHLSL` that will contain the underlying `RootElement` and can hold
any additional diagnostic information. This struct will be what is used
in `HLSLRootSignatureParser` and in `SemaHLSL`. Then the diagnostic
information will be stripped and the underlying element will be stored
in the `RootSignatureDecl`.

For the reporting of diagnostics, we can now use the retained
`SourceLocation` of each `RootElement` when reporting the range overlap,
and we can add a `note` diagnostic to highlight the other root element
as well.

- Defines `RootSignatureElement` in the `hlsl` namespace in `SemaHLSL`
(defined in `SemaHLSL` because `Parse` has a dependency on `Sema`)
- Updates parsing logic to construct `RootSignatureElement`s and retain
the source loction in `ParseHLSLRootSignature`
- Updates `SemaHLSL` when it constructs the `RootSignatureDecl` to take
the new `RootSignatureElement` and store the underlying `RootElement`
- Updates the current tests to ensure the new `note` diagnostic is
produced and that the `SourceLocation` is seen
- Slight update to the `RootSignatureValidations` api to ensure the
caller sorts and owns the memory of the passed in `RangeInfo`
- Adds a test to demonstrate the `SourceLocation` of both elements being
correctly pointed out

Resolves: https://github.com/llvm/llvm-project/issues/145819
2025-07-11 18:33:16 -07:00
Abid Qadeer
7b91df3868
[OMPIRBuilder] Don't use invalid debug loc in reduction functions. (#147950)
We have this pattern of code in OMPIRBuilder for many functions that are
used in reduction operations.

 ```
 Function *LtGRFunc = Function::Create
  BasicBlock *EntryBlock = BasicBlock::Create(Ctx, "entry", LtGRFunc);
  Builder.SetInsertPoint(EntryBlock);
```

The insertion point is moved to the new function but the debug location is not updated. This means that reduction function will use the debug location that points to another function. This problem gets hidden because these functions gets inlined but the potential for failure exists.

This patch resets the debug location when insertion point is moved to new function. Some `InsertPointGuard` have been added to make sure why restore the debug location correctly when we are done with the reduction function.
2025-07-11 09:50:05 +01:00
Finn Plummer
420e2f584d
[DirectX] Add missing verifications during validate of DXILRootSignature (#147111)
This pr resolves some discrepancies in verification during `validate` in
`DXILRootSignature.cpp`.

Note: we don't add a backend test for version 1.0 flag values because it
treats the struct as though there is no flags value. However, this will
be used when we use the verifications in the frontend.

- Updates `verifyDescriptorFlag` to check for valid flags based on
version, as reflected [here](https://github.com/llvm/wg-hlsl/pull/297)
- Add test to demonstrate updated flag verifications
- Adds `verifyNumDescriptors` to the validation of `DescriptorRange`s
- Add a test to demonstrate `numDescriptors` verification
- Updates a number of tests that mistakenly had an invalid
`numDescriptors` specified

Resolves: https://github.com/llvm/llvm-project/issues/147107
2025-07-09 12:02:02 -07:00
Finn Plummer
aa1829df02
[NFC][HLSL] Move resource range logic from SemaHLSL to RootSignatureValidations (#147117)
This pr abstracts out the logic of detecting resource range overlap from
`SemaHLSL` into the `RootSignatureValidations` library.

For more context see linked issue.

- Moves the validation logic from `SemaHLSL` to
`RootSignatureValidations`
- Updates `SemaHLSL` to use the new interface for the validations

Resolves: https://github.com/llvm/llvm-project/issues/146393
2025-07-08 18:13:36 -07:00
Finn Plummer
deba201f70
[NFC][HLSL][DirectX] Move DXILRootSignature validations to RootSignatureValidations library (#147110)
Simple code movement of the verification logic in `validate` of the
`DXILRootSignature` pass.

Moving this code to the `RootSignatureValidations` library allows for
the common verifications to be used in the frontend.

- Moves all the `static` verification functions used in
`DXILRootSignature` to the `RootSignatureValidations` library
- Updates `DXILRootSignature` to use the moved functions

Resolves: https://github.com/llvm/llvm-project/issues/145940
2025-07-07 15:25:08 -07:00
Finn Plummer
56e3fc4c42
[NFC][HLSL][RootSignature] Split up HLSLRootSignatureUtils (#146124)
This pr breaks-up `HLSLRootSignatureUtils` into separate orthogonal and
meaningful libraries. This prevents it ending up as a dumping grounds of
many different parts.

- Creates a library `RootSignatureMetadata` to contain helper functions
for interacting the root signatures in their metadata representation
- Create a library `RootSignatureValidations` to contain helper
functions that will validate various values of root signatures
- Move the serialization of root signature elements to
`HLSLRootSignature`

Resolves: https://github.com/llvm/llvm-project/issues/145946
2025-07-04 07:58:58 -07:00
Finn Plummer
0ceb0c377a
[NFC][HLSL][DirectX] Let HLSLRootSignature reuse the dxbc defined enums (#145986)
This pr removes the redundancy of having the same enums defined in both
the front-end and back-end of handling root signatures. Since there are
many more uses of the enum in the front-end of the code, we will adhere
to the naming conventions used in the front-end, to minimize the diff.

The macros in `DXContainerConstants.def` are also touched-up to be
consistent and to have each macro name follow its respective definition
in d3d12.h and searchable by name
[here](https://learn.microsoft.com/en-us/windows/win32/api/d3d12/).

Additionally, the many `getEnumNames` are moved to `DXContainer` from
`HLSLRootSignatureUtils` as they we will want them to be exposed
publicly anyways.

Changes for each enum follow the pattern of a commit that will make the
enum definition in `DXContainer` adhere to above listed naming
conventions, followed by a commit to actually use that enum in the
front-end.

Resolves https://github.com/llvm/llvm-project/issues/145815
2025-07-03 14:44:11 -07:00
Krzysztof Parzyszek
795b17d0b8
[Frontend][OpenMP] Implement directive name parser (#146776)
Implement a state machine that consumes tokens (words delimited by white
space), and returns the corresponding directive id, or fails if the tokens
did not form a valid name.
2025-07-03 12:18:01 -05:00
Abid Qadeer
d56c06e6c9
[flang][debug] Generate DISubprogramAttr for omp::TargetOp. (#146532)
This is combination of https://github.com/llvm/llvm-project/pull/138149
and https://github.com/llvm/llvm-project/pull/138039 which were opened
separately for ease of reviewing. Only other change is adjustments in 2
tests which have gone in since.

There are `DeclareOp` present for the variables mapped into target
region. That allow us to generate debug information for them. But the
`TargetOp` is still part of parent function and those variables get the
parent function's `DISubprogram` as a scope.
    
In `OMPIRBuilder`, a new function is created for the `TargetOp`. We also
create a new `DISubprogram` for it. All the variables that were in the
target region now have to be updated to have the correct scope. This
after the fact updating of
debug information becomes very difficult in certain cases. Take the
example of variable arrays. The type of those arrays depend on the
artificial `DILocalVariable`(s) which hold the size(s) of the array.
This new function will now require that we generate the new variable and
and new types. Similar issue exist for character type variables too.
    
To avoid this after the fact updating, this PR generates a
`DISubprogramAttr` for the `TargetOp` while generating the debug info in
`flang`. Then we don't need to generate a `DISubprogram` in
`OMPIRBuilder`. This change is made a bit more complicated by the the
fact that in new scheme, the debug location already points to the new
`DISubprogram` by the time it reaches `convertOmpTarget`. But we need
some code generation in the parent function so we have to carefully
manage the debug locations.
    
This fixes issue `#134991`.
2025-07-03 10:38:28 +01:00
Abid Qadeer
4233ca1e4e
[OMPIRBuilder] Fix use of uninitialized variable. (#145883)
The code in `OpenMPIRBuilder::getTargetEntryUniqueInfo` calls
`ID.getDevice()` even when `getUniqueID` has failed and ID is
un-initialized. This caused a sanitizer fail for me in
https://github.com/llvm/llvm-project/pull/145026. Fix it by giving a
default value to `ID`. The value chosen is the same as used in
`OpenMPToLLVMIRTranslation.cpp`.
2025-06-26 16:10:58 +01:00
Kajetan Puchalski
d3ed84ed67
[Utils][mlir] Fix interaction between CodeExtractor and OpenMPIRBuilder (#145051)
CodeExtractor can currently erroneously insert an alloca into a
different function than it inserts its users into, in cases where code
is being extracted out of a function that has already been outlined. Add
an assertion that the two blocks being inserted into are actually in the
same function.

Add a check to findAllocaInsertPoint in OpenMP to LLVMIR translation to
prevent the aforementioned scenario from happening.

OpenMPIRBuilder relies on a callback mechanism to fix-up a module later
on during the finaliser step. In some cases this results in the module
being invalid prior to the finalise step running. Remove calls to
verifyModule wrapped in LLVM_DEBUG from CodeExtractor, as the presence
of those results in the compiler crashing with -mllvm -debug due to
premature module verification where it would not crash without -debug.

Call ompBuilder->finalize() the end of mlir::translateModuleToLLVMIR, in
order to make sure the module has actually been finalized prior to
trying to verify it.

Resolves https://github.com/llvm/llvm-project/issues/138102.

---------

Signed-off-by: Kajetan Puchalski <kajetan.puchalski@arm.com>
2025-06-25 13:34:35 +01:00
Finn Plummer
310a62f88a
[HLSL][RootSignature] Plug-in serialization and add full sample testcase (#144769)
This pr extends `dumpRootElements` to invoke the print methods of all
`RootElement`s now that they are all implemented.

Extends the `RootSignatures-AST.hlsl` testcase to have a root element of
each type being parsed, constructed to the in-memory representation mode
and then being dumped as part of the AST dump.

- Update `HLSLRootSignatureUtils.cpp` to extend `dumpRootElements`
- Extend `AST/HLSL/RootSigantures-AST.hlsl` testcase
- Defines the helper `operator<<` for `RootElement`
- Small correction to the output of `numDescriptors` to be `unbounded`
in special case

Resolves https://github.com/llvm/llvm-project/issues/124595.
2025-06-23 17:19:12 -07:00
Finn Plummer
e6ee2c7c7b
[HLSL][RootSignature] Implement validation of resource ranges for RootDescriptors (#140962)
As was established
[previously](https://github.com/llvm/llvm-project/pull/140957), we
created a structure to model a resource range and to detect an overlap
in a given set of these.

However, a resource range only overlaps with another resource range if
they have:
- equivalent ResourceClass (SRV, UAV, CBuffer, Sampler)
- equivalent resource name-space
- overlapping shader visibility

For instance, the following don't overlap even though they have the same
register range:
- `CBV(b0)` and `SRV(t0)` (different resource class)
- `CBV(b0, space = 0)` and `CBV(b0, space = 1)` (different space)
- `CBV(b0, visibility = Pixel)` and `CBV(b0, visibility = Domain)`
(non-overlapping visibility)

The first two clauses are naturally modelled by grouping all the
`RangeInfo`s that have the equivalent `ResourceClass` and `Space` values
together and check if there is any overlap on a `ResourceRange` for all
these `RangeInfo`s. However, `Visibility` is not quite as easily mapped
(`Visibility = All` would overlap with any other visibility). So we will
instead need to track a `ResourceRange` for each of the `Visibility`
types in a group. Then we can determine when inserting a range of the
same group if it would overlap with any overlapping visibilities.

The collection of `RangeInfo` for `RootDescriptor`s, sorting of the
`RangeInfo`s into the groups and finally the insertion of each point
into their respective `ResourceRange`s are implemented. Furthermore, we
integrate this into `SemaHLSL` to provide a diagnostic for each entry
function that uses the invalid root signature.

- Implements collection of `RangeInfo` for `RootDescriptors`
- Implements resource range validation in `SemaHLSL`
- Add diagnostic testing of error production in
`RootSignature-resource-ranges-err.hlsl`
- Add testing to ensure no errors are raised in valid root signatures
`RootSignature-resource-ranges.hlsl`

Part 2 of https://github.com/llvm/llvm-project/issues/129942

A final pr will be produced to integrate the analysis of
`DescriptorTable`, `StaticSampler` and `RootConstants` by defining how
to construct the `RangeInfo` from their elements respectively.
2025-06-20 14:54:58 -07:00
Mikael Holmen
c16dc63b44 [OMPIRBuilder] Fix gcc -Wparentheses warning [NFC]
Without this gcc warned like
 /repo/llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7559:68: warning: suggest parentheses around '&&' within '||' [-Wparentheses]
  7559 |         NumStaleCIArgs == (OffloadingArraysToPrivatize.size() + 2) &&
       |         ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~^~
  7560 |             "Wrong number of arguments for StaleCI when shareds are present");
       |             ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
2025-06-18 10:59:18 +02:00
Finn Plummer
87b13ada10
[HLSL][RootSignature] Implement serialization of remaining Root Elements (#143198)
Implements serialization of the remaining `RootElement`s, namely
`RootDescriptor`s and `StaticSampler`s.

- Adds unit testing for the serialization methods

Resolves https://github.com/llvm/llvm-project/issues/138191
Resolves https://github.com/llvm/llvm-project/issues/138193
2025-06-17 15:59:38 -07:00
Finn Plummer
9e0186d925
[HLSL][RootSignature] Implement ResourceRange as an IntervalMap (#140957)
A resource range consists of a closed interval, `[a;b]`, denoting which
shader registers it is bound to.

For instance:
 - `CBV(b1)`  corresponds to the resource range of `[1;1]`
 - `CBV(b0, numDescriptors = 3)` likewise to `[0;2]`

We want to provide an error diagnostic when there is an overlap in the
required registers (an overlap in the resource ranges).

The goal of this pr is to implement a structure to model a set of
resource ranges and provide an api to detect any overlap over a set of
resource ranges.

`ResourceRange` models this by implementing an `IntervalMap` to denote a
mapping from an interval of registers back to a resource range. It
allows for a new `ResourceRange` to be added to the mapping and it will
report if and what the first overlap is.

For the context of how this will be used in validation of a
`RootSignatureDecl` please see the proceeding pull request here:
https://github.com/llvm/llvm-project/pull/140962.

- Implements `ResourceRange` as an `IntervalMap`
- Adds unit testing of the various `insert` scenarios

Note: it was also considered to implement this as an `IntervalTree`,
this would allow reporting of a diagnostic for each overlap that is
encountered, as opposed to just the first. However, error generation of
just reporting the first error is already rather verbose, and adding the
additional diagnostics only made this worse.

Part 1 of https://github.com/llvm/llvm-project/issues/129942
2025-06-17 10:24:57 -07:00
Abid Qadeer
2c90ebf3a7
[OMPIRBuilder][debug] Don't drop debug info for loop constructs. (#144393)
In OMPIRBuilder, we have many cases where we don't handle the debug
location correctly while changing the location or insertion point. This
is one of those cases.

Please see the following test program.
```
program main
  implicit none
  integer i, j
  integer array(16384)

!$omp target teams distribute
  DO i=1,16384
    !$omp parallel do
      DO j=1,16384
        array(j) = i
      ENDDO
    !$omp end parallel do
  ENDDO
!$omp end target teams distribute

print *, array
end program main
```

When tried to compile with the follownig command
`flang -g -O2 -fopenmp  test.f90 -o test  --offload-arch=gfx90a`

will fail in the verification with the following errors: `!dbg
attachment points at wrong subprogram for function`

This happens because we were dropping the debug location in the
createCanonicalLoop and the call to the functions like
`__kmpc_distribute_static_4u` get generated without a debug location.
When it gets inlined, the locations inside it are not adjusted as the
call instruction does not have the debug locations
(`llvm/lib/Transforms/Utils/InlineFunction.cpp:fixupLineNumbers`). Later
Verifier finds that the caller have instructions with debug locations
that point to another function and fails.

The fix is simple to not drop the debug location.
2025-06-17 09:34:47 +01:00
Finn Plummer
a383b1a95b
Reland "[HLSL][RootSignature] Implement serialization of RootConstants and RootFlags" (#143019)
This relands #141130.

The initial commit uncovered that we are missing the correct linking of
FrontendHLSL into clang/lib/Parse and clang/lib/unittests/Parse.

This change addreses this by linking them accordingly.

It was also checked and ensured that the LexHLSLRootSignature libraries
do not depend on FrontendHLSL and so we are not required to link there.

Resolves: #138190 and #138192
2025-06-16 14:52:59 -07:00
Finn Plummer
63b80dd01d
[NFC][RootSignature] Use llvm::EnumEntry for serialization of Root Signature Elements (#144106)
It has pointed out
[here](https://github.com/llvm/llvm-project/pull/143198#discussion_r2132877388)
that we may be able to use `llvm::EnumEntry` so that we can re-use the
printing logic across enumerations.

- Enables re-use of `printEnum` and `printFlags` methods via templates
- Allows easy definition of `getEnumName` function for enum-to-string
conversion, eliminating the need to use a string stream for constructing
the Name SmallString

- Also, does a small fix-up of the operands for descriptor table clause
to be consistent with other `Build*` methods

For reference, the
[test-cases](https://github.com/llvm/llvm-project/blob/main/llvm/unittests/Frontend/HLSLRootSignatureDumpTest.cpp)
that must not change expected output.
2025-06-16 11:45:19 -07:00
Karlo Basioli
ec32d88585
Annotate potentially unused variables introduced in #133499 (#144379) 2025-06-16 17:02:24 +01:00
Kazu Hirata
2dd50bf79e [OpenMP] Fix warnings
This patch fixes:

  llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7233:9: error: unused
  variable 'TaskTy' [-Werror,-Wunused-variable]

  llvm/lib/Frontend/OpenMP/OMPIRBuilder.cpp:7666:15: error: unused
  variable 'ArrayType' [-Werror,-Wunused-variable]
2025-06-16 08:54:04 -07:00
Pranav Bhandarkar
404597061f
[OMPIRBuilder] - Make offloading input data persist for deferred target tasks (#133499)
When we offload to the target, the pointers to data used by the kernel
are passed in arrays created by `OMPIRBuilder`. These arrays of pointers
are allocated on the stack on the host. This is fine for the most part
because absent the `nowait` clause, the default behavior is that target
tasks are included tasks. That is, the host is blocked until the
offloaded target kernel is done. In turn, this means that the host's
stack frame is intact and accessing the array of pointers when
offloading is safe. However, when `nowait` is used on the `!$ omp
target` instance, then the target task is a deferred task meaning, the
generating task on the host does not have to wait for the target task
to finish. In such cases, it is very likely that the stack frame of the
function invoking the target call is wound up thereby leading to memory
access errors as shown below.
```
AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid.
AMDGPU error: Error in hsa_amd_memory_pool_allocate: HSA_STATUS_ERROR_INVALID_ALLOCATION: The requested allocation is not valid. "PluginInterface" error: Failure to allocate device memory: Failed to allocate from memory manager
fort.cod.out: /llvm/llvm-project/offload/plugins-nextgen/common/src/PluginInterface.cpp:1434: Error llvm::omp::target::plugin::PinnedAllocationMapTy::lockMappedHostBuffer(void *, size_t): Assertion `HstPtr && "Invalid pointer"' failed.
Aborted (core dumped)
```
This PR implements support in `OMPIRBuilder` to store these arrays of
pointers in the task structure that is passed to the target task thereby
ensuring it is available to the target task when the target task is
eventually scheduled.

---------

Co-authored-by: Sergio Afonso <safonsof@amd.com>
2025-06-16 10:27:48 -05:00
FYK
52d34865b9
Fix and reapply IR PGO support for Flang (#142892)
This PR resubmits the changes from #136098, which was previously
reverted due to a build failure during the linking stage:

```
undefined reference to `llvm::DebugInfoCorrelate'  
undefined reference to `llvm::ProfileCorrelate'
```

The root cause was that `llvm/lib/Frontend/Driver/CodeGenOptions.cpp`
references symbols from the `Instrumentation` component, but the
`LINK_COMPONENTS` in the `llvm/lib/Frontend/CMakeLists.txt` for
`LLVMFrontendDriver` did not include it. As a result, linking failed in
configurations where these components were not transitively linked.

### Fix:

This updated patch explicitly adds `Instrumentation` to
`LINK_COMPONENTS` in the relevant `llvm/lib/Frontend/CMakeLists.txt`
file to ensure the required symbols are properly resolved.

---------

Co-authored-by: ict-ql <168183727+ict-ql@users.noreply.github.com>
Co-authored-by: Chyaka <52224511+liliumshade@users.noreply.github.com>
Co-authored-by: Tarun Prabhu <tarunprabhu@gmail.com>
2025-06-13 12:05:16 -06:00