245 Commits

Author SHA1 Message Date
Damyan Pepper
cc49f3b3e1
[NFC][HLSL] Remove confusing enum aliases / duplicates (#153909)
Remove:

* DescriptorType enum - this almost exactly shadowed the ResourceClass
enum
* ClauseType aliased ResourceClass

Although these were introduced to make the HLSL root signature handling
code a bit cleaner, they were ultimately causing confusion as they
appeared to be unique enums that needed to be converted between each
other.

Closes #153890
2025-08-18 08:58:33 -07:00
Helena Kotas
fb1035cfb4
[DirectX] Fix resource binding analysis incorrectly removing duplicates (#152253)
The resource binding analysis was incorrectly reducing the size of the
`Bindings` vector by one element after sorting and de-duplication. This
led to an inaccurate setting of the `HasOverlappingBinding` flag in the
`DXILResourceBindingInfo` analysis, as the truncated vector no longer
reflected the true binding state.

This update corrects the shrink logic and introduces an `assert` in the
`DXILPostOptimizationValidation` pass. The assertion will trigger if
`HasOverlappingBinding` is set but no corresponding error is detected,
helping catch future inconsistencies.

The bug surfaced when the `srv_metadata.hlsl` and `uav_metadata.hlsl`
tests were updated to include unbounded resource arrays as part of
https://github.com/llvm/llvm-project/issues/145422. These updated test
files are included in this PR, as they would cause the new assertion to
fire if the original issue remained unresolved.

Depends on #152250
2025-08-11 10:53:00 -07:00
Anchu Rajendran S
49ccf46adc
[OpenMP] [IR Builder] Changes to Support Scan Operation (#136035)
Scan reductions are supported in OpenMP with the help of scan directive.
Reduction clause of the for loop/simd directive can take an `inscan`
modifier along with the body of the directive specifying a `scan`
directive. This PR implements the lowering logic for scan reductions in
workshare loops of OpenMP.
The body of the for loop is split into two loops (Input phase loop and
Scan Phase loop) and a scan reduction loop is added in the middle. The
Input phase loop populates a temporary buffer with initial values that
are to be reduced. The buffer is used by the reduction loop to perform
scan reduction. Scan phase loop copies the values of the buffer to the
reduction variable before executing the scan phase. Below is a high
level view of the code generated.
```
<declare pointer to buffer> ptr
omp parallel {
  size num_iters = <num_iters>
  // temp buffer allocation
  omp masked {
    buff = malloc(num_iters*scanvarstype)
    *ptr = buff
  } 
 barrier;
  // input phase loop
  for (i: 0..<num_iters>) {
    <input phase>;
    buffer = *ptr;
    buffer[i] = red;
  }
  // scan reduction
  omp masked
  {
    for (int k = 0; k != ceil(log2(num_iters)); ++k) {
      i=pow(2,k)
      for (size cnt = last_iter; cnt >= i; --cnt) {
        buffer = *ptr;
        buffer[cnt] op= buffer[cnt-i];
      }
    }
  }
 barrier;
 // scan phase loop
  for (0..<num_iters>) {
    buffer = *ptr;
    red = buffer[i] ;
    <scan phase>;
  }
  // temp buffer deletion
  omp masked {
    free(*ptr)
  }
  barrier;
}
```
The temporary buffer needs to be shared between all threads performing
reduction since it is read/written in Input and Scan workshare Loops.
This is achieved by declaring a pointer to the buffer in the shared
region and dynamically allocating the buffer by the master thread.
This is the reason why allocation, deallocation and scan reduction are
performed within `masked`. The code is verified to produce correct
results for Fortran programs with the code changes in the PR
https://github.com/llvm/llvm-project/pull/133149
2025-08-07 14:58:11 -07:00
Justin Bogner
8a2d3f5653
[HLSL][Sema] Use hlsl::BindingInfoBuilder instead of RangeInfo. NFC (#150634)
Clean up some duplicated logic. We had two ways to do the same thing
here, and BindingInfoBuilder is more flexible.
2025-08-05 10:47:06 -07:00
Arvind Sudarsanam
b075dadbd3
Add missing link component for Frontend offloading test (#151796)
This change fixes one of the failures in
https://github.com/llvm/llvm-project/pull/147321

/usr/bin/ld:
unittests/Frontend/CMakeFiles/LLVMFrontendTests.dir/PropertySetRegistryTest.cpp.o:
undefined reference to symbol
'_ZN4llvm10offloading21writePropertiesToJSONERKSt3mapINSt7__cxx1112basic_stringIcSt11char_traitsIcESaIcEEES1_IS7_St7variantIJjNS_11SmallVectorIhLj0EEEEESt4lessIS7_ESaISt4pairIKS7_SB_EEESD_SaISE_ISF_SI_EEERNS_11raw_ostreamE'

Need to add a missing LLVM link component in CMakeLists.txt.

Thanks

Signed-off-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-08-02 01:09:53 -04:00
Justin Cai
185a23e865
[SYCL] Add property set types and JSON representation (#147321)
This PR adds the `PropertySet` type, along with a pair of functions used
to serialize and deserialize into a JSON representation. A property set
is a key-value map, with values being one of 2 types - uint32 or byte
array. A property set registry is a collection of property sets, indexed
by a "category" name.

In SYCL offloading, property sets will be used to communicate metadata
about device images needed by the SYCL runtime. For example, there is a
property set which has a byte array containing the numeric ID, offset,
and size of each SYCL2020 spec constant. Another example is a property
set describing the optional kernel features used in the module: does it
use fp64? fp16? atomic64?

This metadata will be computed by `clang-sycl-linker` and the JSON
representation will be inserted in the string table of each
output `OffloadBinary`. This JSON will be consumed the SYCL offload
wrapper and will be lowered to the binary form SYCL runtime expects.

For example, consider this SYCL program that calls a kernel that uses
fp64:

```c++
#include <sycl/sycl.hpp>

using namespace sycl;
class MyKernel;

int main() {
  queue q;
  auto *p = malloc_shared<double>(1, q);
  *p = .1;
  q.single_task<MyKernel>([=]{ *p *= 2; }).wait();
  std::cout << *p << "\n";
  free(p, q);
}
```

The device code for this program would have the kernel marked with
`!sycl_used_aspects`:

```
define spir_kernel void @_ZTS8MyKernel([...]) !sycl_used_aspects !n { [...] }
!n = {i32 6}
```

`clang-sycl-linker` would recognize this metadata and then would output
the following JSON in the `OffloadBinary`'s key-value map:

```
{
  "SYCL/device requirements": {
    // aspects contains a list of sycl::aspect values used
    // by the module; in this case just the value 6 encoded 
    // as a 4-byte little-endian integer 
    "aspects": "BjAwMA=="
  }
}
```

The SYCL offload wrapper would lower those property sets to something
like this:

```c++
struct _sycl_device_binary_property_set_struct {
  char *CategoryName;
  _sycl_device_binary_property *PropertiesBegin;
  _sycl_device_binary_property *PropertiesEnd;
};

struct _sycl_device_binary_property_struct {
  char *PropertyName;  
  void *ValAddr;     
  uint64_t ValSize; 
};

//  
_sycl_device_binary_property_struct device_requirements[] = {
  /* PropertyName */ "aspects",
  /* ValAddr */ [pointer to the bytes 0x06 0x00 0x00 0x00],
  /* ValSize */ 4,
};

_sycl_device_binary_property_set_struct properties[] = {
  /* CategoryName */ "SYCL/device requirements",
  /* PropertiesBegin */ device_requirements,
  /* PropertiesEnd */ std::end(device_requirments),
}
```

---------

Co-authored-by: Arvind Sudarsanam <arvind.sudarsanam@intel.com>
2025-08-01 20:05:45 -07:00
Justin Bogner
506834deac
Suppress -Wuninitialized-const-pointer warning (#151583)
Recent clang (as of #148337) introduced a warning on passing unitialized
pointers to functions that take const pointers. This is entirely
spurious on this code, but this works around it to keep the bots happy.

Build failure: https://lab.llvm.org/buildbot/#/builders/168/builds/14779
2025-07-31 12:30:13 -07:00
Justin Bogner
3f066f5fcf
[HLSL][DirectX] Extract HLSLBinding out of DXILResource. NFC (#150633)
We extract the binding logic out of the DXILResource analysis passes into the
FrontendHLSL library. This will allow us to use this logic for resource and
root signature bindings in both the DirectX backend and the HLSL frontend.
2025-07-31 08:35:47 -07:00
Krzysztof Parzyszek
2914a488c7
[flang][OpenMP] Sema checks, lowering with new format of MAP modifiers (#149137)
OpenMP 6.0 has changed the modifiers on the MAP clause. Previous patch
has introduced parsing support for them. This patch introduces
processing of the new forms in semantic checks and in lowering. This
only applies to existing modifiers, which were updated in the 6.0 spec.
Any of the newly introduced modifiers (SELF and REF) are ignored.
2025-07-22 07:37:47 -05:00
Krzysztof Parzyszek
661cbd5a52
[utils][TableGen] Make some non-bitmask enums iterable (#148647)
Additionally, add sentinel values <Enum>::First_ and <Enum>::Last_ to
each one of those enums.

This will allow using `enum_seq_inclusive` to generate the list of
enum-typed values of any generated scoped (non-bitmask) enum.
2025-07-17 11:27:47 -05:00
Jeremy Morse
5b8c15c6e7
[DebugInfo] Remove getPrevNonDebugInstruction (#148859)
With the advent of intrinsic-less debug-info, we no longer need to
scatter calls to getPrevNonDebugInstruction around the codebase. Remove
most of them -- there are one or two that have the "SkipPseudoOp" flag
turned on, however they don't seem to be in positions where skipping
anything would be reasonable.
2025-07-16 11:41:32 +01:00
Jeremy Morse
57a5f9c47e
[DebugInfo][RemoveDIs] Suppress getNextNonDebugInfoInstruction (#144383)
There are no longer debug-info instructions, thus we don't need this
skipping. Horray!
2025-07-15 15:34:10 +01:00
Tom Eccles
a1c61ac756
[mlir][OpenMP] Allow composite SIMD REDUCTION and IF (#147568)
Reduction support: https://github.com/llvm/llvm-project/pull/146671
If Support is fixed in this PR

The problem for the IF clause in composite constructs was that wsloop
and simd both operate on the same CanonicalLoopInfo structure: with the
SIMD processed first, followed by the wsloop. Previously the IF clause
generated code like
```
if (cond) {
  while (...) {
    simd_loop_body;
  }
} else {
  while (...) {
    nonsimd_loop_body;
  }
}
```
The problem with this is that this invalidates the CanonicalLoopInfo
structure to be processed by the wsloop later. To avoid this, in this
patch I preserve the original loop, moving the IF clause inside of the
loop:
```
while (...) {
  if (cond) {
    simd_loop_body;
  } else {
    non_simd_loop_body;
  }
}
```
On simple examples I tried LLVM was able to hoist the if condition
outside of the loop at -O3.

The disadvantage of this is that we cannot add the
llvm.loop.vectorize.enable attribute on either the SIMD or non-SIMD
loops because they both share a loop back edge. There's no way of
solving this without keeping the old design of having two different
loops: which cannot be represented using only one CanonicalLoopInfo
structure. I don't think the presence or absence of this attribute makes
much difference. In my testing it is the llvm.loop.parallel_access
metadata which makes the difference to vectorization. LLVM will
vectorize if legal whether or not this attribute is there in the TRUE
branch. In the FALSE branch this means the loop might be vectorized even
when the condition is false: but I think this is still standards
compliant: OpenMP 6.0 says that when the if clause is false that should
be treated like the SIMDLEN clause is one. The SIMDLEN clause is defined
as a "hint". For the same reason, SIMDLEN and SAFELEN clauses are
silently ignored when SIMD IF is used.

I think it is better to implement SIMD IF and ignore SIMDLEN and SAFELEN
and some vectorization encouragement metadata when combined with IF than
to ignore IF because IF could have correctness consequences whereas the
rest are optimiztion hints. For example, the user might use the IF
clause to disable SIMD programatically when it is known not safe to
vectorize the loop. In this case it is not at all safe to add the
parallel access or SAFELEN metadata.
2025-07-15 10:30:02 +01:00
Finn Plummer
6a948145aa
[HLSL][RootSignature] Update setDefaultFlags to account for Root Signature Version (#145828)
This pr updates `setDefaultFlags` in `HLSLRootSignature.h` to account
for which version it should initialize the default flag values for.

- Updates `setDefaultFlags` with a `Version` argument and initializes
them to be compliant as described
[here](https://github.com/llvm/wg-hlsl/pull/297).
- Updates `RootSignatureParser` to retain the `Version` and pass this
into `setDefaultFlags`
- Updates all uses of `setDefaultFlags` in test-cases
- Adds some new unit testing to ensure behaviour is as expected and that
the Parser correctly passes down the version

Resolves https://github.com/llvm/llvm-project/issues/145820.
2025-07-04 09:48:24 -07:00
Finn Plummer
56e3fc4c42
[NFC][HLSL][RootSignature] Split up HLSLRootSignatureUtils (#146124)
This pr breaks-up `HLSLRootSignatureUtils` into separate orthogonal and
meaningful libraries. This prevents it ending up as a dumping grounds of
many different parts.

- Creates a library `RootSignatureMetadata` to contain helper functions
for interacting the root signatures in their metadata representation
- Create a library `RootSignatureValidations` to contain helper
functions that will validate various values of root signatures
- Move the serialization of root signature elements to
`HLSLRootSignature`

Resolves: https://github.com/llvm/llvm-project/issues/145946
2025-07-04 07:58:58 -07:00
Finn Plummer
0ceb0c377a
[NFC][HLSL][DirectX] Let HLSLRootSignature reuse the dxbc defined enums (#145986)
This pr removes the redundancy of having the same enums defined in both
the front-end and back-end of handling root signatures. Since there are
many more uses of the enum in the front-end of the code, we will adhere
to the naming conventions used in the front-end, to minimize the diff.

The macros in `DXContainerConstants.def` are also touched-up to be
consistent and to have each macro name follow its respective definition
in d3d12.h and searchable by name
[here](https://learn.microsoft.com/en-us/windows/win32/api/d3d12/).

Additionally, the many `getEnumNames` are moved to `DXContainer` from
`HLSLRootSignatureUtils` as they we will want them to be exposed
publicly anyways.

Changes for each enum follow the pattern of a commit that will make the
enum definition in `DXContainer` adhere to above listed naming
conventions, followed by a commit to actually use that enum in the
front-end.

Resolves https://github.com/llvm/llvm-project/issues/145815
2025-07-03 14:44:11 -07:00
Krzysztof Parzyszek
795b17d0b8
[Frontend][OpenMP] Implement directive name parser (#146776)
Implement a state machine that consumes tokens (words delimited by white
space), and returns the corresponding directive id, or fails if the tokens
did not form a valid name.
2025-07-03 12:18:01 -05:00
Krzysztof Parzyszek
d7b936b633
[OpenMP] Add directive spellings introduced in OpenMP 6.0 (#141772)
For background information see

https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
2025-06-25 07:55:06 -05:00
Finn Plummer
310a62f88a
[HLSL][RootSignature] Plug-in serialization and add full sample testcase (#144769)
This pr extends `dumpRootElements` to invoke the print methods of all
`RootElement`s now that they are all implemented.

Extends the `RootSignatures-AST.hlsl` testcase to have a root element of
each type being parsed, constructed to the in-memory representation mode
and then being dumped as part of the AST dump.

- Update `HLSLRootSignatureUtils.cpp` to extend `dumpRootElements`
- Extend `AST/HLSL/RootSigantures-AST.hlsl` testcase
- Defines the helper `operator<<` for `RootElement`
- Small correction to the output of `numDescriptors` to be `unbounded`
in special case

Resolves https://github.com/llvm/llvm-project/issues/124595.
2025-06-23 17:19:12 -07:00
Finn Plummer
87b13ada10
[HLSL][RootSignature] Implement serialization of remaining Root Elements (#143198)
Implements serialization of the remaining `RootElement`s, namely
`RootDescriptor`s and `StaticSampler`s.

- Adds unit testing for the serialization methods

Resolves https://github.com/llvm/llvm-project/issues/138191
Resolves https://github.com/llvm/llvm-project/issues/138193
2025-06-17 15:59:38 -07:00
Finn Plummer
9e0186d925
[HLSL][RootSignature] Implement ResourceRange as an IntervalMap (#140957)
A resource range consists of a closed interval, `[a;b]`, denoting which
shader registers it is bound to.

For instance:
 - `CBV(b1)`  corresponds to the resource range of `[1;1]`
 - `CBV(b0, numDescriptors = 3)` likewise to `[0;2]`

We want to provide an error diagnostic when there is an overlap in the
required registers (an overlap in the resource ranges).

The goal of this pr is to implement a structure to model a set of
resource ranges and provide an api to detect any overlap over a set of
resource ranges.

`ResourceRange` models this by implementing an `IntervalMap` to denote a
mapping from an interval of registers back to a resource range. It
allows for a new `ResourceRange` to be added to the mapping and it will
report if and what the first overlap is.

For the context of how this will be used in validation of a
`RootSignatureDecl` please see the proceeding pull request here:
https://github.com/llvm/llvm-project/pull/140962.

- Implements `ResourceRange` as an `IntervalMap`
- Adds unit testing of the various `insert` scenarios

Note: it was also considered to implement this as an `IntervalTree`,
this would allow reporting of a diagnostic for each overlap that is
encountered, as opposed to just the first. However, error generation of
just reporting the first error is already rather verbose, and adding the
additional diagnostics only made this worse.

Part 1 of https://github.com/llvm/llvm-project/issues/129942
2025-06-17 10:24:57 -07:00
Finn Plummer
a383b1a95b
Reland "[HLSL][RootSignature] Implement serialization of RootConstants and RootFlags" (#143019)
This relands #141130.

The initial commit uncovered that we are missing the correct linking of
FrontendHLSL into clang/lib/Parse and clang/lib/unittests/Parse.

This change addreses this by linking them accordingly.

It was also checked and ensured that the LexHLSLRootSignature libraries
do not depend on FrontendHLSL and so we are not required to link there.

Resolves: #138190 and #138192
2025-06-16 14:52:59 -07:00
Finn Plummer
9ec5afea77
[NFC][RootSignature] Move RootSignature util functions (#142491)
`HLSLRootSignature.h` was originally created to hold the struct
definitions of an `llvm::hlsl::rootsig::RootElement` and some helper
functions for it.

However, there many users of the structs that don't require any of the
helper methods. This requires us to link the `FrontendHLSL` library,
where we otherwise wouldn't need to.

For instance:
- This [revert](https://github.com/llvm/llvm-project/pull/142005) was
required as it requires linking to the unrequired `FrontendHLSL` library
- As part of the change required here:
https://github.com/llvm/llvm-project/issues/126557. We will want to add
an `HLSLRootSignatureVersion` enum. Ideally this could live with the
root signature struct defs, but we don't want to link the helper objects
into `clang/Basic/TargetOptions.h`

This change allows the struct definitions to be kept in a single header
file and to then have the `FrontendHLSL` library only be linked when
required.
2025-06-03 09:59:50 -07:00
Finn Plummer
dd56693451
Revert "[HLSL][RootSignature] Implement serialization of RootConstants and RootFlags" (#142005)
The commit caused build failures,
[here](https://lab.llvm.org/buildbot/#/builders/10/builds/6308), due to
a missing linked llvm library (HLSLFrontend) into
`clang/unittests/Parse/CMakeLists.txt`.

While it seems like the fix is straightforwardly to just add this
library, I will revert now to build and verify locally it correctly
fixes it.

Reverts llvm/llvm-project#141130
2025-05-29 11:21:20 -07:00
Finn Plummer
66889bf300
[HLSL][RootSignature] Implement serialization of RootConstants and RootFlags (#141130)
- Implements serialization of the currently completely defined
`RootElement`s, namely `RootConstants` and `RootFlags`

- Adds unit testing for the serialization methods

Resolves: https://github.com/llvm/llvm-project/issues/138190 and
https://github.com/llvm/llvm-project/issues/138192
2025-05-29 10:31:08 -07:00
Finn Plummer
5d76555f93
[NFC][HLSL][RootSignature] Use operator<< overload instead of dump method (#141127)
- we will need to provide a way to dump `RootFlags` for serialization
and by using operator overloads we can maintain a consistent interface

This is an NFC to allow for
https://github.com/llvm/llvm-project/issues/138192 to be more
straightforwardly implemented.
2025-05-22 19:21:40 -07:00
Alexander Richardson
07e2ba445d
[AMDGPU] Set AS8 address width to 48 bits
Of the 128-bits of buffer descriptor only 48 bits are address bits, so
following the discussion on https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54,
the logic conclusion is to set the index width to 48 bits instead of
the current value of 128.

Most of the test changes are mechanical datalayout updates, but there
is one actual change: the ptrmask test now uses .i48 instead of .i128
and I had to update SelectionDAGBuilder to correctly extend the mask.

Reviewed By: krzysz00

Pull Request: https://github.com/llvm/llvm-project/pull/139419
2025-05-19 17:26:05 -07:00
Shafik Yaghmour
c248903053
[OpenMP][NFC] Use pass by const ref for Dependencies (#139592)
Static analysis flagged the passing of Dependencies to emitTargetCall as
a
place we could use std::move to avoid copying. A closer look indicated
we could
instead turn the parameter into a const & and not have a default value
since it
was only used in two lines in a test and changing those two locations
was easy.
2025-05-13 09:09:37 -07:00
Finn Plummer
74ed33484e
[HLSL][RootSignature] Implement serialized dump of Descriptor Tables (#138326)
- defines the `dump` method for in-memory descriptor table data structs
in `Frontend/HLSLRootSignature`
- creates unit test infrastructure to support unit tests of the dump
methods

Resolves https://github.com/llvm/llvm-project/issues/138189
2025-05-09 12:44:38 -07:00
Krzysztof Parzyszek
4b29ee407e
[LLVM][OpenMP] Add "version" parameter to getOpenMPDirectiveName (#139114)
Some OpenMP directives have different spellings in different versions of
the OpenMP spec. To use the proper spelling for a given spec version
pass "version" as a parameter to getOpenMPDirectiveName.

This parameter won't be used at the moment, and will have a default
value to allow callers not to pass it, for gradual adoption in various
components.

RFC:
https://discourse.llvm.org/t/rfc-alternative-spellings-of-openmp-directives/85507
2025-05-09 07:41:27 -05:00
Kazu Hirata
2f3067ed69
[llvm] Remove unused local variables (NFC) (#138454) 2025-05-04 09:38:16 -07:00
NimishMishra
b62afbccc8
[mlir][OpenMP] Add __atomic_store to AtomicInfo (#121055)
This PR adds functionality for `__atomic_store` libcall in AtomicInfo.
This allows for supporting complex types in `atomic write`.

Fixes https://github.com/llvm/llvm-project/issues/113479
Fixes https://github.com/llvm/llvm-project/issues/115652
2025-04-29 07:53:36 -07:00
Matt Arsenault
34e7809397
unittests: Avoid using getNumUses (#136352) 2025-04-18 23:17:26 +02:00
NimishMishra
53fa92dcad
[mlir][llvm][OpenMP] Hoist __atomic_load alloca (#132888)
Current implementation of `__atomic_compare_exchange` uses an alloca for
`__atomic_load`, leading to issues like
https://github.com/llvm/llvm-project/issues/120724. This PR hoists this
alloca to `AllocaIP`.


Fixes: https://github.com/llvm/llvm-project/issues/120724
2025-04-09 03:01:44 -07:00
Jan Leyonberg
fbc8335311
[MLIR][OpenMP] Add codegen for teams reductions (#133310)
This patch adds the lowering of teams reductions from the omp dialect to
LLVM-IR. Some minor cleanup was done in clang to remove an unused
parameter.
2025-04-07 12:47:16 -04:00
Sergio Afonso
56975b4ecd
[OpenMPIRBuilder] Split calculation of canonical loop trip count, NFC (#127820)
This patch splits off the calculation of canonical loop trip counts from
the creation of canonical loops. This makes it possible to reuse this
logic to, for instance, populate the `__tgt_target_kernel` runtime call
for SPMD kernels.

This feature is used to simplify one of the existing OpenMPIRBuilder
tests.
2025-02-25 10:32:54 +00:00
Akash Banerjee
785a5b4676
[MLIR][OpenMP] Add LLVM translation support for OpenMP UserDefinedMappers (#124746)
This patch adds OpenMPToLLVMIRTranslation support for the OpenMP Declare
Mapper directive.

Since both MLIR and Clang now support custom mappers, I've changed the
respective function params to no longer be optional as well.

Depends on #121005
2025-02-18 17:55:48 +00:00
Abid Qadeer
5f7acf7259
[flang][OMPIRbuilder] Set debug loc on terminator created by splitBB. (#125897)
Fixes #125088.

When splitBB is called with createBranch=true, it creates a branch
instruction in the old block. But no debug loc is set on that branch
instruction. If that is used as InsertPoint in the restoreIP, it has the
potential to set the current debug location to null and subsequent
instruction will come out without a debug location. This caused the
verification check to fail as shown in the bug report.

This PR changes splitBB and spliceBB function to also take a debugLoc
parameter which can be used to set the debug location of the branch
instruction.
2025-02-05 22:35:43 +00:00
Abid Qadeer
e151b1d1f6
[MLIR][OpenMP] Use correct DebugLoc in target construct callbacks. (#125856)
This is same as PR #125106 which somehow is stuck in a "Processing
Update" loop for many hours now. I am going to close that one and push
this one instead.

While working on https://github.com/llvm/llvm-project/issues/125088, I
noticed a problem with the TargetBodyGenCallbackTy and
TargetGenArgAccessorsCallbackTy. The OMPIRBuilder and MLIR side Both
maintain their own IRBuilder and when control goes from one to other, we
have to take care to not use a stale debug location. The code currently
rely on restoreIP to set the insertion point and the debug location. But
if the passes InsertPointTy has an empty block, then the debug location
will not be updated (see SetInsertPoint). This can cause invalid debug
location to be attached to instruction and the verifier will complain.

Similarly when we exit the callback, the debug location of the Builder
is not set to what it was before the callback. This again can cause
verification failures.

This PR resets the debug location at the start and also uses an
InsertPointGuard to restore the debug location at exit.

Both of these problems would have been caught by the unit tests but they
were not setting the debug location of the builder before calling the
createTarget so the problem was hidden. I have updated the tests
accordingly.
2025-02-05 14:59:37 +00:00
Ritanya-B-Bharadwaj
8c36665267
[OpenMP]Initial parsing/sema support for target_device selector set (#118471)
This patch adds initial support for target_device selector set - Section
9.2 (Spec 6.0)
2025-02-05 19:24:24 +05:30
Jeremy Morse
81d18ad864
[NFC][DebugInfo] Make some block-start-position methods return iterators (#124287)
As part of the "RemoveDIs" work to eliminate debug intrinsics, we're
replacing methods that use Instruction*'s as positions with iterators. A
number of these (such as getFirstNonPHIOrDbg) are sufficiently
infrequently used that we can just replace the pointer-returning version
with an iterator-returning version, hopefully without much/any
disruption.

Thus this patch has getFirstNonPHIOrDbg and
getFirstNonPHIOrDbgOrLifetime return an iterator, and updates all
call-sites. There are no concerns about the iterators returned being
converted to Instruction*'s and losing the debug-info bit: because the
methods skip debug intrinsics, the iterator head bit is always false
anyway.
2025-01-27 16:27:54 +00:00
Alex MacLean
07ed8187ac
[OpenMP] Replace nvvm.annotation usage with kernel calling conventions (#122320)
Specifying a kernel with the `ptx_kernel` or `amdgpu_kernel` calling
convention is a more idiomatic and compile-time performant than using
the `nvvm.annoation !"kernel"` metadata.

Transition OMPIRBuilder to use calling conventions for PTX kernels and
no longer emit `nvvm.annoation`. Update OpenMPOpt to work with kernels
specified via calling convention as well as metadata. Update OpenMP
tests to use the calling conventions.
2025-01-24 16:56:10 -08:00
Jeremy Morse
6292a808b3
[NFC][DebugInfo] Use iterator-flavour getFirstNonPHI at many call-sites (#123737)
As part of the "RemoveDIs" project, BasicBlock::iterator now carries a
debug-info bit that's needed when getFirstNonPHI and similar feed into
instruction insertion positions. Call-sites where that's necessary were
updated a year ago; but to ensure some type safety however, we'd like to
have all calls to getFirstNonPHI use the iterator-returning version.

This patch changes a bunch of call-sites calling getFirstNonPHI to use
getFirstNonPHIIt, which returns an iterator. All these call sites are
where it's obviously safe to fetch the iterator then dereference it. A
follow-up patch will contain less-obviously-safe changes.

We'll eventually deprecate and remove the instruction-pointer
getFirstNonPHI, but not before adding concise documentation of what
considerations are needed (very few).

---------

Co-authored-by: Stephen Tozer <Melamoto@gmail.com>
2025-01-24 13:27:56 +00:00
Mats Jun Larsen
7bb949ec61
[IR][unittests] Replace of PointerType::getUnqual(Type) with opaque version (NFC) (#123901)
Follow up to https://github.com/llvm/llvm-project/issues/123569
2025-01-22 18:02:51 +09:00
Sergio Afonso
9bc8828093
[OMPIRBuilder][MLIR] Add support for target 'if' clause (#122478)
This patch implements support for handling the 'if' clause of OpenMP
'target' constructs in the OMPIRBuilder and updates MLIR to LLVM IR
translation of the `omp.target` MLIR operation to make use of this new
feature.
2025-01-15 10:16:19 +00:00
Sergio Afonso
d0b641b7e2
[OMPIRBuilder] Propagate attributes to outlined target regions (#117875)
This patch copies the target-cpu and target-features attributes of
functions containing target regions into the corresponding outlined
function holding the target region.

This mirrors what is currently being done for all other outlined
functions through the `CodeExtractor` in `OpenMPIRBuilder::finalize()`.
2025-01-14 12:35:50 +00:00
Sergio Afonso
fabc443e93
[OMPIRBuilder] Support runtime number of teams and threads, and SPMD mode (#116051)
This patch introduces a `TargetKernelRuntimeAttrs` structure to hold
host-evaluated `num_teams`, `thread_limit`, `num_threads` and trip count
values passed to the runtime kernel offloading call.

Additionally, kernel type information is used to influence target device
code generation and the `IsSPMD` flag is replaced by `ExecFlags`, which
provides more granularity.
2025-01-14 12:34:37 +00:00
Sergio Afonso
27bc6bdaba
[OMPIRBuilder] Introduce struct to hold default kernel teams/threads (#116050)
This patch introduces the `OpenMPIRBuilder::TargetKernelDefaultAttrs`
structure used to simplify passing default and constant values for
number of teams and threads, and possibly other target kernel-related
information in the future.

This is used to forward values passed to `createTarget` to
`createTargetInit`, which previously used a default unrelated set of
values.
2025-01-14 11:08:55 +00:00
Sergio Afonso
b79ed8729b
[OpenMP][OMPIRBuilder] Handle non-failing calls properly (#115863)
The preprocessor definition used to enable asserts and the one that
`llvm::Error` and `llvm::Expected` use to ensure all created instances are
checked are not the same. By making these checks inside of an `assert` in cases
where errors are not expected, certain build configurations would trigger
runtime failures (e.g. `-DLLVM_ENABLE_ASSERTIONS=OFF
-DLLVM_UNREACHABLE_OPTIMIZE=ON`).

The `llvm::cantFail()` function, which was intended for this use case, is used
by this patch in place of `assert` to prevent these runtime failures. In tests,
new preprocessor definitions based on `ASSERT_THAT_EXPECTED` and
`EXPECT_THAT_EXPECTED` are used instead, to avoid silent failures in release
builds.
2025-01-09 10:28:16 +00:00
Kaviya Rajendiran
d3eb65f15d
[MLIR][OpenMP] Lowering aligned clause to LLVM IR for SIMD directive (#119536)
This patch,
- Added a translation support for aligned clause in SIMD directive by passing the alignment details to "llvm.assume" intrinsic.
- Updated the insertion point for llvm.assume intrinsic call in "OMPIRBuilder.cpp".
- Added a check in aligned clause MLIR lowering, to ensure that the alignment value must be a power of 2.
2025-01-03 16:22:38 +05:30