945 Commits

Author SHA1 Message Date
Alexander Richardson
3a4b351ba1
[IR] Introduce the ptrtoaddr instruction
This introduces a new `ptrtoaddr` instruction which is similar to
`ptrtoint` but has two differences:

1) Unlike `ptrtoint`, `ptrtoaddr` does not capture provenance
2) `ptrtoaddr` only extracts (and then extends/truncates) the low
   index-width bits of the pointer

For most architectures, difference 2) does not matter since index (address)
width and pointer representation width are the same, but this does make a
difference for architectures that have pointers that aren't just plain
integer addresses such as AMDGPU fat pointers or CHERI capabilities.

This commit introduces textual and bitcode IR support as well as basic code
generation, but optimization passes do not handle the new instruction yet
so it may result in worse code than using ptrtoint. Follow-up changes will
update capture tracking, etc. for the new instruction.

RFC: https://discourse.llvm.org/t/clarifiying-the-semantics-of-ptrtoint/83987/54

Reviewed By: nikic

Pull Request: https://github.com/llvm/llvm-project/pull/139357
2025-08-08 10:12:39 -07:00
Diana Picus
20d8398825
[AMDGPU] ISel & PEI for whole wave functions (#145858)
Whole wave functions are functions that will run with a full EXEC mask.
They will not be invoked directly, but instead will be launched by way
of a new intrinsic, `llvm.amdgcn.call.whole.wave` (to be added in
a future patch). These functions are meant as an alternative to the
`llvm.amdgcn.init.whole.wave` or `llvm.amdgcn.strict.wwm` intrinsics.

Whole wave functions will set EXEC to -1 in the prologue and restore the
original value of EXEC in the epilogue. They must have a special first
argument, `i1 %active`, that is going to be mapped to EXEC. They may
have either the default calling convention or amdgpu_gfx. The inactive
lanes need to be preserved for all registers used, active lanes only for
the CSRs.

At the IR level, arguments to a whole wave function (other than
`%active`) contain poison in their inactive lanes. Likewise, the return
value for the inactive lanes is poison.

This patch contains the following work:
* 2 new pseudos, SI_SETUP_WHOLE_WAVE_FUNC and SI_WHOLE_WAVE_FUNC_RETURN
  used for managing the EXEC mask. SI_SETUP_WHOLE_WAVE_FUNC will return
  a SReg_1 representing `%active`, which needs to be passed into
  SI_WHOLE_WAVE_FUNC_RETURN.
* SelectionDAG support for generating these 2 new pseudos and the
  special handling of %active. Since the return may be in a different
  basic block, it's difficult to add the virtual reg for %active to
  SI_WHOLE_WAVE_FUNC_RETURN, so we initially generate an IMPLICIT_DEF
  which is later replaced via a custom inserter.
* Expansion of the 2 pseudos during prolog/epilog insertion. PEI also
  marks any used VGPRs as WWM registers, which are then spilled and
  restored with the usual logic.

Future patches will include the `llvm.amdgcn.call.whole.wave` intrinsic
and a lot of optimization work (especially in order to reduce spills
around function calls).

---------

Co-authored-by: Matt Arsenault <Matthew.Arsenault@amd.com>
Co-authored-by: Shilei Tian <i@tianshilei.me>
2025-07-21 10:39:09 +02:00
Jeremy Morse
51f4e2cda2
[Bitcode][NFC] Add abbrev for FUNC_CODE_DEBUG_LOC (#147211)
DILocations that are not attached to instructions are encoded using
METADATA_LOCATION records which have an abbrev. DILocations attached to
instructions are interleaved with instruction records as FUNC_CODE_DEBUG_LOC
records, which do not have an abbrev (and FUNC_CODE_DEBUG_LOC_AGAIN
which have no operands).

Add a new FUNCTION_BLOCK abbrev FUNCTION_DEBUG_LOC_ABBREV for
FUNC_CODE_DEBUG_LOC records.

This reduces the bc file size by up to 7% in CTMark, with many between 2-4%
smaller.

[per-file file size
compile-time-tracker](https://llvm-compile-time-tracker.com/compare.php?from=75cf826849713c00829cdf657e330e24c1a2fd03&to=1e268ebd0a581016660d9d7e942495c1be041f7d&stat=size-file&details=on)
(go to stage1-ReleaseLTO-g).

This optimisation is motivated by #144102, which adds the new Key Instructions
fields to bitcode records. The combined patches still overall look to be a
slight improvement over the base.

(Originally reviewed in PR #146497)

Co-authored-by: Orlando Cazalet-Hyams <orlando.hyams@sony.com>
2025-07-06 22:30:31 +01:00
Nikita Popov
0a656d8e57
[Bitcode] Add abbreviations for additional instructions (#146825)
Add abbreviations for icmp/fcmp, store and br, which are the most common
instructions that don't have abbreviations yet. This requires increasing
the abbreviation size to 5 bits.

This gives about 3-5% bitcode size reductions for the clang build.
2025-07-03 14:28:32 +02:00
Antonio Frighetto
f1cc0b607b [IR] Introduce dead_on_return attribute
Add `dead_on_return` attribute, which is meant to be taken advantage
by the frontend, and states that the memory pointed to by the argument
is dead upon function return. As with `byval`, it is supposed to be
used for passing aggregates by value. The difference lies in the ABI:
`byval` implies that the pointer is explicitly passed as argument to
the callee (during codegen the copy is emitted as per byval contract),
whereas a `dead_on_return`-marked argument implies that the copy
already exists in the IR, is located at a specific stack offset within
the caller, and this memory will not be read further by the caller upon
callee return – or otherwise poison, if read before being written.

RFC: https://discourse.llvm.org/t/rfc-add-dead-on-return-attribute/86871.
2025-07-02 09:29:36 +02:00
Timothy Werquin
a8e486bfc4
[Bitcode] Fix constexpr expansion creating invalid PHIs (#141560)
Fixes errors about duplicate PHI edges when the input had duplicates
with constexprs in them. The constexpr translation makes new basic
blocks, causing the verifier to complain about duplicate entries in PHI
nodes.
2025-05-27 15:51:48 +02:00
Jonathan Thackray
6e49f73825
Reland [llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#137701)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.

These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
     https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.

Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
2025-04-30 22:06:37 +01:00
Jonathan Thackray
7ee0097b48
Revert "[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions" (#137657)
Reverts llvm/llvm-project#136759 due to bad interaction with c792b25e4
2025-04-28 16:53:36 +01:00
Jonathan Thackray
ba420d8122
[llvm] Add support for llvm IR atomicrmw fminimum/fmaximum instructions (#136759)
This patch adds support for LLVM IR atomicrmw `fmaximum` and `fminimum`
instructions.

These mirror the `llvm.maximum.*` and `llvm.minimum.*` instructions, but
are atomic and use IEEE754 2019 handling for NaNs, which is different to
`fmax` and `fmin`. See:
     https://llvm.org/docs/LangRef.html#llvm-minimum-intrinsic
for more details.

Future changes will allow this LLVM IR to be lowered to specialised
assembler instructions on suitable targets, such as AArch64.
2025-04-28 15:31:44 +01:00
Matt Arsenault
c5ac63e4fc
ThinLTO: Add flag to print uselistorder in bitcode writer pass (#133230)
This is needed in llvm-reduce to avoid perturbing the uselistorder in
intermediate steps. Really llvm-reduce wants pure serialization with
no dependency on the pass manager. There are other optimizations mixed
in to the serialization here depending on metadata in the module, which
is also bad.
    
Part of #63621
2025-04-14 20:34:02 +02:00
Shilei Tian
a45b133d40
[AMDGPU][Verifier] Mark calls to entry functions as invalid in the IR verifier (#134910) 2025-04-11 15:32:37 -04:00
Shilei Tian
61a7289077 [NFC] Remove trailing whitespaces in llvm/test/Bitcode/calling-conventions.3.2.ll 2025-04-10 13:01:39 -04:00
Jeremy Morse
40f9bb9e25
[DebugInfo][RemoveDIs] Eliminate another debug-info variation flag (#133917)
The "preserve input debug-info format" flag allowed some tooling to opt
into not seeing the new debug records yet, and to not autoupgrade. This
was good at the time, but un-necessary now that we'll be ditching
intrinsics shortly.

It also hides errors now: verify-uselistorder was hardcoding this flag
to on, and as a result it hasn't seen debug records before. Thus, we
missed a uselistorder variation: constant-expressions such as GEPs can
be contained within debug records and completely isolated from the value
hierachy, see the metadata-use-uselistorder.ll test. These Values didn't
get ordered, but were legitimate uses of constants like "i64 0", and we
now run into difficulty handling that. The patch to AsmWriter seeks
Values to order even through debug-info now.

Finally there are a few intrinsics-tests relying on this flag that we
can just delete, such as one in llvm-reduce and another few in the
LocalTest unit tests. For the fast-isel test, it was added in
https://reviews.llvm.org/D67703 explicitly for checking the size of
blocks without debug-info and in 1525abb9c94 the codepath it tests moved
towards being sunsetted. It'll be totally redundant once RemoveDIs is on
permanently.

Note that there's now no explicit test for the textual-IR autoupgrade
path. I submit that we can rely on the thousands of .ll files where
we've only been bothered to update the outputs, not the inputs, to debug
records.
2025-04-09 18:00:28 +01:00
Matt Arsenault
28a391848c Bitcode: Convert test to opaque pointers 2025-04-07 21:59:11 +07:00
Jeremy Morse
1ebc308bba
[DebugInfo][RemoveDIs] Remove debug-intrinsic printing cmdline options (#131855)
During the transition from debug intrinsics to debug records, we used
several different command line options to customise handling: the
printing of debug records to bitcode and textual could be independent of
how the debug-info was represented inside a module, whether the
autoupgrader ran could be customised. This was all valuable during
development, but now that totally removing debug intrinsics is coming
up, this patch removes those options in favour of a single flag
(experimental-debuginfo-iterators), which enables autoupgrade, in-memory
debug records, and debug record printing to bitcode and textual IR.

We need to do this ahead of removing the
experimental-debuginfo-iterators flag, to reduce the amount of
test-juggling that happens at that time.

There are quite a number of weird test behaviours related to this --
some of which I simply delete in this commit. Things like
print-non-instruction-debug-info.ll , the test suite now checks for
debug records in all tests, and we don't want to check we can print as
intrinsics. Or the update_test_checks tests -- these are duplicated with
write-experimental-debuginfo=false to ensure file writing for intrinsics
is correct, but that's something we're imminently going to delete.

A short survey of curious test changes:
* free-intrinsics.ll: we don't need to test that debug-info is a zero
cost intrinsic, because we won't be using intrinsics in the future.
* undef-dbg-val.ll: apparently we pinned this to non-RemoveDIs in-memory
mode while we sorted something out; it works now either way.
* salvage-cast-debug-info.ll: was testing intrinsics-in-memory get
salvaged, isn't necessary now
* localize-constexpr-debuginfo.ll: was producing "dead metadata"
intrinsics for optimised-out variable values, dbg-records takes the
(correct) representation of poison/undef as an operand. Looks like we
didn't update this in the past to avoid spurious test differences.
* Transforms/Scalarizer/dbginfo.ll: this test was explicitly testing
that debug-info affected codegen, and we deferred updating the tests
until now. This is just one of those silent gnochange issues that get
fixed by RemoveDIs.

Finally: I've added a bitcode test, dbg-intrinsics-autoupgrade.ll.bc,
that checks we can autoupgrade debug intrinsics that are in bitcode into
the new debug records.
2025-04-01 14:27:11 +01:00
Tom Tromey
68947342b7
Add support for fixed-point types (#129596)
This adds DWARF generation for fixed-point types. This feature is needed
by Ada.

Note that a pre-existing GNU extension is used in one case. This has
been emitted by GCC for years, and is needed because standard DWARF is
otherwise incapable of representing these types.
2025-03-31 07:42:21 -07:00
Tom Tromey
f89129af8a
Add bit stride to DICompositeType (#131680)
In Ada, an array can be packed and the elements can take less space than
their natural object size. For example, for this type:

   type Packed_Array is array (4 .. 8) of Boolean;
   pragma pack (Packed_Array);

... each element of the array occupies a single bit, even though the
"natural" size for a Boolean in memory is a byte.

In DWARF, this is represented by putting a DW_AT_bit_stride onto the
array type itself.

This patch adds a bit stride to DICompositeType so that gnat-llvm can
emit DWARF for these sorts of arrays.
2025-03-25 17:14:07 -07:00
Michael Buch
a27da0a20c
[llvm][MetadataLoader] Make sure we correctly load DW_APPLE_ENUM_KIND from bitcode (#132374)
This was pointed out in
https://github.com/llvm/llvm-project/pull/124752#issuecomment-2730052773

There was no test that roundtrips this attribute through LLVM bitcode,
so this was never caught.
2025-03-22 08:09:49 +00:00
Brandon Wu
c804e86f55
[RISCV][VLS] Support RISCV VLS calling convention (#100346)
This patch adds a function attribute `riscv_vls_cc` for RISCV VLS
calling
convention which takes 0 or 1 argument, the argument is the `ABI_VLEN`
which is the `VLEN` for passing the fixed-vector arguments, it wraps the
argument as a scalable vector(VLA) using the `ABI_VLEN` and uses the
corresponding mechanism to handle it. The range of `ABI_VLEN` is [32,
65536],
if not specified, the default value is 128.

Here is an example of VLS argument passing:
Non-VLS call:
```
  void original_call(__attribute__((vector_size(16))) int arg) {}
=>
  define void @original_call(i128 noundef %arg) {
  entry:
    ...
    ret void
  }
```
VLS call:
```
  void __attribute__((riscv_vls_cc(256))) vls_call(__attribute__((vector_size(16))) int arg) {}
=>
  define riscv_vls_cc void @vls_call(<vscale x 1 x i32> %arg) {
  entry:
    ...
    ret void
  }
}
```

The first Non-VLS call passes generic vector argument of 16 bytes by
flattened integer.
On the contrary, the VLS call uses `ABI_VLEN=256` which wraps the
vector to <vscale x 1 x i32> where the number of scalable vector
elements
is calaulated by: `ORIG_ELTS * RVV_BITS_PER_BLOCK / ABI_VLEN`.
Note: ORIG_ELTS = Vector Size / Type Size = 128 / 32 = 4.

PsABI PR: https://github.com/riscv-non-isa/riscv-elf-psabi-doc/pull/418
C-API PR: https://github.com/riscv-non-isa/riscv-c-api-doc/pull/68
2025-03-03 12:39:35 +08:00
Tom Tromey
e298fc2da9
Add DISubrangeType (#126772)
An Ada program can have types that are subranges of other types. This
patch adds a new DIType node, DISubrangeType, to represent this concept.
    
I considered extending the existing DISubrange to do this, but as
DISubrange does not derive from DIType, that approach seemed more
disruptive.
    
A DISubrangeType can be used both as an ordinary type, but also as the
type of an array index. This is also important for Ada.

Ada subrange types can also be stored using a bias. Representing this in
the DWARF required the use of an extension. GCC has been emitting this
extension for years, so I've reused it here.
2025-02-24 10:11:53 -08:00
Pedro Lobo
4af8c5382e
[BitcodeReader] Use poison instead of undef to represent unsupported constexprs in metadata (#127665)
Metadata that references unsupported constant expressions can be
represented with `poison` metadata instead of `undef` metadata.
2025-02-19 10:13:37 +01:00
Antonio Frighetto
ff585feacf [IR][ModRef] Introduce errno memory location
Model C/C++ `errno` macro by adding a corresponding `errno`
memory location kind to the IR. Preliminary work to separate
`errno` writes from other memory accesses, to the benefit of
alias analyses and optimization correctness.

Previous discussion: https://discourse.llvm.org/t/rfc-modelling-errno-memory-effects/82972.
2025-02-13 12:13:39 +01:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Nikita Popov
22e9024c9f
[IR] Introduce captures attribute (#116990)
This introduces the `captures` attribute as described in:
https://discourse.llvm.org/t/rfc-improvements-to-capture-tracking/81420

This initial patch only introduces the IR/bitcode support for the
attribute and its in-memory representation as `CaptureInfo`. This will
be followed by a patch to upgrade and remove the `nocapture` attribute,
and then by actual inference/analysis support.

Based on the RFC feedback, I've used a syntax similar to the `memory`
attribute, though the only "location" that can be specified is `ret`.

I've added some pretty extensive documentation to LangRef on the
semantics. One non-obvious bit here is that using ptrtoint will not
result in a "return-only" capture, even if the ptrtoint result is only
used in the return value. Without this requirement we wouldn't be able
to continue ordinary capture analysis on the return value.
2025-01-13 14:40:25 +01:00
John Brawn
ecbe4d1e36
[IR] Allow fast math flags on fptrunc and fpext (#115894)
This consists of:
 * Make these instructions part of FPMathOperator.
* Adjust bitcode/ir readers/writers to expect fast math flags on these
instructions.
 * Make IRBuilder set the fast math flags on these instructions.
 * Update langref and release notes.
* Update a bunch of tests. Some of these are due to InstCombineCasts
incorrectly adding fast math flags to fptrunc, which will be fixed in a
later patch.
2024-12-04 10:53:04 +00:00
Nikita Popov
98204a2e5b [Bitcode] Verify types for aggregate initializers
Unfortunately all the nice error messages get lost because we
don't forward errors from lazy value materialization.

Fixes https://github.com/llvm/llvm-project/issues/117707.
2024-11-28 11:21:38 +01:00
Paul Walker
56c091ea71
[LLVM][IR] Use splat syntax when printing ConstantExpr based splats. (#116856)
This brings the printing of scalable vector constant splats inline with
their fixed length counterparts.
2024-11-21 11:21:12 +00:00
Teresa Johnson
b35f40688e
[MemProf] Change the STACK_ID record to fixed width values (#116448)
The stack ids are hashes that are close to 64 bits in size, so emitting
as a pair of 32-bit fixed-width values is more efficient than a VBR.
This reduced the summary bitcode size for a large target by about 1%.

Bump the index version and ensure we can read the old format.
2024-11-18 15:16:48 -08:00
Lee Wei
1469d82e1c
Remove br i1 undef from some regression tests [NFC] (#115130)
As defined in LangRef, branching on `undef` is undefined behavior.
This PR aims to remove undefined behavior from tests. As UB tests break
Alive2 and may be the root cause of breaking future optimizations.

Here's an Alive2 proof for one of the examples:
https://alive2.llvm.org/ce/z/TncxhP
2024-11-07 08:11:15 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
davidtrevelyan
4102625380
[rtsan][llvm][NFC] Rename sanitize_realtime_unsafe attr to sanitize_realtime_blocking (#113155)
# What

This PR renames the newly-introduced llvm attribute
`sanitize_realtime_unsafe` to `sanitize_realtime_blocking`. Likewise,
sibling variables such as `SanitizeRealtimeUnsafe` are renamed to
`SanitizeRealtimeBlocking` respectively. There are no other functional
changes.


# Why?

- There are a number of problems that can cause a function to be
real-time "unsafe",
- we wish to communicate what problems rtsan detects and *why* they're
unsafe, and
- a generic "unsafe" attribute is, in our opinion, too broad a net -
which may lead to future implementations that need extra contextual
information passed through them in order to communicate meaningful
reasons to users.
- We want to avoid this situation and make the runtime library boundary
API/ABI as simple as possible, and
- we believe that restricting the scope of attributes to names like
`sanitize_realtime_blocking` is an effective means of doing so.

We also feel that the symmetry between `[[clang::blocking]]` and
`sanitize_realtime_blocking` is easier to follow as a developer.

# Concerns

- I'm aware that the LLVM attribute `sanitize_realtime_unsafe` has been
part of the tree for a few weeks now (introduced here:
https://github.com/llvm/llvm-project/pull/106754). Given that it hasn't
been released in version 20 yet, am I correct in considering this to not
be a breaking change?
2024-10-26 13:06:11 +01:00
Jay Foad
922992a22f
Fix typo "instrinsic" (#112899) 2024-10-18 15:58:33 +01:00
elhewaty
9efb07f261
[IR] Add samesign flag to icmp instruction (#111419)
Inspired by
https://discourse.llvm.org/t/rfc-signedness-independent-icmps/81423
2024-10-15 17:11:25 +08:00
Tim Renouf
76007138f4
[LLVM] New NoDivergenceSource function attribute (#111832)
A call to a function that has this attribute is not a source of
divergence, as used by UniformityAnalysis. That allows a front-end to
use known-name calls as an instruction extension mechanism (e.g.
https://github.com/GPUOpen-Drivers/llvm-dialects ) without such a call
being a source of divergence.
2024-10-12 09:34:45 +01:00
Serge Pavlov
15de239406
[IR] Allow MDString in operand bundles (#110805)
This change implements support of metadata strings in operand bundle
values. It makes possible calls like:

    call void @some_func(i32 %x) [ "foo"(i32 42, metadata !"abc") ]

It requires some extension of the bitcode serialization. As SSA values
and metadata are stored in different tables, there must be a way to
distinguish them during deserialization. It is implemented by putting a
special marker before the metadata index. The marker cannot be treated
as a reference to any SSA value, so it unambiguously identifies
metadata. It allows extending the bitcode serialization without breaking
compatibility.

Metadata as operand bundle values are intended to be used in
floating-point function calls. They would represent the same information
as now is passed by the constrained intrinsic arguments.
2024-10-11 12:09:10 +07:00
Matt Arsenault
c198f775cd
AMDGPU: Remove flat/global fmin/fmax intrinsics (#105642)
These have been replaced with atomicrmw
2024-10-09 09:27:28 +04:00
Matt Arsenault
9dca83f2e1
AMDGPU: Add noalias.addrspace metadata when autoupgrading atomic intrinsics (#102599)
This will be needed to continue generating the raw instruction in the flat case.
2024-10-08 00:13:28 +04:00
Benjamin Maxwell
95f00a63ce
[IR] Allow fast math flags on calls with homogeneous FP struct types (#110506)
This extends FPMathOperator to allow calls that return literal structs
of homogeneous floating-point or vector-of-floating-point types.

The intended use case for this is to support FP intrinsics that return
multiple values (such as `llvm.sincos`).
2024-10-02 10:05:09 +01:00
davidtrevelyan
0f488a0b7d
[LLVM][rtsan] Add sanitize_realtime_unsafe attribute (#106754) 2024-09-19 16:45:25 -06:00
Mingming Liu
d4ddf06b0c
[NFCI]Remove EntryCount from FunctionSummary and clean up surrounding synthetic count passes. (#107471)
The primary motivation is to remove `EntryCount` from `FunctionSummary`.
This frees 8 bytes out of `sizeof(FunctionSummary)` (136 bytes as of
64498c5483).

While I'm at it, this PR clean up {SummaryBasedOptimizations,
SyntheticCountsPropagation} since they were not used and there are no
plans to further invest on them.

With this patch, bitcode writer writes a placeholder 0 at the byte
offset of `EntryCount` and bitcode reader can parse the function entry
count at the correct byte offset. Added a TODO to stop writing
`EntryCount` and bump bitcode version
2024-09-06 16:38:17 -07:00
anjenner
4af249fe6e
Add usub_cond and usub_sat operations to atomicrmw (#105568)
These both perform conditional subtraction, returning the minuend and
zero respectively, if the difference is negative.
2024-09-06 16:19:20 +01:00
Nikita Popov
5dcea4628d [AutoUpgrade] Preserve attributes when upgrading named struct return
For example, if the argument has an alignment attribute, preserve it.
2024-09-02 12:42:52 +02:00
Chris Apple
fef3426ad3
Revert "[LLVM][rtsan] Add LLVM nosanitize_realtime attribute (#105447)" (#106743)
This reverts commit 178fc4779ece31392a2cd01472b0279e50b3a199.

This attribute was not needed now that we are using the lsan style
ScopedDisabler for disabling this sanitizer

See #106736 
#106125 

For more discussion
2024-08-30 07:48:31 -07:00
Maciej Gabka
95d2d1cba0
Move stepvector intrinsic out of experimental namespace (#98043)
This patch is moving out stepvector intrinsic from the experimental
namespace.

This intrinsic exists in LLVM for several years now, and is widely used.
2024-08-28 12:48:20 +01:00
Jan Voung
fa4fbaefde
Reapply: Use an abbrev to reduce size of VALUE_GUID records in ThinLTO summaries (#106165)
This retries #90692 which was reverted previously due to issues with
lld-available being set, even if the copy of lld is not built from
source.

This does not change any code compared to #90692 to address the
lld-available issue.
The main change w.r.t, lld-available is xfailing tests in PR #99056
(until a longer term fix is available).
2024-08-27 13:53:25 -04:00
Chris Apple
178fc4779e
[LLVM][rtsan] Add LLVM nosanitize_realtime attribute (#105447) 2024-08-26 12:49:27 -07:00
Matt Arsenault
ee08d9cba5
AMDGPU: Remove global/flat atomic fadd intrinics (#97051)
These have been replaced with atomicrmw.
2024-08-22 23:27:33 +04:00
Matt Arsenault
9d364286f3
AMDGPU: Remove flat/global atomic fadd v2bf16 intrinsics (#97050)
These are now fully covered by atomicrmw.
2024-08-21 14:26:42 +04:00
Sergei Barannikov
c91cc459d3
[DataLayout] Refactor the rest of parseSpecification (#104545)
The aim is to improve test coverage of data layout string parsing.

Pull Request: https://github.com/llvm/llvm-project/pull/104545
2024-08-20 11:25:49 +03:00
Matt Arsenault
70feafdb27
IR/AMDGPU: Autoupgrade amdgpu-unsafe-fp-atomics attribute (#101698)
Delete the attribute and annotate any atomicrmw instructions in the
function with new metadata.
2024-08-12 14:56:53 +04:00