619 Commits

Author SHA1 Message Date
Philip Reames
e6b4a21849
[IR] Add utilities for manipulating length of MemIntrinsic [nfc] (#153856)
Goal is simply to reduce direct usage of getLength and setLength so that
if we end up moving memset.pattern (whose length is in elements) there
are fewer places to audit.
2025-08-20 13:50:11 -07:00
Orlando Cazalet-Hyams
d13341db26
[RemoveDIs][NFC] Remove getAssignmentMarkers (#153214)
getAssignmentMarkers was for debug intrinsics. getDVRAssignmentMarkers
is used for DbgRecords.
2025-08-13 10:56:19 +01:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Tommy MᶜMichen
155359c1f2
[llvm][sroa] Disable support for invariant.group (#151743)
Resolves #151574.

> SROA pass does not perform aggregate load/store rewriting on a pointer
whose source is a `launder.invariant.group`.
> 
> This causes failed assertion in `AllocaSlices`.
> 
> ```
> void (anonymous
namespace)::AllocaSlices::SliceBuilder::visitStoreInst(StoreInst &):
> Assertion `(!SI.isSimple() || ValOp->getType()->isSingleValueType())
&&
>  "All simple FCA stores should have been pre-split"' failed.
> ```

Disables support for `{launder,strip}.invariant.group` intrinsics in
SROA.

Updates SROA test for `invariant.group` support.
2025-08-05 09:59:07 +02:00
Jeremy Morse
5328c732a4
[DebugInfo] Strip more debug-intrinsic code from local utils (#149037)
SROA and a few other facilities use generic-lambdas and some overloaded
functions to deal with both intrinsics and debug-records at the same time.
As part of stripping out intrinsic support, delete a swathe of this code
from things in the Utils directory.

This is a large diff, but is mostly about removing functions that were
duplicated during the migration to debug records. I've taken a few
opportunities to replace comments about "intrinsics" with "records",
and replace generic lambdas with plain lambdas (I believe this makes
it more readable).

All of this is chipping away at intrinsic-specific code until we get to
removing parts of findDbgUsers, which is the final boss -- we can't
remove that until almost everything else is gone.
2025-07-16 14:13:53 +01:00
Alex MacLean
0008af882d
[SROA] Allow as zext<i1> index when unfolding GEP select (#146929)
A zero-extension from an i1 is equivalent to a select with constant 0
and 1 values. Add this case when rewriting gep(select) -> select(gep) to
expose more opportunities for SROA.
2025-07-04 08:16:19 -07:00
Paul Walker
ea9046699e
[LLVM][SROA] Teach SROA how to "bitcast" between fixed and scalable vectors. (#130973)
For function whose vscale_range is limited to a single value we can size
scalable vectors. This aids SROA by allowing scalable vector load and
store operations to be considered for replacement whereby bitcasts
through memory can be replaced by vector insert or extract operations.
2025-06-11 11:02:32 +01:00
Nikita Popov
5c97397c2c
[SROA] Support load-only promotion with dynamic offset loads (#135609)
If we do load-only promotion, it is okay if we leave some loads alone.
We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped
read-only" code path.

This is something we already support in LICM load-only promotion, but
doing this in SROA is much better from a phase ordering perspective.

Fixes https://github.com/llvm/llvm-project/issues/134513.
2025-04-17 10:42:07 +02:00
Nikita Popov
a9474191e0
[SROA] Improve handling of lifetimes in load-only promotion (#135382)
The propagateStoredValuesToLoads() transform currently bails out if
there is a lifetime intrinsic spanning the whole alloca, but the
individual loads/stores operate on some smaller part, because the slice
/ partition size does not match.
    
Fix this by ignoring assume-like slices early, regardless of which range
they cover.
    
I've changed the overall code structure here a bit because I was getting
confused by the different iterators.
2025-04-14 11:52:42 +02:00
Rahul Joshi
74b7abf154
[IRBuilder] Add new overload for CreateIntrinsic (#131942)
Add a new `CreateIntrinsic` overload with no `Types`, useful for
creating calls to non-overloaded intrinsics that don't need additional
mangling.
2025-03-31 08:10:34 -07:00
Kazu Hirata
73dc2afd2c
[Transforms] Use *Set::insert_range (NFC) (#132652)
We can use *Set::insert_range to collapse:

  for (auto Elem : Range)
    Set.insert(E);

down to:

  Set.insert_range(Range);

In some cases, we can further fold that into the set declaration.
2025-03-23 19:42:53 -07:00
Nikita Popov
55b480ec3c
[SROA] Allow load-only promotion with read-only captures (#130735)
It's okay if the address or read-provenance of the pointer is captured.
We only have to make sure that there are no unanalyzable writes to the
pointer.
2025-03-13 09:53:02 +01:00
Veera
9d1fbbd2b9
[SROA][NFC] Remove Unused Parameter in promoteAllocas() (#128382)
Removing it because `Function &F` is not used by `promoteAllocas()`.
2025-02-23 11:17:43 -05:00
Harald van Dijk
1083ec647f
[reland][DebugInfo] Update DIBuilder insertion to take InsertPosition (#126967)
After #124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.

As a special exception, DIBuilder::insertDeclare() keeps a separate
overload accepting a BasicBlock *InsertAtEnd. This is because despite
the name, this method does not insert at the end of the block, therefore
this cannot be handled implicitly by using InsertPosition.
2025-02-13 10:46:42 +00:00
Harald van Dijk
23209eb1d9 Revert "[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)"
This reverts commit 3ec9f7494b31f2fe51d5ed0e07adcf4b7199def6.
2025-02-12 17:50:39 +00:00
Harald van Dijk
3ec9f7494b
[DebugInfo] Update DIBuilder insertion to take InsertPosition (#126059)
After #124287 updated several functions to return iterators rather than
Instruction *, it was no longer straightforward to pass their result to
DIBuilder. This commit updates DIBuilder methods to accept an
InsertPosition instead, so that they can be called with an iterator
(preferred), or with a deprecation warning an Instruction *, or a
BasicBlock *. This commit also updates the existing calls to the
DIBuilder methods to pass in iterators.
2025-02-12 17:38:59 +00:00
Jeremy Morse
34b139594a
[NFC][DebugInfo] Switch more call-sites to using iterator-insertion (#124283)
To finalise the "RemoveDIs" work removing debug intrinsics, we're
updating call sites that insert instructions to use iterators instead.
This set of changes are those where it's not immediately obvious that
just calling getIterator to fetch an iterator is correct, and one or two
places where more than one line needs to change.

Overall the same rule holds though: iterators generated for the start of
a block such as getFirstNonPHIIt need to be passed into insert/move
methods without being unwrapped/rewrapped, everything else can use
getIterator.
2025-01-27 16:44:14 +00:00
David Green
2a7ed2c1aa [SROA] Protect against calling the alloca ptr
In case we are calling the alloca ptr directly, check that the Use is a normal
operand to the call. Fortran is a funny language.
2024-12-17 09:21:15 +00:00
David Green
0032c151dc [SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)
Given an alloca that potentially has many uses in big complex code and
escapes into a call that is readonly+nocapture, we cannot easily split
up the alloca. There are several optimizations that will attempt to take
a value that is stored and a reload, and replace the load with the
original stored value. Instcombine has some simple heuristics, GVN can
sometimes do it, as can CSE in limited situations. They all suffer from
the same issue with complex code - they start from a load/store and need
to prove no-alias for all code between, which in complex cases might be
a lot to look through. Especially if the ptr is an alloca with many uses
that is over the normal escape capture limits.

The pass that does do well with allocas is SROA, as it has a complete
view of all of the uses. This patch adds a case to SROA where it can
detect allocas that are passed into calls that are no-capture readonly.
It can then optimize the reloaded values inside the alloca slice with
the stored value knowing that it is valid no matter the location of the
loads/stores from the no-escaping nature of the alloca.
2024-12-14 18:07:21 +00:00
Kirill Stoimenov
e3676aa21f Revert "[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)"
Causing buffer overflow:

SUMMARY: AddressSanitizer: heap-buffer-overflow llvm/lib/Transforms/Scalar/SROA.cpp:5552:35

This reverts commit 5e247d726d7a54cf0acc997bc17b50e7494e6fa3.
2024-12-12 21:32:35 +00:00
Kirill Stoimenov
7071cd3885 Revert "[Transforms] Silence a warning in SROA.cpp (NFC)"
This reverts commit 5b077506de26b1dfce1926895548b86f2106bed9.
2024-12-12 21:31:15 +00:00
Jie Fu
5b077506de [Transforms] Silence a warning in SROA.cpp (NFC)
/llvm-project/llvm/lib/Transforms/Scalar/SROA.cpp:5526:48:
 error: '&&' within '||' [-Werror,-Wlogical-op-parentheses]
          if (!SI->isSimple() || PartitionType && UserTy != PartitionType)
                              ~~ ~~~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~
/llvm-project/llvm/lib/Transforms/Scalar/SROA.cpp:5526:48:
 note: place parentheses around the '&&' expression to silence this warning
          if (!SI->isSimple() || PartitionType && UserTy != PartitionType)
                                               ^
                                 (                                       )
1 error generated.
2024-12-12 18:44:19 +08:00
David Green
5e247d726d
[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)
Given an alloca that potentially has many uses in big complex code and
escapes into a call that is readonly+nocapture, we cannot easily split
up the alloca. There are several optimizations that will attempt to take
a value that is stored and a reload, and replace the load with the
original stored value. Instcombine has some simple heuristics, GVN can
sometimes do it, as can CSE in limited situations. They all suffer from
the same issue with complex code - they start from a load/store and need
to prove no-alias for all code between, which in complex cases might be
a lot to look through. Especially if the ptr is an alloca with many uses
that is over the normal escape capture limits.

The pass that does do well with allocas is SROA, as it has a complete
view of all of the uses. This patch adds a case to SROA where it can
detect allocas that are passed into calls that are no-capture readonly.
It can then optimize the reloaded values inside the alloca slice with
the stored value knowing that it is valid no matter the location of the
loads/stores from the no-escaping nature of the alloca.
2024-12-12 10:27:27 +00:00
Kazu Hirata
5b19ed8bb4
[llvm] Migrate away from PointerUnion::{is,get,dyn_cast} (NFC) (#115626)
Note that PointerUnion::{is,get,dyn_cast} have been soft deprecated in
PointerUnion.h:

  // FIXME: Replace the uses of is(), get() and dyn_cast() with
  //        isa<T>, cast<T> and the llvm::dyn_cast<T>
2024-11-10 07:24:06 -08:00
Afanasyev Ivan
5a41800ea1
[SROA] Fix NumPromoted statistic for SROA pass (#115586)
`NumPromoted` stat should not be increased if `SROASkipMem2Reg` is set
and nothing is changed.
2024-11-09 10:08:58 +01:00
Kazu Hirata
94f9cbbe49
[Scalar] Remove unused includes (NFC) (#114645)
Identified with misc-include-cleaner.
2024-11-02 08:32:26 -07:00
Jay Foad
e03f427196
[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133)
It is almost always simpler to use {} instead of std::nullopt to
initialize an empty ArrayRef. This patch changes all occurrences I could
find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor
could be deprecated or removed.
2024-09-19 16:16:38 +01:00
Bartłomiej Chmiel
c2b92a4250
[SROA] Use SetVector for PromotableAllocas (#105809)
Use SetVector to make operations more efficient
if there is a very large number of allocas.
2024-09-04 14:10:51 +02:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Shubham Sandeep Rastogi
359c704004
Handle #dbg_values in SROA. (#94070)
This patch properly handles #dbg_values in SROA by making sure that any
#dbg_values get moved to before a store just like #dbg_declares do, or
the #dbg_value is correctly updated with the right alloca after an
aggregate alloca is broken up.

The issue stems from swift where #dbg_values are emitted and not
dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and
it causes them to all have undefs

If we look at this simple-ish testcase (This is all I could reduce it
down to, and I am still relatively bad at writing llvm IR by hand so I
apologize in advance):

```
%T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }>
%T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }>
%T1M1SVySfG = type <{ ptr, %Ts4Int8V }>
%Ts4Int8V = type <{ i8 }>
%T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }>
define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 {
entry:
  %3 = alloca %T4main1VV13TangentVectorV
  %4 = alloca %T4main1UV13TangentVectorV
  %5 = alloca %T4main1VV13TangentVectorV
  %6 = alloca %T4main1UV13TangentVectorV
  %7 = alloca %T4main1VV13TangentVectorV
  %8 = alloca %T4main1UV13TangentVectorV
  %9 = alloca %T4main1VV13TangentVectorV
  %10 = alloca %T4main1UV13TangentVectorV
  call void @llvm.lifetime.start.p0(i64 9, ptr %3)
  call void @llvm.lifetime.start.p0(i64 25, ptr %4)
  call void @llvm.lifetime.start.p0(i64 9, ptr %5)
  call void @llvm.lifetime.start.p0(i64 25, ptr %6)
  call void @llvm.lifetime.start.p0(i64 9, ptr %7)
  call void @llvm.lifetime.start.p0(i64 25, ptr %8)
  call void @llvm.lifetime.start.p0(i64 9, ptr %9)
  call void @llvm.lifetime.start.p0(i64 25, ptr %10)
  %.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false)
  %.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false)
  call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75
  %.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0
  %.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0
  %11 = load ptr, ptr %.s.c
  %.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1
  %.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0
  %12 = load i8, ptr %.s.b._value
  %.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0
  %.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0
  %13 = load ptr, ptr %.s2.c
  %.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1
  %.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0
  %14 = load i8, ptr %.s2.b._value
  %.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false)
  %.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false)
  %.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0
  %.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0
  %18 = load ptr, ptr %.s4.c
  %.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0
  %.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0
  %20 = load ptr, ptr %.s5.c
  %.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false)
  %.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false)
  %.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0
  %.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0
  %25 = load ptr, ptr %.s7.c
  %.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1
  %.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0
  %26 = load i8, ptr %.s7.b._value
  %.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0
  %.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0
  %27 = load ptr, ptr %.s8.c
  %.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1
  %.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0
  %28 = load i8, ptr %.s8.b._value
  %.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false)
  %.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false)
  %.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0
  %.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0
  %32 = load ptr, ptr %.s11.c
  %.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0
  %.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0
  %34 = load ptr, ptr %.s12.c
  call void @llvm.lifetime.end.p0(i64 25, ptr %10)
  call void @llvm.lifetime.end.p0(i64 9, ptr %9)
  call void @llvm.lifetime.end.p0(i64 25, ptr %8)
  call void @llvm.lifetime.end.p0(i64 9, ptr %7)
  call void @llvm.lifetime.end.p0(i64 25, ptr %6)
  call void @llvm.lifetime.end.p0(i64 9, ptr %5)
  call void @llvm.lifetime.end.p0(i64 25, ptr %4)
  call void @llvm.lifetime.end.p0(i64 9, ptr %3)
  ret void
}
!llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15}
!swift.module.flags = !{!33}
!llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43}
!0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]}
!1 = !{i32 1, !"Objective-C Version", i32 2}
!2 = !{i32 1, !"Objective-C Image Info Version", i32 0}
!3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"}
!4 = !{i32 1, !"Objective-C Garbage Collection", i8 0}
!6 = !{i32 7, !"Dwarf Version", i32 4}
!7 = !{i32 2, !"Debug Info Version", i32 3}
!8 = !{i32 1, !"wchar_size", i32 4}
!9 = !{i32 8, !"PIC Level", i32 2}
!10 = !{i32 7, !"uwtable", i32 1}
!11 = !{i32 7, !"frame-pointer", i32 1}
!12 = !{i32 1, !"Swift Version", i32 7}
!13 = !{i32 1, !"Swift ABI Version", i32 7}
!14 = !{i32 1, !"Swift Major Version", i8 6}
!15 = !{i32 1, !"Swift Minor Version", i8 0}
!16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk")
!17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2")
!18 = !{!19, !21, !23, !25, !27, !29, !31}
!19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17)
!20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen")
!21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17)
!22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule")
!23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60)
!24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule")
!25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61)
!26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule")
!27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17)
!28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule")
!29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17)
!30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims")
!31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17)
!32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule")
!33 = !{i1 false}
!34 = !{!"-lswiftCore"}
!35 = !{!"-lswift_StringProcessing"}
!36 = !{!"-lswift_Differentiation"}
!37 = !{!"-lswiftDarwin"}
!38 = !{!"-lswift_Concurrency"}
!39 = !{!"-lswiftSwiftOnoneSupport"}
!40 = !{!"-lobjc"}
!41 = !{!"-lswiftCompatibilityConcurrency"}
!42 = !{!"-lswiftCompatibility56"}
!43 = !{!"-lswiftCompatibilityPacks"}
!44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53)
!45 = !DIFile(filename: "<compiler-generated>", directory: "/")
!46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD")
!47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD")
!48 = !{}
!49 = !DISubroutineType(types: !50)
!50 = !{!51}
!51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD")
!52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized)
!53 = !{!54, !56, !57}
!54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial)
!55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46)
!56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial)
!57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial)
!58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51)
!62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial)
!63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70)
!64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD")
!65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD")
!66 = !DISubroutineType(types: !67)
!67 = !{!68}
!68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD")
!69 = !DISubprogram( spFlags: DISPFlagOptimized)
!70 = !{!71, !73}
!71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial)
!72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64)
!73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial)
!74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68)
!75 = !DILocation( scope: !63, inlinedAt: !76)
!76 = distinct !DILocation( scope: !44)

```

if we run
` opt -S -passes=sroa file.ll  -o -`

With this patch we will see
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
 ```
 
 Without this patch we will see:
 
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
```

Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved

This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068
2024-08-21 17:52:37 -07:00
Orlando Cazalet-Hyams
57539418ba
[SROA] Fix debug locations for variables with non-zero offsets (#97750)
Fixes issue #61981 by adjusting variable location offsets (in the DIExpression)
when splitting allocas.

Patch [4/4] to fix structured bindings in SROA.

NOTE: There's still a bug in mem2reg which generates incorrect locations in some
situations: if the variable fragment has an offset into the new (split) alloca,
mem2reg will fail to convert that into a bit shift (the location contains a
garbage offset). That's not addressed here.

insertNewDbgInst - Now takes the address-expression and FragmentInfo as
  separate parameters because unlike dbg_declares dbg_assigns want those to go
  to different places. dbg_assign records put the variable fragment info in the
  value expression only (whereas dbg_declare has only one expression so puts it
  there - ideally this information wouldn't live in DIExpression, but that's
  another issue).

MigrateOne - Modified to correctly compute the necessary offsets and fragment
  adjustments. The previous implementation produced bogus locations for variables
  with non-zero offsets. The changes replace most of the body of this lambda, so
  it might be easier to review in a split-diff view and focus on the change as a
  whole than to compare it to the old implementation.

  This uses calculateFragmentIntersect and extractLeadingOffset added in previous
  patches in this series, and createOrReplaceFragment described below.

createOrReplaceFragment - Similar to DIExpression::createFragmentExpression
  except for 3 important distinctions:

    1. The new fragment isn't relative to an existing fragment.
    2. There are no checks on the the operation types because it is assumed
       the location this expression is computing is not implicit (i.e., it's
       always safe to create a fragment because arithmetic operations apply
       to the address computation, not to an implicit value computation).
    3. Existing extract_bits are modified independetly of fragment changes
       using \p BitExtractOffset. A change to the fragment offset or size
       may affect a bit extract. But a bit extract offset can change
       independently of the fragment dimensions.

  Returns the new expression, or nullptr if one couldn't be created.  Ideally
  this is only used to signal that a bit-extract has become zero-sized (and thus
  the new debug record has no size and can be dropped), however, it fails for
  other reasons too - see the FIXME below.

  FIXME: To keep the scope of this change focused on non-bitfield structured
  bindings the function bails in situations that
  DIExpression::createFragmentExpression fails. E.g. when fragment and bit
  extract sizes differ. These limitations can be removed in the future.
2024-07-18 09:08:25 +01:00
Nikita Popov
9df71d7673
[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919)
Similar to https://github.com/llvm/llvm-project/pull/96902, this adds
`getDataLayout()` helpers to Function and GlobalValue, replacing the
current `getParent()->getDataLayout()` pattern.
2024-06-28 08:36:49 +02:00
Nikita Popov
2d209d964a
[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902)
This is a helper to avoid writing `getModule()->getDataLayout()`. I
regularly try to use this method only to remember it doesn't exist...

`getModule()->getDataLayout()` is also a common (the most common?)
reason why code has to include the Module.h header.
2024-06-27 16:38:15 +02:00
Stephen Tozer
d75f9dd1d2 Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497)"
Reverts the above commit, as it updates a common header function and
did not update all callsites:

  https://lab.llvm.org/buildbot/#/builders/29/builds/382

This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.
2024-06-24 18:00:22 +01:00
Stephen Tozer
6481dc5761
[IR][NFC] Update IRBuilder to use InsertPosition (#96497)
Uses the new InsertPosition class (added in #94226) to simplify some of
the IRBuilder interface, and removes the need to pass a BasicBlock
alongside a BasicBlock::iterator, using the fact that we can now get the
parent basic block from the iterator even if it points to the sentinel.
This patch removes the BasicBlock argument from each constructor or call
to setInsertPoint.

This has no functional effect, but later on as we look to remove the
`Instruction *InsertBefore` argument from instruction-creation
(discussed
[here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)),
this will simplify the process by allowing us to deprecate the
InsertPosition constructor directly and catch all the cases where we use
instructions rather than iterators.
2024-06-24 17:27:43 +01:00
Stephen Tozer
80f881485a
[LLVM] Add InsertPosition union-type to remove overloads of Instruction-creation (#94226)
This patch simplifies instruction creation by replacing all overloads of
instruction constructors/Create methods that are identical other than
the Instruction *InsertBefore/BasicBlock *InsertAtEnd/BasicBlock::iterator
InsertBefore argument with a single version that takes an InsertPosition
argument. The InsertPosition class can be implicitly constructed from
any of the above, internally converting them to the appropriate
BasicBlock::iterator value which can then be used to insert the
instruction (or to not insert it if an invalid iterator is passed).

The upshot of this is that code will be deduplicated, and all callsites
will switch to calling the new unified version without any changes
needed to make the compiler happy. There is at least one exception to
this; the construction of InsertPosition is a user-defined conversion,
so any caller that was already relying on a different user-defined
conversion won't work. In all of LLVM and Clang this happens exactly
once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca
with an AssertingVH<Instruction> argument, which must now be cast to an
Instruction* by using `&*`. If this is more common elsewhere, it could
be fixed by adding an appropriate constructor to InsertPosition.
2024-06-20 10:27:55 +01:00
Kazu Hirata
7c6d0d26b1
[llvm] Use llvm::unique (NFC) (#95628) 2024-06-14 22:49:36 -07:00
Nikita Popov
738fcbee68 [SROA] Preserve all GEP flags during speculation
Unlikely to matter in practice, as these GEPs are typically
promoted away.
2024-06-14 11:48:35 +02:00
Nikita Popov
a87ff8db26
[SROA] Remove sroa-strict-inbounds option (NFC) (#94342)
This option was added in 3b79b2ab4e35353e63ba323a3de4b0a70c61a5f1 with
the comment:

> I had SROA implemented this way a long time ago and due to the
> overwhelming bugs that surfaced, moved to a much more relaxed
> variant. Richard Smith would like to understand the magnitude
> of this problem and it seems fairly harmless to keep some
> flag-controlled logic to get the extremely strict behavior here.
> I'll remove it if it doesn't prove useful.

As far as I know, it did not prove useful, so I'm removing it now.

With constant GEPs canonicalized to i8, GEPs that only temporarily go
out of bounds during the offset calculation do not naturally occur
anymore anyway.
2024-06-05 09:18:13 +02:00
henke9600
50dbbe5a0a
[SROA] Use stable sort for slices to avoid non-determinism (#91609)
Found this while trying to build a LLVM toolchain reproducibly from both
Debian 12 and FreeBSD 14. With these changes they come out bit-by-bit
identical.

Previously there was a mix of stable and unstable sorts for slices, now
only stable sorts are used.
2024-05-21 08:17:44 +02:00
Stephen Tozer
ffd08c7759
[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216)
This is the major rename patch that prior patches have built towards.
The DPValue class is being renamed to DbgVariableRecord, which reflects
the updated terminology for the "final" implementation of the RemoveDI
feature. This is a pure string substitution + clang-format patch. The
only manual component of this patch was determining where to perform
these string substitutions: `DPValue` and `DPV` are almost exclusively
used for DbgRecords, *except* for:

- llvm/lib/target, where 'DP' is used to mean double-precision, and so
appears as part of .td files and in variable names. NB: There is a
single existing use of `DPValue` here that refers to debug info, which
I've manually updated.
- llvm/tools/gold, where 'LDPV' is used as a prefix for symbol
visibility enums.

Outside of these places, I've applied several basic string
substitutions, with the intent that they only affect DbgRecord-related
identifiers; I've checked them as I went through to verify this, with
reasonable confidence that there are no unintended changes that slipped
through the cracks. The substitutions applied are all case-sensitive,
and are applied in the order shown:

```
  DPValue -> DbgVariableRecord
  DPVal -> DbgVarRec
  DPV -> DVR
```

Following the previous rename patches, it should be the case that there
are no instances of any of these strings that are meant to refer to the
general case of DbgRecords, or anything other than the DPValue class.
The idea behind this patch is therefore that pure string substitution is
correct in all cases as long as these assumptions hold.
2024-03-19 20:07:07 +00:00
Stephen Tozer
15f3f446c5
[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793)
As part of the effort to rename the DbgRecord classes, this patch
renames the widely-used functions that operate on DbgRecords but refer
to DbgValues or DPValues in their names to refer to DbgRecords instead;
all such functions are defined in one of `BasicBlock.h`,
`Instruction.h`, and `DebugProgramInstruction.h`.

This patch explicitly does not change the names of any comments or
variables, except for where they use the exact name of one of the
renamed functions. The reason for this is reviewability; this patch can
be trivially examined to determine that the only changes are direct
string substitutions and any results from clang-format responding to the
changed line lengths. Future patches will cover renaming variables and
comments, and then renaming the classes themselves.
2024-03-12 14:53:13 +00:00
Orlando Cazalet-Hyams
9997e03971
[RemoveDIs] Update DIBuilder to conditionally insert DbgRecords (#84739)
Have DIBuilder conditionally insert either debug intrinsics or DbgRecord
depending on the module's IsNewDbgInfoFormat flag. The insertion methods
now return a `DbgInstPtr` (a `PointerUnion<Instruction *, DbgRecord
*>`).

Add a unittest for both modes (I couldn't find an existing test testing
insertion behaviours specifically).

This patch changes the existing assumption that DbgRecords are only ever
inserted if there's an instruction to insert-before because clang
currently inserts debug intrinsics while CodeGening (like any other
instruction) meaning it'll try inserting to the end of a block without a
terminator. We already have machinery in place to maintain the
DbgRecords when a terminator is removed - these become "trailing
DbgRecords" which are re-attached when a new instruction is inserted.
All I've done is allow this state to occur while inserting DbgRecords
too, i.e., it's not only removing terminators that causes this valid
transient state, but inserting DbgRecords into incomplete blocks too.

The C API will be updated in follow up patches.

---

Note: this doesn't mean clang is emitting DbgRecords yet, because the
modules it creates are still always in the old debug mode. That will
come in a future patch.
2024-03-12 10:25:58 +00:00
Arthur Eubanks
eae4f56cb4 [SROA] Fix phi gep unfolding with an alloca not in entry block
Fixes a crash reported in #83494.
2024-03-07 07:23:48 +00:00
Jeffrey Byrnes
1e828f838c [SROA]: Only defer trying partial sized ptr or ptr vector types
Change-Id: Ic77f87290905addadd5819dff2d0c62f031022ab
2024-03-05 08:52:07 -08:00
Jeremy Morse
2fe81edef6 [NFC][RemoveDIs] Insert instruction using iterators in Transforms/
As part of the RemoveDIs project we need LLVM to insert instructions using
iterators wherever possible, so that the iterators can carry a bit of
debug-info. This commit implements some of that by updating the contents of
llvm/lib/Transforms/Utils to always use iterator-versions of instruction
constructors.

There are two general flavours of update:
 * Almost all call-sites just call getIterator on an instruction
 * Several make use of an existing iterator (scenarios where the code is
   actually significant for debug-info)
The underlying logic is that any call to getFirstInsertionPt or similar
APIs that identify the start of a block need to have that iterator passed
directly to the insertion function, without being converted to a bare
Instruction pointer along the way.

Noteworthy changes:
 * FindInsertedValue now takes an optional iterator rather than an
   instruction pointer, as we need to always insert with iterators,
 * I've added a few iterator-taking versions of some value-tracking and
   DomTree methods -- they just unwrap the iterator. These are purely
   convenience methods to avoid extra syntax in some passes.
 * A few calls to getNextNode become std::next instead (to keep in the
   theme of using iterators for positions),
 * SeparateConstOffsetFromGEP has it's insertion-position field changed.
   Noteworthy because it's not a purely localised spelling change.

All this should be NFC.
2024-03-05 15:12:22 +00:00
Arthur Eubanks
8848258f7b
[SROA] Unfold gep of index phi (round 2) (#83494)
If a gep has only one phi as one of its operands and the remaining
indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi
((gep ptr, idx1), (gep ptr, idx2))`.

Take care not to unfold recursive phis.

Followup to #80983.

This was initially was #83087. Initial PR did not handle allocas in
entry block that weren't at the beginning of the function, causing GEPs
to be inserted after the first chunk of allocas but potentially before
an alloca not at the beginning. Insert GEPs at the end of the entry
block instead since constants/arguments/static allocas can all be used
there.
2024-03-04 14:21:26 -08:00
Fangrui Song
43b7dfcc1d Revert "[SROA] Unfold gep of index phi (#83087)"
This reverts commit 2eb63982e88b9ed8336158d35884b1a1d04a0f78.

This caused verifier error
```
Instruction does not dominate all uses!
```
for some projects using Halide.
The verifier error happens inside `Halide::Internal::CodeGen_LLVM::optimize_module`
and looks like a genuine SROA issue.
2024-02-28 15:56:43 -08:00
Arthur Eubanks
2eb63982e8
[SROA] Unfold gep of index phi (#83087)
If a gep has only one phi as one of its operands and the remaining
indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi
((gep ptr, idx1), (gep ptr, idx2))`.

Take care not to unfold recursive phis.

Followup to #80983.
2024-02-28 10:53:47 -08:00
Arthur Eubanks
001e18c816 [NFC] clang-format SROA.cpp 2024-02-27 22:46:16 +00:00