376 Commits

Author SHA1 Message Date
Nuno Lopes
d0029b87d8
remove UB from test [NFC] 2025-08-19 11:18:27 +01:00
Nikita Popov
c23b4fbdbb
[IR] Remove size argument from lifetime intrinsics (#150248)
Now that #149310 has restricted lifetime intrinsics to only work on
allocas, we can also drop the explicit size argument. Instead, the size
is implied by the alloca.

This removes the ability to only mark a prefix of an alloca alive/dead.
We never used that capability, so we should remove the need to handle
that possibility everywhere (though many key places, including stack
coloring, did not actually respect this).
2025-08-08 11:09:34 +02:00
Tommy MᶜMichen
155359c1f2
[llvm][sroa] Disable support for invariant.group (#151743)
Resolves #151574.

> SROA pass does not perform aggregate load/store rewriting on a pointer
whose source is a `launder.invariant.group`.
> 
> This causes failed assertion in `AllocaSlices`.
> 
> ```
> void (anonymous
namespace)::AllocaSlices::SliceBuilder::visitStoreInst(StoreInst &):
> Assertion `(!SI.isSimple() || ValOp->getType()->isSingleValueType())
&&
>  "All simple FCA stores should have been pre-split"' failed.
> ```

Disables support for `{launder,strip}.invariant.group` intrinsics in
SROA.

Updates SROA test for `invariant.group` support.
2025-08-05 09:59:07 +02:00
Nikita Popov
2c6eec219d [Tests] Avoid lifetime intrinsics on non-allocas (NFC)
Don't rely on auto-upgrade, instead either remove unnecessary
casts or remove no longer applicable tests.
2025-07-23 15:05:43 +02:00
Alex MacLean
0008af882d
[SROA] Allow as zext<i1> index when unfolding GEP select (#146929)
A zero-extension from an i1 is equivalent to a select with constant 0
and 1 values. Add this case when rewriting gep(select) -> select(gep) to
expose more opportunities for SROA.
2025-07-04 08:16:19 -07:00
Paul Walker
ea9046699e
[LLVM][SROA] Teach SROA how to "bitcast" between fixed and scalable vectors. (#130973)
For function whose vscale_range is limited to a single value we can size
scalable vectors. This aids SROA by allowing scalable vector load and
store operations to be considered for replacement whereby bitcasts
through memory can be replaced by vector insert or extract operations.
2025-06-11 11:02:32 +01:00
Nikita Popov
5c97397c2c
[SROA] Support load-only promotion with dynamic offset loads (#135609)
If we do load-only promotion, it is okay if we leave some loads alone.
We only need to know all stores that affect a specific location.

As such, we can handle loads with unknown offset via the "escaped
read-only" code path.

This is something we already support in LICM load-only promotion, but
doing this in SROA is much better from a phase ordering perspective.

Fixes https://github.com/llvm/llvm-project/issues/134513.
2025-04-17 10:42:07 +02:00
Nikita Popov
1e2dc5b087 [SROA] Add load-only promotion tests with dynamic offset load 2025-04-14 12:16:06 +02:00
Nikita Popov
a9474191e0
[SROA] Improve handling of lifetimes in load-only promotion (#135382)
The propagateStoredValuesToLoads() transform currently bails out if
there is a lifetime intrinsic spanning the whole alloca, but the
individual loads/stores operate on some smaller part, because the slice
/ partition size does not match.
    
Fix this by ignoring assume-like slices early, regardless of which range
they cover.
    
I've changed the overall code structure here a bit because I was getting
confused by the different iterators.
2025-04-14 11:52:42 +02:00
Matt Arsenault
7b3b4a5b1b
IR: Use poison in dropDroppableUse (#134576) 2025-04-07 14:59:34 +07:00
Jeremy Morse
792a6f8119
[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298)
These date back to when the non-intrinsic format of variable locations
was still being tested and was behind a compile-time flag, so not all
builds / bots would correctly run them. The solution at the time, to get
at least some test coverage, was to have tests opt-in to non-intrinsic
debug-info if it was built into LLVM.

Nowadays, non-intrinsic format is the default and has been on for more
than a year, there's no need for this flag to exist.

(I've downgraded the flag from "try" to explicitly requesting
non-intrinsic format in some places, so that we can deal with tests that
are explicitly about non-intrinsic format in their own commit).
2025-03-14 15:50:49 +00:00
Nikita Popov
55b480ec3c
[SROA] Allow load-only promotion with read-only captures (#130735)
It's okay if the address or read-provenance of the pointer is captured.
We only have to make sure that there are no unanalyzable writes to the
pointer.
2025-03-13 09:53:02 +01:00
Pedro Lobo
3c80d9b8dd
[Instruction] Set metadata to poison on deletion (#129449)
Represent extant metadata uses of a deleted instruction with `poison`
instead of `undef`.
2025-03-03 07:17:01 +07:00
Nikita Popov
29441e4f5f
[IR] Convert from nocapture to captures(none) (#123181)
This PR removes the old `nocapture` attribute, replacing it with the new
`captures` attribute introduced in #116990. This change is
intended to be essentially NFC, replacing existing uses of `nocapture`
with `captures(none)` without adding any new analysis capabilities.
Making use of non-`none` values is left for a followup.

Some notes:
* `nocapture` will be upgraded to `captures(none)` by the bitcode
   reader.
* `nocapture` will also be upgraded by the textual IR reader. This is to
   make it easier to use old IR files and somewhat reduce the test churn in
   this PR.
* Helper APIs like `doesNotCapture()` will check for `captures(none)`.
* MLIR import will convert `captures(none)` into an `llvm.nocapture`
   attribute. The representation in the LLVM IR dialect should be updated
   separately.
2025-01-29 16:56:47 +01:00
Alex MacLean
1a56360cc6
[IR] Treat calls with byval ptrs as read-only (#122961) 2025-01-15 10:25:55 -08:00
David Green
2a7ed2c1aa [SROA] Protect against calling the alloca ptr
In case we are calling the alloca ptr directly, check that the Use is a normal
operand to the call. Fortran is a funny language.
2024-12-17 09:21:15 +00:00
David Green
0032c151dc [SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)
Given an alloca that potentially has many uses in big complex code and
escapes into a call that is readonly+nocapture, we cannot easily split
up the alloca. There are several optimizations that will attempt to take
a value that is stored and a reload, and replace the load with the
original stored value. Instcombine has some simple heuristics, GVN can
sometimes do it, as can CSE in limited situations. They all suffer from
the same issue with complex code - they start from a load/store and need
to prove no-alias for all code between, which in complex cases might be
a lot to look through. Especially if the ptr is an alloca with many uses
that is over the normal escape capture limits.

The pass that does do well with allocas is SROA, as it has a complete
view of all of the uses. This patch adds a case to SROA where it can
detect allocas that are passed into calls that are no-capture readonly.
It can then optimize the reloaded values inside the alloca slice with
the stored value knowing that it is valid no matter the location of the
loads/stores from the no-escaping nature of the alloca.
2024-12-14 18:07:21 +00:00
Kirill Stoimenov
e3676aa21f Revert "[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)"
Causing buffer overflow:

SUMMARY: AddressSanitizer: heap-buffer-overflow llvm/lib/Transforms/Scalar/SROA.cpp:5552:35

This reverts commit 5e247d726d7a54cf0acc997bc17b50e7494e6fa3.
2024-12-12 21:32:35 +00:00
David Green
5e247d726d
[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645)
Given an alloca that potentially has many uses in big complex code and
escapes into a call that is readonly+nocapture, we cannot easily split
up the alloca. There are several optimizations that will attempt to take
a value that is stored and a reload, and replace the load with the
original stored value. Instcombine has some simple heuristics, GVN can
sometimes do it, as can CSE in limited situations. They all suffer from
the same issue with complex code - they start from a load/store and need
to prove no-alias for all code between, which in complex cases might be
a lot to look through. Especially if the ptr is an alloca with many uses
that is over the normal escape capture limits.

The pass that does do well with allocas is SROA, as it has a complete
view of all of the uses. This patch adds a case to SROA where it can
detect allocas that are passed into calls that are no-capture readonly.
It can then optimize the reloaded values inside the alloca slice with
the stored value knowing that it is valid no matter the location of the
loads/stores from the no-escaping nature of the alloca.
2024-12-12 10:27:27 +00:00
David Green
6106422ddb [SROA] Escaping readonly nocapture tests. NFC 2024-12-10 18:07:54 +00:00
Paul Walker
38fffa630e
[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548) 2024-11-06 11:53:33 +00:00
Jay Foad
922992a22f
Fix typo "instrinsic" (#112899) 2024-10-18 15:58:33 +01:00
Stephen Tozer
3d08ade7bd
[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149)
This patch is part of a set of patches that add an `-fextend-lifetimes`
flag to clang, which extends the lifetimes of local variables and
parameters for improved debuggability. In addition to that flag, the
patch series adds a pragma to selectively disable `-fextend-lifetimes`,
and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes`
for this pointers only. All changes and tests in these patches were
written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer)
has handled review and merging. The extend lifetimes flag is intended to
eventually be set on by `-Og`, as discussed in the RFC
here:

https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850

This patch implements a new intrinsic instruction in LLVM,
`llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand
and has no effect other than "using" its operand, to ensure that its
operand remains live until after the fake use. This patch does not emit
fake uses anywhere; the next patch in this sequence causes them to be
emitted from the clang frontend, such that for each variable (or this) a
fake.use operand is inserted at the end of that variable's scope, using
that variable's value. This patch covers everything post-frontend, which
is largely just the basic plumbing for a new intrinsic/instruction,
along with a few steps to preserve the fake uses through optimizations
(such as moving them ahead of a tail call or translating them through
SROA).

Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>
2024-08-29 17:53:32 +01:00
Shubham Sandeep Rastogi
359c704004
Handle #dbg_values in SROA. (#94070)
This patch properly handles #dbg_values in SROA by making sure that any
#dbg_values get moved to before a store just like #dbg_declares do, or
the #dbg_value is correctly updated with the right alloca after an
aggregate alloca is broken up.

The issue stems from swift where #dbg_values are emitted and not
dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and
it causes them to all have undefs

If we look at this simple-ish testcase (This is all I could reduce it
down to, and I am still relatively bad at writing llvm IR by hand so I
apologize in advance):

```
%T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }>
%T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }>
%T1M1SVySfG = type <{ ptr, %Ts4Int8V }>
%Ts4Int8V = type <{ i8 }>
%T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }>
define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 {
entry:
  %3 = alloca %T4main1VV13TangentVectorV
  %4 = alloca %T4main1UV13TangentVectorV
  %5 = alloca %T4main1VV13TangentVectorV
  %6 = alloca %T4main1UV13TangentVectorV
  %7 = alloca %T4main1VV13TangentVectorV
  %8 = alloca %T4main1UV13TangentVectorV
  %9 = alloca %T4main1VV13TangentVectorV
  %10 = alloca %T4main1UV13TangentVectorV
  call void @llvm.lifetime.start.p0(i64 9, ptr %3)
  call void @llvm.lifetime.start.p0(i64 25, ptr %4)
  call void @llvm.lifetime.start.p0(i64 9, ptr %5)
  call void @llvm.lifetime.start.p0(i64 25, ptr %6)
  call void @llvm.lifetime.start.p0(i64 9, ptr %7)
  call void @llvm.lifetime.start.p0(i64 25, ptr %8)
  call void @llvm.lifetime.start.p0(i64 9, ptr %9)
  call void @llvm.lifetime.start.p0(i64 25, ptr %10)
  %.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false)
  %.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false)
  call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75
  %.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0
  %.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0
  %11 = load ptr, ptr %.s.c
  %.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1
  %.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0
  %12 = load i8, ptr %.s.b._value
  %.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0
  %.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0
  %13 = load ptr, ptr %.s2.c
  %.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1
  %.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0
  %14 = load i8, ptr %.s2.b._value
  %.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false)
  %.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false)
  %.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0
  %.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0
  %18 = load ptr, ptr %.s4.c
  %.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0
  %.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0
  %20 = load ptr, ptr %.s5.c
  %.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false)
  %.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false)
  %.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0
  %.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0
  %25 = load ptr, ptr %.s7.c
  %.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1
  %.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0
  %26 = load i8, ptr %.s7.b._value
  %.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0
  %.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0
  %27 = load ptr, ptr %.s8.c
  %.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1
  %.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0
  %28 = load i8, ptr %.s8.b._value
  %.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false)
  %.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2
  call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false)
  %.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0
  %.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0
  %32 = load ptr, ptr %.s11.c
  %.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0
  %.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0
  %34 = load ptr, ptr %.s12.c
  call void @llvm.lifetime.end.p0(i64 25, ptr %10)
  call void @llvm.lifetime.end.p0(i64 9, ptr %9)
  call void @llvm.lifetime.end.p0(i64 25, ptr %8)
  call void @llvm.lifetime.end.p0(i64 9, ptr %7)
  call void @llvm.lifetime.end.p0(i64 25, ptr %6)
  call void @llvm.lifetime.end.p0(i64 9, ptr %5)
  call void @llvm.lifetime.end.p0(i64 25, ptr %4)
  call void @llvm.lifetime.end.p0(i64 9, ptr %3)
  ret void
}
!llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15}
!swift.module.flags = !{!33}
!llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43}
!0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]}
!1 = !{i32 1, !"Objective-C Version", i32 2}
!2 = !{i32 1, !"Objective-C Image Info Version", i32 0}
!3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"}
!4 = !{i32 1, !"Objective-C Garbage Collection", i8 0}
!6 = !{i32 7, !"Dwarf Version", i32 4}
!7 = !{i32 2, !"Debug Info Version", i32 3}
!8 = !{i32 1, !"wchar_size", i32 4}
!9 = !{i32 8, !"PIC Level", i32 2}
!10 = !{i32 7, !"uwtable", i32 1}
!11 = !{i32 7, !"frame-pointer", i32 1}
!12 = !{i32 1, !"Swift Version", i32 7}
!13 = !{i32 1, !"Swift ABI Version", i32 7}
!14 = !{i32 1, !"Swift Major Version", i8 6}
!15 = !{i32 1, !"Swift Minor Version", i8 0}
!16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk")
!17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2")
!18 = !{!19, !21, !23, !25, !27, !29, !31}
!19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17)
!20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen")
!21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17)
!22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule")
!23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60)
!24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule")
!25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61)
!26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule")
!27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17)
!28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule")
!29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17)
!30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims")
!31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17)
!32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule")
!33 = !{i1 false}
!34 = !{!"-lswiftCore"}
!35 = !{!"-lswift_StringProcessing"}
!36 = !{!"-lswift_Differentiation"}
!37 = !{!"-lswiftDarwin"}
!38 = !{!"-lswift_Concurrency"}
!39 = !{!"-lswiftSwiftOnoneSupport"}
!40 = !{!"-lobjc"}
!41 = !{!"-lswiftCompatibilityConcurrency"}
!42 = !{!"-lswiftCompatibility56"}
!43 = !{!"-lswiftCompatibilityPacks"}
!44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53)
!45 = !DIFile(filename: "<compiler-generated>", directory: "/")
!46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD")
!47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD")
!48 = !{}
!49 = !DISubroutineType(types: !50)
!50 = !{!51}
!51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD")
!52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized)
!53 = !{!54, !56, !57}
!54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial)
!55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46)
!56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial)
!57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial)
!58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51)
!62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial)
!63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70)
!64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD")
!65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD")
!66 = !DISubroutineType(types: !67)
!67 = !{!68}
!68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD")
!69 = !DISubprogram( spFlags: DISPFlagOptimized)
!70 = !{!71, !73}
!71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial)
!72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64)
!73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial)
!74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68)
!75 = !DILocation( scope: !63, inlinedAt: !76)
!76 = distinct !DILocation( scope: !44)

```

if we run
` opt -S -passes=sroa file.ll  -o -`

With this patch we will see
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
 ```
 
 Without this patch we will see:
 
```
%.sroa.5.sroa.021 = alloca [7 x i8], align 8
%.sroa.5.sroa.014 = alloca [7 x i8], align 8
```

Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved

This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068
2024-08-21 17:52:37 -07:00
Vitaly Buka
6dba99e14f
[InstCombine][asan] Don't speculate loads before select ptr (#100773)
Even if memory is valid from `llvm` point of view,
e.g. local alloca, sanitizers have API for user
specific memory annotations.

These annotations can be used to track size of the
local object, e.g. inline vectors may prevent
accesses beyond the current vector size.

So valid programs should not access those parts of
alloca before checking preconditions.

Fixes #100639.
2024-07-29 11:28:03 -07:00
Vitaly Buka
2f3ae2f625
[NFC][InstCombine][SROA][Asan] Precommit tests affected by #100773 (#100844)
Some optimization need to be undone with
sanitizers by #100773.

For #100639.
2024-07-29 10:32:51 -07:00
James Y Knight
b7e4fba6e5
Cleanup x86_mmx after removing IR type (#100646)
After #98505, the textual IR keyword `x86_mmx` was temporarily made to
parse as `<1 x i64>`, so as not to require a lot of test update noise.

This completes the removal of the type, by removing the`x86_mmx` keyword
from the IR parser, and making the (now no-op) test updates via `sed -i
's/\bx86_mmx\b/<1 x i64>/g' $(git grep -l x86_mmx llvm/test/)`.
Resulting bitcasts from <1 x i64> to itself were then manually deleted.

Changes to llvm/test/Bitcode/compatibility-$VERSION.ll were reverted, as
they're intended to be equivalent to the .bc file, if parsed by old
LLVM, so shouldn't be updated.

A few tests were removed, as they're no longer testing anything, in the
following files:
- llvm/test/Transforms/GlobalOpt/x86_mmx_load.ll
- llvm/test/Transforms/InstCombine/cast.ll
- llvm/test/Transforms/InstSimplify/ConstProp/gep-zeroinit-vector.ll

Works towards issue #98272.
2024-07-28 18:12:47 -04:00
Vitaly Buka
cd354e37ab [NFC][SROA] Regenerate a test
New update_test_checks.py use a difference spacing.
2024-07-26 17:22:48 -07:00
James Y Knight
dfeb3991fb
Remove the x86_mmx IR type. (#98505)
It is now translated to `<1 x i64>`, which allows the removal of a bunch
of special casing.

This _incompatibly_ changes the ABI of any LLVM IR function with
`x86_mmx` arguments or returns: instead of passing in mmx registers,
they will now be passed via integer registers. However, the real-world
incompatibility caused by this is expected to be minimal, because Clang
never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>`
or `double`, depending on ABI.

This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type.
That type simply no longer corresponds to an IR type, and is used only
by MMX intrinsics and inline-asm operands.

Because SelectionDAGBuilder only knows how to generate the
operands/results of intrinsics based on the IR type, it thus now
generates the intrinsics with the type MVT::v1i64, instead of
MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus
have the X86 backend fix them up in DAGCombine. (This may be a
short-lived hack, if all the MMX intrinsics can be removed in upcoming
changes.)

Works towards issue #98272.
2024-07-25 09:19:22 -04:00
Antonio Frighetto
6ce7b1f861 [TBAA] Do not rewrite TBAA if exists, always null out !tbaa.struct
Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess`
unless it already exists, as struct-path aware `MDNodes` emitted
via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct`
carries memcpy padding semantics among struct fields and `!tbaa`
is already meant to aid to alias semantics, it should be possible
to zero out `!tbaa.struct` once the memcpy has been simplified.
`SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has
already replaced `!tbaa.struct` in SROA.

Fixes: https://github.com/llvm/llvm-project/issues/95661.
2024-07-25 09:24:56 +02:00
Yashwant Singh
cd1e6a587b
[SROA] Propagate no-signed-zeros(nsz) fast-math flag on the phi node using function attribute (#83381)
Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with
-Ofast, produces fabs intrinsic. However, at this point, LLVM is unable
to do so.

The above sequence goes through the following transformation during the
pass pipeline:
1) SROA pass generates the phi node. Here, it does not infer the
fast-math flags on the phi node unlike clang frontend typically does.
2) Phi node eventually gets translated into select instruction. 
Because of missing no-signed-zeros(nsz) fast-math flag on the select
instruction, InstCombine pass fails to fold the sequence into fabs
intrinsic.

This patch, as a part of SROA, tries to propagate nsz fast-math flag on
the phi node using function attribute enabling this folding.

Closes #51601

Co-authored-by: Sushant Gokhale <sgokhale@nvidia.com>
2024-07-02 11:59:39 +05:30
Stephen Tozer
094572701d
[RemoveDIs] Print IR with debug records by default (#91724)
This patch makes the final major change of the RemoveDIs project, changing the
default IR output from debug intrinsics to debug records. This is expected to
break a large number of tests: every single one that tests for uses or
declarations of debug intrinsics and does not explicitly disable writing
records. 

If this patch has broken your downstream tests (or upstream tests on a
configuration I wasn't able to run):
1. If you need to immediately unblock a build, pass
`--write-experimental-debuginfo=false` to LLVM's option processing for all
failing tests (remember to use `-mllvm` for clang/flang to forward arguments to
LLVM).
2. For most test failures, the changes are trivial and mechanical, enough that
they can be done by script; see the migration guide for a guide on how to do
this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates
3. If any tests fail for reasons other than FileCheck check lines that need
updating, such as assertion failures, that is most likely a real bug with this
patch and should be reported as such.

For more information, see the recent PSA:
https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578
2024-06-14 15:07:27 +01:00
Nikita Popov
738fcbee68 [SROA] Preserve all GEP flags during speculation
Unlikely to matter in practice, as these GEPs are typically
promoted away.
2024-06-14 11:48:35 +02:00
Florian Hahn
c8e5ad4e12
Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit 7dbba39e583a3fd64e7e6b947251c035e483f054.

Revert as there are reports this triggers during ThinLTO in some
configurations.
2024-04-22 10:50:49 +01:00
Julian Nagele
7dbba39e58
Reapply "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit b9cd48f96acdd07c627ccafbf4386a1f3dcd6c51.

-------------------------------------------------------------
Original commit message:

Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.

PR: https://github.com/llvm/llvm-project/pull/86709
2024-04-15 11:25:06 +01:00
Florian Hahn
b9cd48f96a
Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709)"
This reverts commit df75183d70e029352a49c93f275db703c81a65c1.

Revert for now as this appears to cause failures on some buildbots,
e.g.:
https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio
2024-03-27 21:22:15 +00:00
Julian Nagele
df75183d70
[TBAA] Add verifier for tbaa.struct metadata (#86709)
Adds logic to the IR verifier that checks whether !tbaa.struct nodes are
well-formed. That is, it checks that the operands of !tbaa.struct nodes
are in groups of three, that each group of three operands consists of
two integers and a valid tbaa node, and that the regions described by
the offset and size operands are non-overlapping.

PR: https://github.com/llvm/llvm-project/pull/86709
2024-03-27 10:30:27 +01:00
Arthur Eubanks
eae4f56cb4 [SROA] Fix phi gep unfolding with an alloca not in entry block
Fixes a crash reported in #83494.
2024-03-07 07:23:48 +00:00
Jeffrey Byrnes
1e828f838c [SROA]: Only defer trying partial sized ptr or ptr vector types
Change-Id: Ic77f87290905addadd5819dff2d0c62f031022ab
2024-03-05 08:52:07 -08:00
Arthur Eubanks
8848258f7b
[SROA] Unfold gep of index phi (round 2) (#83494)
If a gep has only one phi as one of its operands and the remaining
indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi
((gep ptr, idx1), (gep ptr, idx2))`.

Take care not to unfold recursive phis.

Followup to #80983.

This was initially was #83087. Initial PR did not handle allocas in
entry block that weren't at the beginning of the function, causing GEPs
to be inserted after the first chunk of allocas but potentially before
an alloca not at the beginning. Insert GEPs at the end of the entry
block instead since constants/arguments/static allocas can all be used
there.
2024-03-04 14:21:26 -08:00
Arthur Eubanks
de8e2b7b86 [test][SROA] Regenerate vector-promotion.ll 2024-02-29 18:53:25 +00:00
Fangrui Song
43b7dfcc1d Revert "[SROA] Unfold gep of index phi (#83087)"
This reverts commit 2eb63982e88b9ed8336158d35884b1a1d04a0f78.

This caused verifier error
```
Instruction does not dominate all uses!
```
for some projects using Halide.
The verifier error happens inside `Halide::Internal::CodeGen_LLVM::optimize_module`
and looks like a genuine SROA issue.
2024-02-28 15:56:43 -08:00
Arthur Eubanks
2eb63982e8
[SROA] Unfold gep of index phi (#83087)
If a gep has only one phi as one of its operands and the remaining
indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi
((gep ptr, idx1), (gep ptr, idx2))`.

Take care not to unfold recursive phis.

Followup to #80983.
2024-02-28 10:53:47 -08:00
Stephen Tozer
d128448efd Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)""
Reverted due to some test failures on some buildbots.

https://lab.llvm.org/buildbot/#/builders/67/builds/14669

This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.
2024-02-27 10:17:24 +00:00
Stephen Tozer
aa436493ab Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)"
Fixes the prior issue in which the symbol for a cl-arg was unavailable to
some binaries.

This reverts commit dc06d75ab27b4dcae2940fc386fadd06f70faffe.
2024-02-27 09:59:08 +00:00
Stephen Tozer
dc06d75ab2 Revert "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)"
Reverted due to failures on buildbots, where a new cl flag was placed
in the wrong file, resulting in link errors.

https://lab.llvm.org/buildbot/#/builders/198/builds/8548

This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.
2024-02-26 18:49:18 +00:00
Stephen Tozer
0b398256b3
[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281)
This patch adds support for printing the proposed non-instruction debug
info ("RemoveDIs") out to textual IR. This patch does not add any
bitcode support, parsing support, or documentation.

Printing of the new format is controlled by a flag added in this patch,
`--write-experimental-debuginfo`, which defaults to false. The new
format will be printed *iff* this flag is true, so whether we use the IR
format is completely independent of whether we use non-instruction debug
info during LLVM passes (which is controlled by the
`--try-experimental-debuginfo-iterators` flag).

Even with the flag disabled, some existing tests need to be updated, as this
patch causes debug intrinsic declarations to be changed in a round trip,
such that they always appear at the end of a module and have no attributes
(this has no functional change on the module).

The design of this new IR format was proposed previously on
Discourse, and any further discussion about the design can still be
contributed there:

https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491
2024-02-26 18:22:05 +00:00
Florian Hahn
dc85719d5b
[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (#81313)
Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector<std::complex<float>>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81289.

PR: https://github.com/llvm/llvm-project/pull/81313
2024-02-16 19:23:14 +00:00
Florian Hahn
53c0e809fa
[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289)
If a split memory access introduced by SROA accesses precisely a single
field of the original operation's !tbaa.struct, use the !tbaa tag for
the accessed field directly instead of the full !tbaa.struct.

InstCombine already had a similar logic.

Motivation for this and follow-on patches is to improve codegen for
libc++, where using memcpy limits optimizations, like vectorization for
code iteration over std::vector<std::complex<float>>:
https://godbolt.org/z/f3vqYos3c

Depends on https://github.com/llvm/llvm-project/pull/81285.
2024-02-16 13:45:01 +00:00
Florian Hahn
2a9b86cc10
[SROA] Extend !tbaa.struct test coverage with multiple missing cases.
Add tests to cover missing cases for
https://github.com/llvm/llvm-project/pull/81289 and
https://github.com/llvm/llvm-project/pull/81313.
2024-02-15 21:21:18 +00:00