llvm-project

Author	SHA1	Message	Date
Nuno Lopes	d0029b87d8	remove UB from test [NFC]	2025-08-19 11:18:27 +01:00
Nikita Popov	c23b4fbdbb	[IR] Remove size argument from lifetime intrinsics (#150248 ) Now that #149310 has restricted lifetime intrinsics to only work on allocas, we can also drop the explicit size argument. Instead, the size is implied by the alloca. This removes the ability to only mark a prefix of an alloca alive/dead. We never used that capability, so we should remove the need to handle that possibility everywhere (though many key places, including stack coloring, did not actually respect this).	2025-08-08 11:09:34 +02:00
Tommy MᶜMichen	155359c1f2	[llvm][sroa] Disable support for `invariant.group` (#151743 ) Resolves #151574. > SROA pass does not perform aggregate load/store rewriting on a pointer whose source is a `launder.invariant.group`. > > This causes failed assertion in `AllocaSlices`. > > ``` > void (anonymous namespace)::AllocaSlices::SliceBuilder::visitStoreInst(StoreInst &): > Assertion `(!SI.isSimple() \|\| ValOp->getType()->isSingleValueType()) && > "All simple FCA stores should have been pre-split"' failed. > ``` Disables support for `{launder,strip}.invariant.group` intrinsics in SROA. Updates SROA test for `invariant.group` support.	2025-08-05 09:59:07 +02:00
Nikita Popov	2c6eec219d	[Tests] Avoid lifetime intrinsics on non-allocas (NFC) Don't rely on auto-upgrade, instead either remove unnecessary casts or remove no longer applicable tests.	2025-07-23 15:05:43 +02:00
Alex MacLean	0008af882d	[SROA] Allow as zext<i1> index when unfolding GEP select (#146929 ) A zero-extension from an i1 is equivalent to a select with constant 0 and 1 values. Add this case when rewriting gep(select) -> select(gep) to expose more opportunities for SROA.	2025-07-04 08:16:19 -07:00
Paul Walker	ea9046699e	[LLVM][SROA] Teach SROA how to "bitcast" between fixed and scalable vectors. (#130973 ) For function whose vscale_range is limited to a single value we can size scalable vectors. This aids SROA by allowing scalable vector load and store operations to be considered for replacement whereby bitcasts through memory can be replaced by vector insert or extract operations.	2025-06-11 11:02:32 +01:00
Nikita Popov	5c97397c2c	[SROA] Support load-only promotion with dynamic offset loads (#135609 ) If we do load-only promotion, it is okay if we leave some loads alone. We only need to know all stores that affect a specific location. As such, we can handle loads with unknown offset via the "escaped read-only" code path. This is something we already support in LICM load-only promotion, but doing this in SROA is much better from a phase ordering perspective. Fixes https://github.com/llvm/llvm-project/issues/134513.	2025-04-17 10:42:07 +02:00
Nikita Popov	1e2dc5b087	[SROA] Add load-only promotion tests with dynamic offset load	2025-04-14 12:16:06 +02:00
Nikita Popov	a9474191e0	[SROA] Improve handling of lifetimes in load-only promotion (#135382 ) The propagateStoredValuesToLoads() transform currently bails out if there is a lifetime intrinsic spanning the whole alloca, but the individual loads/stores operate on some smaller part, because the slice / partition size does not match. Fix this by ignoring assume-like slices early, regardless of which range they cover. I've changed the overall code structure here a bit because I was getting confused by the different iterators.	2025-04-14 11:52:42 +02:00
Matt Arsenault	7b3b4a5b1b	IR: Use poison in dropDroppableUse (#134576 )	2025-04-07 14:59:34 +07:00
Jeremy Morse	792a6f8119	[RemoveDIs] Remove "try-debuginfo-iterators..." test flags (#130298 ) These date back to when the non-intrinsic format of variable locations was still being tested and was behind a compile-time flag, so not all builds / bots would correctly run them. The solution at the time, to get at least some test coverage, was to have tests opt-in to non-intrinsic debug-info if it was built into LLVM. Nowadays, non-intrinsic format is the default and has been on for more than a year, there's no need for this flag to exist. (I've downgraded the flag from "try" to explicitly requesting non-intrinsic format in some places, so that we can deal with tests that are explicitly about non-intrinsic format in their own commit).	2025-03-14 15:50:49 +00:00
Nikita Popov	55b480ec3c	[SROA] Allow load-only promotion with read-only captures (#130735 ) It's okay if the address or read-provenance of the pointer is captured. We only have to make sure that there are no unanalyzable writes to the pointer.	2025-03-13 09:53:02 +01:00
Pedro Lobo	3c80d9b8dd	[Instruction] Set metadata to `poison` on deletion (#129449 ) Represent extant metadata uses of a deleted instruction with `poison` instead of `undef`.	2025-03-03 07:17:01 +07:00
Nikita Popov	29441e4f5f	[IR] Convert from nocapture to captures(none) (#123181 ) This PR removes the old `nocapture` attribute, replacing it with the new `captures` attribute introduced in #116990. This change is intended to be essentially NFC, replacing existing uses of `nocapture` with `captures(none)` without adding any new analysis capabilities. Making use of non-`none` values is left for a followup. Some notes: * `nocapture` will be upgraded to `captures(none)` by the bitcode reader. * `nocapture` will also be upgraded by the textual IR reader. This is to make it easier to use old IR files and somewhat reduce the test churn in this PR. * Helper APIs like `doesNotCapture()` will check for `captures(none)`. * MLIR import will convert `captures(none)` into an `llvm.nocapture` attribute. The representation in the LLVM IR dialect should be updated separately.	2025-01-29 16:56:47 +01:00
Alex MacLean	1a56360cc6	[IR] Treat calls with byval ptrs as read-only (#122961 )	2025-01-15 10:25:55 -08:00
David Green	2a7ed2c1aa	[SROA] Protect against calling the alloca ptr In case we are calling the alloca ptr directly, check that the Use is a normal operand to the call. Fortran is a funny language.	2024-12-17 09:21:15 +00:00
David Green	0032c151dc	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-14 18:07:21 +00:00
Kirill Stoimenov	e3676aa21f	Revert "[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 )" Causing buffer overflow: SUMMARY: AddressSanitizer: heap-buffer-overflow llvm/lib/Transforms/Scalar/SROA.cpp:5552:35 This reverts commit 5e247d726d7a54cf0acc997bc17b50e7494e6fa3.	2024-12-12 21:32:35 +00:00
David Green	5e247d726d	[SROA] Optimize reloaded values in allocas that escape into readonly nocapture calls. (#116645 ) Given an alloca that potentially has many uses in big complex code and escapes into a call that is readonly+nocapture, we cannot easily split up the alloca. There are several optimizations that will attempt to take a value that is stored and a reload, and replace the load with the original stored value. Instcombine has some simple heuristics, GVN can sometimes do it, as can CSE in limited situations. They all suffer from the same issue with complex code - they start from a load/store and need to prove no-alias for all code between, which in complex cases might be a lot to look through. Especially if the ptr is an alloca with many uses that is over the normal escape capture limits. The pass that does do well with allocas is SROA, as it has a complete view of all of the uses. This patch adds a case to SROA where it can detect allocas that are passed into calls that are no-capture readonly. It can then optimize the reloaded values inside the alloca slice with the stored value knowing that it is valid no matter the location of the loads/stores from the no-escaping nature of the alloca.	2024-12-12 10:27:27 +00:00
David Green	6106422ddb	[SROA] Escaping readonly nocapture tests. NFC	2024-12-10 18:07:54 +00:00
Paul Walker	38fffa630e	[LLVM][IR] Use splat syntax when printing Constant[Data]Vector. (#112548 )	2024-11-06 11:53:33 +00:00
Jay Foad	922992a22f	Fix typo "instrinsic" (#112899 )	2024-10-18 15:58:33 +01:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Shubham Sandeep Rastogi	359c704004	Handle #dbg_values in SROA. (#94070 ) This patch properly handles #dbg_values in SROA by making sure that any #dbg_values get moved to before a store just like #dbg_declares do, or the #dbg_value is correctly updated with the right alloca after an aggregate alloca is broken up. The issue stems from swift where #dbg_values are emitted and not dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and it causes them to all have undefs If we look at this simple-ish testcase (This is all I could reduce it down to, and I am still relatively bad at writing llvm IR by hand so I apologize in advance): ``` %T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }> %T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }> %T1M1SVySfG = type <{ ptr, %Ts4Int8V }> %Ts4Int8V = type <{ i8 }> %T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }> define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 { entry: %3 = alloca %T4main1VV13TangentVectorV %4 = alloca %T4main1UV13TangentVectorV %5 = alloca %T4main1VV13TangentVectorV %6 = alloca %T4main1UV13TangentVectorV %7 = alloca %T4main1VV13TangentVectorV %8 = alloca %T4main1UV13TangentVectorV %9 = alloca %T4main1VV13TangentVectorV %10 = alloca %T4main1UV13TangentVectorV call void @llvm.lifetime.start.p0(i64 9, ptr %3) call void @llvm.lifetime.start.p0(i64 25, ptr %4) call void @llvm.lifetime.start.p0(i64 9, ptr %5) call void @llvm.lifetime.start.p0(i64 25, ptr %6) call void @llvm.lifetime.start.p0(i64 9, ptr %7) call void @llvm.lifetime.start.p0(i64 25, ptr %8) call void @llvm.lifetime.start.p0(i64 9, ptr %9) call void @llvm.lifetime.start.p0(i64 25, ptr %10) %.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false) %.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false) call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75 %.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0 %.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0 %11 = load ptr, ptr %.s.c %.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1 %.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0 %12 = load i8, ptr %.s.b._value %.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0 %.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0 %13 = load ptr, ptr %.s2.c %.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1 %.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0 %14 = load i8, ptr %.s2.b._value %.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false) %.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false) %.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0 %.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0 %18 = load ptr, ptr %.s4.c %.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0 %.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0 %20 = load ptr, ptr %.s5.c %.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false) %.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false) %.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0 %.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0 %25 = load ptr, ptr %.s7.c %.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1 %.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0 %26 = load i8, ptr %.s7.b._value %.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0 %.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0 %27 = load ptr, ptr %.s8.c %.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1 %.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0 %28 = load i8, ptr %.s8.b._value %.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false) %.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false) %.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0 %.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0 %32 = load ptr, ptr %.s11.c %.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0 %.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0 %34 = load ptr, ptr %.s12.c call void @llvm.lifetime.end.p0(i64 25, ptr %10) call void @llvm.lifetime.end.p0(i64 9, ptr %9) call void @llvm.lifetime.end.p0(i64 25, ptr %8) call void @llvm.lifetime.end.p0(i64 9, ptr %7) call void @llvm.lifetime.end.p0(i64 25, ptr %6) call void @llvm.lifetime.end.p0(i64 9, ptr %5) call void @llvm.lifetime.end.p0(i64 25, ptr %4) call void @llvm.lifetime.end.p0(i64 9, ptr %3) ret void } !llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15} !swift.module.flags = !{!33} !llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43} !0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]} !1 = !{i32 1, !"Objective-C Version", i32 2} !2 = !{i32 1, !"Objective-C Image Info Version", i32 0} !3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"} !4 = !{i32 1, !"Objective-C Garbage Collection", i8 0} !6 = !{i32 7, !"Dwarf Version", i32 4} !7 = !{i32 2, !"Debug Info Version", i32 3} !8 = !{i32 1, !"wchar_size", i32 4} !9 = !{i32 8, !"PIC Level", i32 2} !10 = !{i32 7, !"uwtable", i32 1} !11 = !{i32 7, !"frame-pointer", i32 1} !12 = !{i32 1, !"Swift Version", i32 7} !13 = !{i32 1, !"Swift ABI Version", i32 7} !14 = !{i32 1, !"Swift Major Version", i8 6} !15 = !{i32 1, !"Swift Minor Version", i8 0} !16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk") !17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2") !18 = !{!19, !21, !23, !25, !27, !29, !31} !19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17) !20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen") !21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17) !22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule") !23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60) !24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule") !25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61) !26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule") !27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17) !28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule") !29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17) !30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims") !31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17) !32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule") !33 = !{i1 false} !34 = !{!"-lswiftCore"} !35 = !{!"-lswift_StringProcessing"} !36 = !{!"-lswift_Differentiation"} !37 = !{!"-lswiftDarwin"} !38 = !{!"-lswift_Concurrency"} !39 = !{!"-lswiftSwiftOnoneSupport"} !40 = !{!"-lobjc"} !41 = !{!"-lswiftCompatibilityConcurrency"} !42 = !{!"-lswiftCompatibility56"} !43 = !{!"-lswiftCompatibilityPacks"} !44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53) !45 = !DIFile(filename: "<compiler-generated>", directory: "/") !46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD") !47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD") !48 = !{} !49 = !DISubroutineType(types: !50) !50 = !{!51} !51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD") !52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized) !53 = !{!54, !56, !57} !54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial) !55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46) !56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial) !57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial) !58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51) !62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial) !63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70) !64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD") !65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD") !66 = !DISubroutineType(types: !67) !67 = !{!68} !68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD") !69 = !DISubprogram( spFlags: DISPFlagOptimized) !70 = !{!71, !73} !71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial) !72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64) !73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial) !74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68) !75 = !DILocation( scope: !63, inlinedAt: !76) !76 = distinct !DILocation( scope: !44) ``` if we run ` opt -S -passes=sroa file.ll -o -` With this patch we will see ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Without this patch we will see: ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068	2024-08-21 17:52:37 -07:00
Vitaly Buka	6dba99e14f	[InstCombine][asan] Don't speculate loads before `select ptr` (#100773 ) Even if memory is valid from `llvm` point of view, e.g. local alloca, sanitizers have API for user specific memory annotations. These annotations can be used to track size of the local object, e.g. inline vectors may prevent accesses beyond the current vector size. So valid programs should not access those parts of alloca before checking preconditions. Fixes #100639.	2024-07-29 11:28:03 -07:00
Vitaly Buka	2f3ae2f625	[NFC][InstCombine][SROA][Asan] Precommit tests affected by #100773 (#100844 ) Some optimization need to be undone with sanitizers by #100773. For #100639.	2024-07-29 10:32:51 -07:00
James Y Knight	b7e4fba6e5	Cleanup x86_mmx after removing IR type (#100646 ) After #98505, the textual IR keyword `x86_mmx` was temporarily made to parse as `<1 x i64>`, so as not to require a lot of test update noise. This completes the removal of the type, by removing the`x86_mmx` keyword from the IR parser, and making the (now no-op) test updates via `sed -i 's/\bx86_mmx\b/<1 x i64>/g' $(git grep -l x86_mmx llvm/test/)`. Resulting bitcasts from <1 x i64> to itself were then manually deleted. Changes to llvm/test/Bitcode/compatibility-$VERSION.ll were reverted, as they're intended to be equivalent to the .bc file, if parsed by old LLVM, so shouldn't be updated. A few tests were removed, as they're no longer testing anything, in the following files: - llvm/test/Transforms/GlobalOpt/x86_mmx_load.ll - llvm/test/Transforms/InstCombine/cast.ll - llvm/test/Transforms/InstSimplify/ConstProp/gep-zeroinit-vector.ll Works towards issue #98272.	2024-07-28 18:12:47 -04:00
Vitaly Buka	cd354e37ab	[NFC][SROA] Regenerate a test New update_test_checks.py use a difference spacing.	2024-07-26 17:22:48 -07:00
James Y Knight	dfeb3991fb	Remove the `x86_mmx` IR type. (#98505 ) It is now translated to `<1 x i64>`, which allows the removal of a bunch of special casing. This _incompatibly_ changes the ABI of any LLVM IR function with `x86_mmx` arguments or returns: instead of passing in mmx registers, they will now be passed via integer registers. However, the real-world incompatibility caused by this is expected to be minimal, because Clang never uses the x86_mmx type -- it lowers `__m64` to either `<1 x i64>` or `double`, depending on ABI. This change does _not_ eliminate the SelectionDAG `MVT::x86mmx` type. That type simply no longer corresponds to an IR type, and is used only by MMX intrinsics and inline-asm operands. Because SelectionDAGBuilder only knows how to generate the operands/results of intrinsics based on the IR type, it thus now generates the intrinsics with the type MVT::v1i64, instead of MVT::x86mmx. We need to fix this before the DAG LegalizeTypes, and thus have the X86 backend fix them up in DAGCombine. (This may be a short-lived hack, if all the MMX intrinsics can be removed in upcoming changes.) Works towards issue #98272.	2024-07-25 09:19:22 -04:00
Antonio Frighetto	6ce7b1f861	[TBAA] Do not rewrite TBAA if exists, always null out `!tbaa.struct` Retrieve `!tbaa` metadata via `!tbaa.struct` in `adjustForAccess` unless it already exists, as struct-path aware `MDNodes` emitted via `new-struct-path-tbaa` may be leveraged. As `!tbaa.struct` carries memcpy padding semantics among struct fields and `!tbaa` is already meant to aid to alias semantics, it should be possible to zero out `!tbaa.struct` once the memcpy has been simplified. `SROA/tbaa-struct.ll` test has gone out of scope, as `!tbaa` has already replaced `!tbaa.struct` in SROA. Fixes: https://github.com/llvm/llvm-project/issues/95661.	2024-07-25 09:24:56 +02:00
Yashwant Singh	cd1e6a587b	[SROA] Propagate no-signed-zeros(nsz) fast-math flag on the phi node using function attribute (#83381 ) Its expected that the sequence `return X > 0.0 ? X : -X`, compiled with -Ofast, produces fabs intrinsic. However, at this point, LLVM is unable to do so. The above sequence goes through the following transformation during the pass pipeline: 1) SROA pass generates the phi node. Here, it does not infer the fast-math flags on the phi node unlike clang frontend typically does. 2) Phi node eventually gets translated into select instruction. Because of missing no-signed-zeros(nsz) fast-math flag on the select instruction, InstCombine pass fails to fold the sequence into fabs intrinsic. This patch, as a part of SROA, tries to propagate nsz fast-math flag on the phi node using function attribute enabling this folding. Closes #51601 Co-authored-by: Sushant Gokhale <sgokhale@nvidia.com>	2024-07-02 11:59:39 +05:30
Stephen Tozer	094572701d	[RemoveDIs] Print IR with debug records by default (#91724 ) This patch makes the final major change of the RemoveDIs project, changing the default IR output from debug intrinsics to debug records. This is expected to break a large number of tests: every single one that tests for uses or declarations of debug intrinsics and does not explicitly disable writing records. If this patch has broken your downstream tests (or upstream tests on a configuration I wasn't able to run): 1. If you need to immediately unblock a build, pass `--write-experimental-debuginfo=false` to LLVM's option processing for all failing tests (remember to use `-mllvm` for clang/flang to forward arguments to LLVM). 2. For most test failures, the changes are trivial and mechanical, enough that they can be done by script; see the migration guide for a guide on how to do this: https://llvm.org/docs/RemoveDIsDebugInfo.html#test-updates 3. If any tests fail for reasons other than FileCheck check lines that need updating, such as assertion failures, that is most likely a real bug with this patch and should be reported as such. For more information, see the recent PSA: https://discourse.llvm.org/t/psa-ir-output-changing-from-debug-intrinsics-to-debug-records/79578	2024-06-14 15:07:27 +01:00
Nikita Popov	738fcbee68	[SROA] Preserve all GEP flags during speculation Unlikely to matter in practice, as these GEPs are typically promoted away.	2024-06-14 11:48:35 +02:00
Florian Hahn	c8e5ad4e12	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit 7dbba39e583a3fd64e7e6b947251c035e483f054. Revert as there are reports this triggers during ThinLTO in some configurations.	2024-04-22 10:50:49 +01:00
Julian Nagele	7dbba39e58	Reapply "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit b9cd48f96acdd07c627ccafbf4386a1f3dcd6c51. ------------------------------------------------------------- Original commit message: Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-04-15 11:25:06 +01:00
Florian Hahn	b9cd48f96a	Revert "[TBAA] Add verifier for tbaa.struct metadata (#86709 )" This reverts commit df75183d70e029352a49c93f275db703c81a65c1. Revert for now as this appears to cause failures on some buildbots, e.g.: https://lab.llvm.org/buildbot/#/builders/93/builds/19428/steps/10/logs/stdio	2024-03-27 21:22:15 +00:00
Julian Nagele	df75183d70	[TBAA] Add verifier for tbaa.struct metadata (#86709 ) Adds logic to the IR verifier that checks whether !tbaa.struct nodes are well-formed. That is, it checks that the operands of !tbaa.struct nodes are in groups of three, that each group of three operands consists of two integers and a valid tbaa node, and that the regions described by the offset and size operands are non-overlapping. PR: https://github.com/llvm/llvm-project/pull/86709	2024-03-27 10:30:27 +01:00
Arthur Eubanks	eae4f56cb4	[SROA] Fix phi gep unfolding with an alloca not in entry block Fixes a crash reported in #83494.	2024-03-07 07:23:48 +00:00
Jeffrey Byrnes	1e828f838c	[SROA]: Only defer trying partial sized ptr or ptr vector types Change-Id: Ic77f87290905addadd5819dff2d0c62f031022ab	2024-03-05 08:52:07 -08:00
Arthur Eubanks	8848258f7b	[SROA] Unfold gep of index phi (round 2) (#83494 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983. This was initially was #83087. Initial PR did not handle allocas in entry block that weren't at the beginning of the function, causing GEPs to be inserted after the first chunk of allocas but potentially before an alloca not at the beginning. Insert GEPs at the end of the entry block instead since constants/arguments/static allocas can all be used there.	2024-03-04 14:21:26 -08:00
Arthur Eubanks	de8e2b7b86	[test][SROA] Regenerate vector-promotion.ll	2024-02-29 18:53:25 +00:00
Fangrui Song	43b7dfcc1d	Revert "[SROA] Unfold gep of index phi (#83087 )" This reverts commit 2eb63982e88b9ed8336158d35884b1a1d04a0f78. This caused verifier error ``` Instruction does not dominate all uses! ``` for some projects using Halide. The verifier error happens inside `Halide::Internal::CodeGen_LLVM::optimize_module` and looks like a genuine SROA issue.	2024-02-28 15:56:43 -08:00
Arthur Eubanks	2eb63982e8	[SROA] Unfold gep of index phi (#83087 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983.	2024-02-28 10:53:47 -08:00
Stephen Tozer	d128448efd	Revert "Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )"" Reverted due to some test failures on some buildbots. https://lab.llvm.org/buildbot/#/builders/67/builds/14669 This reverts commit aa436493ab7ad4cf323b0189c15c59ac9dc293c7.	2024-02-27 10:17:24 +00:00
Stephen Tozer	aa436493ab	Reapply "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Fixes the prior issue in which the symbol for a cl-arg was unavailable to some binaries. This reverts commit dc06d75ab27b4dcae2940fc386fadd06f70faffe.	2024-02-27 09:59:08 +00:00
Stephen Tozer	dc06d75ab2	Revert "[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 )" Reverted due to failures on buildbots, where a new cl flag was placed in the wrong file, resulting in link errors. https://lab.llvm.org/buildbot/#/builders/198/builds/8548 This reverts commit 0b398256b3f72204ad1f7c625efe4990204e898a.	2024-02-26 18:49:18 +00:00
Stephen Tozer	0b398256b3	[RemoveDIs] Print non-intrinsic debug info in textual IR output (#79281 ) This patch adds support for printing the proposed non-instruction debug info ("RemoveDIs") out to textual IR. This patch does not add any bitcode support, parsing support, or documentation. Printing of the new format is controlled by a flag added in this patch, `--write-experimental-debuginfo`, which defaults to false. The new format will be printed iff this flag is true, so whether we use the IR format is completely independent of whether we use non-instruction debug info during LLVM passes (which is controlled by the `--try-experimental-debuginfo-iterators` flag). Even with the flag disabled, some existing tests need to be updated, as this patch causes debug intrinsic declarations to be changed in a round trip, such that they always appear at the end of a module and have no attributes (this has no functional change on the module). The design of this new IR format was proposed previously on Discourse, and any further discussion about the design can still be contributed there: https://discourse.llvm.org/t/rfc-debuginfo-proposed-changes-to-the-textual-ir-representation-for-debug-values/73491	2024-02-26 18:22:05 +00:00
Florian Hahn	dc85719d5b	[TBAA] Use !tbaa for first accessed field if it is an exact match in offset and size. (#81313 ) Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on https://github.com/llvm/llvm-project/pull/81289. PR: https://github.com/llvm/llvm-project/pull/81313	2024-02-16 19:23:14 +00:00
Florian Hahn	53c0e809fa	[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289 ) If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct. InstCombine already had a similar logic. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on https://github.com/llvm/llvm-project/pull/81285.	2024-02-16 13:45:01 +00:00
Florian Hahn	2a9b86cc10	[SROA] Extend !tbaa.struct test coverage with multiple missing cases. Add tests to cover missing cases for https://github.com/llvm/llvm-project/pull/81289 and https://github.com/llvm/llvm-project/pull/81313.	2024-02-15 21:21:18 +00:00

1 2 3 4 5 ...

376 Commits