llvm-project

Author	SHA1	Message	Date
Jay Foad	e03f427196	[LLVM] Use {} instead of std::nullopt to initialize empty ArrayRef (#109133 ) It is almost always simpler to use {} instead of std::nullopt to initialize an empty ArrayRef. This patch changes all occurrences I could find in LLVM itself. In future the ArrayRef(std::nullopt_t) constructor could be deprecated or removed.	2024-09-19 16:16:38 +01:00
Bartłomiej Chmiel	c2b92a4250	[SROA] Use SetVector for PromotableAllocas (#105809 ) Use SetVector to make operations more efficient if there is a very large number of allocas.	2024-09-04 14:10:51 +02:00
Stephen Tozer	3d08ade7bd	[ExtendLifetimes] Implement llvm.fake.use to extend variable lifetimes (#86149 ) This patch is part of a set of patches that add an `-fextend-lifetimes` flag to clang, which extends the lifetimes of local variables and parameters for improved debuggability. In addition to that flag, the patch series adds a pragma to selectively disable `-fextend-lifetimes`, and an `-fextend-this-ptr` flag which functions as `-fextend-lifetimes` for this pointers only. All changes and tests in these patches were written by Wolfgang Pieb (@wolfy1961), while Stephen Tozer (@SLTozer) has handled review and merging. The extend lifetimes flag is intended to eventually be set on by `-Og`, as discussed in the RFC here: https://discourse.llvm.org/t/rfc-redefine-og-o1-and-add-a-new-level-of-og/72850 This patch implements a new intrinsic instruction in LLVM, `llvm.fake.use` in IR and `FAKE_USE` in MIR, that takes a single operand and has no effect other than "using" its operand, to ensure that its operand remains live until after the fake use. This patch does not emit fake uses anywhere; the next patch in this sequence causes them to be emitted from the clang frontend, such that for each variable (or this) a fake.use operand is inserted at the end of that variable's scope, using that variable's value. This patch covers everything post-frontend, which is largely just the basic plumbing for a new intrinsic/instruction, along with a few steps to preserve the fake uses through optimizations (such as moving them ahead of a tail call or translating them through SROA). Co-authored-by: Stephen Tozer <stephen.tozer@sony.com>	2024-08-29 17:53:32 +01:00
Shubham Sandeep Rastogi	359c704004	Handle #dbg_values in SROA. (#94070 ) This patch properly handles #dbg_values in SROA by making sure that any #dbg_values get moved to before a store just like #dbg_declares do, or the #dbg_value is correctly updated with the right alloca after an aggregate alloca is broken up. The issue stems from swift where #dbg_values are emitted and not dbg.declares, the SROA pass doesn't handle the #dbg_values correctly and it causes them to all have undefs If we look at this simple-ish testcase (This is all I could reduce it down to, and I am still relatively bad at writing llvm IR by hand so I apologize in advance): ``` %T4main1TV13TangentVectorV = type <{ %T4main1UV13TangentVectorV, [7 x i8], %T4main1UV13TangentVectorV }> %T4main1UV13TangentVectorV = type <{ %T1M1SVySfG, [7 x i8], %T4main1VV13TangentVectorV }> %T1M1SVySfG = type <{ ptr, %Ts4Int8V }> %Ts4Int8V = type <{ i8 }> %T4main1VV13TangentVectorV = type <{ %T1M1SVySfG }> define hidden swiftcc void @"$s4main1TV13TangentVectorV1poiyA2E_AEtFZ"(ptr noalias nocapture sret(%T4main1TV13TangentVectorV) %0, ptr noalias nocapture dereferenceable(57) %1, ptr noalias nocapture dereferenceable(57) %2) #0 !dbg !44 { entry: %3 = alloca %T4main1VV13TangentVectorV %4 = alloca %T4main1UV13TangentVectorV %5 = alloca %T4main1VV13TangentVectorV %6 = alloca %T4main1UV13TangentVectorV %7 = alloca %T4main1VV13TangentVectorV %8 = alloca %T4main1UV13TangentVectorV %9 = alloca %T4main1VV13TangentVectorV %10 = alloca %T4main1UV13TangentVectorV call void @llvm.lifetime.start.p0(i64 9, ptr %3) call void @llvm.lifetime.start.p0(i64 25, ptr %4) call void @llvm.lifetime.start.p0(i64 9, ptr %5) call void @llvm.lifetime.start.p0(i64 25, ptr %6) call void @llvm.lifetime.start.p0(i64 9, ptr %7) call void @llvm.lifetime.start.p0(i64 25, ptr %8) call void @llvm.lifetime.start.p0(i64 9, ptr %9) call void @llvm.lifetime.start.p0(i64 25, ptr %10) %.u1 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %4, ptr align 8 %.u1, i64 25, i1 false) %.u11 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 0 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %6, ptr align 8 %.u11, i64 25, i1 false) call void @llvm.dbg.value(metadata ptr %4, metadata !62, metadata !DIExpression(DW_OP_deref)), !dbg !75 %.s = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 0 %.s.c = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 0 %11 = load ptr, ptr %.s.c %.s.b = getelementptr inbounds %T1M1SVySfG, ptr %.s, i32 0, i32 1 %.s.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s.b, i32 0, i32 0 %12 = load i8, ptr %.s.b._value %.s2 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 0 %.s2.c = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 0 %13 = load ptr, ptr %.s2.c %.s2.b = getelementptr inbounds %T1M1SVySfG, ptr %.s2, i32 0, i32 1 %.s2.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s2.b, i32 0, i32 0 %14 = load i8, ptr %.s2.b._value %.v = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %4, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %3, ptr align 8 %.v, i64 9, i1 false) %.v3 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %6, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %5, ptr align 8 %.v3, i64 9, i1 false) %.s4 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %3, i32 0, i32 0 %.s4.c = getelementptr inbounds %T1M1SVySfG, ptr %.s4, i32 0, i32 0 %18 = load ptr, ptr %.s4.c %.s5 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %5, i32 0, i32 0 %.s5.c = getelementptr inbounds %T1M1SVySfG, ptr %.s5, i32 0, i32 0 %20 = load ptr, ptr %.s5.c %.u2 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %1, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %8, ptr align 8 %.u2, i64 25, i1 false) %.u26 = getelementptr inbounds %T4main1TV13TangentVectorV, ptr %2, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %10, ptr align 8 %.u26, i64 25, i1 false) %.s7 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 0 %.s7.c = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 0 %25 = load ptr, ptr %.s7.c %.s7.b = getelementptr inbounds %T1M1SVySfG, ptr %.s7, i32 0, i32 1 %.s7.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s7.b, i32 0, i32 0 %26 = load i8, ptr %.s7.b._value %.s8 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 0 %.s8.c = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 0 %27 = load ptr, ptr %.s8.c %.s8.b = getelementptr inbounds %T1M1SVySfG, ptr %.s8, i32 0, i32 1 %.s8.b._value = getelementptr inbounds %Ts4Int8V, ptr %.s8.b, i32 0, i32 0 %28 = load i8, ptr %.s8.b._value %.v9 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %8, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %7, ptr align 8 %.v9, i64 9, i1 false) %.v10 = getelementptr inbounds %T4main1UV13TangentVectorV, ptr %10, i32 0, i32 2 call void @llvm.memcpy.p0.p0.i64(ptr align 8 %9, ptr align 8 %.v10, i64 9, i1 false) %.s11 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %7, i32 0, i32 0 %.s11.c = getelementptr inbounds %T1M1SVySfG, ptr %.s11, i32 0, i32 0 %32 = load ptr, ptr %.s11.c %.s12 = getelementptr inbounds %T4main1VV13TangentVectorV, ptr %9, i32 0, i32 0 %.s12.c = getelementptr inbounds %T1M1SVySfG, ptr %.s12, i32 0, i32 0 %34 = load ptr, ptr %.s12.c call void @llvm.lifetime.end.p0(i64 25, ptr %10) call void @llvm.lifetime.end.p0(i64 9, ptr %9) call void @llvm.lifetime.end.p0(i64 25, ptr %8) call void @llvm.lifetime.end.p0(i64 9, ptr %7) call void @llvm.lifetime.end.p0(i64 25, ptr %6) call void @llvm.lifetime.end.p0(i64 9, ptr %5) call void @llvm.lifetime.end.p0(i64 25, ptr %4) call void @llvm.lifetime.end.p0(i64 9, ptr %3) ret void } !llvm.module.flags = !{!0, !1, !2, !3, !4, !6, !7, !8, !9, !10, !11, !12, !13, !14, !15} !swift.module.flags = !{!33} !llvm.linker.options = !{!34, !35, !36, !37, !38, !39, !40, !41, !42, !43} !0 = !{i32 2, !"SDK Version", [2 x i32] [i32 14, i32 4]} !1 = !{i32 1, !"Objective-C Version", i32 2} !2 = !{i32 1, !"Objective-C Image Info Version", i32 0} !3 = !{i32 1, !"Objective-C Image Info Section", !"__DATA, no_dead_strip"} !4 = !{i32 1, !"Objective-C Garbage Collection", i8 0} !6 = !{i32 7, !"Dwarf Version", i32 4} !7 = !{i32 2, !"Debug Info Version", i32 3} !8 = !{i32 1, !"wchar_size", i32 4} !9 = !{i32 8, !"PIC Level", i32 2} !10 = !{i32 7, !"uwtable", i32 1} !11 = !{i32 7, !"frame-pointer", i32 1} !12 = !{i32 1, !"Swift Version", i32 7} !13 = !{i32 1, !"Swift ABI Version", i32 7} !14 = !{i32 1, !"Swift Major Version", i8 6} !15 = !{i32 1, !"Swift Minor Version", i8 0} !16 = distinct !DICompileUnit(language: DW_LANG_Swift, file: !17, imports: !18, sdk: "MacOSX14.4.sdk") !17 = !DIFile(filename: "/Users/emilpedersen/swift2/swift/test/IRGen/debug_scope_distinct.swift", directory: "/Users/emilpedersen/swift2") !18 = !{!19, !21, !23, !25, !27, !29, !31} !19 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !20, file: !17) !20 = !DIModule(scope: null, name: "main", includePath: "/Users/emilpedersen/swift2/swift/test/IRGen") !21 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !22, file: !17) !22 = !DIModule(scope: null, name: "Swift", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/Swift.swiftmodule/arm64-apple-macos.swiftmodule") !23 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !24, line: 60) !24 = !DIModule(scope: null, name: "_Differentiation", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Differentiation.swiftmodule/arm64-apple-macos.swiftmodule") !25 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !26, line: 61) !26 = !DIModule(scope: null, name: "M", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/test-macosx-arm64/IRGen/Output/debug_scope_distinct.swift.tmp/M.swiftmodule") !27 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !28, file: !17) !28 = !DIModule(scope: null, name: "_StringProcessing", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_StringProcessing.swiftmodule/arm64-apple-macos.swiftmodule") !29 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !30, file: !17) !30 = !DIModule(scope: null, name: "_SwiftConcurrencyShims", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/shims") !31 = !DIImportedEntity(tag: DW_TAG_imported_module, scope: !17, entity: !32, file: !17) !32 = !DIModule(scope: null, name: "_Concurrency", includePath: "/Users/emilpedersen/swift2/_build/Ninja-RelWithDebInfoAssert+stdlib-RelWithDebInfo/swift-macosx-arm64/lib/swift/macosx/_Concurrency.swiftmodule/arm64-apple-macos.swiftmodule") !33 = !{i1 false} !34 = !{!"-lswiftCore"} !35 = !{!"-lswift_StringProcessing"} !36 = !{!"-lswift_Differentiation"} !37 = !{!"-lswiftDarwin"} !38 = !{!"-lswift_Concurrency"} !39 = !{!"-lswiftSwiftOnoneSupport"} !40 = !{!"-lobjc"} !41 = !{!"-lswiftCompatibilityConcurrency"} !42 = !{!"-lswiftCompatibility56"} !43 = !{!"-lswiftCompatibilityPacks"} !44 = distinct !DISubprogram( unit: !16, declaration: !52, retainedNodes: !53) !45 = !DIFile(filename: "<compiler-generated>", directory: "/") !46 = !DICompositeType(tag: DW_TAG_structure_type, scope: !47, elements: !48, identifier: "$s4main1TV13TangentVectorVD") !47 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TVD") !48 = !{} !49 = !DISubroutineType(types: !50) !50 = !{!51} !51 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1TV13TangentVectorVXMtD") !52 = !DISubprogram( file: !45, type: !49, spFlags: DISPFlagOptimized) !53 = !{!54, !56, !57} !54 = !DILocalVariable( scope: !44, type: !55, flags: DIFlagArtificial) !55 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !46) !56 = !DILocalVariable( scope: !44, flags: DIFlagArtificial) !57 = !DILocalVariable( scope: !44, type: !58, flags: DIFlagArtificial) !58 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !51) !62 = !DILocalVariable( scope: !63, type: !72, flags: DIFlagArtificial) !63 = distinct !DISubprogram( type: !66, unit: !16, declaration: !69, retainedNodes: !70) !64 = !DICompositeType(tag: DW_TAG_structure_type, scope: !65, identifier: "$s4main1UV13TangentVectorVD") !65 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UVD") !66 = !DISubroutineType(types: !67) !67 = !{!68} !68 = !DICompositeType(tag: DW_TAG_structure_type, identifier: "$s4main1UV13TangentVectorVXMtD") !69 = !DISubprogram( spFlags: DISPFlagOptimized) !70 = !{!71, !73} !71 = !DILocalVariable( scope: !63, flags: DIFlagArtificial) !72 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !64) !73 = !DILocalVariable( scope: !63, type: !74, flags: DIFlagArtificial) !74 = !DIDerivedType(tag: DW_TAG_const_type, baseType: !68) !75 = !DILocation( scope: !63, inlinedAt: !76) !76 = distinct !DILocation( scope: !44) ``` if we run ` opt -S -passes=sroa file.ll -o -` With this patch we will see ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 tail call void @llvm.dbg.value(metadata ptr %.sroa.5.sroa.021, metadata !59, metadata !DIExpression(DW_OP_deref, DW_OP_LLVM_fragment, 72, 56)), !dbg !72 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Without this patch we will see: ``` %.sroa.5.sroa.021 = alloca [7 x i8], align 8 %.sroa.5.sroa.014 = alloca [7 x i8], align 8 ``` Thus this patch ensures that llvm.dbg.values that use allocas that are broken up still have the correct metadata and debug information is preserved This is part of a stack of patches and is preceded by: https://github.com/llvm/llvm-project/pull/94068	2024-08-21 17:52:37 -07:00
Orlando Cazalet-Hyams	57539418ba	[SROA] Fix debug locations for variables with non-zero offsets (#97750 ) Fixes issue #61981 by adjusting variable location offsets (in the DIExpression) when splitting allocas. Patch [4/4] to fix structured bindings in SROA. NOTE: There's still a bug in mem2reg which generates incorrect locations in some situations: if the variable fragment has an offset into the new (split) alloca, mem2reg will fail to convert that into a bit shift (the location contains a garbage offset). That's not addressed here. insertNewDbgInst - Now takes the address-expression and FragmentInfo as separate parameters because unlike dbg_declares dbg_assigns want those to go to different places. dbg_assign records put the variable fragment info in the value expression only (whereas dbg_declare has only one expression so puts it there - ideally this information wouldn't live in DIExpression, but that's another issue). MigrateOne - Modified to correctly compute the necessary offsets and fragment adjustments. The previous implementation produced bogus locations for variables with non-zero offsets. The changes replace most of the body of this lambda, so it might be easier to review in a split-diff view and focus on the change as a whole than to compare it to the old implementation. This uses calculateFragmentIntersect and extractLeadingOffset added in previous patches in this series, and createOrReplaceFragment described below. createOrReplaceFragment - Similar to DIExpression::createFragmentExpression except for 3 important distinctions: 1. The new fragment isn't relative to an existing fragment. 2. There are no checks on the the operation types because it is assumed the location this expression is computing is not implicit (i.e., it's always safe to create a fragment because arithmetic operations apply to the address computation, not to an implicit value computation). 3. Existing extract_bits are modified independetly of fragment changes using \p BitExtractOffset. A change to the fragment offset or size may affect a bit extract. But a bit extract offset can change independently of the fragment dimensions. Returns the new expression, or nullptr if one couldn't be created. Ideally this is only used to signal that a bit-extract has become zero-sized (and thus the new debug record has no size and can be dropped), however, it fails for other reasons too - see the FIXME below. FIXME: To keep the scope of this change focused on non-bitfield structured bindings the function bails in situations that DIExpression::createFragmentExpression fails. E.g. when fragment and bit extract sizes differ. These limitations can be removed in the future.	2024-07-18 09:08:25 +01:00
Nikita Popov	9df71d7673	[IR] Add getDataLayout() helpers to Function and GlobalValue (#96919 ) Similar to https://github.com/llvm/llvm-project/pull/96902, this adds `getDataLayout()` helpers to Function and GlobalValue, replacing the current `getParent()->getDataLayout()` pattern.	2024-06-28 08:36:49 +02:00
Nikita Popov	2d209d964a	[IR] Add getDataLayout() helpers to BasicBlock and Instruction (#96902 ) This is a helper to avoid writing `getModule()->getDataLayout()`. I regularly try to use this method only to remember it doesn't exist... `getModule()->getDataLayout()` is also a common (the most common?) reason why code has to include the Module.h header.	2024-06-27 16:38:15 +02:00
Stephen Tozer	d75f9dd1d2	Revert "[IR][NFC] Update IRBuilder to use InsertPosition (#96497 )" Reverts the above commit, as it updates a common header function and did not update all callsites: https://lab.llvm.org/buildbot/#/builders/29/builds/382 This reverts commit 6481dc57612671ebe77fe9c34214fba94e1b3b27.	2024-06-24 18:00:22 +01:00
Stephen Tozer	6481dc5761	[IR][NFC] Update IRBuilder to use InsertPosition (#96497 ) Uses the new InsertPosition class (added in #94226) to simplify some of the IRBuilder interface, and removes the need to pass a BasicBlock alongside a BasicBlock::iterator, using the fact that we can now get the parent basic block from the iterator even if it points to the sentinel. This patch removes the BasicBlock argument from each constructor or call to setInsertPoint. This has no functional effect, but later on as we look to remove the `Instruction *InsertBefore` argument from instruction-creation (discussed [here](https://discourse.llvm.org/t/psa-instruction-constructors-changing-to-iterator-only-insertion/77845)), this will simplify the process by allowing us to deprecate the InsertPosition constructor directly and catch all the cases where we use instructions rather than iterators.	2024-06-24 17:27:43 +01:00
Stephen Tozer	80f881485a	[LLVM] Add InsertPosition union-type to remove overloads of Instruction-creation (#94226 ) This patch simplifies instruction creation by replacing all overloads of instruction constructors/Create methods that are identical other than the Instruction InsertBefore/BasicBlock InsertAtEnd/BasicBlock::iterator InsertBefore argument with a single version that takes an InsertPosition argument. The InsertPosition class can be implicitly constructed from any of the above, internally converting them to the appropriate BasicBlock::iterator value which can then be used to insert the instruction (or to not insert it if an invalid iterator is passed). The upshot of this is that code will be deduplicated, and all callsites will switch to calling the new unified version without any changes needed to make the compiler happy. There is at least one exception to this; the construction of InsertPosition is a user-defined conversion, so any caller that was already relying on a different user-defined conversion won't work. In all of LLVM and Clang this happens exactly once: at clang/lib/CodeGen/CGExpr.cpp:123 we try to construct an alloca with an AssertingVH<Instruction> argument, which must now be cast to an Instruction* by using `&*`. If this is more common elsewhere, it could be fixed by adding an appropriate constructor to InsertPosition.	2024-06-20 10:27:55 +01:00
Kazu Hirata	7c6d0d26b1	[llvm] Use llvm::unique (NFC) (#95628 )	2024-06-14 22:49:36 -07:00
Nikita Popov	738fcbee68	[SROA] Preserve all GEP flags during speculation Unlikely to matter in practice, as these GEPs are typically promoted away.	2024-06-14 11:48:35 +02:00
Nikita Popov	a87ff8db26	[SROA] Remove sroa-strict-inbounds option (NFC) (#94342 ) This option was added in 3b79b2ab4e35353e63ba323a3de4b0a70c61a5f1 with the comment: > I had SROA implemented this way a long time ago and due to the > overwhelming bugs that surfaced, moved to a much more relaxed > variant. Richard Smith would like to understand the magnitude > of this problem and it seems fairly harmless to keep some > flag-controlled logic to get the extremely strict behavior here. > I'll remove it if it doesn't prove useful. As far as I know, it did not prove useful, so I'm removing it now. With constant GEPs canonicalized to i8, GEPs that only temporarily go out of bounds during the offset calculation do not naturally occur anymore anyway.	2024-06-05 09:18:13 +02:00
henke9600	50dbbe5a0a	[SROA] Use stable sort for slices to avoid non-determinism (#91609 ) Found this while trying to build a LLVM toolchain reproducibly from both Debian 12 and FreeBSD 14. With these changes they come out bit-by-bit identical. Previously there was a mix of stable and unstable sorts for slices, now only stable sorts are used.	2024-05-21 08:17:44 +02:00
Stephen Tozer	ffd08c7759	[RemoveDIs][NFC] Rename DPValue -> DbgVariableRecord (#85216 ) This is the major rename patch that prior patches have built towards. The DPValue class is being renamed to DbgVariableRecord, which reflects the updated terminology for the "final" implementation of the RemoveDI feature. This is a pure string substitution + clang-format patch. The only manual component of this patch was determining where to perform these string substitutions: `DPValue` and `DPV` are almost exclusively used for DbgRecords, except for: - llvm/lib/target, where 'DP' is used to mean double-precision, and so appears as part of .td files and in variable names. NB: There is a single existing use of `DPValue` here that refers to debug info, which I've manually updated. - llvm/tools/gold, where 'LDPV' is used as a prefix for symbol visibility enums. Outside of these places, I've applied several basic string substitutions, with the intent that they only affect DbgRecord-related identifiers; I've checked them as I went through to verify this, with reasonable confidence that there are no unintended changes that slipped through the cracks. The substitutions applied are all case-sensitive, and are applied in the order shown: ``` DPValue -> DbgVariableRecord DPVal -> DbgVarRec DPV -> DVR ``` Following the previous rename patches, it should be the case that there are no instances of any of these strings that are meant to refer to the general case of DbgRecords, or anything other than the DPValue class. The idea behind this patch is therefore that pure string substitution is correct in all cases as long as these assumptions hold.	2024-03-19 20:07:07 +00:00
Stephen Tozer	15f3f446c5	[RemoveDIs][NFC] Rename common interface functions for DPValues->DbgRecords (#84793 ) As part of the effort to rename the DbgRecord classes, this patch renames the widely-used functions that operate on DbgRecords but refer to DbgValues or DPValues in their names to refer to DbgRecords instead; all such functions are defined in one of `BasicBlock.h`, `Instruction.h`, and `DebugProgramInstruction.h`. This patch explicitly does not change the names of any comments or variables, except for where they use the exact name of one of the renamed functions. The reason for this is reviewability; this patch can be trivially examined to determine that the only changes are direct string substitutions and any results from clang-format responding to the changed line lengths. Future patches will cover renaming variables and comments, and then renaming the classes themselves.	2024-03-12 14:53:13 +00:00
Orlando Cazalet-Hyams	9997e03971	[RemoveDIs] Update DIBuilder to conditionally insert DbgRecords (#84739 ) Have DIBuilder conditionally insert either debug intrinsics or DbgRecord depending on the module's IsNewDbgInfoFormat flag. The insertion methods now return a `DbgInstPtr` (a `PointerUnion<Instruction , DbgRecord >`). Add a unittest for both modes (I couldn't find an existing test testing insertion behaviours specifically). This patch changes the existing assumption that DbgRecords are only ever inserted if there's an instruction to insert-before because clang currently inserts debug intrinsics while CodeGening (like any other instruction) meaning it'll try inserting to the end of a block without a terminator. We already have machinery in place to maintain the DbgRecords when a terminator is removed - these become "trailing DbgRecords" which are re-attached when a new instruction is inserted. All I've done is allow this state to occur while inserting DbgRecords too, i.e., it's not only removing terminators that causes this valid transient state, but inserting DbgRecords into incomplete blocks too. The C API will be updated in follow up patches. --- Note: this doesn't mean clang is emitting DbgRecords yet, because the modules it creates are still always in the old debug mode. That will come in a future patch.	2024-03-12 10:25:58 +00:00
Arthur Eubanks	eae4f56cb4	[SROA] Fix phi gep unfolding with an alloca not in entry block Fixes a crash reported in #83494.	2024-03-07 07:23:48 +00:00
Jeffrey Byrnes	1e828f838c	[SROA]: Only defer trying partial sized ptr or ptr vector types Change-Id: Ic77f87290905addadd5819dff2d0c62f031022ab	2024-03-05 08:52:07 -08:00
Jeremy Morse	2fe81edef6	[NFC][RemoveDIs] Insert instruction using iterators in Transforms/ As part of the RemoveDIs project we need LLVM to insert instructions using iterators wherever possible, so that the iterators can carry a bit of debug-info. This commit implements some of that by updating the contents of llvm/lib/Transforms/Utils to always use iterator-versions of instruction constructors. There are two general flavours of update: * Almost all call-sites just call getIterator on an instruction * Several make use of an existing iterator (scenarios where the code is actually significant for debug-info) The underlying logic is that any call to getFirstInsertionPt or similar APIs that identify the start of a block need to have that iterator passed directly to the insertion function, without being converted to a bare Instruction pointer along the way. Noteworthy changes: * FindInsertedValue now takes an optional iterator rather than an instruction pointer, as we need to always insert with iterators, * I've added a few iterator-taking versions of some value-tracking and DomTree methods -- they just unwrap the iterator. These are purely convenience methods to avoid extra syntax in some passes. * A few calls to getNextNode become std::next instead (to keep in the theme of using iterators for positions), * SeparateConstOffsetFromGEP has it's insertion-position field changed. Noteworthy because it's not a purely localised spelling change. All this should be NFC.	2024-03-05 15:12:22 +00:00
Arthur Eubanks	8848258f7b	[SROA] Unfold gep of index phi (round 2) (#83494 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983. This was initially was #83087. Initial PR did not handle allocas in entry block that weren't at the beginning of the function, causing GEPs to be inserted after the first chunk of allocas but potentially before an alloca not at the beginning. Insert GEPs at the end of the entry block instead since constants/arguments/static allocas can all be used there.	2024-03-04 14:21:26 -08:00
Fangrui Song	43b7dfcc1d	Revert "[SROA] Unfold gep of index phi (#83087 )" This reverts commit 2eb63982e88b9ed8336158d35884b1a1d04a0f78. This caused verifier error ``` Instruction does not dominate all uses! ``` for some projects using Halide. The verifier error happens inside `Halide::Internal::CodeGen_LLVM::optimize_module` and looks like a genuine SROA issue.	2024-02-28 15:56:43 -08:00
Arthur Eubanks	2eb63982e8	[SROA] Unfold gep of index phi (#83087 ) If a gep has only one phi as one of its operands and the remaining indexes are constant, we can unfold `gep ptr, (phi idx1, idx2)` to `phi ((gep ptr, idx1), (gep ptr, idx2))`. Take care not to unfold recursive phis. Followup to #80983.	2024-02-28 10:53:47 -08:00
Arthur Eubanks	001e18c816	[NFC] clang-format SROA.cpp	2024-02-27 22:46:16 +00:00
Florian Hahn	53c0e809fa	[SROA] Use !tbaa instead of !tbaa.struct if op matches field. (#81289 ) If a split memory access introduced by SROA accesses precisely a single field of the original operation's !tbaa.struct, use the !tbaa tag for the accessed field directly instead of the full !tbaa.struct. InstCombine already had a similar logic. Motivation for this and follow-on patches is to improve codegen for libc++, where using memcpy limits optimizations, like vectorization for code iteration over std::vector<std::complex<float>>: https://godbolt.org/z/f3vqYos3c Depends on https://github.com/llvm/llvm-project/pull/81285.	2024-02-16 13:45:01 +00:00
Nikita Popov	2f8e37d201	[SROA] Unfold gep of index select (#80983 ) SROA currently supports converting a gep of select into select of gep if the select is in the pointer operand. This patch expands support to selects in an index operand. This is intended to address the regression reported in https://github.com/llvm/llvm-project/pull/68882#issuecomment-1924909922.	2024-02-09 09:36:05 +01:00
Jeremy Morse	a643ab852a	[DebugInfo][RemoveDIs] Final omnibus test fixing for RemoveDIs (#81125 ) With this, I get a clean test suite running under RemoveDIs, the non-intrinsic representation of debug-info, including under asan. We've previously established that we generate identical binaries for some large projects, so this i just edge-case cleanup. The changes: * CodeGenPrepare fixups need to apply to dbg.assigns as well as dbg.values (a dbg.assign is a dbg.value). * Pin a test for constant-deletion to intrinsic debug-info: this very rare scenario uses a different kill-location sigil in dbg.value mode to RemoveDIs mode, which generates spurious test differences. * Suppress a memory leak in a unit test: the code for dealing with trailing debug-info in a block is necessarily fiddly, leading to this leak when testing it. Developer-facing interfaces for moving instructions around always deal with this behind the scenes. * SROA, when replacing some vector-loads, needs to insert the replacement loads ahead of any debug-info records so that their values remain dominated by a definition. Set the head-bit indicating our insertion should come before debug-info.	2024-02-08 11:49:04 +00:00
Jeffrey Byrnes	f709fbb1bb	[SROA] Only try additional vector type candidates when needed (#77678 ) `f9c2a341b9` causes regressions when we have a slice with integer vector type that is the same size as the partition, and a ptr load/store slice that is not the size of the element type. Ref `vector-promotion.ll:ptrLoadStoreTys`. Before the patch, we would only consider `<4 x i32>` as a candidate type for vector promotion, and would find that it is a viable type for all the slices. After the patch, we now add `<2 x ptr>` as a candidate type due to slice with user `store ptr %val0, ptr %obj, align 8` -- and flag that we `HaveVecPtrTy`. The pre-existing behavior of this flag results in removing the viable `<4 x i32>` and keeping only the unviable `<2 x ptr>`, which results in a failure to promote. The end result is failing to promote an alloca that was previously promoted -- this does not appear to be the intent of that patch, which has the goal of increasing promotions by providing more promotion opportunities. This PR preserves this behavior via a simple reorganization of the implemention: try first the slice types with same size as the partition, then, if there is no promotable type, try the `LoadStoreTys.`	2024-01-23 17:22:49 -08:00
Jeffrey Byrnes	2a61be4e4c	[SROA] NFC: Extract code to checkVectorTypesForPromotion Change-Id: Ib6f237cc791a097f8f2411bc1d6502f11d4a748e	2024-01-23 15:40:20 -08:00
Stephen Tozer	60e1c835d3	[RemoveDIs][DebugInfo] Update SROA to handle DPVAssigns (#78475 ) SROA needs to update llvm.dbg.assign intrinsics when it migrates debug info in response to alloca splitting; this patch updates the debug info migration code to handle DPVAssigns as well, making use of generic code to avoid duplication as much as possible.	2024-01-23 09:37:27 +00:00
Stephen Tozer	304119860a	[DebugInfo][RemoveDIs][NFC] Split findDbgDeclares into two functions (#77478 ) This patch follows on from comments on https://github.com/llvm/llvm-project/pull/73498, implementing the proposed split of findDbgDeclares into two separate functions for DbgDeclareInsts and DPVDeclares, which return containers rather than taking containers by reference.	2024-01-15 17:46:56 +00:00
Nikita Popov	6c2fbc3a68	[IRBuilder] Add CreatePtrAdd() method (NFC) (#77582 ) This abstracts over the common pattern of creating a gep with i8 element type.	2024-01-12 14:21:21 +01:00
Jannik Silvanus	7954c57124	[IR] Fix GEP offset computations for vector GEPs (#75448 ) Vectors are always bit-packed and don't respect the elements' alignment requirements. This is different from arrays. This means offsets of vector GEPs need to be computed differently than offsets of array GEPs. This PR fixes many places that rely on an incorrect pattern that always relies on `DL.getTypeAllocSize(GTI.getIndexedType())`. We replace these by usages of `GTI.getSequentialElementStride(DL)`, which is a new helper function added in this PR. This changes behavior for GEPs into vectors with element types for which the (bit) size and alloc size is different. This includes two cases: * Types with a bit size that is not a multiple of a byte, e.g. i1. GEPs into such vectors are questionable to begin with, as some elements are not even addressable. * Overaligned types, e.g. i16 with 32-bit alignment. Existing tests are unaffected, but a miscompilation of a new test is fixed. --------- Co-authored-by: Nikita Popov <github@npopov.com>	2024-01-04 10:08:21 +01:00
Nikita Popov	54067c5fbe	[SROA] Use memcpy if type size does not match store size The original memcpy also copies the padding, so make sure that this is still the case after splitting. Fixes https://github.com/llvm/llvm-project/issues/64081.	2023-12-22 10:19:22 +01:00
Kazu Hirata	4b4dcb4988	[Transforms] Fix a warning This patch fixes: llvm/lib/Transforms/Scalar/SROA.cpp:4855:9: error: unused variable 'NewAssign' [-Werror,-Wunused-variable]	2023-12-12 08:50:49 -08:00
Orlando Cazalet-Hyams	3d42557872	[RemoveDI] Handle DPValues in SROA (#74089 ) Handle dbg.declares in SROA using DPValues. In order to reduce duplication, the migrate-debug-info loop has been changed to a generic lambda with some helper function overloads, which is called for dbg.declares, dbg.assigns, and DPValues alike. The tests will become "live" once #74090 lands (see for more info).	2023-12-12 15:49:24 +00:00
Orlando Cazalet-Hyams	2d9d9a1a55	[NFC] Change FindDbgDeclareUsers interface to match findDbgUsers/values (#73498 ) This simplifies an upcoming patch to support the RemoveDIs project (tracking variable locations without using intrinsics). Next in this series is #73500.	2023-12-12 09:43:58 +00:00
Youngsuk Kim	c419f3e10e	[llvm][SROA] Replace calls to Type::getPointerTo (NFC) NFC cleanup towards removing method Type::getPointerTo. * Remove unnecessary call to Type::getPointerTo * Replace call to Type::getPointerTo with IRB.getPtrTy	2023-11-26 20:17:06 -06:00
Craig Topper	aba040182a	[IR] Replace uses of IRBuilder::getInt8PtrTy with getPtrTy. NFC (#73154 )	2023-11-22 12:24:18 -08:00
Bruno De Fraine	cd88047765	[NFC][SROA] Remove implementation details from SROA header (#72846 ) This moves the SROA implementation from SROAPass into a separate SROA class that is defined in the cpp file, and reduces the SROAPass class to a thin NewPM wrapper. This allows to remove all implementation details from the SROA header, and the SROALegacyPass can wrap the SROA class instead of the NewPM SROAPass. The trigger for this change is a GCC warning about visibility of implementation details in the SROA header after D138238. Credits to Nikita Popov for suggesting this reorganization.	2023-11-20 14:00:22 +01:00
Antonio Frighetto	ebbb9cdbb3	[SROA] Allow `llvm.launder.invariant.group` intrinsic to be splittable Let `llvm.launder.invariant.group` intrinsic as well as instructions operating on memory addresses, whose invariance may be broken by the intrinsic, to be rewritten. Fixes: https://github.com/llvm/llvm-project/issues/72035.	2023-11-16 17:56:34 +01:00
Nikita Popov	ed86e740ef	Revert "[SROA] Limit the number of allowed slices when trying to split allocas" This reverts commit e13e808283f7fd9e873ae922dd1ef61aeaa0eb4a. This causes performance regressions on GPU targets, see https://github.com/llvm/llvm-project/issues/69785. Revert the change for now.	2023-11-09 16:38:52 +01:00
Kazu Hirata	8a7f4eeb60	[llvm] Use llvm::is_contained (NFC)	2023-09-22 17:09:27 -07:00
Nikita Popov	57a554800b	[SROA] Don't shrink volatile load past end For volatile atomic, this may result in a verifier errors, if the new alloca type is not legal for atomic accesses. I've opted to disable this special case for volatile accesses in general, as changing the size of the volatile access seems dubious in any case. Fixes https://github.com/llvm/llvm-project/issues/64721.	2023-09-20 14:12:31 +02:00
Nikita Popov	ddf7cc27f7	[SROA] Remove unnecessary IsStorePastEnd handling (NFCI) Unlike the load case, stores past the end of the alloca are removed by SROA as undefined behavior. As such, there is no need to handle this case when rewriting stores.	2023-09-19 16:53:24 +02:00
Jeremy Morse	e54277fa10	[NFC][RemoveDIs] Use iterators over inst-pointers when using IRBuilder This patch adds a two-argument SetInsertPoint method to IRBuilder that takes a block/iterator instead of an instruction, and updates many call sites to use it. The motivating reason for doing this is given here [0], we'd like to pass around more information about the position of debug-info in the iterator object. That necessitates passing iterators around most of the time. [0] https://discourse.llvm.org/t/rfc-instruction-api-changes-needed-to-eliminate-debug-intrinsics-from-ir/68939 Differential Revision: https://reviews.llvm.org/D152468	2023-09-11 20:01:19 +01:00
Dhruv Chawla	e13e808283	[SROA] Limit the number of allowed slices when trying to split allocas This patch adds a hidden CLI option "--sroa-max-alloca-slices", which is an integer that controls the maximum number of alloca slices SROA can consider before bailing out. This is useful because it may not be profitable to split memcpys into (possibly tens of) thousands of loads/stores. This also prevents an issue with exponential compile time explosion in passes like DSE and MemCpyOpt caused by excessive alloca splitting. Fixes https://github.com/rust-lang/rust/issues/88580. Differential Revision: https://reviews.llvm.org/D159354	2023-09-09 11:00:47 +05:30
Zhongyunde	c570531c3d	[SROA] Skip uses of allocas where the type is scalable When visiting load and store instructions in SROA skip scalable vectors. This is relevant in the implementation of the 'arm_sve_vector_bits' attribute that is used to define VLS types, similar to D85725. Fix https://gcc.godbolt.org/z/o561P9zj4 Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D158631	2023-08-24 18:44:09 +08:00
Bjorn Pettersson	e53b28c833	[llvm] Drop some bitcasts and references related to typed pointers Differential Revision: https://reviews.llvm.org/D157551	2023-08-10 15:07:07 +02:00
Bjorn Pettersson	4ce7c4a92a	[llvm] Drop some typed pointer handling/bitcasts Differential Revision: https://reviews.llvm.org/D157016	2023-08-03 22:54:33 +02:00

1 2 3 4 5 ...

593 Commits