llvm-project

Author	SHA1	Message	Date
paperchalice	7ac7d418ac	[NewPM][NVPTX] Add NVPTXPassRegistry.def NFCI (#86246 ) Prepare for dag-isel migration.	2024-03-23 11:20:18 +08:00
Rishabh Bali	fe42e72db2	[CodeGen] Port AtomicExpand to new Pass Manager (#71220 ) Port the `atomicexpand` pass to the new Pass Manager. Fixes #64559	2024-02-25 18:42:22 +05:30
paperchalice	ffb1f20e0d	[CodeGen] Add flag to populate target pass names (#76328 ) `print-pipeline-passes` can show target pass names.	2024-01-03 09:07:02 +08:00
Christian Sigg	5b7a7ec5a2	[NVPTX] Fix code generation for `trap-unreachable`. (#67478 ) https://reviews.llvm.org/D152789 added an `exit` op before each `unreachable`. This means we never get to the `trap` instruction. This change limits the insertion of `exit` instructions to the cases where `unreachable` is not lowered to `trap`. Trap itself is changed to be emitted as `trap; exit;` to convey to `ptxas` that it exits the CFG.	2023-10-01 07:59:24 +02:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Artem Belevich	a8821c8b1c	[NVPTX] added a hidden option to control NVPTXLowerUnreachable pass. We've ran into an issue where the pass breaks a handful of our internal tests and need a way to temporarily disable the pass while we're investigating. Differential Revision: https://reviews.llvm.org/D154106	2023-06-30 08:52:23 -07:00
Tim Besard	1ee4d880e8	NVPTX: Lower unreachable to exit to allow ptxas to accurately reconstruct the CFG. PTX does not have a notion of `unreachable`, which results in emitted basic blocks having an edge to the next block: ``` block1: call @does_not_return(); // unreachable block2: // ptxas will create a CFG edge from block1 to block2 ``` This may result in significant changes to the control flow graph, e.g., when LLVM moves unreachable blocks to the end of the function. That's a problem in the context of divergent control flow, as `ptxas` uses the CFG to determine divergent regions, while some intructions may not be executed divergently. For example, `bar.sync` is not allowed to be executed divergently on Pascal or earlier. If we start with the following: ``` entry: // start of divergent region @%p0 bra cont; @%p1 bra unlikely; ... bra.uni cont; unlikely: ... // unreachable cont: // end of divergent region bar.sync 0; bra.uni exit; exit: ret; ``` it is transformed by the branch-folder and block-placement passes to: ``` entry: // start of divergent region @%p0 bra cont; @%p1 bra unlikely; ... bra.uni cont; cont: bar.sync 0; bra.uni exit; unlikely: ... // unreachable exit: // end of divergent region ret; ``` After moving the `unlikely` block to the end of the function, it has an edge to the `exit` block, which widens the divergent region and makes the `bar.sync` instruction happen divergently. That causes wrong computations, as we've been running into for years with Julia code (which emits a lot of `trap` + `unreachable` code all over the place). To work around this, add an `exit` instruction before every `unreachable`, as `ptxas` understands that exit terminates the CFG. Note that `trap` is not equivalent, and only future versions of `ptxas` will model it like `exit`. Another alternative would be to emit a branch to the block itself, but emitting `exit` seems like a cleaner solution to represent `unreachable` to me. Also note that this may not be sufficient, as it's possible that the block with unreachable control flow is branched to from different divergent regions, e.g. after block merging, in which case it may still be the case that `ptxas` could reconstruct a CFG where divergent regions are merged (I haven't confirmed this, but also haven't encountered this pattern in the wild yet): ``` entry: // start of divergent region 1 @%p0 bra cont1; @%p1 bra unlikely; bra.uni cont1; cont1: // intended end of divergent region 1 bar.sync 0; // start of divergent region 2 @%p2 bra cont2; @%p3 bra unlikely; bra.uni cont2; cont2: // intended end of divergent region 2 bra.uni exit; unlikely: ... exit; exit: // possible end of merged divergent region? ``` I originally tried to avoid the above by cloning paths towards `unreachable` and splitting the outgoing edges, but that quickly became too complicated. I propose we go with the simple solution first, also because modern GPUs with more flexible hardware thread schedulers don't even suffer from this issue. Finally, although I expect this to fix most of https://bugs.llvm.org/show_bug.cgi?id=27738, I do still encounter miscompilations with Julia's unreachable-heavy code when targeting these older GPUs using an older `ptxas` version (specifically, from CUDA 11.4 or below). This is likely due to related bugs in `ptxas` which have been fixed since, as I have filed several reproducers with NVIDIA over the past couple of years. I'm not inclined to look into fixing those issues over here, and will instead be recommending our users to upgrade CUDA to 11.5+ when using these GPUs. Also see: - https://github.com/JuliaGPU/CUDAnative.jl/issues/4 - https://github.com/JuliaGPU/CUDA.jl/issues/1746 - https://discourse.llvm.org/t/llvm-reordering-blocks-breaks-ptxas-divergence-analysis/71126 Reviewed By: jdoerfert, tra Differential Revision: https://reviews.llvm.org/D152789	2023-06-21 11:40:31 -07:00
Artem Belevich	3d4964f494	[NVPTX] add new sm90-specific intrinsics. Differential Revision: https://reviews.llvm.org/D151009	2023-05-25 11:57:55 -07:00
Joseph Huber	f05ce9045a	[NVPTX] Add NVPTXCtorDtorLoweringPass to handle global ctors / dtors This patch mostly adapts the existing AMDGPUCtorDtorLoweringPass for use by the Nvidia backend. This pass transforms the ctor / dtor list into a kernel call that can be used to invoke those functinos. Furthermore, we emit globals such that the names and addresses of these constructor functions can be found by the driver. Unfortunately, since NVPTX has no way to emit variables at a named section, nor a functioning linker to provide the begin / end symbols, we need to mangle these names and have an external application find them. This work is related to the work in D149398 and D149340. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D149451	2023-05-04 07:13:00 -05:00
Bjorn Pettersson	21a6890856	[Vectorize] Clean up Transforms/Vectorize.h Removed definitions of vectorizeBasicBlock and VectorizeConfig (possibly a remnant from the BBVectorize pass that was removed way back in 2017). Also reduced amount of include dependencies to Transforms/Vectorize.h.	2023-04-17 13:54:19 +02:00
Pavel Kopyl	0c0387c7a5	[NVPTX] Port GenericToNVVM to the new PM. Differential Revision: https://reviews.llvm.org/D146345	2023-03-23 00:56:14 +01:00
Archibald Elliott	62c7f035b4	[NFC][TargetParser] Remove llvm/ADT/Triple.h I also ran `git clang-format` to get the headers in the right order for the new location, which has changed the order of other headers in two files.	2023-02-07 12:39:46 +00:00
Bjorn Pettersson	2dd221fe48	Remove no longer needed includes of LegacyPassManager.h Most of the removed includes should probably have been removed already when we removed TargetMachine::adjustPassManager.	2023-02-06 13:38:57 +01:00
Andrew Savonichev	ca50be8c89	[NVPTX] Implement NVPTX AliasAnalysis NVPTXAliasAnalysis extends the default AA to take pointer address spaces into account. The analysis assumes that pointers in different address spaces do not alias, unless one of them is generic (flat) address space. The patch also implements pointsToConstantMemory (via getModRefInfoMask) to expose semantic of the constant address space to the optimizer as discussed in D112466. Differential Revision: https://reviews.llvm.org/D124787	2023-02-01 16:16:43 +03:00
Artem Belevich	8db31e932d	[NVPTX] Do not addrspacecast AS-specific kernel arguments. Fixes https://github.com/llvm/llvm-project/issues/46954 The assumption that generic pointers passed to a CUDA kernel is CUDA-specific and should not be applied to non-CUDA compilations. Addrspacecasts to global AS and back should never be applied to AS-specific pointers. In order to make tests actually do the testing for non-CUDA compilation, we need to get TargetMachine from the TargetPassConfig, instead of passing it explicitly as a pass constructor argument. Differential Revision: https://reviews.llvm.org/D142581	2023-01-26 11:29:20 -08:00
Luke Drummond	d9c50cc984	[NFC][NVPTX] Move a comment back to its proper place The comment introduced in b94bd05b952a5 was misplaced during f14af1621942 and no longer comments on the relevant bit of code; move it back so it makes sense.	2023-01-05 13:01:28 +00:00
Luke Drummond	6aa9cfb13f	[NVPTX] Replace PTX's ManagedStringPool with StringSaver In use ManagedStringPool caused a lot of heap allocations. At least one for every register name lookup in NVPTXTargetRegisterInfo and one for every symbol lookup in the target machine and isel lowering. There already exists an llvm/Support string interning-class that has better memory performance. Use LLVM's and delete ManagedStringPool which was unique to PTX llc Binary Size (.text only; bss and data were unchanged): MinsizeRel: Before: 31219884 After: 31219796 Release: Before: 42961872 After: 42960656 Total heap allocations by the NVPTX string saving code running check-llvm-codegen-nvptx Total bytes allocated: Before: 2431825 After: 2288151 (All numbers on x86-64-linux-gnu / gcc-12 / lld14) I didn't see obvious time differences when running the tests. Reviewers: tra, avasonic Differential Revision: https://reviews.llvm.org/D140704	2023-01-04 11:28:39 +00:00
Nick Desaulniers	19a004b468	[llvm][SelectionDAGISel] support -{start\|stop}-{before\|after}= for remaining targets Follow up to the series: 1. https://reviews.llvm.org/D140161 2. https://reviews.llvm.org/D140349 3. https://reviews.llvm.org/D140331 4. https://reviews.llvm.org/D140323 Completes the work from the previous two for remaining targets. This creates the following named passes that can be run via `llc -{start\|stop}-{before\|after}`: - arc-isel - arm-isel - avr-isel - bpf-isel - csky-isel - hexagon-isel - lanai-isel - loongarch-isel - m68k-isel - msp430-isel - mips-isel - nvptx-isel - ppc-codegen - riscv-isel - sparc-isel - systemz-isel - ve-isel - wasm-isel - xcore-isel A nice way to write tests for SelectionDAGISel might be to use a RUN: line like: llc -mtriple=<triple> -start-before=<arch>-isel -stop-after=finalize-isel -o - Fixes: https://github.com/llvm/llvm-project/issues/59538 Reviewed By: asb, zixuan-wu Differential Revision: https://reviews.llvm.org/D140364	2022-12-21 13:25:15 -08:00
Matt Arsenault	69e75ae695	CodeGen: Don't lazily construct MachineFunctionInfo This fixes what I consider to be an API flaw I've tripped over multiple times. The point this is constructed isn't well defined, so depending on where this is first called, you can conclude different information based on the MachineFunction. For example, the AMDGPU implementation inspected the MachineFrameInfo on construction for the stack objects and if the frame has calls. This kind of worked in SelectionDAG which visited all allocas up front, but broke in GlobalISel which hasn't visited any of the IR when arguments are lowered. I've run into similar problems before with the MIR parser and trying to make use of other MachineFunction fields, so I think it's best to just categorically disallow dependency on the MachineFunction state in the constructor and to always construct this at the same time as the MachineFunction itself. A missing feature I still could use is a way to access an custom analysis pass on the IR here.	2022-12-21 10:49:32 -05:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit 122efef8ee9be57055d204d52c38700fe933c033. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit 17db0de330f943833296ae72e26fa988bba39cb3. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Fangrui Song	bac974278c	CodeGen/CommandFlags: Convert Optional to std::optional	2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek	8c7c20f033	Convert Optional<CodeModel> to std::optional<CodeModel>	2022-12-03 12:08:47 -06:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit 6d12599fd4134c1da63198c74a25490d28c733f6.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Bjorn Pettersson	99c47d9e31	Remove TargetMachine::adjustPassManager Since opt no longer supports to run default (O0/O1/O2/O3/Os/Oz) pipelines using the legacy PM, there are no in-tree uses of TargetMachine::adjustPassManager remaining. This patch removes the no longer used adjustPassManager functions. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D137796	2022-11-28 10:24:16 +01:00
Shilei Tian	ecf5b78053	[NVPTX] Enable AtomicExpandPass for NVPTX This patch enables `AtomicExpandPass` for NVPTX. Depend on D125652. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D125639	2022-05-20 17:25:28 -04:00
Jameson Nash	c4b1a63a1b	mark getTargetTransformInfo and getTargetIRAnalysis as const Seems like this can be const, since Passes shouldn't modify it. Reviewed By: wsmoses Differential Revision: https://reviews.llvm.org/D120518	2022-02-25 14:30:44 -05:00
Michael Liao	bf225939bc	[InferAddressSpaces] Support assumed addrspaces from addrspace predicates. - CUDA cannot associate memory space with pointer types. Even though Clang could add extra attributes to specify the address space explicitly on a pointer type, it breaks the portability between Clang and NVCC. - This change proposes to assume the address space from a pointer from the assumption built upon target-specific address space predicates, such as `__isGlobal` from CUDA. E.g., ``` foo(float *p) { __builtin_assume(__isGlobal(p)); // From there, we could assume p is a global pointer instead of a // generic one. } ``` This makes the code portable without introducing the implementation-specific features. Note that NVCC starts to support __builtin_assume from version 11. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D112041	2021-11-08 16:51:57 -05:00
Artem Belevich	b6b7fe60a4	[NVPTX] Add a late SROA pass which allows optimizing away more allocas. Fixes performance regression https://bugs.llvm.org/show_bug.cgi?id=52037 Differential Revision: https://reviews.llvm.org/D111471	2021-10-19 16:18:28 -07:00
Jay Foad	012248b0bc	Remove the verifyAfter mechanism that was replaced by D111397 Differential Revision: https://reviews.llvm.org/D111872	2021-10-18 10:26:46 +01:00
Reid Kleckner	89b57061f7	Move TargetRegistry.(h\|cpp) from Support to MC This moves the registry higher in the LLVM library dependency stack. Every client of the target registry needs to link against MC anyway to actually use the target, so we might as well move this out of Support. This allows us to ensure that Support doesn't have includes from MC/*. Differential Revision: https://reviews.llvm.org/D111454	2021-10-08 14:51:48 -07:00
Tarindu Jayatilaka	7a797b2902	Take OptimizationLevel class out of Pass Builder Pulled out the OptimizationLevel class from PassBuilder in order to be able to access it from within the PassManager and avoid include conflicts. Reviewed By: mtrofin Differential Revision: https://reviews.llvm.org/D107025	2021-07-29 21:57:23 -07:00
Arthur Eubanks	34a8a437bf	[NewPM] Hide pass manager debug logging behind -debug-pass-manager-verbose Printing pass manager invocations is fairly verbose and not super useful. This allows us to remove DebugLogging from pass managers and PassBuilder since all logging (aside from analysis managers) goes through instrumentation now. This has the downside of never being able to print the top level pass manager via instrumentation, but that seems like a minor downside. Reviewed By: ychen Differential Revision: https://reviews.llvm.org/D101797	2021-05-07 21:51:47 -07:00
William S. Moses	7aa3cad46a	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 20:12:12 -04:00
William S. Moses	8ede96493c	Revert "[NVPTX] Enable lowering of atomics on local memory" This reverts commit fede99d386ec9e7bab2762aa16cb10c0513ae464.	2021-04-26 19:33:01 -04:00
William S. Moses	fede99d386	[NVPTX] Enable lowering of atomics on local memory LLVM does not have valid assembly backends for atomicrmw on local memory. However, as this memory is thread local, we should be able to lower this to the relevant load/store. Differential Revision: https://reviews.llvm.org/D98650	2021-04-26 19:27:27 -04:00
Arthur Eubanks	e84a4650eb	[NVPTX][NewPM] Re-enable NVVMReflectPass Disabled alongside NVVMIntrRangePass in https://reviews.llvm.org/D96166, but turns out NVVMIntrRangePass was the issue. Reviewed By: tra Differential Revision: https://reviews.llvm.org/D96291	2021-02-08 13:58:17 -08:00
Arthur Eubanks	526c0955c0	[NVPTX][NewPM] Temporarily disable NVPTX passes in new PM pipeline These passes are causing numerical discrepancies after being added to the pipeline. Disable while investigating. Reviewed By: rupprecht Differential Revision: https://reviews.llvm.org/D96166	2021-02-05 11:31:07 -08:00
Arthur Eubanks	9ccf13c36d	[NewPM][NVPTX] Port NVPTX opt passes There are only two used in the IR optimization pipeline. Port these and add them to the default pipeline. Similar to https://reviews.llvm.org/D93863. I added -mtriple to some tests since under the new PM, the passes are only available when the TargetMachine is specified. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D93930	2021-01-07 15:12:35 -08:00
Matt Arsenault	c9122ddef5	CodeGen: Refactor regallocator command line and target selection Make the sequence of passes to select and rewrite instructions to physical registers be a target callback. This is to prepare to allow targets to split register allocation into multiple phases.	2021-01-07 13:13:25 -05:00
Frederic Bastien	019ab61e25	[NVPTX, LSV] Move the LSV optimization pass to later when the graph is cleaner This allow it to recognize more loads as being consecutive when the load's address are complex at the start. Differential Revision: https://reviews.llvm.org/D74444	2020-02-13 12:15:38 -08:00
Benjamin Kramer	adcd026838	Make llvm::StringRef to std::string conversions explicit. This is how it should've been and brings it more in line with std::string_view. There should be no functional change here. This is mostly mechanical from a custom clang-tidy check, with a lot of manual fixups. It uncovers a lot of minor inefficiencies. This doesn't actually modify StringRef yet, I'll do that in a follow-up.	2020-01-28 23:25:25 +01:00
Tom Stellard	0dbcb36394	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: merge_guards_bot, luismarques, smeenai, ldionne, lenary, s.egerton, pzheng, sameer.abuasal, MaskRay, wuzish, echristo, Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439	2020-01-14 19:46:52 -08:00
Jonas Devlieghere	0eaee545ee	[llvm] Migrate llvm::make_unique to std::make_unique Now that we've moved to C++14, we no longer need the llvm::make_unique implementation from STLExtras.h. This patch is a mechanical replacement of (hopefully) all the llvm::make_unique instances across the monorepo. llvm-svn: 369013	2019-08-15 15:54:37 +00:00
Tom Stellard	4b0b26199b	Revert CMake: Make most target symbols hidden by default This reverts r362990 (git commit 374571301dc8e9bc9fdd1d70f86015de198673bd) This was causing linker warnings on Darwin: ld: warning: direct access in function 'llvm::initializeEvexToVexInstPassPass(llvm::PassRegistry&)' from file '../../lib/libLLVMX86CodeGen.a(X86EvexToVex.cpp.o)' to global weak symbol 'void std::__1::__call_once_proxy<std::__1::tuple<void* (&)(llvm::PassRegistry&), std::__1::reference_wrapper<llvm::PassRegistry>&&> >(void*)' from file '../../lib/libLLVMCore.a(Verifier.cpp.o)' means the weak symbol cannot be overridden at runtime. This was likely caused by different translation units being compiled with different visibility settings. llvm-svn: 363028	2019-06-11 03:21:13 +00:00
Tom Stellard	374571301d	CMake: Make most target symbols hidden by default Summary: For builds with LLVM_BUILD_LLVM_DYLIB=ON and BUILD_SHARED_LIBS=OFF this change makes all symbols in the target specific libraries hidden by default. A new macro called LLVM_EXTERNAL_VISIBILITY has been added to mark symbols in these libraries public, which is mainly needed for the definitions of the LLVMInitialize* functions. This patch reduces the number of public symbols in libLLVM.so by about 25%. This should improve load times for the dynamic library and also make abi checker tools, like abidiff require less memory when analyzing libLLVM.so One side-effect of this change is that for builds with LLVM_BUILD_LLVM_DYLIB=ON and LLVM_LINK_LLVM_DYLIB=ON some unittests that access symbols that are no longer public will need to be statically linked. Before and after public symbol counts (using gcc 8.2.1, ld.bfd 2.31.1): nm before/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 36221 nm after/libLLVM-9svn.so \| grep ' [A-Zuvw] ' \| wc -l 26278 Reviewers: chandlerc, beanz, mgorny, rnk, hans Reviewed By: rnk, hans Subscribers: Jim, hiraditya, michaelplatings, chapuni, jholewinski, arsenm, dschuff, jyknight, dylanmckay, sdardis, nemanjai, jvesely, nhaehnle, javed.absar, sbc100, jgravelle-google, aheejin, kbarton, fedor.sergeev, asb, rbar, johnrusso, simoncook, apazos, sabuasal, niosHD, jrtc27, zzheng, edward-jones, mgrang, atanasyan, rogfer01, MartinMosbeck, brucehoult, the_o, PkmX, jocewei, kristina, jsji, llvm-commits Tags: #llvm Differential Revision: https://reviews.llvm.org/D54439 llvm-svn: 362990	2019-06-10 22:12:56 +00:00
Richard Trieu	e8f83befd5	[NVPTX] Create a TargetInfo header. NFC Move the declarations of getThe<Name>Target() functions into a new header in TargetInfo and make users of these functions include this new header. This fixes a layering problem. llvm-svn: 360729	2019-05-14 23:56:18 +00:00
Matt Arsenault	cf55a657f0	CodeGen: Refactor regallocator command line and target selection This will allow targets more flexibility to replace the register allocator core passes. In a future commit, AMDGPU will run the core register assignment passes twice, and will also want to disallow using the standard -regalloc option. llvm-svn: 356506	2019-03-19 19:33:12 +00:00

1 2 3 4

179 Commits