llvm-project

Author	SHA1	Message	Date
Yeting Kuo	0f8c761c48	[VP][RISCV] Recommit "Add vp.fshl/fshr and RISC-V support." This reverts commit 7883e5b061bdbbe8bee5f479ebe911db5045b7e9. The original commit was reverted that it didn't update test files after D136263 landed. The recommit fixed those. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D139509	2022-12-07 15:58:12 +08:00
Rahman Lavaee	6015a045d7	[Propeller] Use Fixed MBB ID instead of volatile MachineBasicBlock::Number. Let Propeller use specialized IDs for basic blocks, instead of MBB number. This allows optimizations not just prior to asm-printer, but throughout the entire codegen. This patch only implements the functionality under the new `LLVM_BB_ADDR_MAP` version, but the old version is still being used. A later patch will change the used version. ####Background Today Propeller uses machine basic block (MBB) numbers, which already exist, to map native assembly to machine IR. This is done as follows. - Basic block addresses are captured and dumped into the `LLVM_BB_ADDR_MAP` section just before the AsmPrinter pass which writes out object files. This ensures that we have a mapping that is close to assembly. - Profiling mapping works by taking a virtual address of an instruction and looking up the `LLVM_BB_ADDR_MAP` section to find the MBB number it corresponds to. - While this works well today, we need to do better when we scale Propeller to target other Machine IR optimizations like spill code optimization. Register allocation happens earlier in the Machine IR pipeline and we need an annotation mechanism that is valid at that point. - The current scheme will not work in this scenario because the MBB number of a particular basic block is not fixed and changes over the course of codegen (via renumbering, adding, and removing the basic blocks). - In other words, the volatile MBB numbers do not provide a one-to-one correspondence throughout the lifetime of Machine IR. Profile annotation using MBB numbers is restricted to a fixed point; only valid at the exact point where it was dumped. - Further, the object file can only be dumped before AsmPrinter and cannot be dumped at an arbitrary point in the Machine IR pass pipeline. Hence, MBB numbers are not suitable and we need something else. ####Solution We propose using fixed unique incremental MBB IDs for basic blocks instead of volatile MBB numbers. These IDs are assigned upon the creation of machine basic blocks. We modify `MachineFunction::CreateMachineBasicBlock` to assign the fixed ID to every newly created basic block. It assigns `MachineFunction::NextMBBID` to the MBB ID and then increments it, which ensures having unique IDs. To ensure correct profile attribution, multiple equivalent compilations must generate the same Propeller IDs. This is guaranteed as long as the MachineFunction passes run in the same order. Since the `NextBBID` variable is scoped to `MachineFunction`, interleaving of codegen for different functions won't cause any inconsistencies. The new encoding is generated under the new version number 2 and we keep backward-compatibility with older versions. ####Impact on Size of the `LLVM_BB_ADDR_MAP` Section Emitting the Propeller ID results in a 23% increase in the size of the `LLVM_BB_ADDR_MAP` section for the clang binary. Reviewed By: tmsriram Differential Revision: https://reviews.llvm.org/D100808	2022-12-06 22:50:09 -08:00
Kazu Hirata	934942c033	[llvm] Don't include Optional.h (NFC) These source files no longer use Optional<T>, so they do not need to include Optional.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-06 22:34:50 -08:00
Kazu Hirata	7883e5b061	Revert "[VP][RISCV] Add vp.fshl/fshr and RISC-V support." This reverts commit 70de0e014013b4d97febe6704881a9a8c893d078. I'm seeing: Failed Tests (2): LLVM :: CodeGen/RISCV/rvv/fixed-vectors-fshr-fshl-vp.ll LLVM :: CodeGen/RISCV/rvv/fshr-fshl-vp.ll Also reported at: https://lab.llvm.org/buildbot/#/builders/123/builds/14531	2022-12-06 22:27:43 -08:00
Yeting Kuo	70de0e0140	[VP][RISCV] Add vp.fshl/fshr and RISC-V support. The patch made VectorLegalizer expand ISD::VP_FSHL and ISD::VP_FSHR to achieve the codegen. Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D138379	2022-12-07 12:16:36 +08:00
Kazu Hirata	405fc404bf	[ADT] Don't including None.h (NFC) These source files no longer use None, so they do not need to include None.h. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-06 20:14:51 -08:00
Kazu Hirata	d8c00c4f63	[llvm] Don't include STLForwardCompat.h (NFC) STLForwardCompat.h defines remove_cvref and remove_cvref_t. These source files use neither one of those.	2022-12-06 20:09:56 -08:00
Gregory Alfonso	cb38be9ed3	[NFC] Use Register instead of unsigned for variables that receive a Register object Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D139451	2022-12-07 00:23:34 +00:00
Krzysztof Parzyszek	c589730ad5	[YAML] Convert Optional to std::optional	2022-12-06 12:49:32 -08:00
Sanjay Patel	adc7c589c3	[SDAG] try to convert bit set/clear to signbit test when trunc is free (X & Pow2MaskC) == 0 --> (trunc X) >= 0 (X & Pow2MaskC) != 0 --> (trunc X) < 0 This was noted as a regression in the post-commit feedback for D112634 (where we canonicalized IR differently). For x86, this saves a few instruction bytes. AArch64 seems neutral. Differential Revision: https://reviews.llvm.org/D139363	2022-12-06 11:34:48 -05:00
Tobias Hieta	2298a44ccd	[CodeView] Add support for local S_CONSTANT records CodeView doesn't have the ability to represent variables in other ways than as in registers or memory values, but LLVM very often transforms simple values into constants, consider this program: int f () { int i = 123; return i; } LLVM will transform `i` into a constant value and just leave behind a llvm.dbg.value, this can't be represented as a S_LOCAL record in CodeView. But we can represent it as a S_CONSTANT record. This patch checks if the location of a debug value is null, then we will insert a S_CONSTANT record instead of a S_LOCAL value with the flag "OptimizedAway". In lld we then output the S_CONSTANT in the right scope, before they where always inserted in the global stream, now we check the scope before inserting it. This has shown to improve debugging for our developers internally. Fixes to llvm/llvm-project#55958 Reviewed By: aganea Differential Revision: https://reviews.llvm.org/D138995	2022-12-06 10:34:01 +01:00
Philip Reames	186c192261	[SDAG] Allow scalable vectors in SimplifyDemanded routines This is a continuation of the series of patches adding lane wise support for scalable vectors in various knownbit-esq routines. The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations. Differential Revision: https://reviews.llvm.org/D137190	2022-12-05 12:42:16 -08:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit 122efef8ee9be57055d204d52c38700fe933c033. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Matt Arsenault	47c68904a5	DAG: ComputeNumSignBits from load range metadata The cases where the result type doesn't match the range type are inadequately tested, but I'm not sure how to write such a test. During the pre-legalize combine, any obviously optimizable code gets handled so it's harder to test legalized extloads.	2022-12-05 11:57:13 -05:00
Philip Reames	7969ab85e0	[SDAG] Allow scalable vectors in ComputeKnownBits (try 2) This was previously reverted due to a hang on a Hexagon bot. This turned out to be a bug in the Hexagon backend around how splat_vectors are legalized (which they're using for fixed length vectors!). I adjusted this patch to remove the implicit truncate support. This hides the hexagon bug for now, and unblocks the rest of the change. Original commit message: This is the SelectionDAG equivalent of D136470, and is thus an alternate patch to D128159. The basic idea here is that we track a single lane for scalable vectors which corresponds to an unknown number of lanes at runtime. This is enough for us to perform lane wise reasoning on many arithmetic operations. This patch also includes an implementation for SPLAT_VECTOR as without it, the lane wise reasoning has no base case. The original patch which inspired this (D128159), also included STEP_VECTOR. I plan to do that as a separate patch. Differential Revision: https://reviews.llvm.org/D137140	2022-12-05 08:52:37 -08:00
Dmitry Vyukov	dbe8c2c316	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-12-05 14:40:31 +01:00
Vladislav Dzhidzhoev	f32cafedf0	[GlobalISel][DebugInfo] Propagate debug location for localized constants After IRTranslator pass, constants are deduplicated and translated into instructions at entry block, having debug locations lost. Localization of constants may cause emission of extra zero lines in debug_line section, like here https://godbolt.org/z/ecvsxxfKn. In this example, constant gets placed as a first instruction in entry block, and despite it has no debug location, AsmPrinter emits zero line for it. If a localized constant has the only user, we can assume that it has the same debug location as its user, since they are placed consequently. Differential Revision: https://reviews.llvm.org/D128192	2022-12-05 16:38:24 +03:00
Fangrui Song	a996cc217c	Remove unused #include "llvm/ADT/Optional.h"	2022-12-05 06:31:11 +00:00
Fangrui Song	89fae41ef1	[IR] llvm::Optional => std::optional Many llvm/IR/* files have been migrated by other contributors. This migrates most remaining files.	2022-12-05 04:13:11 +00:00
Kazu Hirata	595f1a6aaf	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 19:47:13 -08:00
Kazu Hirata	9f252e5567	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:31:17 -08:00
Kazu Hirata	3c09ed006a	[llvm] Use std::nullopt instead of None in comments (NFC) This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 17:12:44 -08:00
Fangrui Song	89fab98e88	[DebugInfo] llvm::Optional => std::optional https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-05 00:09:22 +00:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit 17db0de330f943833296ae72e26fa988bba39cb3. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Fangrui Song	b0df70403d	[Target] llvm::Optional => std::optional The updated functions are mostly internal with a few exceptions (virtual functions in TargetInstrInfo.h, TargetRegisterInfo.h). To minimize changes to LLVMCodeGen, GlobalISel files are skipped. https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 22:43:14 +00:00
Fangrui Song	f4c16c4473	[MC] llvm::Optional => std::optional https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 21:36:08 +00:00
Fangrui Song	4e62072ca1	[Passes] llvm::Optional => std::optional	2022-12-04 20:44:52 +00:00
Krzysztof Parzyszek	f3b6dbfda8	Instructions: convert Optional to std::optional	2022-12-04 14:25:11 -06:00
Krzysztof Parzyszek	0ca43d4488	DebugInfoMetadata: convert Optional to std::optional	2022-12-04 11:52:02 -06:00
Benjamin Kramer	fcf4e360ba	Iterate over StringMaps using structured bindings. NFCI.	2022-12-04 18:36:41 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Krzysztof Parzyszek	ab672e9173	FPEnv: convert Optional to std::optional	2022-12-03 13:55:56 -06:00
Fangrui Song	bac974278c	CodeGen/CommandFlags: Convert Optional to std::optional	2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek	8c7c20f033	Convert Optional<CodeModel> to std::optional<CodeModel>	2022-12-03 12:08:47 -06:00
David Green	16a72a0f87	[AArch64] Enable the select optimize pass for AArch64 This enabled the select optimize patch for ARM Out of order AArch64 cores. It is trying to solve a problem that is difficult for the compiler to fix. The criteria for when a csel is better or worse than a branch depends heavily on whether the branch is well predicted and the amount of ILP in the loop (as well as other criteria like the core in question and the relative performance of the branch predictor). The pass seems to do a decent job though, with the inner loop heuristics being well implemented and doing a better job than I had expected in general, even without PGO information. I've been doing quite a bit of benchmarking. The headline numbers are these for SPEC2017 on a Neoverse N1: 500.perlbench_r -0.12% 502.gcc_r 0.02% 505.mcf_r 6.02% 520.omnetpp_r 0.32% 523.xalancbmk_r 0.20% 525.x264_r 0.02% 531.deepsjeng_r 0.00% 541.leela_r -0.09% 548.exchange2_r 0.00% 557.xz_r -0.20% Running benchmarks with a combination of the llvm-test-suite plus several versions of SPEC gave between a 0.2% and 0.4% geomean improvement depending on the core/run. The instruction count went down by 0.1% too, which is a good sign, but the results can be a little noisy. Some issues from other benchmarks I had ran were improved in rGca78b5601466f8515f5f958ef8e63d787d9d812e. In summary well predicted branches will see in improvement, badly predicted branches may get worse, and on average performance seems to be a little better overall. This patch enables the pass for AArch64 under -O3 for cores that will benefit for it. i.e. not in-order cores that do not fit into the "Assume infinite resources that allow to fully exploit the available instruction-level parallelism" cost model. It uses a subtarget feature for specifying when the pass will be enabled, which I have enabled under cpu=generic as the performance increases for out of order cores seems larger than any decreases for inorder, which were minor. Differential Revision: https://reviews.llvm.org/D138990	2022-12-03 16:08:58 +00:00
Kazu Hirata	998960ee1f	[CodeGen] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:08 -08:00
Jan Svoboda	abf0c6c0c0	Use CTAD on llvm::SaveAndRestore Reviewed By: dblaikie Differential Revision: https://reviews.llvm.org/D139229	2022-12-02 15:36:12 -08:00
Fangrui Song	ca23b7ca47	[AsmPrinter] .addrsig_sym: remove isTransitiveUsedByMetadataOnly With D135642 ignoring unregistered symbols, isTransitiveUsedByMetadataOnly added by D101512 is no longer needed (the operation is potentially slow). There is a `.addrsig_sym` directive for an only-used-by-metadata symbol but it does not emit an entry. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D138362	2022-12-02 19:05:43 +00:00
Sanjay Patel	0037e21f28	[SDAG] bail out of mergeTruncStores() if there's any other use in the chain This fixes the miscompile in issue #58883. The test demonstrates that we gave up on store merging in that example. This change should be strictly safe (just adds another clause to avoid the transform), and it does not prohibit any existing valid optimizations based on regression tests. I want to believe that it's also a sufficient fix (possibly overkill), but I'm not sure how to prove that. Differential Revision: https://reviews.llvm.org/D137791	2022-12-02 10:08:19 -05:00
Florian Hahn	63150f4639	Revert "Enhance stack protector for calling no return function" This reverts commit 416e8c6ad529c57f21f46c6f52ded96d3ed239fb. This commit causes a test failure with expensive checks due to a DT verification failure. Revert to bring bot back to green: https://green.lab.llvm.org/green/job/clang-stage1-cmake-RA-expensive/24249/testReport/junit/LLVM/CodeGen_X86/stack_protector_no_return_ll/ + /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/clang-build/bin/llc /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/llvm-project/llvm/test/CodeGen/X86/stack-protector-no-return.ll -mtriple=x86_64-unknown-linux-gnu -o - + /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/clang-build/bin/FileCheck /Users/buildslave/jenkins/workspace/clang-stage1-cmake-RA-expensive/llvm-project/llvm/test/CodeGen/X86/stack-protector-no-return.ll DominatorTree is different than a freshly computed one! Current: =============================-------------------------------- Inorder Dominator Tree: DFSNumbers invalid: 0 slow queries. [1] %entry {4294967295,4294967295} [0] [2] %unreachable {4294967295,4294967295} [1] [2] %lpad {4294967295,4294967295} [1] [3] %invoke.cont {4294967295,4294967295} [2] [4] %invoke.cont2 {4294967295,4294967295} [3] [4] %SP_return3 {4294967295,4294967295} [3] [4] %CallStackCheckFailBlk2 {4294967295,4294967295} [3] [3] %lpad1 {4294967295,4294967295} [2] [4] %eh.resume {4294967295,4294967295} [3] [5] %SP_return6 {4294967295,4294967295} [4] [5] %CallStackCheckFailBlk5 {4294967295,4294967295} [4] [4] %terminate.lpad {4294967295,4294967295} [3] [5] %SP_return9 {4294967295,4294967295} [4] [5] %CallStackCheckFailBlk8 {4294967295,4294967295} [4] [2] %SP_return {4294967295,4294967295} [1] [2] %CallStackCheckFailBlk {4294967295,4294967295} [1] Roots: %entry	2022-12-02 12:58:46 +00:00
tentzen	db6a979ae8	Revert "[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2" This reverts commit 1a949c871ab4a6b6d792849d3e8c0fa6958d27f5.	2022-12-02 02:44:18 -08:00
tentzen	1a949c871a	[Windows SEH]: HARDWARE EXCEPTION HANDLING (MSVC -EHa) - Part 2 This patch is the Part-2 (BE LLVM) implementation of HW Exception handling. Part-1 (FE Clang) was committed in 797ad701522988e212495285dade8efac41a24d4. This new feature adds the support of Hardware Exception for Microsoft Windows SEH (Structured Exception Handling). Compiler options: For clang-cl.exe, the option is -EHa, the same as MSVC. For clang.exe, the extra option is -fasync-exceptions, plus -triple x86_64-windows -fexceptions and -fcxx-exceptions as usual. NOTE:: Without the -EHa or -fasync-exceptions, this patch is a NO-DIFF change. The rules for C code: For C-code, one way (MSVC approach) to achieve SEH -EHa semantic is to follow three rules: First, no exception can move in or out of _try region., i.e., no "potential faulty instruction can be moved across _try boundary. Second, the order of exceptions for instructions 'directly' under a _try must be preserved (not applied to those in callees). Finally, global states (local/global/heap variables) that can be read outside of _try region must be updated in memory (not just in register) before the subsequent exception occurs. The impact to C++ code: Although SEH is a feature for C code, -EHa does have a profound effect on C++ side. When a C++ function (in the same compilation unit with option -EHa ) is called by a SEH C function, a hardware exception occurs in C++ code can also be handled properly by an upstream SEH _try-handler or a C++ catch(...). As such, when that happens in the middle of an object's life scope, the dtor must be invoked the same way as C++ Synchronous Exception during unwinding process. Design: A natural way to achieve the rules above in LLVM today is to allow an EH edge added on memory/computation instruction (previous iload/istore idea) so that exception path is modeled in Flow graph preciously. However, tracking every single memory instruction and potential faulty instruction can create many Invokes, complicate flow graph and possibly result in negative performance impact for downstream optimization and code generation. Making all optimizations be aware of the new semantic is also substantial. This design does not intend to model exception path at instruction level. Instead, the proposed design tracks and reports EH state at BLOCK-level to reduce the complexity of flow graph and minimize the performance-impact on CPP code under -EHa option. One key element of this design is the ability to compute State number at block-level. Our algorithm is based on the following rationales: A _try scope is always a SEME (Single Entry Multiple Exits) region as jumping into a _try is not allowed. The single entry must start with a seh_try_begin() invoke with a correct State number that is the initial state of the SEME. Through control-flow, state number is propagated into all blocks. Side exits marked by seh_try_end() will unwind to parent state based on existing SEHUnwindMap[]. Note side exits can ONLY jump into parent scopes (lower state number). Thus, when a block succeeds various states from its predecessors, the lowest State triumphs others. If some exits flow to unreachable, propagation on those paths terminate, not affecting remaining blocks. For CPP code, object lifetime region is usually a SEME as SEH _try. However there is one rare exception: jumping into a lifetime that has Dtor but has no Ctor is warned, but allowed: Warning: jump bypasses variable with a non-trivial destructor In that case, the region is actually a MEME (multiple entry multiple exits). Our solution is to inject a eha_scope_begin() invoke in the side entry block to ensure a correct State. Implementation: Part-1: Clang implementation (already in): Please see commit 797ad701522988e212495285dade8efac41a24d4). Part-2 : LLVM implementation described below. For both C++ & C-code, the state of each block is computed at the same place in BE (WinEHPreparing pass) where all other EH tables/maps are calculated. In addition to _scope_begin & _scope_end, the computation of block state also rely on the existing State tracking code (UnwindMap and InvokeStateMap). For both C++ & C-code, the state of each block with potential trap instruction is marked and reported in DAG Instruction Selection pass, the same place where the state for -EHsc (synchronous exceptions) is done. If the first instruction in a reported block scope can trap, a Nop is injected before this instruction. This nop is needed to accommodate LLVM Windows EH implementation, in which the address in IPToState table is offset by +1. (note the purpose of that is to ensure the return address of a call is in the same scope as the call address. The handler for catch(...) for -EHa must handle HW exception. So it is 'adjective' flag is reset (it cannot be IsStdDotDot (0x40) that only catches C++ exceptions). Suppress push/popTerminate() scope (from noexcept/noTHrow) so that HW exceptions can be passed through. Original llvm-dev [RFC] discussions can be found in these two threads below: https://lists.llvm.org/pipermail/llvm-dev/2020-March/140541.html https://lists.llvm.org/pipermail/llvm-dev/2020-April/141338.html Differential Revision: https://reviews.llvm.org/D102817/new/	2022-12-01 23:44:25 -08:00
Vasileios Porpodas	4e30c3ddf0	[NFC] Cleanup: Replaces BB->getInstList().erase() with BB->erase(). This is part of a series of cleanup patches towards making BasicBlock::getInstList() private. Differential Revision: https://reviews.llvm.org/D139143	2022-12-01 18:19:23 -08:00
Krzysztof Parzyszek	864aaa21b4	TargetLowering: convert Optional to std::optional	2022-12-01 16:19:10 -08:00
Krzysztof Parzyszek	467432899b	MemoryLocation: convert Optional to std::optional	2022-12-01 15:36:20 -08:00
Mitch Phillips	850defb861	Add assembler plumbing for sanitize_memtag Extends the Asm reader/writer to support reading and writing the '.memtag' directive (including allowing it on internal global variables). Also add some extra tooling support, including objdump and yaml2obj/obj2yaml. Test that the sanitize_memtag IR attribute produces the expected asm directive. Uses the new Aarch64 MemtagABI specification (https://github.com/ARM-software/abi-aa/blob/main/memtagabielf64/memtagabielf64.rst) to identify symbols as tagged in object files. This is done using a R_AARCH64_NONE relocation that identifies each tagged symbol, and these relocations are tagged in a special SHT_AARCH64_MEMTAG_GLOBALS_STATIC section. This signals to the linker that the global variable should be tagged. Reviewed By: fmayer, MaskRay, peter.smith Differential Revision: https://reviews.llvm.org/D128958	2022-12-01 10:50:34 -08:00
ZHU Zijia	010a8f7a90	[CodeGen] Fix restore blocks' BasicBlock information in branch relaxation In branch relaxation pass, restore blocks are created and placed before the jump destination if indirect branches are required. For example: foo sd s11, 0(sp) jump .restore, s11 bar bar bar j .dest .restore: ld s11, 0(sp) .dest: baz The BasicBlock information of the restore MachineBasicBlock should be identical to the dest MachineBasicBlock. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D131863	2022-12-02 02:42:22 +08:00
Jonas Paulsson	8ef4632681	Revert "[CodeGen] Add new pass for late cleanup of redundant definitions." Temporarily revert and fix buildbot failure. This reverts commit 6d12599fd4134c1da63198c74a25490d28c733f6.	2022-12-01 13:29:24 -05:00
Jonas Paulsson	6d12599fd4	[CodeGen] Add new pass for late cleanup of redundant definitions. A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-01 13:21:35 -05:00
Freddy Ye	89f36dd8f3	[X86] Add ExpandLargeFpConvert Pass and enable for X86 As stated in https://discourse.llvm.org/t/rfc-llc-add-expandlargeintfpconvert-pass-for-fp-int-conversion-of-large-bitint/65528, this implementation is very similar to ExpandLargeDivRem, which expands ‘fptoui .. to’, ‘fptosi .. to’, ‘uitofp .. to’, ‘sitofp .. to’ instructions with a bitwidth above a threshold into auto-generated functions. This is useful for targets like x86_64 that cannot lower fp convertions with more than 128 bits. The expanded nodes are referring from the IR generated by `compiler-rt/lib/builtins/floattidf.c`, `compiler-rt/lib/builtins/fixdfti.c`, and etc. Corner cases: 1. For fp16: as there is no related builtins added in compliler-rt. So I mainly utilized the fp32 <-> fp16 lib calls to implement. 2. For fp80: as this pass is soft fp emulation and no fp80 instructions can help in this problem. I recommend users to deprecate this usage. For now, the implementation uses fp128 as the temporary conversion type and inserts fptrunc/ext at top/end of the function. 3. For bf16: as clang FE currently doesn't support bf16 algorithm operations (convert to int, float, +, -, *, ...), this patch doesn't consider bf16 for now. 4. For unsigned FPToI: since both default hardware behaviors and libgcc are ignoring "returns 0 for negative input" spec. This pass follows this old way to ignore unsigned FPToI. See this example: https://gcc.godbolt.org/z/bnv3jqW1M The end-to-end tests are uploaded at https://reviews.llvm.org/D138261 Reviewed By: LuoYuanke, mgehre-amd Differential Revision: https://reviews.llvm.org/D137241	2022-12-01 13:47:43 +08:00

1 2 3 4 5 ...

33293 Commits