llvm-project

Author	SHA1	Message	Date
Julian Lettner	e6a789ef9b	Remove -lower-global-dtors-via-cxa-atexit flag Remove the `-lower-global-dtors-via-cxa-atexit` escape hatch introduced in D121736 [1], which switched the default lowering of global destructors on MachO to use `__cxa_atexit()` to avoid emitting deprecated `__mod_term_func` sections. I added this flag as an escape hatch in case the switch causes any problems. We didn't discover any problems so now we can remove it. [1] https://reviews.llvm.org/D121736 rdar://90277838 Differential Revision: https://reviews.llvm.org/D145715	2023-03-14 14:18:11 -07:00
Archibald Elliott	b189218d44	[ARM] Fix Chain/Glue Bug in PerformVMOVhrCombine In this optimisation, the Chain and Glue from the original CopyFromReg was being lost by this optimisation, which resulted in miscompiles. This fix just ensures that the input chains are correctly updated, and that any any users are also updated with the new chain from the new CopyFromReg. Fixes #60510. Differential Revision: https://reviews.llvm.org/D143713	2023-03-06 11:55:54 +00:00
Archibald Elliott	c314667141	[ARM] Pre-Commit Tests for PR60510 Differential Revision: https://reviews.llvm.org/D143712	2023-03-06 11:55:53 +00:00
Archibald Elliott	20b2d11896	[ARM] Fix Crash in 't'/'w' handling without fp16/bf16 After https://reviews.llvm.org/rGff4027d152d0 and https://reviews.llvm.org/rG7d15212b8c0c we saw crashes in SelectionDAG when trying to use these constraints when you don't have the fp16 or bf16 extensions. However, it is still possible to move 16-bit floating point values into the right place in S registers with a normal `vmov`, even if we don't have fp16 instructions we can use within the inline assembly string. This patch therefore fixes the crash. I think the reason we weren't getting this crash before is because I think the __fp16 and __bf16 types got an error diagnostic in the Clang frontend when you didn't have the right architectural extensions to use them. This restriction was recently relaxed. The approach for bf16 needs a bit more explanation. Exactly how BF16 is legalized was changed in rGb769eb02b526e3966847351e15d283514c2ec767 - effectively, whether you have the right instructions to get a bf16 value into/out of a S register with MoveTo/FromHPR depends on hasFullFP16, but whether you use a HPR for a value of type MVT::bf16 depends on hasBF16. This is why the tests are not changed by `+bf16` vs `-bf16`, but I've left both sets of RUN lines in case this changes in the future. Test Changes: - Added more testing for testing inline asm (the core part) - fp16-promote.ll and pr47454.ll show improvements where unnecessary fp16-fp32 up/down-casts are no longer emitted. This results in fewer libcalls where those casts would be done with a libcall. - aes-erratum-fix.ll is fairly noisy, and I need to revisit this test so that the IR is more minimal than it is right now, because most of the changes in this commit do not relate to what AES is actually trying to verify. Differential Revision: https://reviews.llvm.org/D143711	2023-03-06 11:55:08 +00:00
Arthur Eubanks	773d663e47	[IPO] Remove various legacy passes These are part of the optimization pipeline, of which the legacy pass manager version is deprecated and being removed.	2023-02-27 19:06:08 -08:00
Nick Desaulniers	a3a84c9e25	[llvm] add CallBrPrepare pass to pipelines Capstone of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Clang changes are still necessary to enable the use of outputs along indirect edges of asm goto statements. Link: https://github.com/llvm/llvm-project/issues/53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140180	2023-02-16 17:58:34 -08:00
Samuel Parker	8f104a3f9a	[ARM] O3-pipeline fix	2023-02-13 11:01:00 +00:00
Tim Northover	c4ce967e34	ARM: skip debug instructions when matching jump-table patterns. When working out whether we can see a compressible jump-table pattern during ConstantIslands, we were stopping when we saw a debug instruction. Instead it's better to keep iterating backwards to the first real instruction. https://reviews.llvm.org/D142019	2023-02-10 12:27:59 +00:00
Andrew Savonichev	c65b4d64d4	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2023-02-09 18:45:20 +03:00
Anton Sidorenko	6820cb2dd5	[Test] Fix YAML mapping keys duplication. NFC. YAML specification does not allow keys duplication an a mapping. However, YAML parser in LLVM does not have any check on that and uses only the last key entry. In this change duplicated keys are merged to satisfy the spec. Differential Revision: https://reviews.llvm.org/D141848	2023-02-09 12:59:50 +03:00
Nick Desaulniers	07c7784d7b	[llvm][IfConversion] update successor list when merging INLINEASM_BR If this successor list is not correct, then branch-folding may incorrectly think that the indirect target is dead and remove it. This results in a dangling reference to the removed block as an operand to the INLINEASM_BR, which later will get AsmPrinted into code that doesn't assemble. This was made more obvious by, but is not a regression of https://reviews.llvm.org/D130316. Fixes: https://github.com/llvm/llvm-project/issues/60346 Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D142924	2023-02-07 10:28:11 -08:00
Nick Desaulniers	1cecfa407c	precommit test for pr60346 Link: https://github.com/llvm/llvm-project/issues/60346 Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D142923	2023-02-07 10:28:11 -08:00
Samuel Parker	7bff37783f	[SDAG] Check fminnum/fmaxnum for non-zero operand. Currently, in TargetLowering, if the target does not support fminnum, we lower to fminimum if neither operand could be a NaN. But this isn't quite correct because fminnum and fminimum treat +/-0 differently; so, we need to prove that one of the operands isn't a zero, or we don't have signed zeros. Differential Revision: https://reviews.llvm.org/D143256	2023-02-07 10:54:23 +00:00
Samuel Parker	a7de5c82bb	[NFC] minnum/maxnum intrinsic tests ARM and WebAssembly tests.	2023-02-07 10:47:40 +00:00
Samuel Parker	91f8289ff0	Revert "[DAGCombine] Fold redundant select" This reverts commit bbdf24357932b064f2aa18ea1356b474e0220dde.	2023-02-07 10:37:20 +00:00
Sanjay Patel	fb3e3ef62e	[SDAG] fix miscompiles caused by using ValueTracking matchSelectPattern to create FMINIMUM/FMAXIMUM ValueTracking attempts to match compare+select patterns to FP min/max operations, but it was created before the newer IEEE-754-2019 minimum/maximum ops were defined. Ie, matchSelectPattern() does not account for the -0.0/+0.0 behavior that is specified in the newer standard. FMINIMUM/FMAXIMUM nodes were created to map to the newer standard: /// FMINIMUM/FMAXIMUM - NaN-propagating minimum/maximum that also treat -0.0 /// as less than 0.0. While FMINNUM_IEEE/FMAXNUM_IEEE follow IEEE 754-2008 /// semantics, FMINIMUM/FMAXIMUM follow IEEE 754-2018 draft semantics. We could adjust ValueTracking to deal with signed zero, but it seems like a moot point given the divergent NaN behavior discussed in D143056, so just delete this possibility to avoid bugs when converting IR to SDAG. Differential Revision: https://reviews.llvm.org/D143106	2023-02-03 09:53:47 -05:00
Samuel Parker	bbdf243579	[DAGCombine] Fold redundant select If a chain of two selects share a true/false value and are controlled by two setcc nodes, that are never both true, we can fold away one of the selects. So, the following: (select (setcc X, const0, eq), Y, (select (setcc X, const1, eq), Z, Y)) Can be combined to: select (setcc X, const1, eq) Z, Y Differential Revision: https://reviews.llvm.org/D142535	2023-02-02 09:43:21 +00:00
Tim Northover	6e520fcf45	Revert "ARM: skip debug instructions when matching jump-table patterns." This reverts commit ce4fcea59e1d5829b4355b6401d7265be23f617a. I committed it accidentally.	2023-01-26 13:26:10 +00:00
Tim Northover	ce4fcea59e	ARM: skip debug instructions when matching jump-table patterns. When working out whether we can see a compressible jump-table pattern during ConstantIslands, we were stopping when we saw a debug instruction. Instead it's better to keep iterating backwards to the first real instruction.	2023-01-26 13:00:36 +00:00
Matt Arsenault	778cf5431c	IR: Add atomicrmw uinc_wrap and udec_wrap These are essentially add/sub 1 with a clamping value. AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.	2023-01-24 17:55:11 -04:00
David Green	3770b4aa3c	[ARM] Don't emit Arm speculation hardening thunks under Thumb and vice-versa Given a patch like D129506, using instructions not valid for the current target feature set becomes an error. This means that emitting Arm instructions in a Thumb target (or vice versa) becomes an error. When running in Thumb mode only thumb thunks will be needed, and in Arm mode only arm thunks are needed. This patch limits the emitted thunks to just the ones valid for the current architecture. Differential Revision: https://reviews.llvm.org/D129693	2023-01-23 11:22:11 +00:00
Matt Arsenault	65420c8041	DAG: Use getNegatedExpression in combineMinNumMaxNum Computing the negated RHS expression just to see if it compares equal and throw it away feels dirty.	2023-01-23 06:07:23 -04:00
Matt Arsenault	3b80d02992	DAG: Look through fneg when trying to create unsafe minnum/maxnum This makes most sense for isFNegFree targets, but shouldn't make things worse without it. This avoids AMDGPU test regressions in a future patch. For some reason APFloat::compareAbsoluteValue is private, so compute the neg of the constants.	2023-01-23 06:07:22 -04:00
Matt Arsenault	0ee04a1e3c	ARM: Add baseline test for fneg + fcmp + select combine	2023-01-22 21:21:15 -04:00
Philip Reames	b3154d08e9	[ARM][AArch64] Switch to generic MEMBARRIER node This change switches both targets from using target specific CompilerBarrier nodes to the recently introduced generic MEMBARRIER instruction. A couple things to call out. First, this changes the assembly comment printed. I'm not sure this matters, but if it does, we can simply drop this patch. This is a minor clean up at best. Second, the ordering operand on the target instruction appears to be unused. We could easily add ordering to the generic instruction, but since we don't seem to have a motivating case in tree, I simply dropped the ordering when selecting to the generic instruction. Differential Revision: https://reviews.llvm.org/D141513	2023-01-20 08:54:34 -08:00
OCHyams	99c12afeb4	[Assignment Tracking] Fix tests for buildbot failure (2) Follow-up for 4ece50737d5385fb80cfa23f5297d1111f8eed39 (D142027). Assignment Tracking Analysis now always runs and is skipped internally if assignment tracking is disabled. Update these tests to expect to see the pass run. Buildbot failure: https://lab.llvm.org/buildbot/#/builders/57/builds/24094	2023-01-20 15:58:35 +00:00
Paul Kirth	557a5bc336	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-19 01:51:14 +00:00
Nikita Popov	9ed2f14c87	[AsmParser] Remove typed pointer auto-detection IR is now always parsed in opaque pointer mode, unless -opaque-pointers=0 is explicitly given. There is no automatic detection of typed pointers anymore. The -opaque-pointers=0 option is added to any remaining IR tests that haven't been migrated yet. Differential Revision: https://reviews.llvm.org/D141912	2023-01-18 09:58:32 +01:00
Paul Kirth	fdc0bf6adc	Revert "[codegen] Add StackFrameLayoutAnalysisPass" This breaks on some AArch64 bots This reverts commit 0a652c540556a118bbd9386ed3ab7fd9e60a9754.	2023-01-13 22:59:36 +00:00
Paul Kirth	0a652c5405	[codegen] Add StackFrameLayoutAnalysisPass Issue #58168 describes the difficulty diagnosing stack size issues identified by -Wframe-larger-than. For simple code, its easy to understand the stack layout and where space is being allocated, but in more complex programs, where code may be heavily inlined, unrolled, and have duplicated code paths, it is no longer easy to manually inspect the source program and understand where stack space can be attributed. This patch implements a machine function pass that emits remarks with a textual representation of stack slots, and also outputs any available debug information to map source variables to those slots. The new behavior can be used by adding `-Rpass-analysis=stack-frame-layout` to the compiler invocation. Like other remarks the diagnostic information can be saved to a file in a machine readable format by adding -fsave-optimzation-record. Fixes: #58168 Reviewed By: nickdesaulniers, thegameg Differential Revision: https://reviews.llvm.org/D135488	2023-01-13 20:52:48 +00:00
Nikita Popov	60442f0d44	[CodeGen] Convert some tests to opaque pointers (NFC) These are mostly MIR tests, which I did not handle during previous conversions.	2023-01-05 13:21:20 +01:00
Filipp Zhinkin	98265db84c	[ScheduleDAG] Support REQ_SEQUENCE unscheduling REG_SEQUENCE node requires special treatment during the unscheduling because the node is untyped and neither its class, nor cost could be retrieved the same way as for typed nodes. Related issue: https://github.com/llvm/llvm-project/issues/58911 Reviewed By: efriedma Differential Revision: https://reviews.llvm.org/D138837	2022-12-30 15:17:11 +04:00
Craig Topper	8abd70081f	[TargetLowering] Teach BuildUDIV to take advantage of leading zeros in the dividend. If the dividend has leading zeros, we can use them to reduce the size of the multiplier and avoid the fixup cases. This patch is for scalars only, but we might be able to do this for vectors in a follow up. Differential Revision: https://reviews.llvm.org/D140750	2022-12-29 13:58:46 -08:00
Nikita Popov	701890164d	[ARM] Convert some tests to opaque pointers (NFC)	2022-12-21 12:37:55 +01:00
Nikita Popov	87679b12c1	[ARM] Regenerate test checks (NFC)	2022-12-21 12:33:35 +01:00
David Green	752819e813	[AArch64][ARM] Remove load from dup and vmul tests. NFC These tests needn't use loads in their testing of dup and mul instructions, and as the load changes the test may no longer test what they are intending (as in D140069).	2022-12-20 15:23:38 +00:00
Simon Pilgrim	6161a8dd5c	DAG: Pull fneg out of select feeding fadd into fsub Enables folding fadd x, (select c, (fneg a), (fneg b)) -> fsub (select a, b), c Avoids some regressions in a future AMDGPU change.	2022-12-19 11:38:30 -05:00
Matt Arsenault	ddfc8bfe07	ARM: Add baseline tests for fadd with select combine	2022-12-19 10:28:07 -05:00
Nikita Popov	bed1c7f061	[ARM] Convert some tests to opaque pointers (NFC)	2022-12-19 12:45:35 +01:00
Qiu Chaofan	a40ef656d8	[Intrinsic] Rename flt.rounds intrinsic to get.rounding Address the inconsistency between FLT_ROUNDS_ and SET_ROUNDING SDAG node. Rename FLT_ROUNDS_ to GET_ROUNDING and add llvm.get.rounding intrinsic to replace flt.rounds. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D139507	2022-12-19 15:22:39 +08:00
Ron Lieberman	38f1abef86	Revert "[SelectionDAG] Do not second-guess alignment for alloca" Breaks amdgpu buildbot https://lab.llvm.org/buildbot/#/builders/193 23491 This reverts commit ffedf47d8b793e07317f82f9c2a5f5425ebb71ad.	2022-12-15 10:55:18 -06:00
Andrew Savonichev	ffedf47d8b	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2022-12-15 18:18:12 +03:00
Roman Lebedev	d6bd732aeb	[NFC] Port codegen ARM tests that invoke opt to `-passes=` syntax	2022-12-09 01:04:46 +03:00
Roman Lebedev	b1a9584818	[opt] Disincentivize new tests from using old pass syntax Over the past day or so, i've took a large swing at our tests, and reduced the number of tests that were still using the old syntax from ~1800 to just 200. Left to handle: (as it is seen in this patch) * Transforms/LSR * Transforms/CGP * Transforms/TypePromotion * Transforms/HardwareLoops * Analysis/* * some misc. I think this is the right point to start actively refusing to honor the old syntax, except for the old tests, to prevent the old syntax from creeping back in. Thus, let's add temporary default-off flag, and if it is not passed refuse to accept old syntax. The tests that still need porting are annotated with this flag. Reviewed By: aeubanks Differential Revision: https://reviews.llvm.org/D139647	2022-12-08 23:54:03 +03:00
Peter Rong	ee31a4a702	[ARM] IselLowering unsigned overflow to crash using APInt in PerformSHLSimplify This diff fixes issue https://github.com/llvm/llvm-project/issues/59317 We should check if bitwidth is lower than the shift amount before we subtract them to avoid unsigned overflow. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D139238	2022-12-06 09:58:27 -08:00
Jonas Paulsson	5ecd363295	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." This reverts commit 122efef8ee9be57055d204d52c38700fe933c033. - Patch fixed to not reuse definitions from predecessors in EH landing pads. - Late review suggestions (by MaskRay) have been addressed. - M68k/pipeline.ll test updated. - Init captures added in processBlock() to avoid capturing structured bindings. - RISCV has this disabled for now. Original commit message: A new pass MachineLateInstrsCleanup is added to be run after PEI. This is a simple pass that removes redundant and identical instructions whenever found by scanning the MF once while keeping track of register definitions in a map. These instructions are typically immediate loads resulting from rematerialization, and address loads emitted by target in eliminateFrameInde(). This is enabled by default, but a target could easily disable it by means of 'disablePass(&MachineLateInstrsCleanupID);'. This late cleanup is naturally not "optimal" in removing instructions as it is done by looking at phys-regs, but still quite effective. It would be desirable to improve other parts of CodeGen and avoid these redundant instructions in the first place, but there are no ideas for this yet. Differential Revision: https://reviews.llvm.org/D123394 Reviewed By: RKSimon, foad, craig.topper, arsenm, asb	2022-12-05 12:53:50 -06:00
Dmitry Vyukov	dbe8c2c316	Use-after-return sanitizer binary metadata Currently per-function metadata consists of: (start-pc, size, features) This adds a new UAR feature and if it's set an additional element: (start-pc, size, features, stack-args-size) Reviewed By: melver Differential Revision: https://reviews.llvm.org/D136078	2022-12-05 14:40:31 +01:00
Jonas Paulsson	122efef8ee	Revert "Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions."" This reverts commit 17db0de330f943833296ae72e26fa988bba39cb3. Some more bots got broken - need to investigate.	2022-12-05 00:52:00 +01:00
Jonas Paulsson	17db0de330	Reapply "[CodeGen] Add new pass for late cleanup of redundant definitions." Init captures added in processBlock() to avoid capturing structured bindings, which caused the build problems (with clang). RISCV has this disabled for now until problems relating to post RA pseudo expansions are resolved.	2022-12-03 14:15:15 -06:00
Bjorn Pettersson	a11faeed44	[test] Switch to use -passes syntax in various test cases	2022-12-01 21:25:59 +01:00

1 2 3 4 5 ...

4692 Commits