llvm-project

Author	SHA1	Message	Date
Oleksii Lozovskyi	c72dea88b6	[AArch64][ARM][X86] Split XRay tests for Linux/macOS XRay instrumentation works for macOS running on Apple Silicon, but codegen is untested there. I'm going to make changes affecting this target, get the XRay tests running on AArch64. Data sections are going to become slightly different on x86_64 soon. I do want the tests to be specific about symbol names, so instead of having test check the common step, bifurcate tests a bit and check the full symbol names. As for ARM, XRay is not really supported on iOS at the moment, though ARM is also really used there with modern phones. Nevertheless, codegen tests exist and the output is going to change a little, make it easier to write the special case for iOS. Reviewed By: MaskRay Differential Revision: https://reviews.llvm.org/D145291	2023-06-11 12:53:29 -07:00
JP Lehr	c9998ec145	Revert "[DAGCombine] Make sure combined nodes are added back to the worklist in topological order." This reverts commit e69fa03ddd85812be3143d79a0359c3e8d43bd45. This patch lead to build time outs on the AMDGPU OpenMP runtime buildbot.	2023-06-05 10:55:58 -04:00
Amaury Séchet	e69fa03ddd	[DAGCombine] Make sure combined nodes are added back to the worklist in topological order. Currently, a node and its users are added back to the worklist in reverse topological order after it is combined. This diff changes that order to be topological. This is part of a larger migration to get the DAGCombiner to process nodes in topological order. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D127115	2023-06-05 11:09:18 +00:00
Serge Pavlov	eecaeb6f10	[FPEnv] Intrinsics for access to FP environment The change implements intrinsics 'get_fpenv', 'set_fpenv' and 'reset_fpenv'. They are used to read floating-point environment, set it or reset to some default state. They do the same actions as C library functions 'fegetenv' and 'fesetenv'. By default these intrinsics are lowered to calls to these functions. The new intrinsics specify FP environment as a value of integer type, it is convenient of most targets where the FP state is a content of some register. Some targets however use long representations. On X86 the size of FP environment is 256 bits, and even half of this size is not a legal ibteger type. To facilitate legalization in such cases, two sets of DAG nodes is used. Nodes GET_FPENV and SET_FPENV are used when FP environment may be represented by a legal integer type. Nodes GET_FPENV_MEM and SET_FPENV_MEM consider FP environment as a region in memory, much like `fesetenv` and `fegetenv` do. They are used when target has long representation for floationg-point state. Differential Revision: https://reviews.llvm.org/D71742	2023-06-05 13:10:01 +07:00
sgokhale	c4a60c9d34	[CodeGen][ShrinkWrap] Enable PostShrinkWrap by default This is an attempt to reland D42600 and enabling this optimisation by default. This also resolves the issue pointed out in the context of PGO build. Differential Revision: https://reviews.llvm.org/D42600	2023-05-25 13:56:29 +05:30
Fangrui Song	e018cbf720	[IR] Make stack protector symbol dso_local according to -f[no-]direct-access-external-data There are two motivations. `-fno-pic -fstack-protector -mstack-protector-guard=global` created `__stack_chk_guard` is referenced directly on all ELF OSes except FreeBSD. This patch allows referencing the symbol indirectly with -fno-direct-access-external-data. Some Linux kernel folks want `-fno-pic -fstack-protector -mstack-protector-guard-reg=gs -mstack-protector-guard-symbol=__stack_chk_guard` created `__stack_chk_guard` to be referenced directly, avoiding R_X86_64_REX_GOTPCRELX (even if the relocation may be optimized out by the linker). https://github.com/llvm/llvm-project/issues/60116 Why they need this isn't so clear to me. --- Add module flag "direct-access-external-data" and set the dso_local property of the stack protector symbol. The module flag can benefit other LLVMCodeGen synthesized symbols that are not represented in LLVM IR. Nowadays, with `-fno-pic` being uncommon, ideally we should set "direct-access-external-data" when it is true. However, doing so would require ~90 clang/test tests to be updated, which are too much. As a compromise, we set "direct-access-external-data" only when it's different from the implied default value. Reviewed By: nickdesaulniers Differential Revision: https://reviews.llvm.org/D150841	2023-05-23 09:49:57 -07:00
Craig Topper	139392c0a5	[LegalizeTypes][ARM][AArch6][RISCV][VE][WebAssembly] Add special case for smin(X, -1) and smax(X, 0) to ExpandIntRes_MINMAX. We can compute a simpler expression for Lo for these cases. This is an alternative for the test cases in D151180 that works for more targets. This is similar to some of the special cases we have for expanding setcc operands. Differential Revision: https://reviews.llvm.org/D151182	2023-05-23 09:19:55 -07:00
Tobias Hieta	f84bac329b	[NFC][Py Reformat] Reformat lit.local.cfg python files in llvm This is a follow-up to b71edfaa4ec3c998aadb35255ce2f60bba2940b0 since I forgot the lit.local.cfg files in that one. Reformatting is done with `black`. If you end up having problems merging this commit because you have made changes to a python file, the best way to handle that is to run git checkout --ours <yourfile> and then reformat it with black. If you run into any problems, post to discourse about it and we will try to help. RFC Thread below: https://discourse.llvm.org/t/rfc-document-and-standardize-python-code-style Reviewed By: barannikov88, kwk Differential Revision: https://reviews.llvm.org/D150762	2023-05-17 17:03:15 +02:00
Noah Goldstein	d294e3cb76	[SelectionDAG] Improve `computeKnownBits` implementations of `sdiv` and `udiv` Add `exact` flag handling for `udiv` and add entire `sdiv` case. Differential Revision: https://reviews.llvm.org/D150098	2023-05-16 18:58:13 -05:00
Austin Chang	d069ac035a	[DAGCombiner] Add bswap(logic_op(bswap(x), y)) optimization This is the implementation of D149782 The patch implements a helper function that matches and fold the following cases in the DAGCombiner: 1. `bswap(logic_op(x, bswap(y))) -> logic_op(bswap(x), y)` 2. `bswap(logic_op(bswap(x), y)) -> logic_op(x, bswap(y))` 3. `bswap(logic_op(bswap(x), bswap(y))) -> logic_op(x, y)` in multiuse case, which still reduces the number of instructions. The helper function accepts SDValue with BSWAP and BITREVERSE opcode. This patch folds the BSWAP cases and remain the BITREVERSE optimization in the future Reviewed By: RKSimon, goldstein.w.n Differential Revision: https://reviews.llvm.org/D149783	2023-05-16 18:58:07 -05:00
Austin Chang	58c9ad9c85	[DAGCombiner] Add bswap(logic_op(bswap(x), y)) regression test case; NFC Fold the following case on SelectionDAG combiner This patch includes the regression test cases ``` bswap(logic_op(x, bswap(y))) -> logic_op(bswap(x), y) bswap(logic_op(bswap(x), y)) -> logic_op(x, bswap(y)) bswap(logic_op(bswap(x), bswap(y))) -> logic_op(x, y) (with multiuse) ``` Reviewed By: goldstein.w.n Differential Revision: https://reviews.llvm.org/D149782	2023-05-16 18:58:03 -05:00
Gaëtan Bossu	c4a872badb	FastRegAlloc: Fix implicit operands not rewritten This patch fixes a potential crash due to RegAllocFast not rewriting virtual registers. This essentially happens because of a call to MachineInstr::addRegisterKilled() in the process of allocating a "killed" vreg. The former can eventually delete implicit operands without RegAllocFast noticing, leading to some operands being "skipped" and not rewritten to use physical registers. Note that I noticed this crash when working on a solution for tying a register with one/multiple of its sub-registers within an instruction. (See problem description here: https://discourse.llvm.org/t/pass-to-tie-an-output-operand-to-a-subregister-of-an-input-operand/67184). Aside from this fix, I believe there could be further improvements to the RegAllocFast when it comes to instructions with multiple uses of a same virtual register. You can see it in the added test where the implicit uses have been re-written in a somewhat surprising way because of phase ordering. Ultimately, when allocating vregs for an instruction, I believe we should iterate on the vregs it uses (and then process all the operands that use this vregs), instead of directly iterating on operands and somewhat assuming each operand uses a different vreg. This would in the end be quite close to what greedy+virtregrewriter does. If that makes sense, I would probably spin off another patch (after I get more familiar with RegAllocFast). Differential Revision: https://reviews.llvm.org/D145169	2023-05-16 09:49:20 +02:00
Noah Goldstein	e36caaeeb2	[SelectionDAG] Use `computeKnownBits` if `Op` is not recognized by `isKnownNeverZero` The current logic is pretty limitted unless the `Op` is a constant. This at least covers more obvious cases. Reviewed By: craig.topper, foad Differential Revision: https://reviews.llvm.org/D149196	2023-05-13 14:36:04 -05:00
Serge Pavlov	0833a9a796	[test] Use autogenerated assertions	2023-05-12 13:32:59 +07:00
Zequan Wu	3977b77a6b	[CodeGen] Fix nomerge attribute not working in tail calls. In D79537, `nomerge` was made to only apply to non-tail calls. This fixes it by also applying it to tail calls. For ARM, I only made the new MI to inherit the flag under `TCRETURNdi` and `TCRETURNri`, because that's the place tail calls got replaced. Not sure if there's any other place needed. Fixes #61545. Reviewed By: rnk Differential Revision: https://reviews.llvm.org/D146749	2023-05-10 14:25:11 -04:00
Aaron Ballman	f08b94d5c2	Further amend 945f6e65be0d20b3446e7c1537c64151de618af4 This addresses the ARM issue found by: https://lab.llvm.org/buildbot/#/builders/109/builds/63726 (This test wouldn't run for me locally, hence missing it in the last fix.)	2023-05-09 16:16:07 -04:00
Alan Zhao	f4999d3535	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 1ddfd1c8186735c62b642df05c505dc4907ffac4. The original commit causes a Chrome build assertion failure with ThinLTO: https://crbug.com/1443635	2023-05-08 16:27:59 -07:00
sgokhale	1ddfd1c818	[CodeGen][ShrinkWrap] Split restore point Try to reland D42600 Differential Revision: https://reviews.llvm.org/D42600	2023-05-08 13:21:07 +05:30
Zhiyao Ma	1d0ccebcd7	[ARM] Don't allocate memory if free space in segmented stack is just enough Assuming that the stack grows downwards, it is fine if the stack pointer is exactly at the stacklet boundary. We should use less-or-equal condition when deciding whether to skip new memory allocation. Differential Revision: https://reviews.llvm.org/D149315	2023-05-02 13:09:49 +01:00
Craig Topper	df017ba9d3	[TargetLowering] Don't use ISD::SELECT_CC in expandFP_TO_INT_SAT. This function gets called for vectors and ISD::SELECT_CC was never intended to support vectors. Some updates were made to support it when this function started getting used for vectors. Overall, using separate ISD::SETCC and ISD::SELECT looks like an improvement even for scalar. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149481	2023-04-29 10:23:08 -07:00
Julian Lettner	c3f0153ec2	[MachO] Disable atexit()-based lowering when LTO'ing kernel/kext code The kernel and kext environments do not provide the `__cxa_atexit()` function, so we can't use it for lowering global module destructors. Unfortunately, just querying for "compiling for kernel/kext?" in the LTO pipeline isn't possible (kernel/kext identifier isn't part of the triple yet) so we need to pass down a CodeGen flag. rdar://93536111 Differential Revision: https://reviews.llvm.org/D148967	2023-04-25 12:13:40 -07:00
David Green	15d2821263	[ARM] Fix qsat for armv5te/armv6 + thumb-mode This is a Thumb1 target, so will not have qsat instructions available. There was a mismatch between hasBaseDSP and the instruction patterns when +dsp was present, which is set by clang (but maybe shouldn't be). The target being thumb1-only should override that, implying that it does not have any qadds. Fixes #62273	2023-04-23 17:20:28 +01:00
Archibald Elliott	9ee4fe63bc	[ARM] Fix Crashes in fp16/bf16 Inline Asm We were still seeing occasional crashes with inline assembly blocks using fp16/bf16 after my previous patches: - https://reviews.llvm.org/rGff4027d152d0 - https://reviews.llvm.org/rG7d15212b8c0c - https://reviews.llvm.org/rG20b2d11896d9 It turns out: - The original two commits were wrong, and we should have always been choosing the SPR register class, not the HPR register class, so that LLVM's SelectionDAGBuilder correctly did the right splits/joins. - The `splitValueIntoRegisterParts`/`joinRegisterPartsIntoValue` changes from rG20b2d11896d9 are still correct, even though they sometimes result in inefficient codegen of casts between fp16/bf16 and i32/f32 (which is visible in these tests). This patch fixes crashes in `getCopyToParts` and when trying to select `(bf16 (bitconvert (fp16 ...)))` dags when Neon is enabled. This patch also adds support for passing fp16/bf16 values using the 'x' constraint that is LLVM-specific. This should broadly match how we pass with 't' and 'w', but with a different set of valid S registers. Differential Revision: https://reviews.llvm.org/D147715	2023-04-13 15:34:04 +01:00
Archibald Elliott	eeb4fe093d	[NFC][ARM] Fix Type in Test I landed this test with a typo, the callsites all show `fp16_inner` returning `half`, so the declaration should too.	2023-04-13 13:43:51 +01:00
David Green	9bd7b149c2	[ARM] Replace some uses of -mcpu=cortex-m33 with architectures features. NFC This adjusts some of the tests to use the architecture features directly as opposed to -mcpu=cortex-m33 names.	2023-04-13 11:57:32 +01:00
sgokhale	bb5befefc6	Revert "[CodeGen][ShrinkWrap] Split restore point" This reverts commit 5f0bccc3d1a74111458c71f009817c9995f4bf83. An issue has been reported here: https://github.com/ClangBuiltLinux/linux/issues/1833	2023-04-13 10:52:28 +05:30
Momchil Velikov	4ac6f99ae0	[LiveInterval] Fix live range overlap check Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D145707	2023-04-11 11:11:30 +01:00
Nikita Popov	cd91992de8	[ARM] Convert test to opaque pointers When converting this test to opaque pointers, we get a register move between the call and the inline asm. However, the test comment specifically says that there should be nothing between them. As far as I can tell, this is fine, both in that the inline asm doesn't use the relevant registers, but also more generally because the inline asm doesn't declare any clobbers, so really LLVM can do whatever, side effects or not. The test was added by 618ce3e85ed1c68e89dc696b7c9ab94a6a910797 with only a reference to Apple's internal issue tracker. Differential Revision: https://reviews.llvm.org/D147512	2023-04-11 10:28:40 +02:00
sgokhale	5f0bccc3d1	[CodeGen][ShrinkWrap] Split restore point This patch splits a restore point to allow it to only post-dominate blocks reachable by use or def of CSRs(Callee Saved Registers)/FI(Frame Index). Benchmarking this on SPEC2017, this gives around 4% improvement on povray and no significant change for others. Co-authored-by: junbuml Differential Revision: https://reviews.llvm.org/D42600	2023-04-11 11:58:50 +05:30
Nikita Popov	ebd579ccae	[ARM] Regenerate test checks (NFC)	2023-04-04 11:25:13 +02:00
Nikita Popov	f45b22eeb5	[ARM] Convert some tests to opaque pointers (NFC)	2023-04-04 11:22:08 +02:00
Nikita Popov	24906aa83e	[ARM] Regenerate test checks (NFC)	2023-04-04 11:22:08 +02:00
Nikita Popov	bc2de67f25	[ARM] Name instructions in test (NFC)	2023-04-04 11:22:08 +02:00
Zequan Wu	321d02cc6b	[NFC] Update CodeGen/*/nomerge.ll tests with utils/update_llc_test_checks.py. Precommit this patch for better diff view on D146749. Reviewed By: nemanjai Differential Revision: https://reviews.llvm.org/D147454	2023-04-03 19:52:39 -04:00
Martin Storsjö	c5383536cb	[ARM] Handle generating SEH unwind info for t2STR_PRE/t2LDR_POST This fixes compiling some uncommon cases. Differential Revision: https://reviews.llvm.org/D147212	2023-03-31 10:22:28 +03:00
Sergei Barannikov	1f5e9a3502	[MCP] Do not try forward non-existent sub-register of a copy In this example: ``` $d14 = COPY killed $d18 $s0 = MI $s28 ``` $s28 is a sub-register of $d14. However, $d18 does not have sub-registers and thus cannot be forwarded. Previously, this resulted in $noreg being substituted in place of the use of $s28, which later led to an assertion failure. Fixes https://github.com/llvm/llvm-project/issues/60908, a regression that was introduced in D141747. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D146930	2023-03-30 06:11:00 +03:00
David Green	3ac5a123d9	[ARM] Add Thumb Attributes for thumb thunks created in SLSHarding Without this the function will be use an Arm subtarget, meaning the instructions in it will be invalid for the current subtarget. Differential Revision: https://reviews.llvm.org/D144733	2023-03-24 18:11:54 +00:00
Caleb Zulawski	71dc3de533	[ARM] Improve min/max vector reductions on Arm This patch adds some more efficient lowering for vecreduce.min/max under NEON, using sequences of pairwise vpmin/vpmax to reduce to a single value. This nearly resolves issues such as #50466, #40981, #38190. Differential Revision: https://reviews.llvm.org/D146404	2023-03-22 16:00:19 +00:00
Julian Lettner	e6a789ef9b	Remove -lower-global-dtors-via-cxa-atexit flag Remove the `-lower-global-dtors-via-cxa-atexit` escape hatch introduced in D121736 [1], which switched the default lowering of global destructors on MachO to use `__cxa_atexit()` to avoid emitting deprecated `__mod_term_func` sections. I added this flag as an escape hatch in case the switch causes any problems. We didn't discover any problems so now we can remove it. [1] https://reviews.llvm.org/D121736 rdar://90277838 Differential Revision: https://reviews.llvm.org/D145715	2023-03-14 14:18:11 -07:00
Archibald Elliott	b189218d44	[ARM] Fix Chain/Glue Bug in PerformVMOVhrCombine In this optimisation, the Chain and Glue from the original CopyFromReg was being lost by this optimisation, which resulted in miscompiles. This fix just ensures that the input chains are correctly updated, and that any any users are also updated with the new chain from the new CopyFromReg. Fixes #60510. Differential Revision: https://reviews.llvm.org/D143713	2023-03-06 11:55:54 +00:00
Archibald Elliott	c314667141	[ARM] Pre-Commit Tests for PR60510 Differential Revision: https://reviews.llvm.org/D143712	2023-03-06 11:55:53 +00:00
Archibald Elliott	20b2d11896	[ARM] Fix Crash in 't'/'w' handling without fp16/bf16 After https://reviews.llvm.org/rGff4027d152d0 and https://reviews.llvm.org/rG7d15212b8c0c we saw crashes in SelectionDAG when trying to use these constraints when you don't have the fp16 or bf16 extensions. However, it is still possible to move 16-bit floating point values into the right place in S registers with a normal `vmov`, even if we don't have fp16 instructions we can use within the inline assembly string. This patch therefore fixes the crash. I think the reason we weren't getting this crash before is because I think the __fp16 and __bf16 types got an error diagnostic in the Clang frontend when you didn't have the right architectural extensions to use them. This restriction was recently relaxed. The approach for bf16 needs a bit more explanation. Exactly how BF16 is legalized was changed in rGb769eb02b526e3966847351e15d283514c2ec767 - effectively, whether you have the right instructions to get a bf16 value into/out of a S register with MoveTo/FromHPR depends on hasFullFP16, but whether you use a HPR for a value of type MVT::bf16 depends on hasBF16. This is why the tests are not changed by `+bf16` vs `-bf16`, but I've left both sets of RUN lines in case this changes in the future. Test Changes: - Added more testing for testing inline asm (the core part) - fp16-promote.ll and pr47454.ll show improvements where unnecessary fp16-fp32 up/down-casts are no longer emitted. This results in fewer libcalls where those casts would be done with a libcall. - aes-erratum-fix.ll is fairly noisy, and I need to revisit this test so that the IR is more minimal than it is right now, because most of the changes in this commit do not relate to what AES is actually trying to verify. Differential Revision: https://reviews.llvm.org/D143711	2023-03-06 11:55:08 +00:00
Arthur Eubanks	773d663e47	[IPO] Remove various legacy passes These are part of the optimization pipeline, of which the legacy pass manager version is deprecated and being removed.	2023-02-27 19:06:08 -08:00
Nick Desaulniers	a3a84c9e25	[llvm] add CallBrPrepare pass to pipelines Capstone of https://discourse.llvm.org/t/rfc-syncing-asm-goto-with-outputs-with-gcc/65453/8 Clang changes are still necessary to enable the use of outputs along indirect edges of asm goto statements. Link: https://github.com/llvm/llvm-project/issues/53562 Reviewed By: void Differential Revision: https://reviews.llvm.org/D140180	2023-02-16 17:58:34 -08:00
Samuel Parker	8f104a3f9a	[ARM] O3-pipeline fix	2023-02-13 11:01:00 +00:00
Tim Northover	c4ce967e34	ARM: skip debug instructions when matching jump-table patterns. When working out whether we can see a compressible jump-table pattern during ConstantIslands, we were stopping when we saw a debug instruction. Instead it's better to keep iterating backwards to the first real instruction. https://reviews.llvm.org/D142019	2023-02-10 12:27:59 +00:00
Andrew Savonichev	c65b4d64d4	[SelectionDAG] Do not second-guess alignment for alloca Alignment of an alloca in IR can be lower than the preferred alignment on purpose, but this override essentially treats the preferred alignment as the minimum alignment. The patch changes this behavior to always use the specified alignment. If alignment is not set explicitly in LLVM IR, it is set to DL.getPrefTypeAlign(Ty) in computeAllocaDefaultAlign. Tests are changed as well: explicit alignment is increased to match the preferred alignment if it changes output, or omitted when it is hard to determine the right value (e.g. for pointers, some structs, or weird types). Differential Revision: https://reviews.llvm.org/D135462	2023-02-09 18:45:20 +03:00
Anton Sidorenko	6820cb2dd5	[Test] Fix YAML mapping keys duplication. NFC. YAML specification does not allow keys duplication an a mapping. However, YAML parser in LLVM does not have any check on that and uses only the last key entry. In this change duplicated keys are merged to satisfy the spec. Differential Revision: https://reviews.llvm.org/D141848	2023-02-09 12:59:50 +03:00
Nick Desaulniers	07c7784d7b	[llvm][IfConversion] update successor list when merging INLINEASM_BR If this successor list is not correct, then branch-folding may incorrectly think that the indirect target is dead and remove it. This results in a dangling reference to the removed block as an operand to the INLINEASM_BR, which later will get AsmPrinted into code that doesn't assemble. This was made more obvious by, but is not a regression of https://reviews.llvm.org/D130316. Fixes: https://github.com/llvm/llvm-project/issues/60346 Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D142924	2023-02-07 10:28:11 -08:00
Nick Desaulniers	1cecfa407c	precommit test for pr60346 Link: https://github.com/llvm/llvm-project/issues/60346 Reviewed By: efriedma, void Differential Revision: https://reviews.llvm.org/D142923	2023-02-07 10:28:11 -08:00

1 2 3 4 5 ...

4730 Commits