llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	4f95821f58	[DAG] SelectionDAG::getNode() - consistently use N1 for first operand. NFCI. This has been annoying me for years - rename Operand to N1 so it matches all the other getNode() calls, and simplifies my debug watch windows!	2023-07-17 17:17:40 +01:00
Simon Pilgrim	e9caa37e9c	[DAG] Move lshr narrowing from visitANDLike to SimplifyDemandedBits Inspired by some of the cases from D145468 Let SimplifyDemandedBits handle the narrowing of lshr to half-width if we don't require the upper bits, the narrowed shift is profitable and the zext/trunc are free. A future patch will propose the equivalent shl narrowing combine. Differential Revision: https://reviews.llvm.org/D146121	2023-07-17 15:50:09 +01:00
Jay Foad	a1a9c53ae7	[GlobalISel] Fix infinite loop in reassociation combine Don't reassociate (C1+C2)+Y -> C1+(C2+Y). Fixes https://github.com/llvm/llvm-project/issues/63849 Differential Revision: https://reviews.llvm.org/D155284	2023-07-16 14:15:24 +01:00
Matt Arsenault	c4ccd6e3d2	MachineSink: Remove unnecessary empty block check	2023-07-14 18:46:18 -04:00
Matt Arsenault	6d3027e3d1	MachineSink: Move helper function and use more const	2023-07-14 18:46:18 -04:00
Weining Lu	ef33d6cbfc	[XRay] Add initial support for loongarch64 Only support patching FunctionEntry/FunctionExit/FunctionTailExit for now. Reviewed By: MaskRay, xen0n Co-Authored-By: zhanglimin <zhanglimin@loongson.cn> Differential Revision: https://reviews.llvm.org/D140727	2023-07-14 09:27:13 +08:00
Amara Emerson	432338a673	Don't assert on a non-pointer value being used for a "p" inline asm constraint. GCC and existing codebases allow the use of integral values to be used with this constraint. A recent change D133914 in this area started causing asserts. Removing the assert is enough as the rest of the code works fine. rdar://109675485 Differential Revision: https://reviews.llvm.org/D155023	2023-07-13 10:45:56 -07:00
Oliver Stannard	aea8db8eb9	Revert "[CodeGen] Store SP adjustment in MachineBasicBlock. NFCI." This reverts commit 58d1eaa3b6ce4f7285c51f83faff7a3ac374c746.	2023-07-13 14:25:39 +01:00
Jon Roelofs	56e60bc5bb	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Divverential revision: https://reviews.llvm.org/D155095 This reverts commit cdc633e4bc93d4bf241ecd4c29691ae065749313.	2023-07-12 16:13:27 -07:00
Noah Goldstein	a4c461c063	[SelectionDAG] Fill in some more cases in `isKnownNeverZero` This mostly copies cases that already exist in ValueTracking, although it skips the more complex ones. Those can be filled in as needed. Reviewed By: RKSimon Differential Revision: https://reviews.llvm.org/D149199	2023-07-12 17:17:53 -05:00
Noah Goldstein	74f0ec5e24	[DAGCombiner] Make it so that `udiv` can be folded with `(select c, NonZero, 1)` This is done by allowing speculation of `udiv` if we can prove the denominator is non-zero. https://alive2.llvm.org/ce/z/VNCt_q Differential Revision: https://reviews.llvm.org/D149198	2023-07-12 17:17:53 -05:00
Fangrui Song	86a542b005	[CodeGen] Fix -Wunused-variable in -DLLVM_ENABLE_ASSERTIONS=off builds after D154281	2023-07-12 14:37:40 -07:00
Jon Roelofs	cdc633e4bc	Revert "TargetLowering: fix an infinite DAG combine in SimplifySETCC" This reverts commit b76c85b355578d9076c22a86faf4ea8de1745bdf. It broke the RISCV-enabled bots. Oops.	2023-07-12 12:22:03 -07:00
Jon Roelofs	b76c85b355	TargetLowering: fix an infinite DAG combine in SimplifySETCC TargetLowering::SimplifySetCC wants to swap the operands of a SETCC to canonicalize the constant to the RHS. The bug here was that it did so whether or not the RHS was already a constant, leading to an infinite loop. rdar://111847838 Differential revision: https://reviews.llvm.org/D155095	2023-07-12 11:44:15 -07:00
Qiongsi Wu	41447f6fdf	[libLTO][AIX] Respect `-f[no]-integrated-as` on AIX `libLTO` currently ignores the `-f[no-]integrated-as` flags. This patch teaches `libLTO` to respect them on AIX. The implementation consists of two parts: # Migrate `llc`'s `-no-integrated-as` option to a codegen option so that the option is available to `libLTO`/`lld`/`gold`. # Teach `clang` to pass `-no-integrated-as` accordingly to `libLTO` depending on the `-f[no-]integrated-as` flags. On platforms other than AIX, the `-f[no-]integrated-as` flags are ignored. Reviewed By: MaskRay, steven_wu Differential Revision: https://reviews.llvm.org/D152924	2023-07-12 13:22:02 -04:00
Jingu Kang	33e60484d7	[MachineLICM] Handle Subloops MachineLICM pass handles inner loops only when outmost loop does not have unique predecessor. If the loop has preheader and there is loop invariant code, the invariant code can be hoisted to the preheader in general. This patch makes the pass handle inner loops in general. Differential Revision: https://reviews.llvm.org/D154205	2023-07-12 16:32:14 +01:00
Craig Topper	45b172c838	[LegalizeDAG] Prevent LegalizeLoadOps from creating extloads that mix int and fp types. For RISC-V, getRegisterType for fp16 returns i16. i16->fp64 extload is considered legal because the LoadExtActions defaults to Legal for all entries. Only fp/fp and int/int entries are changed to Expand fore RISC-V. This patch detects the FP-ness has changed and won't try to call isLoadExtLegal. Alternatively, we could add Expand for int/fp and fp/int, but that seemed a little silly. Fixes #63816 Reviewed By: asb, wangpc Differential Revision: https://reviews.llvm.org/D155040	2023-07-12 08:03:35 -07:00
Ivan Kosarev	e705b2b1f4	Fix warnings about unused varibles on builds without asserts.	2023-07-12 14:45:29 +01:00
Jay Foad	58d1eaa3b6	[CodeGen] Store SP adjustment in MachineBasicBlock. NFCI. Record the SP adjustment on entry to each basic block. This is almost always zero except on targets like ARM which can split a basic block in the middle of a call sequence. This simplifies PEI::replaceFrameIndices which previously had to visit basic blocks in a specific order and had special handling for unreachable blocks. More importantly it paves the way for an equally simple implementation of a backwards version of replaceFrameIndices, which is required to fully convert PrologEpilogInserter to backwards register scavenging, which is preferred because it does not rely on accurate kill flags. Differential Revision: https://reviews.llvm.org/D154281	2023-07-12 14:29:26 +01:00
Marco Elver	de79233b2e	[X86] Complete preservation of !pcsections in X86ISelLowering https://reviews.llvm.org/D130883 introduced MIMetadata to simplify metadata propagation (DebugLoc and PCSections). However, we're currently still permitting implicit conversion of DebugLoc to MIMetadata, to allow for a gradual transition and let the old code work as-is. This manifests in lost !pcsections metadata for X86-specific lowerings. For example, 128-bit atomics. Fix the situation for X86ISelLowering by converting all BuildMI() calls to use an explicitly constructed MIMetadata. Reviewed By: dvyukov Differential Revision: https://reviews.llvm.org/D154986	2023-07-12 15:09:31 +02:00
Ivan Kosarev	15e7749e19	[Codegen] Generate fast fp64-to-fp16 conversions in unsafe mode. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154528	2023-07-12 11:55:19 +01:00
Jay Foad	f7684d8510	[DAG] Use legal shift amount type in DAGTypeLegalizer::JoinIntegers Documentation for TargetLowering::getShiftAmountTy says that LegalTypes should generally be true during type legalization, so this patch does that. On AMDGPU the effect is that we use i32 (a sane type) instead of i64 (pointer sized type) for more shift amounts, which in turn allows more formation of rotates and funnel shifts pre-legalization. Differential Revision: https://reviews.llvm.org/D154960	2023-07-12 08:12:09 +01:00
Han Shen	65ef4d4357	[CodeGen] Part II of "Fine tune MachineFunctionSplitPass (MFS) for FSAFDO". This CL adds a new discriminator pass. Also adds a new sample profile loading pass when MFS is enabled. Differential Revision: https://reviews.llvm.org/D152577	2023-07-11 22:40:25 -07:00
Matt Arsenault	3701ebe76b	AtomicExpand: Fix expanding atomics into unconstrained FP in strictfp functions Ideally the normal fadd/fmin/fmax this was creating would fail the verifier. It's probably also necessary to force off FP exception handlers in the cmpxchg loop but we don't have a generic way to do that now. Note strictfp builder is broken in the minnum/maxnum case https://reviews.llvm.org/D154993	2023-07-11 18:51:15 -04:00
Matt Arsenault	b59022b42e	DAG: Handle lowering of unordered fcZero\|fcSubnormal to fcmp	2023-07-11 18:30:15 -04:00
pvanhout	8444038d16	[AMDGPU] Use GlobalISel MatchTable Combiner Backend Use the new matchtable-based combiner backend for all AMDGPU combiners. This drop-in from the user's perspective; there are no test changes, the new combiner behaves exactly like the old one. Depends on D153757 NOTE: This would land iff D153757 (RFC) lands too. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153758	2023-07-11 11:27:13 +02:00
pvanhout	1fe7d9c799	[GlobalISel] Generalize `InstructionSelector` Match Tables Makes `InstructionSelector.h`/`InstructionSelectorImpl.h` generic so the match tables can also be used for the combiner. Some notes: - Coverage was made an optional parameter of `executeMatchTable`, combines won't use it for now. - `GIPFP_` -> `GICXXPred_` so it's more generic. Those are just C++ predicates and aren't PatFrag-specific. - Pass the MatcherState directly to testMIPredicate_MI, the combiner will need it. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D153755	2023-07-11 09:42:30 +02:00
Matt Arsenault	1d92b68ead	DAG: Correct chain management for frexp libcalls We need to replace the other uses of the call chain with the new load chain. Fixes not preserving the return def with unused x86_fp80 results. Regression reported here: https://reviews.llvm.org/rGb15bf305ca3e9ce63aaef7247d32fb3a75174531#1224999	2023-07-10 21:39:15 -04:00
Han Shen	8df75969ae	[CodeGen] Fine tune MachineFunctionSplitPass (MFS) for FSAFDO. The original MFS work D85368 shows good performance improvement with Instrumented FDO. However, AutoFDO or Flow-Sensitive AutoFDO (FSAFDO) does not show performance gain. This is mainly caused by a less accurate profile compared to the iFDO profile. For the past few months, we have been working to improve FSAFDO quality, like in D145171. Taking advantage of this improvement, MFS now shows performance improvements over FSAFDO profiles. That being said, 2 minor changes need to be made, 1) An FS-AutoFDO profile generation pass needs to be added right before MFS pass and an FSAFDO profile load pass is needed when FS-AutoFDO is enabled and the MFS flag is present. 2) MFS only applies to hot functions, because we believe (and experiment also shows) FS-AutoFDO is more accurate about functions that have plenty of samples than those with no or very few samples. With this improvement, we see a 1.2% performance improvement in clang benchmark, 0.9% QPS improvement in our internal search benchmark, and 3%-5% improvement in internal storage benchmark. This is #1 of the two patches that enables the improvement. Reviewed By: wenlei, snehasish, xur Differential Revision: https://reviews.llvm.org/D152399	2023-07-10 16:00:30 -07:00
Fangrui Song	0b69cc8bcb	[AArch64] Improve sanitize_memtag test The ELFObjectWriter::shouldRelocateWithSymbol change in D128958 is untested. Add the testing. Also, change a diagnostic to follow the convention (no capitalization or trailing period). Test it.	2023-07-10 13:25:09 -07:00
Jake Egan	bbd0d123d3	Implement -frecord-command-line for XCOFF This patch extends support of the option `-frecord-command-line` to XCOFF. XCOFF doesn’t have custom sections like ELF, so the command line data is emitted to a .info section instead. A C_INFO symbol is generated with the .info section to preserve the command line data past the link step. Multiple command lines are separated by newlines and null bytes. The command line data can be retrieved on AIX with command `what file_name`. Reviewed By: scott.linder Differential Revision: https://reviews.llvm.org/D153600	2023-07-10 12:47:07 -04:00
Igor Kirillov	0aecf7ff0d	[CodeGen] Fix incorrectly detected reduction bug in ComplexDeinterleaving pass Using ACLE intrinsics, it is possible to create a loop that the deinterleaving pass incorrectly classified as a reduction loop. For example, for fixed-width vectors the loop was like below: vector.body: %a = phi <4 x float> [ %init.a, %entry ], [ %updated.a, %vector.body ] %b = phi <4 x float> [ %init.b, %entry ], [ %updated.b, %vector.body ] ... ; Does not depend on %a or %b: %updated.a = ... %updated.b = ... Differential Revision: https://reviews.llvm.org/D154598	2023-07-10 12:54:38 +00:00
Amara Emerson	3a80bdb316	[GlobalISel] Remove an erroneous oneuse check in the G_ADD reassociation combine. This check was unnecessary/incorrect, it was already being done by the target hook default implementation, and the one in the matcher was checking for a completely different thing. This change: 1) Removes the check and updates affected tests which now do some more reassociations. 2) Modifies the AMDGPU hooks which were stubbed with "return true" to also do the oneuse check. Not sure why I didn't do this the first time.	2023-07-10 01:03:12 -07:00
David Green	758c4640c9	[CGP] Enable CodeGenPrepares phi type convertion. This is a recommit of 67121d7, enabling the CodeGenPrepare OptimizePhiTypes option that can help with the type of phi instructions into ISel.	2023-07-09 10:32:11 +01:00
Matt Arsenault	310f839612	DAG: Lower is.fpclass fcInf to fcmp of fabs InstCombine should have taken care of this, but I think this is more useful in the future when the expansion tries to handle multiple cases at a time with fcmp. x87 looks worse to me but the only thing I know about it is that I aggressively do not care about it. https://reviews.llvm.org/D143198	2023-07-07 17:00:10 -04:00
Jay Foad	fa78983bcb	[PEI][Mips] Switch to backwards frame index elimination This adds support for running PEI::replaceFrameIndicesBackward with no RegisterScavenger, and basic support for eliminating call frame pseudo instructions. Differential Revision: https://reviews.llvm.org/D154347	2023-07-07 18:30:08 +01:00
Jay Foad	4fd186d804	[PEI] Simplify iterator handling in replaceFrameIndicesBackward. NFCI. Differential Revision: https://reviews.llvm.org/D154346	2023-07-07 18:30:08 +01:00
Yashwant Singh	b7836d8562	[CodeGen]Allow targets to use target specific COPY instructions for live range splitting Replacing D143754. Right now the LiveRangeSplitting during register allocation uses TargetOpcode::COPY instruction for splitting. For AMDGPU target that creates a problem as we have both vector and scalar copies. Vector copies perform a copy over a vector register but only on the lanes(threads) that are active. This is mostly sufficient however we do run into cases when we have to copy the entire vector register and not just active lane data. One major place where we need that is live range splitting. Allowing targets to use their own copy instructions(if defined) will provide a lot of flexibility and ease to lower these pseudo instructions to correct MIR. - Introduce getTargetCopyOpcode() virtual function and use if to generate copy in Live range splitting. - Replace necessary MI.isCopy() checks with TII.isCopyInstr() in register allocator pipeline. Reviewed By: arsenm, cdevadas, kparzysz Differential Revision: https://reviews.llvm.org/D150388	2023-07-07 22:29:50 +05:30
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Matt Arsenault	1588e18b2d	DAG: Check isCondCodeLegal in is_fpclass expansion to fcmp eq 0 Results in some x86 codegen diffs. Some look better, some look worse. https://reviews.llvm.org/D152094	2023-07-06 13:00:52 -04:00
Matt Arsenault	e8ed6e35bd	DAG: Implement soften float for ffrexp Fixes #63661 https://reviews.llvm.org/D154555	2023-07-05 21:42:27 -04:00
Matt Arsenault	20964c901a	DAG: Fix dropping flags when widening unary vector ops	2023-07-05 17:25:24 -04:00
Oskar Wirga	198df5f682	Weaken MFI Max Call Frame Size Assertion A year ago when I was not invested at all into compilers, I found an assertion error when building an AArch64 debug build with LTO + CFI, among other combinations. It was posted as a github issue here: https://github.com/llvm/llvm-project/issues/54088 I took it upon myself to revisit the issue now that I have spent some more time working on LLVM. Reviewed By: MatzeB Differential Revision: https://reviews.llvm.org/D151276	2023-07-05 14:02:51 -07:00
Igor Kirillov	7f20407cee	[CodeGen] Add support for Splats in ComplexDeinterleaving pass This commit allows generating of complex number intrinsics for expressions with constants or loops invariants, which are represented as splats. For instance, after vectorizing loops in the following code snippets, the ComplexDeinterleaving pass will be able to generate complex number intrinsics: ``` complex<> x = ...; for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * x; ``` or ``` for (int i = 0; i < N; ++i) c[i] = a[i] * b[i] * (11.0 + 3.0i); ``` Differential Revision: https://reviews.llvm.org/D153355	2023-07-05 17:02:52 +00:00
Amaury Séchet	ee2d10cd16	[NFC] Reorder functions in DAGCombiner so all UADDO_CARRY related functions are next to each others.	2023-07-04 14:55:11 +00:00
Yashwant Singh	7aebe4eaaa	[CodeGen] Move lowerCopy from expandPostRA to TII This will allow targets to lower their 'copy' instructions easily. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D152261	2023-07-04 09:04:49 +05:30
Christudasan Devadasan	aa82b562b7	[CodeGen] MRI call back in TargetMachine It is needed for target specific initializatons. Reviewed By: qcolombet Differential Revision: https://reviews.llvm.org/D143758	2023-07-03 21:29:37 +05:30
Igor Kirillov	b4f9c3a933	[CodeGen] Refactor ComplexDeinterleaving to run identification on Values instead of Instructions This change will make it easier to add identification of complex constants in future patches. Differential Revision: https://reviews.llvm.org/D153446	2023-07-03 10:35:14 +00:00
David Green	f55d96b9a2	[DAG][AArch64] Handle vector types when expanding sdiv/udiv into mulh The aarch64 backend will benefit from expanding 64vector sdiv/udiv into mulh using shift(mul(ext, ext)), as the larger type size is legal and the mul(ext, ext) can efficiently use smull/umull instructions. This extends the existing code in GetMULHS to handle vector types for it. Differential Revision: https://reviews.llvm.org/D154049	2023-07-02 15:02:52 +01:00

1 2 3 4 5 ...

34332 Commits