llvm-project

Author	SHA1	Message	Date
Hervé Poussineau	6fa1647a47	[MC][Mips] Rename MipsMCAsmInfo to MipsELFMCAsmInfo (#112592 ) Also change MipsAsmPrinter::emitStartOfAsmFile to emit ELF-related sections only when using ELF output file format.	2024-11-01 08:42:34 +08:00
Vladimir Radosavljevic	401d123a1f	[MCP] Optimize copies when src is used during backward propagation (#111130 ) Before this patch, redundant COPY couldn't be removed for the following case: ``` $R0 = OP ... ... // Read of %R0 $R1 = COPY killed $R0 ``` This patch adds support for tracking the users of the source register during backward propagation, so that we can remove the redundant COPY in the above case and optimize it to: ``` $R1 = OP ... ... // Replace all uses of %R0 with $R1 ```	2024-10-23 13:37:02 +02:00
Alex Rønne Petersen	5785cbb405	[llvm] Ensure that soft float targets don't emit `fma()` libcalls. (#106615 ) The previous behavior could be harmful in some edge cases, such as emitting a call to `fma()` in the `fma()` implementation itself. Do this by just being more accurate in `isFMAFasterThanFMulAndFAdd()`. This was already done for PowerPC; this commit just extends that to Arm, z/Arch, and x86. MIPS and SPARC already got it right, but I added tests for them too, for good measure. Note: I don't have commit access.	2024-10-19 06:13:15 -07:00
Alex Rønne Petersen	ad4a582fd9	[llvm] Consistently respect `naked` fn attribute in `TargetFrameLowering::hasFP()` (#106014 ) Some targets (e.g. PPC and Hexagon) already did this. I think it's best to do this consistently so that frontend authors don't run into inconsistent results when they emit `naked` functions. For example, in Zig, we had to change our emit code to also set `frame-pointer=none` to get reliable results across targets. Note: I don't have commit access.	2024-10-18 09:35:42 +04:00
Nikita Popov	9f81acf4ef	[Mips] Regenerate test checks (NFC) Some of these check lines are insufficient to determine correctness. Generate full check lines instead. To reduce noise, add nounwind and use static relocation model.	2024-10-01 14:49:14 +02:00
yingopq	debc325bb1	[MIPS] Fix failing to legalize load+call with vector of non-p2 integer (#109625 ) Add a condition to check whether the vector element type is a power of 2. Fixes #102870.	2024-09-24 09:38:38 +02:00
yingopq	677177bb60	[Mips] Fix mfhi/mflo hazard miscompilation about div and mult (#91449 ) Fix issue1: In mips1-4, require a minimum of 2 instructions between a mflo/mfhi and the next mul/dmult/div/ddiv/divu/ddivu instruction. Fix issue2: In mips1-4, should not put mflo into the delay slot for the return. Fix https://github.com/llvm/llvm-project/issues/81291	2024-09-23 19:07:13 +08:00
futog	3e0a76b1fd	[Codegen][LegalizeIntegerTypes] Improve shift through stack (#96151 ) Minor improvement on cc39c3b17fb2598e20ca0854f9fe6d69169d85c7. Use an aligned stack slot to store the shifted value. Use the native register width as shifting unit, so the load of the shift result is aligned. If the shift amount is a multiple of the native register width, there is no need to do a follow-up shift after the load. I added new tests for these cases. Co-authored-by: Gergely Futo <gergely.futo@hightec-rt.com>	2024-09-23 11:45:43 +02:00
yingopq	72cacf1d99	[MIPS] Fix -msingle-float doesn't work with double on O32 (#107543 ) Skip the following function 'CustomLowerNode' when the operand had done `SoftenFloatResult`. Fix #93052	2024-09-20 07:37:18 +08:00
anbbna	b847076f55	[Mips] Add test file for 'xor' and 'and' instructions (#106679 ) Part of #99783 This test is meant to reflect the oncoming change as this test shows the unoptimized result with unnecessary SLLs.	2024-09-20 07:34:38 +08:00
yingopq	1ad84d7961	[Mips] Optimize `or (and $src1, mask), (shl $src2, shift)` to `ins` (#103017 ) Optimize `$dst = or (and $src1, (2**size0 - 1)), (shl $src2, size0)` to `ins $src1, $src2, pos, size`, where `pos = size0, size = 32 - pos`. Fix #90325	2024-09-13 00:05:54 +08:00
Alex Rønne Petersen	c0b3e491cc	[llvm][Mips] Bail on underaligned loads/stores in FastISel. (#106231 ) We encountered this problem in Zig, causing all of our `mips(el)-linux-gnueabi*` tests to fail: https://github.com/ziglang/zig/issues/21215 For these unusual cases, let's just bail in `MipsFastISel` since `MipsTargetLowering` can handle them fine. Note: I don't have commit access.	2024-09-12 22:10:19 +08:00
YunQiang Su	c641b611f8	MIPSr6: Add llvm.is.fpclasss intrinsic support (#107857 ) MIPSr6 has class.s/class.d instructions. Let's use them for llvm.is.fpclass intrinsic.	2024-09-11 09:37:12 +08:00
YunQiang Su	1e153461c6	MIPS: Add fcanonicalize for pre-R6 (#104554 ) MIPSr6 has max.s/max.d/min.s/min.d instructions, which can be used as fcanonicalize. For pre-R6, we have no instructions that can fcanonicalize an float, so let's use `fadd Y,X,X` to quiet it if it is NaN. IEEE754-2008 requires that the result of general-computational and quiet-computational operation shouldn't be signal NaN.	2024-08-27 17:13:46 +08:00
Craig Topper	ebe7265b14	[Mips] Fix fast isel for i16 bswap. (#103398 ) We need to mask the SRL result to 8 bits before ORing in the SLL. This is needed in case bits 23:16 of the input aren't zero. They will have been shifted into bits 15:8. We don't need to AND the result with 0xffff. It's ok if the upper 16 bits of the register are garbage. Fixes #103035.	2024-08-16 14:54:51 -07:00
YunQiang Su	fb9e685fc4	Intrinsic: introduce minimumnum and maximumnum for IR and SelectionDAG (#96649 ) C23 introduced new functions fminimum_num and fmaximum_num, and they follow the minimumNumber and maximumNumber of IEEE754-2019. Let's introduce new intrinsics to support them. This patch introduces support only support for scalar values. The support of vector (vp, vp.reduce, vector.reduce), experimental.constrained will be added in future patches. With this patch, MIPSr6 and LoongArch can work out of box with fcanonical and fmax/fmin. Aarch64/PowerPC64 can use the same login as MIPSr6 and LoongArch, while they have no fcanonical support yet. I will add it in future patches. The FMIN/FMAX of RISC-V instructions follows the minimumNumber/maximumNumber of IEEE754-2019. We can just add it in future patch. Background https://discourse.llvm.org/t/rfc-fix-llvm-min-f-and-llvm-max-f-intrinsics/79735 Currently we have fminnum/fmaxnum, which have different behavior on different platform for NUM vs sNaN: 1) Fallback to fmin(3)/fmax(3): return qNaN. 2) ARM64/ARM32+Neon: same as libc. 3) MIPSr6/LoongArch/RISC-V: return NUM. And the fix of fminnum/fmaxnum to follow minNUM/maxNUM of IEEE754-2008 will submit as separated patches.	2024-08-15 14:09:36 +08:00
Craig Topper	abc1acf8df	[TargetLowering][AMDGPU][ARM][RISCV][X86] Teach SimplifyDemandedBits to combine (srl (sra X, C1), ShAmt) -> sra(X, C1+ShAmt) (#101751 ) If the upper bits of the shr aren't demanded. This helps with cases where the outer srl was originally an sra and was converted to a srl by SimplifyDemandedBits before it had a chance to combine with the inner sra. This can occur when the inner sra was part of a sign_extend_inreg expansion. There are some regressions in ARM and Thumb2.	2024-08-14 08:44:57 -07:00
Craig Topper	91c3a718b2	[Mips] ISel zext nneg the same as sext for Mips64. (#102852 ) Fixes #62587.	2024-08-12 13:47:27 -07:00
yingopq	e711a0c80f	[MIPS] Fix missing ANDI optimization (#97689 ) 1. Add MipsPat to optimize (andi (srl (truncate i64 $1), x), y) to (andi (truncate (dsrl i64 $1, x)), y). 2. Add MipsPat to optimize (ext (truncate i64 $1), x, y) to (truncate (dext i64 $1, x, y)). The assembly result is the same as gcc. Fixes https://github.com/llvm/llvm-project/issues/42826	2024-08-09 18:55:21 +01:00
yingopq	5fb20024e2	[Mips] Add test for AND optimization (#102278 ) See https://github.com/llvm/llvm-project/issues/42826	2024-08-07 20:55:13 +01:00
Nikita Popov	f2f18459d4	Revert "Intrinsic: introduce minimumnum and maximumnum (#93841 )" As far as I can tell, this pull request was not approved, and did not go through an RFC on discourse. This reverts commit 89881480030f48f83af668175b70a9798edca2fb. This reverts commit 225d8fc8eb24fb797154c1ef6dcbe5ba033142da.	2024-06-21 08:34:04 +02:00
YunQiang Su	8988148003	Intrinsic: introduce minimumnum and maximumnum (#93841 ) Currently, on different platform, the behaivor of llvm.minnum is different if one operand is sNaN: When we compare sNaN vs NUM: ARM/AArch64/PowerPC: follow the IEEE754-2008's minNUM: return qNaN. RISC-V/Hexagon follow the IEEE754-2019's minimumNumber: return NUM. X86: Returns NUM but not same with IEEE754-2019's minimumNumber as +0.0 is not always greater than -0.0. MIPS/LoongArch/Generic: return NUM. LIBCALL: returns qNaN. So, let's introduce llvm.minmumnum/llvm.maximumnum, which always follow IEEE754-2019's minimumNumber/maximumNumber. Half-fix: #93033	2024-06-21 11:53:08 +08:00
Thorsten Schütt	b1f9440fa9	[GlobalIsel] Import GEP flags (#93850 ) https://github.com/llvm/llvm-project/pull/90824	2024-06-14 20:56:43 +02:00
Nikita Popov	deab451e7a	[IR] Remove support for icmp and fcmp constant expressions (#93038 ) Remove support for the icmp and fcmp constant expressions. This is part of: https://discourse.llvm.org/t/rfc-remove-most-constant-expressions/63179 As usual, many of the updated tests will no longer test what they were originally intended to -- this is hard to preserve when constant expressions get removed, and in many cases just impossible as the existence of a specific kind of constant expression was the cause of the issue in the first place.	2024-06-04 08:31:03 +02:00
paperchalice	9b0e1c2ca2	[NewPM][CodeGen] Port `finalize-isel` to new pass manager (#94214 ) It should preserve more analysis results, but it happens immediately after instruction selection.	2024-06-04 09:23:52 +08:00
Sergei Barannikov	3fee8b3469	[GISel] LegalizationArtifactCombiner: Elide redundant G_SEXT_INREG (#93687 ) This is similar to 373c343a, but for targets with zero-or-negative-one booleans. The difference in tests is mostly due to G_SEXT_INREG being illegal for some targets, in which case it gets expanded into G_SHL/G_ASHR pair, which is not currently optimized by the combiner.	2024-05-30 12:40:42 +03:00
YunQiang Su	0bf181eb34	MIPS: Fix llvm.{min,max}num for R6 (#93125 ) MIPS max.fmt/min.fmt instructions is IEEE2008 compatiable. If either argument is sNaN, the result will be NaN. So we define fminnum_ieee instead of fminnum in Mips32r6InstrInfo.td. We also should define fcanonicalize. So that we can define fminnum as expand to fcanonicalize and fminnum_ieee.	2024-05-23 22:27:17 +08:00
YunQiang Su	eac743d1b0	MIPS: Support '%w' token in inline asm template for MSA (#91920 ) MSA registers share the FPRs as its bottom half. So that we can use MSA instructions to work with normal float/double: double a, b, c; asm volatile ("fmadd.d %w0, %w1, %w2" : "+f"(a) : "f"(b), "f"(c)); GCC has support it for quite long time.	2024-05-20 14:46:47 +08:00
YunQiang Su	8f21294897	MIPS: Use pcrel\|sdata4 for eh_frame (#91291 ) Gas uses encoding DW_EH_PE_absptr for PIC, and gnu ld converts it to DW_EH_PE_sdata4\|DW_EH_PE_pcrel. LLD doesn't have this workarounding, thus complains ``` relocation R_MIPS_32 cannot be used against local symbol; recompile with -fPIC relocation R_MIPS_64 cannot be used against local symbol; recompile with -fPIC ``` So, let's generates asm/obj files with `DW_EH_PE_sdata4\|DW_EH_PE_pcrel` encoding. In fact, GNU ld supports such OBJs well. For N64, maybe we should use sdata8, while GNU ld doesn't support it well, and in fact sdata4 is enough now. So we just ignore the `Large` for `MCObjectFileInfo::initELFMCObjectFileInfo`. Maybe we should switch back to sdata8 once GNU LD supports it well. Fixes: #58377.	2024-05-08 17:30:14 +08:00
Cinhi Young	715219482b	[MIPS] match llvm.{min,max}num with {min,max}.fmt for R6 (#89021 ) - The behavior is similar to UCOMISD on x86, which is also used to compare two fp values, specifically on handling of NaNs. - Update related tests regarding this change. - The further goal is to implement `llvm.minimum` and `llvm.maximum` intrinsics for MIPS R6 and Pre-R6. Part of https://github.com/llvm/llvm-project/issues/64207	2024-04-27 15:53:02 +08:00
yingopq	e1aa16299f	[Mips] Use ANDi in for zero-extend in subword atomic umax/umin for both r2 and pre-R2 (#89881 ) About unsigned max/min, ANDi is available for all ISA revisions in extend before slt insn. So that we can reduce one instruction.	2024-04-24 22:31:51 +08:00
YunQiang Su	758d97dce0	[MIPS]: Rework atomic max/min expand for subword (#89575 ) The current code is so buggy: it can work for few cases. The problems include: 1. ll/sc works on a whole word, while other parts other than we rmw are dropped. 2. The oprands are not well zero-extended for unsigned ops. 3. It doesn't work for big-endian, as the postion of subword differs with little endian. And in fact, we can set the return value correct in ll/sc scope, so we can skip the sinkMBB.	2024-04-23 02:08:12 +08:00
Shilei Tian	3a106e5b2c	[GlobalISel] Fold G_ICMP if possible (#86357 ) This patch tries to fold `G_ICMP` if possible.	2024-03-29 15:59:50 -04:00
Wang Pengcheng	610b9e23c5	[SDAG] Use shifts if ISD::MUL is illegal when lowering ISD::CTPOP (#86505 ) We can avoid libcalls. Fixes #86205	2024-03-29 15:38:39 +08:00
Simon Pilgrim	5b544b511c	[Mips] ctpop.mir - regenerate checks to improve codegen diff in #86505	2024-03-26 10:43:29 +00:00
yingopq	5d7fd6a04a	[Mips] Restore wrong deletion of instruction 'and' in unsigned min/max processing. (#85902 ) Fix #61881	2024-03-24 02:35:42 -04:00
Evgenii Kudriashov	d365a45cb3	[GlobalISel] Introduce G_TRAP, G_DEBUGTRAP, G_UBSANTRAP (#84941 ) Here we introduce three new GMIR instructions to cover a set of trap intrinsics. The idea behind it is that generic intrinsics shouldn't be used with G_INTRINSIC opcode. These new instructions can match perfectly with existing trap ISD nodes. It allows X86, AArch64, RISCV and Mips to reuse SelectionDAG patterns for selection and avoid manual selection. However AMDGPU is an exception. It selects traps during legalization regardless SelectionDAG or GlobalISel. Since there are not many places where traps are used, this change attempts to clean up all the usages of G_INTRINSIC with trap intrinsics. So, there is no stage when both G_TRAP and G_INTRINSIC_W_SIDE_EFFECTS(@llvm.trap) are allowed.	2024-03-23 13:12:44 +01:00
YunQiang Su	d7e28cd82b	MIPS: Support -m(no-)unaligned-access for r6 (#85174 ) MIPSr6 ISA requires normal load/store instructions support misunaligned memory access, while it is not always do so by hardware. On some microarchitectures or some corner cases it may need support by OS. Don't confuse with pre-R6's lwl/lwr famlily: MIPSr6 doesn't support them, instead, r6 requires lw instruction support misunaligned memory access. So, if -mstrict-align is used for pre-R6, lwl/lwr won't be disabled. If -mstrict-align is used for r6 and the access is not well aligned, some lb/lh instructions will be used to replace lw. This is useful for OS kernels. To be back-compatible with GCC, -m(no-)unaligned-access are also added as Neg-Alias of -m(no-)strict-align.	2024-03-20 14:18:24 +08:00
Jonas Paulsson	09bc6abba6	[MachineFrameInfo] Refactoring around computeMaxcallFrameSize() (NFC) (#78001 ) - Use computeMaxCallFrameSize() in PEI::calculateCallFrameInfo() instead of duplicating the code. - Set AdjustsStack in FinalizeISel instead of in computeMaxCallFrameSize().	2024-03-18 10:37:59 -04:00
yingopq	755b439694	[Mips] Fix missing sign extension in expansion of sub-word atomic max (#77072 ) Add sign extension "SEB/SEH" before compare. Fix #61881	2024-03-08 15:41:31 -05:00
YunQiang Su	c88beb4112	MIPS: Fix asm constraints "f" and "r" for softfloat (#79116 ) This include 2 fixes: 1. Disallow 'f' for softfloat. 2. Allow 'r' for softfloat. Currently, 'f' is accpeted by clang, then LLVM meets an internal error. 'r' is rejected by LLVM by: couldn't allocate input reg for constraint 'r'. Fixes: #64241, #63632 --------- Co-authored-by: Fangrui Song <i@maskray.me>	2024-02-26 22:08:36 -08:00
yingopq	96abee5eef	[Mips] Fix unable to handle inline assembly ends with compat-branch o… (#77291 ) …n MIPS Modify: Add a global variable 'CurForbiddenSlotAttr' to save current instruction's forbidden slot and whether set reorder. This is the judgment condition for whether to add nop. We would add a couple of '.set noreorder' and '.set reorder' to wrap the current instruction and the next instruction. Then we can get previous instruction`s forbidden slot attribute and whether set reorder by 'CurForbiddenSlotAttr'. If previous instruction has forbidden slot and .set reorder is active and current instruction is CTI. Then emit a NOP after it. Fix https://github.com/llvm/llvm-project/issues/61045. Because https://reviews.llvm.org/D158589 was 'Needs Review' state, not ending, so we commit pull request again.	2024-02-24 15:13:43 +08:00
YunQiang Su	c007fbb198	MipsAsmParser/O32: Don't add redundant $ to $-prefixed symbol in the la macro (#80644 ) When parsing the `la` macro, we add a duplicate `$` prefix in `getOrCreateSymbol`, leading to `error: Undefined temporary symbol $$yy` for code like: ``` xx: la $2,$yy $yy: nop ``` Remove the duplicate prefix. In addition, recognize `.L`-prefixed symbols as local for O32. See: #65020. --------- Co-authored-by: Fangrui Song <i@maskray.me>	2024-02-14 12:48:55 -08:00
darkbuck	d0f4663f48	[GlobalISel][Mips] Global ISel for `brcond` - Enable equivalent between `brcond` and `G_BRCOND`. - Remove the manual selection of `G_BRCOND` in Mips. Revise test cases. Reviewers: petar-avramovic, bcardosolopes, arsenm Reviewed By: arsenm Pull Request: https://github.com/llvm/llvm-project/pull/81306	2024-02-10 21:44:05 -05:00
Fangrui Song	6b2fd7aed6	[MIPS] Use generic isBlockOnlyReachableByFallthrough (#80799 ) FastISel may create a redundant BGTZ terminal which fallthroughes. ``` BGTZ %2:gpr32, %bb.1, implicit-def $at bb.1.bb1: ; predecessors: %bb.0 ``` The `!I->isBarrier()` check in MipsAsmPrinter::isBlockOnlyReachableByFallthrough will incorrectly not print a label, leading to a `Undefined temporary symbol ` error when we try assembling the output assembly file. See the updated `Fast-ISel/pr40325.ll` and https://github.com/rust-lang/rust/issues/108835 In addition, the `SwitchInst` condition is too conservative and prints many unneeded labels (see the updated tests). Just use the generic isBlockOnlyReachableByFallthrough, updated by commit 1995b9fead62f2f6c0ad217bd00ce3184f741fdb for SPARC, which also handles MIPS.	2024-02-06 09:23:33 -08:00
Nikita Popov	ff9af4c43a	[CodeGen] Convert tests to opaque pointers (NFC)	2024-02-05 14:07:09 +01:00
Quentin Dian	112fba974c	[MIRPrinter] Don't print line break when there is no instructions (NFC) (#80147 ) Per #80143, we can remove the extra line break when there is no instruction.	2024-02-01 22:10:52 +08:00
Fangrui Song	f972e4d343	[MC,ELF] .section: unconditionally print section flag 'G' after 'o' * Placing 'G' before 'M' (SHF_MERGE) can be misleading as the sh_entsize argument goes before the section group name, if a reader doesn't know that the order of extra arguments is not affected by the order of flags. * 'a', 'w', and 'x' indicate basic permission-related flags. Separating them with 'G' is kinda ugly. Simplify code and move 'G' after 'o'. The new output is more similar to GCC.	2024-01-09 10:48:23 -08:00
Fangrui Song	7620f03ef7	[MC] Parse SHF_LINK_ORDER argument before section group name (#77407 ) When both SHF_LINK_ORDER \| SHF_GROUP flags are set, GNU assembler from 2.35 onwards (https://sourceware.org/PR25381 https://sourceware.org/binutils/docs/as/Section.html) parses the SHF_LINK_ORDER argument before section group name, different from us. This is unfortunate, but does not matter because the `.section` flag `o` is a niche feature only used by compiler instrumentations, not adopted by hand-written assembly, and using both flags is extremely rare. Let's just match GNU assembler. There is another benefit: we now support zero-flag section group with the SHF_LINK_ORDER flag, while previously there isn't a syntax. While here, print 'G' after 'o' to be clear that the 'G' argument is parsed after the 'o' argument. To make the diff smaller, we don't print 'G' after 'w' in the absence of 'o' for now.	2024-01-09 10:42:34 -08:00
yingopq	e13e95bc44	[Mips] Optimize (shift x (and y, BitWidth - 1)) to (shift x, y) (#73889 ) Do optimization to turn x >> (shift & 31/63) into a single srlv instead of andi + srlv, since the mips variable shift instruction already implicitly masks the shift, like x86, wasm and AMDGPU. Copy the X86DAGToDAGISel::isUnneededShiftMask() function to MIPS for checking whether need combine two instructions to one.	2023-12-29 14:53:55 +05:30

1 2 3 4 5 ...

1786 Commits