llvm-project

Author	SHA1	Message	Date
Craig Topper	e837ef91e3	[RISCV][GISel] Re-generate legalize-vastart-rv32.mir and legalize-vastart-rv64.mir to fix buildbot failure. NFC I must have messed something up when addressing feedback on the patch that added these tests.	2023-12-08 13:08:46 -08:00
Maryam Moghadas	8f6f5ec776	[PowerPC] Move __ehinfo TOC entries to the end of the TOC section (#73586 ) On AIX, the __ehinfo toc-entry is never referenced directly using instructions, therefore we can allocate them with the TE storage mapping class to move them to the end of TOC.	2023-12-08 15:03:11 -05:00
Michael Maitland	e8dbed097a	[RISCV][GISEL] Fix RUN lines in vararg.ll The `< %s` needed to be removed. This change fixes the test introduced in 02379d19147afda413a2bc757e8d2f5249d772d1	2023-12-08 11:56:55 -08:00
Michael Maitland	02379d1914	[RISCV][GISEL] Add vararg.ll LLVM IR -> ASM test This test is added to be the counterpart of the SelectionDAG llvm/test/CodeGen/RISCV/vararg.ll test. Minor changes were made compared to the other version, all which are commented in the test file added in this commit.	2023-12-08 11:25:54 -08:00
Arthur Eubanks	687e63a2bd	[X86] Allow accessing large globals in small code model (#74785 ) This removes some assumptions that the small code model will only reference "near" globals. There are still some missing optimizations and wrong code sequences, but I'd like to address those separately. This will require auditing any checks of the code model in the X86 backend.	2023-12-08 11:09:54 -08:00
Craig Topper	478d093e1b	[RISCV][GISel] Reverse the operands the buildStore created in legalizeVAStart. (#73989 ) We need to store the frame index to the location pointed to by the VASTART, not the other way around.	2023-12-08 10:45:53 -08:00
Michael Maitland	3a38baa0e7	[GISEL][RISCV] Legalize llvm.vacopy intrinsic (#73066 ) In the future, we can consider adding a G_VACOPY opcode instead of going through the GIntrinsic for all targets. We do the approach in this patch because that is what other targets do today.	2023-12-08 13:45:32 -05:00
Michael Maitland	6f9cb9a75c	[RISCV][GISEL] Legalize G_VAARG through expansion. (#73065 ) G_VAARG can be expanded similiar to SelectionDAG::expandVAArg through LegalizerHelper::lower. This patch implements the lowering through this style of expansion. The expansion gets the head of the va_list by loading the pointer to va_list. Then, the head of the list is adjusted depending on argument alignment information. This gives a pointer to the element to be read out of the va_list. Next, the head of the va_list is bumped to the next element in the list. The new head of the list is stored back to the original pointer to the head of the va_list so that subsequent G_VAARG instructions get the next element in the list. Lastly, the element is loaded from the alignment adjusted pointer constructed earlier. This change is stacked on #73062.	2023-12-08 13:24:27 -05:00
Jonas Paulsson	435ba72afd	[SystemZ] Simplify handling of AtomicRMW instructions. (#74789 ) Let the AtomicExpand pass do more of the job of expanding AtomicRMWInst:s in order to simplify the handling in the backend. The only cases that the backend needs to handle itself are those of subword size (8/16 bits) and those directly corresponding to a target instruction.	2023-12-08 17:19:17 +01:00
Benjamin Kramer	06ebe3b237	[NVPTX] Fix a typo that makes the output invalid PTX It's surprisingly tricky to trigger this as it's only used by abs/neg which expand into and/xor in the integer domain.	2023-12-08 14:22:07 +01:00
Jay Foad	e38c29c2b7	[AMDGPU] Add GFX11 test coverage to integer-mad-patterns.ll	2023-12-08 13:06:03 +00:00
Saiyedul Islam	5c4c199fe3	[AMDGPU][NFC] Improve testing for AMDHSA ABI Version (#74300 ) Add tests for COV4 as well as COV5 instead of only testing for the default version.	2023-12-08 18:09:45 +05:30
Simon Pilgrim	5f91335a55	[X86] canonicalizeBitSelect - always use VPTERNLOGD for sub-32bit types We were using VPTERNLOGQ for everything but i32 types, which made broadcasts wider than necessary Noticed in #73509	2023-12-08 11:38:32 +00:00
Simon Pilgrim	faecc736e2	[DAG] isSplatValue - node is a splat if all demanded elts have the same whole constant value (#74443 )	2023-12-08 10:53:51 +00:00
Simon Pilgrim	8859a4f630	[X86] LowerBUILD_VECTOR - don't use insert_element(constant, elt, idx) if we have a freeze(undef) element Fixes #74736	2023-12-08 10:28:56 +00:00
Valery Pykhtin	901c5be524	[AMDGPU] Fix GCNUpwardRPTracker: max register pressure on defs. (#74422 ) Treat a defined register as fully live "at" the instruction and update maximum pressure accordingly. Fixes #3786.	2023-12-08 11:27:08 +01:00
wanglei	cdc3732566	[LoongArch] Mark ISD::FNEG as legal	2023-12-08 15:07:58 +08:00
wanglei	9f70e708a7	[LoongArch] Make ISD::FSQRT a legal operation with lsx/lasx feature (#74795 ) And add some patterns: 1. (fdiv 1.0, vector) 2. (fdiv 1.0, (fsqrt vector))	2023-12-08 14:16:26 +08:00
Philip Reames	ffb2af3ed6	[SCEVExpander] Attempt to reinfer flags dropped due to CSE (#72431 ) LSR uses SCEVExpander to generate induction formulas. The expander internally tries to reuse existing IR expressions. To do that, it needs to strip any poison generating flags (nsw, nuw, exact, nneg, etc..) which may not be valid for the newly added users. This is conservatively correct, but has the effect that LSR will strip nneg flags on zext instructions involved in trip counts in loop preheaders. To avoid this, this patch adjusts the expanded to reinfer the flags on the CSE candidate if legal for all possible users. This should fix the regression reported in https://github.com/llvm/llvm-project/issues/71200. This should arguably be done inside canReuseInstruction instead, but doing it outside is more conservative compile time wise. Both canReuseInstruction and isGuaranteedNotToBePoison walk operand lists, so right now we are performing work which is roughly O(N^2) in the size of the operand graph. We should fix that before making the per operand step more expensive. My tenative plan is to land this, and then rework the code to sink the logic into more core interfaces.	2023-12-07 13:20:36 -08:00
Craig Topper	e87f33d9ce	[RISCV][MC] Pass MCSubtargetInfo down to shouldForceRelocation and evaluateTargetFixup. (#73721 ) Instead of using the STI stored in RISCVAsmBackend, try to get it from the MCFragment. This addresses the issue raised here https://discourse.llvm.org/t/possible-problem-related-to-subtarget-usage/75283	2023-12-07 13:17:58 -08:00
Natalie Chouinard	6c6f8b1acd	[SPIR-V] Fixup tests (#73371 ) These tests are currently failing at tip-of-tree, but pass with minor FileCheck updates that look reasonable.	2023-12-07 15:23:27 -05:00
Stefan Pintilie	ea8b95d0d5	[PowerPC] Add a set of extended mnemonics that are missing from Power 10. (#73003 ) This patch adds the majority of the missing extended mnemonics that were introduced in Power 10. The only extended mnemonics that were not added are related to the plq and pstq instructions. These will be added in a separate patch as the instructions themselves would also have to be added.	2023-12-07 13:40:00 -05:00
David Green	e3720bbc08	[AArch64] Extend and cleanup vector icmp test cases. NFC	2023-12-07 18:39:33 +00:00
Simon Pilgrim	f1200ca7ac	[DAG] visitEXTRACT_VECTOR_ELT - constant fold legal fp imm values (#74304 ) If we're extracting a constant floating point value, and the constant is a legal fp imm value, then replace the extraction with a fp constant.	2023-12-07 14:56:12 +00:00
Simon Pilgrim	5384fb3d40	[X86] gep-expanded-vector.ll - regenerate checks	2023-12-07 14:07:10 +00:00
wanglei	9ff7d0ebeb	[LoongArch] Add codegen support for icmp/fcmp with lsx/lasx fetaures (#74700 ) Mark ISD::SETCC node as legal, and add handling for the vector types condition codes.	2023-12-07 20:11:43 +08:00
Harald van Dijk	03edfe6148	Implement SoftPromoteHalf for FFREXP. (#74076 ) `llvm/test/CodeGen/RISCV/llvm.frexp.ll` and `llvm/test/CodeGen/X86/llvm.frexp.ll` contain a number of disabled tests for unimplemented functionality. This implements one missing part of it.	2023-12-07 11:10:17 +00:00
Simon Pilgrim	22df0886a1	[DAG] Don't split f64 constant stores if the fp imm is legal (#74622 ) If the target can generate a specific fp immediate constant, then don't split the store into 2 x i32 stores Another cleanup step for #74304	2023-12-07 10:33:03 +00:00
Sjoerd Meijer	3acbd38492	[AArch64] Optimise MOVI + CMGT to CMGE (#74499 ) This fixes a regression that occured for a pattern of MOVI + CMGT instructions, which can be optimised to CMGE. I.e., when the signed greater than compare has -1 as an operand, we can rewrite that as a compare greater equal than 0, which is what CMGE does. Fixes #61836	2023-12-07 08:32:02 +00:00
Fangrui Song	39ba027f4e	[RISCV,test] Test whether MCAssembler uses function target-features Test https://discourse.llvm.org/t/possible-problem-related-to-subtarget-usage/75283 The test is similar to ARM/relax-per-target-feature.ll in spirit.	2023-12-07 00:23:42 -08:00
Chen Zheng	4b932d84f4	[PowerPC] redesign the target flags (#69695 ) 12 bit is not enough for PPC's target specific flags. If 8 bit for the bitmask flags, 4 bit for the direct mask, PPC can total have 16 direct mask and 8 bitmask. Not enough for PPC, see this issue in https://github.com/llvm/llvm-project/pull/66316 Redesign how PPC target set the target specific flags. With this patch, all ppc target flags are direct flags. No bitmask flag in PPC anymore. This patch aligns with some targets like X86 which also has many target specific flags. The patch also fixes a bug related to flag `MO_TLSGDM_FLAG` and `MO_LO`. They are the same value and the test case changes in this PR shows the bug.	2023-12-07 12:47:25 +08:00
Craig Topper	b310932f87	[RISCV] Add vmv.x.s to RISCVOptWInstrs. (#74519 ) This instruction produces a 32-bit sign extended value if the SEW is less than or equal to 32.	2023-12-06 17:06:56 -08:00
Thurston Dang	69c4930aad	Revert "Reapply "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG"" This reverts commit 1f283a60a4bb896fa2d37ce00a3018924be82b9f. Reason: breaks MSan buildbot (https://lab.llvm.org/buildbot/#/builders/74/builds/24077)	2023-12-06 19:27:21 +00:00
paperchalice	5baf66f3c2	[CodeGen] Port WasmEHPrepare to new pass manager (#74435 ) Port `WasmEHPrepare` to new pass manager, also rename `wasmehprepare` to `wasm-eh-prepare`.	2023-12-06 11:11:00 -08:00
Artem Belevich	a2d3bb1fa9	Revert "[NVPTX] Lower 16xi8 and 8xi8 stores efficiently (#73646 )" (#74518 ) This reverts commit 173fcf7da592acd284dc50749558fd36928861f0. We need to constrain the optimization to properly aligned loads/stores only. https://github.com/llvm/llvm-project/pull/73646#issuecomment-1841454559	2023-12-06 10:48:43 -08:00
Matthew Devereau	8186e1500b	[SME2] Add LUTI2 and LUTI4 single Builtins and Intrinsics (#73304 ) See https://github.com/ARM-software/acle/pull/217 Patch by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>	2023-12-06 16:35:56 +00:00
Matt Arsenault	1f283a60a4	Reapply "RegisterCoalescer: Add implicit-def of super register when coalescing SUBREG_TO_REG" This reverts commit 9e50c6e6b5741895f58f3e530004052844b6af9f. A few assertion and verifier errors have been fixed in the coalescer and allocator, so hopefully this sticks this time.	2023-12-06 23:07:22 +07:00
Matt Arsenault	546a9ce80c	CodeGen: Fix bypassing legality checks for IMPLICIT_DEF rematerialization (#73934 ) It's permitted to have extra implicit-def operands of the same main register after the main register def. If there are implicit operands, use the standard legality checks which verify the operand contents. Depends #73933	2023-12-06 21:43:19 +07:00
Shengchen Kan	b8bc2351b8	[X86][test] Simplify test avx512-broadcast-unfold.ll (#74593 ) The test was updated by opt -passes=early-cse -S llvm/test/CodeGen/X86/avx512-broadcast-unfold.ll	2023-12-06 22:38:32 +08:00
Simon Pilgrim	56eb3e738a	[X86] Set x87 fld1/fldz pseudo instructions as rematerializable (#74592 ) No need to generate/spill/restore to cpu stack Cleanup work to allow us to properly use isFPImmLegal and fix some regressions encountered while looking at #74304	2023-12-06 14:36:42 +00:00
Matthew Devereau	30faf19a88	[SME2] Add LUTI2 and LUTI4 double Builtins and Intrinsics (#73305 ) See https://github.com/ARM-software/acle/pull/217 Patch by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>	2023-12-06 14:35:11 +00:00
Simon Pilgrim	bf454839a1	[X86] vec_zero_cse.ll - replace X32 check prefix with X86 We use X32 for gnux32 triples - X86 should be used for 32-bit triples	2023-12-06 14:06:37 +00:00
Shengchen Kan	50c66600b8	[X86][test] Migrate test avx512-broadcast-unfold.ll for opaque pointers	2023-12-06 21:24:30 +08:00
Simon Pilgrim	609d980b3f	[ARM] Regenerate aapcs-hfa-code.ll	2023-12-06 12:09:30 +00:00
Simon Pilgrim	f12a0ba53e	[X86] zero-remat.ll - regenerate checks	2023-12-06 11:15:55 +00:00
Simon Pilgrim	322c7c717b	[X86] slow-unaligned-mem.ll - improve checks We can't easily convert this to use the update scripts, but we can manually improve the checks so we check for the right number of stores	2023-12-06 10:50:57 +00:00
Matthew Devereau	6704d6aadd	[SME2] Add LUTI2 and LUTI4 quad Builtins and Intrinsics (#73317 ) See https://github.com/ARM-software/acle/pull/217 Patch by: Hassnaa Hamdi <hassnaa.hamdi@arm.com>	2023-12-06 10:08:04 +00:00
Pierre van Houtryve	ecd2f56a80	[AMDGPU] Warn if 'amdgpu-waves-per-eu' target occupancy was not met (#74055 ) This should make it a bit harder to miss this type of issue. The warning only shows if amdgpu-waves-per-eu is used. See SWDEV-434482	2023-12-06 10:46:46 +01:00
Matt Arsenault	08e63dd8fe	AMDGPU: Add a MIR test to catch infinite loop This is derived from one of the regressions reported after aed1a2217a1da0c9fb7d2c0856302dee25b1d4a1	2023-12-06 15:58:32 +07:00
wanglei	de21308f78	[LoongArch] Make ISD::VSELECT a legal operation with lsx/lasx	2023-12-06 16:43:38 +08:00

... 32 33 34 35 36 ...

52796 Commits