llvm-project

Author	SHA1	Message	Date
Thorsten Schütt	364f781344	[GlobalIsel] Combine logic of icmps (#77855 ) Inspired by InstCombinerImpl::foldAndOrOfICmpsUsingRanges with some adaptations to MIR.	2024-02-06 15:58:02 +01:00
Matt Arsenault	11bf02e019	DAG: Fix ABI lowering with FP promote in strictfp functions (#74405 ) This was emitting non-strict casts in ABI contexts for illegal types.	2024-01-18 10:57:53 +07:00
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Jay Foad	a4196666ac	[AMDGPU] Revert "Preliminary patch for divergence driven instruction selection. Operands Folding 1." (#71710 ) This reverts commit 201f892b3b597f24287ab6a712a286e25a45a7d9.	2023-11-13 13:53:10 +00:00
Matt Arsenault	d86a7d631c	GlobalISel: Add constant fold combine for zext/sext/anyext Could use more work for vectors. https://reviews.llvm.org/D156534	2023-08-24 08:10:01 -04:00
Kevin P. Neal	76c22b18ea	[FPEnv][AMDGPU] Correct strictfp tests. Correct AMDGPU strictfp tests to follow the rules documented in the LangRef: https://llvm.org/docs/LangRef.html#constrained-floating-point-intrinsics Mostly these tests just needed the strictfp attribute on function definitions. I've also removed the strictfp attribute from uses of the constrained intrinsics because it comes by default since D154991, but I only did this in tests I was changing anyway. I also removed attributes added to declare lines of intrinsics. The attributes of intrinsics cannot be changed in a test so I eliminated attempts to do so. Test changes verified with D146845.	2023-07-25 13:24:46 -04:00
Jay Foad	7fa7a08f21	[AMDGPU] Insert s_nop before s_sendmsg sendmsg(MSG_DEALLOC_VGPRS) Differential Revision: https://reviews.llvm.org/D155681	2023-07-19 10:33:11 +01:00
Matt Arsenault	b59022b42e	DAG: Handle lowering of unordered fcZero\|fcSubnormal to fcmp	2023-07-11 18:30:15 -04:00
Matt Arsenault	64df9573a7	DAG: Handle inversion of fcSubnormal \| fcZero There are a number of more test combinations here that can be done together and reduce the number of instructions. https://reviews.llvm.org/D143191	2023-07-06 21:19:44 -04:00
Matt Arsenault	61820f8b5d	CodeGen: Optimize lowering of is.fpclass fcZero\|fcSubnormal Combine the two checks into a check if the exponent bits are 0. The inverted case isn't reachable until a future change, and GlobalISel currently doesn't attempt the inversion optimization. https://reviews.llvm.org/D143182	2023-07-06 13:03:57 -04:00
Jay Foad	f2c164c815	[AMDGPU] Do not wait for vscnt on function entry and return SIInsertWaitcnts inserts waitcnt instructions to resolve data dependencies. The GFX10+ vscnt (VMEM store count) counter is never used in this way. It is only used to resolve memory dependencies, and that is handled by SIMemoryLegalizer. Hence there is no need to conservatively wait for vscnt to be 0 on function entry and before returns. Differential Revision: https://reviews.llvm.org/D153537	2023-07-04 12:22:38 +01:00
Chen Zheng	3f4055dec4	[GlobalISelEmitter] handle operand without MVT/class There are some patterns in td files without MVT/class set for some operands in target pattern that are from the source pattern. This prevents GlobalISelEmitter from adding them as a valid rule, because the target child operand is an unsupported kind operand. For now, for a leaf child, only IntInit and DefInit are handled in GlobalISelEmitter. This issue can be workaround by adding MVT/class to the patterns in the td files, like the workarounds for patterns anyext and setcc in PPCInstrInfo.td in D140878. To avoid adding the same workarounds for other patterns in td files, this patch tries to handle the UnsetInit case in GlobalISelEmitter. Adding the new handling allows us to remove the workarounds in the td files and also generates many selection rules for PPC target. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D141247	2023-04-19 07:00:57 +00:00
Matt Arsenault	9356ec1516	CodeGen: Reorder case handling for is.fpclass legalization Subnormal and zero checks can be combined into one, so move the code closer to reduce the diff in a future change.	2023-03-17 11:29:50 -04:00
Matt Arsenault	cd60bff329	CodeGen: Add some additional is_fpclass lowering tests Cover more cases in preparation for making greater use of fcmp based lowerings. Also add more tests for the inverted cases. Test iszero \| isnan test masks. We should probably just generate every combination of test masks.	2023-03-15 01:13:08 -04:00
Matt Arsenault	0f8b3b97fd	AMDGPU: Add additional tests for is.fpclass legalization	2023-02-02 22:50:23 -04:00
Matt Arsenault	3a01b4a93d	AMDGPU: Regenerate test checks Use right prefix order to get merging. Also drop -verify-machineinstrs and add -amdgpu-enable-delay-alu=0	2023-02-02 22:50:23 -04:00
Nikita Popov	bdf2fbba9c	[AMDGPU] Convert some tests to opaque pointers (NFC)	2022-12-19 12:41:13 +01:00
Matt Arsenault	d647e252b8	InstSimplify: Add basic folding of llvm.is.fpclass intrinsic Copied from the existing llvm.amdgcn.class handling; eventually I will fold that to the generic intrinsic when legal. The tests should probably move into an instsimplify only test.	2022-12-12 21:54:04 -05:00
Janek van Oirschot	587747d8d1	[AMDGPU] G_IS_FPCLASS lower() support for IEEE fp types Simplified globalisel version of sdag's expandIS_FPCLASS. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D139128	2022-12-07 11:53:09 +00:00
Janek van Oirschot	322966f8f8	[AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU Uses existing SelectionDAG lowering of the llvm.amdgcn.class intrinsic for llvm.is.fpclass	2022-11-28 16:00:36 -05:00

20 Commits