llvm-project

Author	SHA1	Message	Date
Matt Arsenault	e7900e695e	AMDGPU: Regenerate baseline mir tests	2024-02-27 10:44:53 +05:30
Fangrui Song	9e9907f1cf	[AMDGPU,test] Change llc -march= to -mtriple= (#75982 ) Similar to 806761a7629df268c8aed49657aeccffa6bca449. For IR files without a target triple, -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple, leaving a target triple which may not make sense, e.g. amdgpu-apple-darwin. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize $unknown-apple-darwin as ELF instead of rejecting it outrightly. This patch changes AMDGPU tests to not rely on the default OS/environment components. Tests that need fixes are not changed: ``` LLVM :: CodeGen/AMDGPU/fabs.f64.ll LLVM :: CodeGen/AMDGPU/fabs.ll LLVM :: CodeGen/AMDGPU/floor.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.f64.ll LLVM :: CodeGen/AMDGPU/fneg-fabs.ll LLVM :: CodeGen/AMDGPU/r600-infinite-loop-bug-while-reorganizing-vector.ll LLVM :: CodeGen/AMDGPU/schedule-if-2.ll ```	2024-01-16 21:54:58 -08:00
Matt Arsenault	d85e849ff4	AMDGPU: Convert some assorted tests to opaque pointers	2022-12-01 21:40:30 -05:00
Matt Arsenault	bb70b5d406	CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer Previously this was assuming piontsToConstantMemory implies dereferenceable.	2022-09-12 08:38:35 -04:00
Matt Arsenault	8d0383eb69	CodeGen: Remove AliasAnalysis from regalloc This was stored in LiveIntervals, but not actually used for anything related to LiveIntervals. It was only used in one check for if a load instruction is rematerializable. I also don't think this was entirely correct, since it was implicitly assuming constant loads are also dereferenceable. Remove this and rely only on the invariant+dereferenceable flags in the memory operand. Set the flag based on the AA query upfront. This should have the same net benefit, but has the possible disadvantage of making this AA query nonlazy. Preserve the behavior of assuming pointsToConstantMemory implying dereferenceable for now, but maybe this should be changed.	2022-07-18 17:23:41 -04:00
Petar Avramovic	29f88b93fd	[GlobalISel] Rework more/fewer elements for vectors Artifact combiner is not able to access individual elements after using LCMTy style merge/unmerge, extract and insert to change vector number of elements (pad with undef or split to sub-vector instructions). Use unmerge to individual elements instead and then merge elements into requested types. Change argument lowering for vectors and moreElementsVector to use buildPadVectorWithUndefElements and buildDeleteTrailingVectorElements. FewerElementsVector had a few helpers that had different behavior, introduce new helper for most of the opcodes. FewerElementsVector helper is more flexible since it can create leftover instruction smaller then requested type (useful in case target wants to avoid pad with undef and use fewer registers). If target does not want leftover of different type it should call more elements first. Some helpers were performing more elements first to have split without leftover. Opcodes that used this helper use clampMaxNumElementsStrict (does more elements first) in LegalizerInfo to avoid test changes. Fixes failures caused by failing to combine artifacts created during more/fewer elements vector. Differential Revision: https://reviews.llvm.org/D114198	2021-12-23 14:30:02 +01:00
Jay Foad	dff3454bda	[TwoAddressInstruction] Tweak constraining of tied operands In collectTiedOperands, when handling an undef use that is tied to a def, constrain the dst reg with the actual register class of the src reg, instead of with the register class from the instructions's MCInstrDesc. This makes a difference in some AMDGPU test cases like this, before: %16:sgpr_96 = INSERT_SUBREG undef %15:sgpr_96_with_sub0_sub1(tied-def 0), killed %11:sreg_64_xexec, %subreg.sub0_sub1 After, without this patch: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec This fails machine verification if you force it to run after TwoAddressInstruction (currently it is disabled) with: * Bad machine code: Invalid register class for subregister index * - function: s_load_constant_v3i32_align4 - basic block: %bb.0 (0xa011a88) - instruction: undef %16.sub0_sub1:sgpr_96 = COPY killed %11:sreg_64_xexec - operand 0: undef %16.sub0_sub1:sgpr_96 Register class SGPR_96 does not fully support subreg index 4 After, with this patch: undef %16.sub0_sub1:sgpr_96_with_sub0_sub1 = COPY killed %11:sreg_64_xexec See also svn r159120 which introduced the code to handle tied undef uses. Differential Revision: https://reviews.llvm.org/D110944	2021-10-01 20:57:58 +01:00
Jay Foad	61ecfc6f9d	[TwoAddressInstruction] Pre-commit a test case for D110944	2021-10-01 20:57:57 +01:00

8 Commits