llvm-project

Author	SHA1	Message	Date
Simon Pilgrim	a0c7a29655	[GlobalISel] IRTranslator::translateGetElementPtr - don't assume a gep constant offset is representable as i64 Fixes https://bugs.chromium.org/p/oss-fuzz/issues/detail?id=65052	2023-12-14 11:02:38 +00:00
Jay Foad	35ebd92d3d	[GlobalISel] Add G_PREFETCH (#74863 )	2023-12-11 11:06:50 +00:00
Craig Topper	755c28a940	[GISel][Mips] Infer alignment when creating memory operand for G_VASTART. (#74004 )	2023-11-30 19:55:23 -08:00
Youngsuk Kim	d8b8aa3a56	[llvm] Replace calls to Type::getPointerTo (NFC) Cleanup work towards removing the method Type::getPointerTo. If a call to Type::getPointerTo is used solely to support an unneeded pointer-cast, remove the call entirely.	2023-11-27 10:49:34 -06:00
HaohaiWen	394bba766d	[CodeGen][DebugInfo] Add missing debug info for jump table BB (#71021 ) visitJumpTable is called on FinishBasicBlock. At that time, getCurSDLoc will always return SDLoc without DebugLoc since CurInst was set to nullptr after visiting each instruction. This patch passes SDLoc to buildJumpTable when visiting SwitchInst so that visitJumpTable can use it later.	2023-11-18 19:17:51 +08:00
Michael Maitland	725e599637	[RISCV][GISEL] Add support for scalable vector types in lowerReturnVal (#71587 ) Scalable vector types from LLVM IR are lowered into physical vector registers in MIR based on calling convention for return instructions.	2023-11-15 17:30:53 -05:00
Paulo Matos	7b9d73c2f9	[NFC] Remove Type::getInt8PtrTy (#71029 ) Replace this with PointerType::getUnqual(). Followup to the opaque pointer transition. Fixes an in-code TODO item.	2023-11-07 17:26:26 +01:00
Amara Emerson	6b69584660	[GlobalISel] Fall back for bf16 conversions. (#71470 ) We don't support these correctly since we don't yet have FP types. AMDGPU tests were silently miscompiling bf16 as if they were fp16.	2023-11-06 21:18:57 -08:00
Diana	7f5d59b38d	[AMDGPU] ISel for @llvm.amdgcn.cs.chain intrinsic (#68186 ) The @llvm.amdgcn.cs.chain intrinsic is essentially a call. The call parameters are bundled up into 2 intrinsic arguments, one for those that should go in the SGPRs (the 3rd intrinsic argument), and one for those that should go in the VGPRs (the 4th intrinsic argument). Both will often be some kind of aggregate. Both instruction selection frameworks have some internal representation for intrinsics (G_INTRINSIC[_WITH_SIDE_EFFECTS] for GlobalISel, ISD::INTRINSIC_[VOID\|WITH_CHAIN] for DAGISel), but we can't use those because aggregates are dissolved very early on during ISel and we'd lose the inreg information. Therefore, this patch shortcircuits both the IRTranslator and SelectionDAGBuilder to lower this intrinsic as a call from the very start. It tries to use the existing infrastructure as much as possible, by calling into the code for lowering tail calls. This has already gone through a few rounds of review in Phab: Differential Revision: https://reviews.llvm.org/D153761	2023-11-06 12:30:07 +01:00
Serge Pavlov	462d5830da	[GlobalISel] Add support for *_fpmode intrinsics The change implements support of the intrinsics `get_fpmode`, `set_fpmode` and `reset_fpmode` in Global Instruction Selector. Now they are lowered into library function calls. Differential Revision: https://reviews.llvm.org/D158260	2023-10-09 21:14:07 +07:00
Mirko Brkusanin	72e3713009	[IRTranslator] Set NUW flag for inbounds gep and load/store offsets Patch by: Acim Maravic Differential Revision: https://reviews.llvm.org/D159515	2023-09-22 16:16:28 +02:00
Martin Storsjö	7a91bbbb00	[GlobalISel] Check for unsupported Windows features on invoke (#65864 ) This matches what is done on calls, since cc981d285d1aa33df201605b9a3e22dd2311ead2 (extended for another case in 5a751e747dbf2c267e944aa961e21de7a815e7eb). Apply both those cases on invoke just like is done for call. Also update the preexisting comment which was left without update in 5a751e747dbf2c267e944aa961e21de7a815e7eb. This fixes github issue #61941.	2023-09-15 11:14:40 +03:00
Arthur Eubanks	0a1aa6cda2	[NFC][CodeGen] Change CodeGenOpt::Level/CodeGenFileType into enum classes (#66295 ) This will make it easy for callers to see issues with and fix up calls to createTargetMachine after a future change to the params of TargetMachine. This matches other nearby enums. For downstream users, this should be a fairly straightforward replacement, e.g. s/CodeGenOpt::Aggressive/CodeGenOptLevel::Aggressive or s/CGFT_/CodeGenFileType::	2023-09-14 14:10:14 -07:00
Matt Arsenault	b14e83d1a4	IR: Add llvm.exp10 intrinsic We currently have log, log2, log10, exp and exp2 intrinsics. Add exp10 to fix this asymmetry. AMDGPU already has most of the code for f32 exp10 expansion implemented alongside exp, so the current implementation is duplicating nearly identical effort between the compiler and library which is inconvenient. https://reviews.llvm.org/D157871	2023-09-01 19:45:03 -04:00
Felipe de Azevedo Piovezan	88417098bb	[CodeGen][DebugInfo] Append OP_deref when converting an EntryValue dbg.declare When we convert an EntryValue dbg.declare into an entry of the MF side table, we currently copy its DIExpression as is, and rely on subsequent layers to "know" that this expression is implicitly indirect. This is bad because it adds an implicit assumption to the IR representation, and requires subsequent layers to know about this assumption. This also limits the reusability of this table: what if, in the future, we want to use this table for dbg.values? This patch changes existing behavior so that the entities converting dbg_declares explicitly add an OP_deref when converting EntryValue dbg.declares. Differential Revision: https://reviews.llvm.org/D158437	2023-08-23 12:25:12 -04:00
David Green	a3f2751f78	[AArch64][GISel] Add handling for G_VECREDUCE_FMAXIMUM and G_VECREDUCE_FMINIMUM This is a lot of copy-pasting for the existing handling of G_VECREDUCE_FMAX/G_VECREDUCE_FMIN to add handling for G_VECREDUCE_FMAXIMUM/G_VECREDUCE_FMINIMUM in the same way. Differential Revision: https://reviews.llvm.org/D156615	2023-08-14 10:03:25 +01:00
Matt Arsenault	1ca0808db2	GlobalISel: Don't expand stacksave/stackrestore in IRTranslator In some (likely invalid edge cases anyway), it's not correct to directly copy the stack pointer register.	2023-08-09 18:33:55 -04:00
David Blaikie	4e429fd2a7	Few linter fixes size() > 0 -> !empty indentation mismatched names on parameters in decls/defs const on value return types	2023-07-31 18:52:57 +00:00
Sameer Sahasrabuddhe	d9847cde48	[GlobalISel] convergent intrinsics Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154766	2023-07-31 12:15:39 +05:30
Matt Arsenault	003b58f65b	IR: Add llvm.frexp intrinsic Add an intrinsic which returns the two pieces as multiple return values. Alternatively could introduce a pair of intrinsics to separately return the fractional and exponent parts. AMDGPU has native instructions to return the two halves, but could use some generic legalization and optimization handling. For example, we should be able to handle legalization of f16 on older targets, and for bf16. Additionally antique targets need a hardware workaround which would be better handled in the backend rather than in library code where it is now.	2023-06-28 14:50:16 -04:00
Amara Emerson	1c2c668846	[GlobalISel] Introduce G_CONSTANT_FOLD_BARRIER and use it to prevent constant folding hoisted constants. The constant hoisting pass tries to hoist large constants into predecessors and also generates remat instructions in terms of the hoisted constants. These aim to prevent codegen from rematerializing expensive constants multiple times. So we can re-use this optimization, we can preserve the no-op bitcasts that are used to anchor constants to the predecessor blocks. SelectionDAG achieves this by having the OpaqueConstant node, which is just a normal constant with an opaque flag set. I've opted to avoid introducing a new constant generic instruction here. Instead, we have a new G_CONSTANT_FOLD_BARRIER operation that constitutes a folding barrier. These are somewhat like the optimization hints, G_ASSERT_ZEXT in that they're eliminated by the generic instruction selection code. This change by itself has very minor improvements in -Os CTMark overall. What this does allow is better optimizations when future combines are added that rely on having expensive constants remain unfolded. Differential Revision: https://reviews.llvm.org/D144336	2023-06-09 11:45:06 -07:00
Matt Arsenault	eece6ba283	IR: Add llvm.ldexp and llvm.experimental.constrained.ldexp intrinsics AMDGPU has native instructions and target intrinsics for this, but these really should be subject to legalization and generic optimizations. This will enable legalization of f16->f32 on targets without f16 support. Implement a somewhat horrible inline expansion for targets without libcall support. This could be better if we could introduce control flow (GlobalISel version not yet implemented). Support for strictfp legalization is less complete but works for the simple cases.	2023-06-06 17:07:18 -04:00
Dávid Bolvanský	09515f2c20	[SDAG] Preserve unpredictable metadata, teach X86CmovConversion to respect this metadata Sometimes an developer would like to have more control over cmov vs branch. We have unpredictable metadata in LLVM IR, but currently it is ignored by X86 backend. Propagate this metadata and avoid cmov->branch conversion in X86CmovConversion for cmov with this metadata. Example: ``` int MaxIndex(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is converted to branch by X86CmovConversion if (a[i] > a[t]) t = i; } return t; } int MaxIndex2(int n, int a) { int t = 0; for (int i = 1; i < n; i++) { // cmov is preserved if (__builtin_unpredictable(a[i] > a[t])) t = i; } return t; } ``` Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D118118	2023-06-01 20:56:44 +02:00
Felipe de Azevedo Piovezan	e8aee45be7	[IRTranslator] Implement translation of entry_value dbg.value intrinsics For dbg.value intrinsics targeting an llvm::Argument address whose expression starts with an entry value, we lower this to a DEBUG_VALUE targeting the livein physical register corresponding to that Argument. Depends on D151328 Differential Revision: https://reviews.llvm.org/D151329	2023-05-26 06:45:01 -04:00
Felipe de Azevedo Piovezan	b5983a38cb	[IRTranslator][NFC] Refactor if/else chain into early returns This will make it easier to add more cases in a subsequent commit and also better conforms to the coding guidelines. Differential Revision: https://reviews.llvm.org/D151328	2023-05-25 06:42:44 -04:00
Krzysztof Drewniak	0bc739a4ae	[GlobalISel] Handle ptr size != index size in IRTranslator, CodeGenPrepare While the original motivation for this patch (address space 7 on AMDGPU) has been reworked and is not presently planned to reach IR translation, the incorrect (by the spec) handling of index offset width in IR translation and CodeGenPrepare is likely to trip someone - possibly future AMD, since we have a p7:160:256:256:32 now, so we convert to the other API now. Reviewed By: aemerson, arsenm Differential Revision: https://reviews.llvm.org/D143526	2023-05-12 16:21:01 +00:00
Felipe de Azevedo Piovezan	3f6e4e5b6e	[IRTranslator][DebugInfo] Implement translation of entry_value vars This commit implements IRTranslator lowering of dbg.declare intrinsics targeting swiftasync Arguments, by putting them in the MachineFunction's table of variables whose location doesn't change throughout the function. Depends on D149881 Differential Revision: https://reviews.llvm.org/D149882	2023-05-12 11:55:39 -04:00
NAKAMURA Takumi	9cfeba5b12	Restore CodeGen/LowLevelType from `Support` This is rework of; - D30046 (LLT) Since I have introduced `llvm-min-tblgen` as D146352, `llvm-tblgen` may depend on `CodeGen`. `LowLevlType.h` originally belonged to `CodeGen`. Almost all userse are still under `CodeGen` or `Target`. I think `CodeGen` is the right place to put `LowLevelType.h`. `MachineValueType.h` may be moved as well. (later, D149024) I have made many modules depend on `CodeGen`. It is consistent but inefficient. It will be split out later, D148769 Besides, I had to isolate MVT and LLT in modmap, since `llvm::PredicateInfo` clashes between `TableGen/CodeGenSchedule.h` and `Transforms/Utils/PredicateInfo.h`. (I think better to introduce namespace llvm::TableGen) Depends on D145937, D146352, and D148768. Differential Revision: https://reviews.llvm.org/D148767	2023-05-03 00:13:19 +09:00
NAKAMURA Takumi	d45fae6010	Move CodeGen/LowLevelType => CodeGen/LowLevelTypeUtils Before restoring `CodeGen/LowLevelType`, rename this to `LowLevelTypeUtils`. Differential Revision: https://reviews.llvm.org/D148768	2023-04-25 08:53:17 +09:00
Felipe de Azevedo Piovezan	79a1e32915	[GlobalISel] Improve stack slot tracking in dbg.values For IR like: ``` %alloca = alloca ... dbg.value(%alloca, !myvar, OP_deref(<other_ops>)) ``` GlobalISel lowers it to MIR: ``` %some_reg = G_FRAME_INDEX <stack_slot> DBG_VALUE %some_reg, !myvar, OP_deref(<other_ops>) ``` In other words, if the value of `!myvar` can be obtained by dereferencing an alloca, in MIR we say that the _location_ of a variable is obtained by dereferencing register %some_reg (plus some `<other_ops>`). We can instead remove the use of `%some_reg`: the location of `!myvar` _is_ `<stack_slot>` (plus some `<other_ops>`). This patch implements this transformation, which improves debug information handling in O0, as these registers hardly ever survive register allocation. A note about testing: similar to what was done in D76934 (f24e2e9eebde4b7a1d), this patch exposed a bug in the Builder class when using `-debug`, where we tried to print an incomplete instruction. The changes in `MachineIRBuilder.cpp` address that. Differential Revision: https://reviews.llvm.org/D147536	2023-04-05 08:21:00 -04:00
Kazu Hirata	b9c4b95b11	[llvm] Use ConstantInt::{isZero,isOne} (NFC)	2023-03-21 17:40:35 -07:00
Kazu Hirata	7e6e636fb6	Use llvm::has_single_bit<uint32_t> (NFC) This patch replaces isPowerOf2_32 with llvm::has_single_bit<uint32_t> where the argument is wider than uint32_t.	2023-02-15 22:17:27 -08:00
Kazu Hirata	55e2cd1609	Use llvm::count{lr}_{zero,one} (NFC)	2023-01-28 12:41:20 -08:00
Matt Arsenault	778cf5431c	IR: Add atomicrmw uinc_wrap and udec_wrap These are essentially add/sub 1 with a clamping value. AMDGPU has instructions for these. CUDA/HIP expose these as atomicInc/atomicDec. Currently we use target intrinsics for these, but those do no carry the ordering and syncscope. Add these to atomicrmw so we can carry these and benefit from the regular legalization processes.	2023-01-24 17:55:11 -04:00
Kazu Hirata	caa99a01f5	Use llvm::popcount instead of llvm::countPopulation(NFC)	2023-01-22 12:48:51 -08:00
Matt Arsenault	e70ae0f46b	DAG/GlobalISel: Fix broken/redundant setting of MODereferenceable This was incorrectly setting dereferenceable on unaligned operands. getLoadMemOperandFlags does the alignment dereferenceabilty check without alignment, and then both paths went on to check isDereferenceableAndAlignedPointer. Make getLoadMemOperandFlags check isDereferenceableAndAlignedPointer, and remove the second call.	2023-01-13 20:30:30 -05:00
James Y Knight	1ae36b1387	Remove special cases for invoke of non-throwing inline-asm. Non-throwing inline asm infers the nounwind attribute in instcombine. Thus, it can be handled in the same manner as non-throwing target functions are generally. Further special casing is unnecessary complexity.	2023-01-06 13:53:10 -05:00
serge-sans-paille	38818b60c5	Move from llvm::makeArrayRef to ArrayRef deduction guides - llvm/ part Use deduction guides instead of helper functions. The only non-automatic changes have been: 1. ArrayRef(some_uint8_pointer, 0) needs to be changed into ArrayRef(some_uint8_pointer, (size_t)0) to avoid an ambiguous call with ArrayRef((uint8_t), (uint8_t)) 2. CVSymbol sym(makeArrayRef(symStorage)); needed to be rewritten as CVSymbol sym{ArrayRef(symStorage)}; otherwise the compiler is confused and thinks we have a (bad) function prototype. There was a few similar situation across the codebase. 3. ADL doesn't seem to work the same for deduction-guides and functions, so at some point the llvm namespace must be explicitly stated. 4. The "reference mode" of makeArrayRef(ArrayRef<T> &) that acts as no-op is not supported (a constructor cannot achieve that). Per reviewers' comment, some useless makeArrayRef have been removed in the process. This is a follow-up to https://reviews.llvm.org/D140896 that introduced the deduction guides. Differential Revision: https://reviews.llvm.org/D140955	2023-01-05 14:11:08 +01:00
Amara Emerson	53445f5b1c	[GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug. We currently have a bug where the legalizer, when dealing with phi operands, may create instructions in the phi's incoming blocks at points which are effectively dead due to a possible exception throw. Say we have: throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL B returnbb bb: %v = phi i1 %true, throwbb, %false.... When legalizing we may need to widen the i1 %true value, and to do that we need to create new extension instructions in the incoming block. Our insertion point currently is the MBB::getFirstTerminator() which puts the IP before the unconditional branch terminator in throwbb. These extensions may never be executed if the call throws, and therefore we need to emit them before the call (but not too early, since our new instruction may need values defined within throwbb as well). throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL %true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws! B returnbb bb: %v = phi i32 %true, throwbb, %false.... To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START is a terminator, which tries to model the fact that in the IR, the original invoke inst is actually a terminator as well. By using that as the new insertion point, we make sure to place new instructions on always executing paths. Unfortunately we still need to make the legalizer use a new insertion point API that I've added, since the existing `getFirstTerminator()` method does a reverse walk up the block, and any non-terminator instructions cause it to bail out. To avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new method that does a forward walk instead. Differential Revision: https://reviews.llvm.org/D137905	2022-12-07 10:28:51 -08:00
Krzysztof Parzyszek	ab672e9173	FPEnv: convert Optional to std::optional	2022-12-03 13:55:56 -06:00
Nicolai Hähnle	43b86bf992	AMDGPU: Remove BufferPseudoSourceValue The use of a PSV for buffer intrinsics is misleading because it may be misinterpreted as all buffer intrinsics accessing the same address in memory, which is clearly not true. Instead, build MachineMemOperands without a pointer value but with an address space, so that address space-based alias analysis can still work. There is a lot of test churn because previously address space 4 (constant address space) was used as an address space for buffer intrinsics. This doesn't make much sense and seems to have been an accident -- see the change in AMDGPUTargetMachine::getAddressSpaceForPseudoSourceKind. Differential Revision: https://reviews.llvm.org/D138711	2022-11-29 22:15:11 +01:00
Janek van Oirschot	322966f8f8	[AMDGPU] Add llvm.is.fpclass intrinsic to existing SelectionDAG fp class support and introduce GlobalISel implementation for AMDGPU Uses existing SelectionDAG lowering of the llvm.amdgcn.class intrinsic for llvm.is.fpclass	2022-11-28 16:00:36 -05:00
Matt Arsenault	162d9030ab	GlobalISel: Pass through AA metadata for target memory intrinsics The corresponding change for the DAG was done in fa4aac7335ac7ecabbb634d134bd4897783bf62b	2022-11-06 22:14:12 -08:00
Peter Rong	c2e7c9cb33	[CodeGen] Using ZExt for extractelement indices. In https://github.com/llvm/llvm-project/issues/57452, we found that IRTranslator is translating `i1 true` into `i32 -1`. This is because IRTranslator uses SExt for indices. In this fix, we change the expected behavior of extractelement's index, moving from SExt to ZExt. This change includes both documentation, SelectionDAG and IRTranslator. We also included a test for AMDGPU, updated tests for AArch64, Mips, PowerPC, RISCV, VE, WebAssembly and X86 This patch fixes issue #57452. Differential Revision: https://reviews.llvm.org/D132978	2022-10-15 15:45:35 -07:00
Matt Arsenault	34fb7803f8	GlobalISel: Pass through AssumptionCache	2022-09-19 19:10:51 -04:00
Matt Arsenault	0d8ffcc532	Analysis: Add AssumptionCache argument to isDereferenceableAndAlignedPointer This does not try to pass it through from the end users.	2022-09-19 18:57:33 -04:00
Matt Arsenault	bb70b5d406	CodeGen: Set MODereferenceable from isDereferenceableAndAlignedPointer Previously this was assuming piontsToConstantMemory implies dereferenceable.	2022-09-12 08:38:35 -04:00
Marco Elver	31a548021b	[GlobalISel] Propagate PCSections metadata to MachineInstr Propagate (most) PC sections metadata to MachineInstr when GlobalISel is doing instruction selection. This change results in support for architectures using GlobalISel (such as -O0 with AArch64). Not all instructions may be supported yet, and requires further target-specific handling (such as done for AArch64 pseudo-atomics). Expanding supported instructions is planned on a case-by-case basis and new use cases for PC sections metadata. Reviewed By: vitalybuka Differential Revision: https://reviews.llvm.org/D130886	2022-09-07 11:36:02 +02:00
Markus Böck	2fdf963daf	[GlobalISel] Explicitly fail trying to translate `gc.statepoint` and related intrinsics The provided testcase would previously fail with an assertion due to later down below trying to allocate registers for `token` return types and arguments. This is especially problematic as the process would then exit instead of falling back to using FastIsel. This patch fixes that by simply explicitly failing translation if either of these intrinsics are encountered. Fixes https://github.com/llvm/llvm-project/issues/57349 Differential Revision: https://reviews.llvm.org/D132974	2022-08-31 00:47:17 +02:00
Eli Friedman	cfd2c5ce58	Untangle the mess which is MachineBasicBlock::hasAddressTaken(). There are two different senses in which a block can be "address-taken". There can be a BlockAddress involved, which means we need to map the IR-level value to some specific block of machine code. Or there can be constructs inside a function which involve using the address of a basic block to implement certain kinds of control flow. Mixing these together causes a problem: if target-specific passes are marking random blocks "address-taken", if we have a BlockAddress, we can't actually tell which MachineBasicBlock corresponds to the BlockAddress. So split this into two separate bits: one for BlockAddress, and one for the machine-specific bits. Discovered while trying to sort out related stuff on D102817. Differential Revision: https://reviews.llvm.org/D124697	2022-08-16 16:15:44 -07:00

1 2 3 4 5 ...

532 Commits