llvm-project

Author	SHA1	Message	Date
Jonas Paulsson	481bb44baa	[SystemZ] Emit a .gnu_attribute for an externally visible vector abi. On SystemZ, the vector ABI changes depending on the presence of hardware vector support. Therefore, each binary compiled with a visible vector ABI (e.g. one that calls an external function with a vector argument) should be marked with a .gnu_attribute describing this. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D105067	2022-12-06 12:53:40 -06:00
Fangrui Song	4b1b9e22b3	Remove unused #include "llvm/ADT/Optional.h"	2022-12-05 04:21:08 +00:00
Fangrui Song	f4c16c4473	[MC] llvm::Optional => std::optional https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-04 21:36:08 +00:00
Fangrui Song	bac974278c	CodeGen/CommandFlags: Convert Optional to std::optional	2022-12-03 18:38:12 +00:00
Krzysztof Parzyszek	8c7c20f033	Convert Optional<CodeModel> to std::optional<CodeModel>	2022-12-03 12:08:47 -06:00
Kazu Hirata	20cde15415	[Target] Use std::nullopt instead of None (NFC) This patch mechanically replaces None with std::nullopt where the compiler would warn if None were deprecated. The intent is to reduce the amount of manual work required in migrating from Optional to std::optional. This is part of an effort to migrate from llvm::Optional to std::optional: https://discourse.llvm.org/t/deprecating-llvm-optional-x-hasvalue-getvalue-getvalueor/63716	2022-12-02 20:36:06 -08:00
Krzysztof Parzyszek	864aaa21b4	TargetLowering: convert Optional to std::optional	2022-12-01 16:19:10 -08:00
Jonas Paulsson	ca51529487	[SystemZ] Extend combineGET_CCMASK() to handle a truncated SELECT_CCMASK. In cases where the SELECT_CCMASK has an additional user of the carry, a truncated SELECT_CCMASK may result as the input to the GET_CCMASK, which need to be recognized. Fixes https://github.com/llvm/llvm-project/issues/59054 Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D138324	2022-11-23 09:53:07 -05:00
Alexander Timofeev	32bd75716c	PEI should be able to use backward walk in replaceFrameIndicesBackward. The backward register scavenger has correct register liveness information. PEI should leverage the backward register scavenger. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D137574	2022-11-18 15:57:34 +01:00
Stanislav Mekhanoshin	bcaf31ec3f	[AMDGPU] Allow finer grain control of an unaligned access speed A target can return if a misaligned access is 'fast' as defined by the target or not. In reality there can be different levels of 'fast' and 'slow'. This patch changes the boolean 'Fast' argument of the allowsMisalignedMemoryAccesses family of functions to an unsigned representing its speed. A target can still define it as it wants and the direct translation of the current code uses 0 and 1 for current false and true. This makes the change an NFC. Subsequent patch will start using an actual value of speed in the load/store vectorizer to compare if a vectorized access going to be not just fast, but not slower than before. Differential Revision: https://reviews.llvm.org/D124217	2022-11-17 09:23:53 -08:00
Matt Arsenault	5baa4b8e11	SystemZ: Register null target streamer Fixes at least one null dereference.	2022-11-01 11:11:22 -07:00
Ulrich Weigand	96482ee434	[SystemZInstPrinter] Introduce markup tags emission SystemZ assembly syntax emission now leverages markup tags, if enabled. Author: Antonio Frighetto Differential Revision: https://reviews.llvm.org/D129868	2022-10-25 18:59:50 +02:00
Josh Stone	4dcfb09e40	[NFC][CodeGen] Use const MF in TargetLowering stack probe functions This makes them callable from places like canUseAsPrologue. Differential Revision: https://reviews.llvm.org/D134492	2022-09-23 09:30:32 -07:00
Sergei Barannikov	c6acb4eb0f	[SDAG] Add `getCALLSEQ_END` overload taking `uint64_t`s All in-tree targets pass pointer-sized ConstantSDNodes to the method. This overload reduced amount of boilerplate code a bit. This also makes getCALLSEQ_END consistent with getCALLSEQ_START, which already takes uint64_ts.	2022-09-15 14:02:12 -04:00
Jonas Paulsson	de0e3117d4	[SystemZ] Improve handling of vector alignments. Make the DataLayout string always hold a vector alignment of 8 bytes, regardless of the vector ABI. This makes the datalayout depend only on the target triple which is the general expectation (in assertions). On older architectures where vectors use the natural alignment (16 bytes), the front end will maintain the same behavior and produce an overalignment compared to the datalayout. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D131158	2022-09-08 17:33:05 +02:00
Markus Böck	f049b2c3fc	[MC] Emit Stackmaps before debug info This patch is essentially an alternative to https://reviews.llvm.org/D75836 and was mentioned by @lhames in a comment. The gist of the issue is that Mach-O has restrictions on which kind of sections are allowed after debug info has been emitted, which is also properly asserted within LLVM. Problem is that stack maps are currently emitted as one of the last sections in each target-specific AsmPrinter so far, which would cause the assertion to trigger. The current approach of special casing for the `__LLVM_STACKMAPS` section is not viable either, as downstream users can overwrite the stackmap format using plugins, which may want to use different sections. This patch fixes the issue by emitting the stack map earlier, right before debug info is emitted. The way this is implemented is by taking the choice when to emit the StackMap away from the target AsmPrinter and doing so in the base class. The only disadvantage of this approach is that the `StackMaps` member is now part of the base class, even for targets that do not support them. This is functionaly not a problem however, as emitting an empty `StackMaps` is a no-op. Differential Revision: https://reviews.llvm.org/D132708	2022-09-06 20:20:56 +02:00
Kazu Hirata	fedc59734a	[llvm] Use range-based for loops (NFC)	2022-09-03 11:17:40 -07:00
Kazu Hirata	8feb60756c	[llvm] Use range-based for loops (NFC)	2022-08-28 23:28:58 -07:00
Kazu Hirata	2833760c57	[Target] Qualify auto in range-based for loops (NFC)	2022-08-28 17:35:09 -07:00
Kazu Hirata	c63f823875	[llvm] Use range-based for loops (NFC)	2022-08-28 17:35:04 -07:00
Simon Pilgrim	f9de13232f	[X86] Promote i8/i16 CTTZ (BSF) instructions and remove speculation branch This patch adds a Type operand to the TLI isCheapToSpeculateCttz/isCheapToSpeculateCtlz callbacks, allowing targets to decide whether branches should occur on a type-by-type/legality basis. For X86, this patch proposes to allow CTTZ speculation for i8/i16 types that will lower to promoted i32 BSF instructions by masking the operand above the msb (we already do something similar for i8/i16 TZCNT). This required a minor tweak to CTTZ lowering - if the src operand is known never zero (i.e. due to the promotion masking) we can remove the CMOV zero src handling. Although BSF isn't very fast, most CPUs from the last 20 years don't do that bad a job with it, although there are some annoying passthrough EFLAGS dependencies. Additionally, now that we emit 'REP BSF' in most cases, we are tending towards assuming this will most likely be executed as a TZCNT instruction on any semi-modern CPU. Differential Revision: https://reviews.llvm.org/D132520	2022-08-24 17:28:18 +01:00
Philip Reames	c9608d57b8	[TTI] Plumb through OperandValueInfo in getMemoryOpCost [NFC] This has the effect of exposing the power-of-two property for use in memory op costing, but no target actually uses it yet. The main point of this change is simple consistency with the recently changes getArithmeticInstrCost, and to remove the last (interface) use of OperandValueKind.	2022-08-23 07:55:42 -07:00
Philip Reames	104fa367ee	[TTI] Use OperandValueInfo in getArithmeticInstrCost implementation [NFC] This change completes the process of replacing OperandValueKind and OperandValueProperties which were previously passed independently in this API with a single container class which contains both. This is the change which motivated the whole sequence which preceeded it. In an original spike version of this change, I'd noticed a nasty bug: I'd changed the signature without changing names, and as result, we silently passed additional information through a callsite which previously dropped the power-of-two fact. This might be harmless in most cases, but at least a couple clearly dependend for correctness on not passing that property through. I did my best to split off prior changes which reduced the scope of this one, and which made it possible to use compiler assistance. For instance, every parameter which changes type in this change also changes name. This was intentional to make sure that every call site possible effected must show up in the diff. This let me audit each one closely.	2022-08-22 15:16:39 -07:00
Simon Pilgrim	5263155d5b	[CostModel] Add CostKind argument to getShuffleCost Defaults to TCK_RecipThroughput - as most explicit calls were assuming TCK_RecipThroughput (vectorizers) or was just doing a before-vs-after comparison (vectorcombiner). Calls via getInstructionCost were just dropping the CostKind, so again there should be no change at this time (as getShuffleCost and its expansions don't use CostKind yet) - but it will make it easier for us to better account for size/latency shuffle costs in inline/unroll passes in the future. Differential Revision: https://reviews.llvm.org/D132287	2022-08-21 10:54:51 +01:00
Kazu Hirata	ec5eab7e87	Use range-based for loops (NFC)	2022-08-20 21:18:32 -07:00
Alexey Bataev	d53e245951	[COST][NFC]Introduce OperandValueKind in getMemoryOpCost, NFC. Added OperandValueKind OpdInfo parameter to getMemoryOpCost functions to better estimate cost with immediate values. Part of D126885.	2022-08-19 07:33:00 -07:00
Fangrui Song	de9d80c1c5	[llvm] LLVM_FALLTHROUGH => [[fallthrough]]. NFC With C++17 there is no Clang pedantic warning or MSVC C5051.	2022-08-08 11:24:15 -07:00
Kazu Hirata	a2d4501718	[llvm] Fix comment typos (NFC)	2022-08-07 00:16:14 -07:00
Mingming Liu	bc8f2f3649	[AArch64][TTI][NFC] Overload method 'getVectorInstrCost' to provide vector instruction itself, as a context information for cost estimation. 1) Overloaded (instruction-based) method is a wrapper around the current (opcode-based) method. 2) This patch also changes a few callsites (VectorCombine.cpp, SLPVectorizer.cpp, CodeGenPrepare.cpp) to call the overloaded method. 3) This is a split of D128302. Differential Revision: https://reviews.llvm.org/D131114	2022-08-04 12:58:25 -07:00
Kazu Hirata	95a932fb15	Remove redundaunt override specifiers (NFC) Identified with modernize-use-override.	2022-07-24 22:28:11 -07:00
Yusra Syeda	6fb27bc2e3	[SystemZ][z/OS] Introduce CCAssignToRegAndStack to calling convention Differential Revision: https://reviews.llvm.org/D127328	2022-07-19 13:55:25 -04:00
Mubariz Afzal	c444f03787	Reland "[SystemZ][z/OS] Fix f32 variadic argument assertion" This patch relands the f32 vararg assertion on z/OS fix that was reverted previously due to the testcase failing on non-z/OS platforms. It is now passing. The tablegen lines that specify the XPLINK64 calling convention for promoting an f32 vararg to an f64 are effectively overwritten by the following tablegen line which bitcast an f64 vararg to an i64 (so that it can be used in the GPRs). Thus it becomes a bitcast from f32 to i64. We don't handle bitcasts for f32s and so this causes an assertion to be thrown. We fix this by simplifying the tablegen lines to explicity show this behaviour, and allow the f32 in the bitcast case by first promoting it to an f64.	2022-07-18 14:25:17 -04:00
Neumann Hon	e8f9a74fbf	[SystemZ][z/OS] Implement detection and handling for XPLink Leaf procedures. This PR adds support for creating leaf functions when there are no CSRs used, no function calls are made, no stack frame is acquired, and contain no try/catch/throw statements. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D129687	2022-07-17 14:30:33 -04:00
Kazu Hirata	5605a1eedd	Use drop_begin (NFC)	2022-07-15 23:58:11 -07:00
David Green	3e0bf1c7a9	[CodeGen] Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Recommitted with some fixes for the leftover MCII variables in release builds. Differential Revision: https://reviews.llvm.org/D129506	2022-07-14 09:33:28 +01:00
David Green	95252133e1	Revert "Move instruction predicate verification to emitInstruction" This reverts commit e2fb8c0f4b940e0285ee36c112469fa75d4b60ff as it does not build for Release builds, and some buildbots are giving more warning than I saw locally. Reverting to fix those issues.	2022-07-13 13:28:11 +01:00
David Green	e2fb8c0f4b	Move instruction predicate verification to emitInstruction D25618 added a method to verify the instruction predicates for an emitted instruction, through verifyInstructionPredicates added into <Target>MCCodeEmitter::encodeInstruction. This is a very useful idea, but the implementation inside MCCodeEmitter made it only fire for object files, not assembly which most of the llvm test suite uses. This patch moves the code into the <Target>_MC::verifyInstructionPredicates method, inside the InstrInfo. The allows it to be called from other places, such as in this patch where it is called from the <Target>AsmPrinter::emitInstruction methods which should trigger for both assembly and object files. It can also be called from other places such as verifyInstruction, but that is not done here (it tends to catch errors earlier, but in reality just shows all the mir tests that have incorrect feature predicates). The interface was also simplified slightly, moving computeAvailableFeatures into the function so that it does not need to be called externally. The ARM, AMDGPU (but not R600), AVR, Mips and X86 backends all currently show errors in the test-suite, so have been disabled with FIXME comments. Differential Revision: https://reviews.llvm.org/D129506	2022-07-13 12:53:32 +01:00
Neumann Hon	c45ec53e7b	[SystemZ] [z/OS] Use assignCalleeSavedSpillSlots() to mark handle special registers in CSR list instead of determineCalleeSave This PR moves the handling of special registers that need to be saved/restored in the prolog/epilog respectively from determineCalleeSaves to assignCalleeSavedSpillSlots. The documentation of the parent function of assignCalleeSavedSpillSlots explicitly allows the modification of the CSI hence adding the special registers (the stack pointer register, the return address register, and the entry point register) to the CSI list at that stage should be permissible. This cleans up the code a bit and makes it so that we do not have to place registers that are not actually considered CSRs by the spec in the CSR list, which is something of a hack. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D125044	2022-07-06 22:22:25 -04:00
Kai Nacke	50b26de3c5	[SystemZ] Add support for tune-cpu attribute clang (like gcc) has the `-mtune=` command line option. This option adds the `"tune-cpu"` attribute to a function. The intended functionality is that the scheduling model of that cpu is used. E.g. `-mtune=z15 -march=z14` generates only instructions supported on z14 but uses the scheduling model of z15 for it. This PR adds the infrastructure to support this. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D128910	2022-06-30 12:50:11 -04:00
Jonas Paulsson	bfca9a0b99	[SystemZ] Fix the cost function for vector zero extend. Zero extend of a vector is done with either a single unpack or a vector permute, and the TTI cost function should reflect this. Review: Ulrich Weigand	2022-06-21 16:42:05 +02:00
Kazu Hirata	d66cbc565a	Don't use Optional::hasValue (NFC)	2022-06-20 20:26:05 -07:00
Kazu Hirata	e0e687a615	[llvm] Don't use Optional::hasValue (NFC)	2022-06-20 10:38:12 -07:00
Jonas Paulsson	3432d40c7f	[SystemZ] Remove unnecessary casts to SystemZInstrInfo (NFC). Review: Ulrich Weigand	2022-06-20 14:52:06 +02:00
Jonas Paulsson	4065ea8c0b	[SystemZ] Remove stray enum value in SystemZInstrInfo.h (NFC). Review: Ulrich Weigand	2022-06-20 14:52:06 +02:00
Fangrui Song	adf4142f76	[MC] De-capitalize SwitchSection. NFC Add SwitchSection to return switchSection. The API will be removed soon.	2022-06-10 22:50:55 -07:00
Yusra Syeda	487ace4c73	[SystemZ][z/OS] Add llvm.read_register() intrinsic support for zOS Differential Revision: https://reviews.llvm.org/D127412	2022-06-10 12:30:07 -04:00
Guillaume Chatelet	38637ee477	[clang] Add support for __builtin_memset_inline In the same spirit as D73543 and in reply to https://reviews.llvm.org/D126768#3549920 this patch is adding support for `__builtin_memset_inline`. The idea is to get support from the compiler to easily write efficient memory function implementations. This patch could be split in two: - one for the LLVM part adding the `llvm.memset.inline.*` intrinsics. - and another one for the Clang part providing the instrinsic as a builtin. Differential Revision: https://reviews.llvm.org/D126903	2022-06-10 13:13:59 +00:00
Jonas Paulsson	88c1cd86ee	[SystemZ] Use STDY/STEY/LDY/LEY for VR32/VR64 in eliminateFrameIndex(). When e.g. a VR64 register is spilled to a stack slot requiring a long (20-bit) displacement, it is possible to use an FP opcode if the allocated phys reg allows it. This eliminates the use of a separate LAY instruction. Reviewed By: Ulrich Weigand Differential Revision: https://reviews.llvm.org/D115406	2022-06-08 17:10:31 +02:00
Matt Arsenault	cc5a1b3dd9	llvm-reduce: Add cloning of target MachineFunctionInfo MIR support is totally unusable for AMDGPU without this, since the set of reserved registers is set from fields here. Add a clone method to MachineFunctionInfo. This is a subtle variant of the copy constructor that is required if there are any MIR constructs that use pointers. Specifically, at minimum fields that reference MachineBasicBlocks or the MachineFunction need to be adjusted to the values in the new function.	2022-06-07 10:14:48 -04:00
Guillaume Chatelet	0788186182	[Alignment][NFC] Remove usage of MemSDNode::getAlignment I can't remove the function just yet as it is used in the generated .inc files. I would also like to provide a way to compare alignment with TypeSize since it came up a few times. Differential Revision: https://reviews.llvm.org/D126910	2022-06-07 13:52:20 +00:00

1 2 3 4 5 ...

1793 Commits