llvm-project

Author	SHA1	Message	Date
Chen Zheng	1bf05fbc98	[PowerPC] refactor rewriteLoadStores for reusing; nfc This is split from https://reviews.llvm.org/D108750. Refactor rewriteLoadStores() so that we can reuse the outlined functions. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D110314	2021-10-07 12:59:20 +00:00
David Green	73346f5848	[ARM] Introduce a MQPRCopy Currently when creating tail predicated loops, we need to validate that all the live-outs of a loop will be equivalent with and without tail predication, and if they are not we cannot legally create a tail-predicated loop, leaving expensive vctp and vpst instructions in the loop. These notably can include register-allocation instructions like stack loads and stores, and copys lowered from COPYs to MVE_VORRs. Instead of trying to prove this is valid late in the pipeline, this patch introduces a MQPRCopy pseudo instruction that COPY is lowered to. This can then either be converted to a MVE_VORR where possible, or to a couple of VMOVD instructions if not. This way they do not behave differently within and outside of tail-predications regions, and we can know by construction that they are always valid. The idea is that we can do the same with stack load and stores, converting them to VLDR/VSTR or VLDM/VSTM where required to prove tail predication is always valid. This does unfortunately mean inserting multiple VMOVD instructions, instead of a single MVE_VORR, but my experiments show it to be an improvement in general. Differential Revision: https://reviews.llvm.org/D111048	2021-10-07 12:52:12 +01:00
Carl Ritson	b5d6ad20e1	[MachineCopyPropagation] Handle propagation of undef copies When propagating undefined copies the undef flag must also be propagated. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D111219	2021-10-07 20:34:27 +09:00
Simon Pilgrim	430ab92910	FunctionLoweringInfo.h - remove unused Optional.h include	2021-10-07 12:07:18 +01:00
Simon Pilgrim	b4f4bc0a68	TargetSchedule.h - remove unused Optional.h include	2021-10-07 12:07:17 +01:00
Simon Pilgrim	bb8dfefb23	MCSchedule.h - remove unused Optional.h include	2021-10-07 12:07:16 +01:00
Simon Pilgrim	e5fa68457a	[ExecutionEngine] remove unused <string> includes	2021-10-07 12:07:15 +01:00
Simon Pilgrim	05910b6beb	ScalarEvolution.h - remove unused Hashing.h include	2021-10-07 12:07:14 +01:00
David Green	bf916cdbd2	[ARM] Add tests for code that spills in tail predicate loops.	2021-10-07 11:35:02 +01:00
Jay Foad	df2d4bc4cb	[TwoAddressInstruction] Fix ReplacedAllUntiedUses in processTiedPairs Fix the calculation of ReplacedAllUntiedUses when any of the tied defs are early-clobber. The effect of this is to fix the placement of kill flags on an instruction like this (from @f2 in test/CodeGen/SystemZ/asm-18.ll): INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], killed %4:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit After TwoAddressInstruction without this patch: %3:grh32bit = COPY killed %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Note that the COPY kills %4, even though there is a later use of %4 in the INLINEASM. This fails machine verification if you force it to run after TwoAddressInstruction (currently it is disabled for other reasons). After TwoAddressInstruction with this patch: %3:grh32bit = COPY %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Differential Revision: https://reviews.llvm.org/D110848	2021-10-07 10:10:11 +01:00
Jay Foad	85abedd750	[TwoAddressInstruction] Pre-commit a test case for D110848	2021-10-07 10:10:11 +01:00
Cullen Rhodes	14cb138b15	[AArch64][SME] Update DUP (predicate) instruction Changes in architecture revision 00eac1: * Renamed to PSEL. * Copies whole source register. * Element type suffix removed from destination. * Element index no longer optional and '#' prefix has been removed. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Depends on D111212. Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D111213	2021-10-07 08:55:11 +00:00
Cullen Rhodes	42ba79b7b0	[AArch64][SME] Update tile slice index offset Changes in architecture revision 00eac1: * Tile slice index offset no longer prefixed with '#'. * The syntax for 128-bit (.Q) ZA tile slice accesses must now include an explicit zero index. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111212	2021-10-07 08:55:10 +00:00
Florian Hahn	09fdfd03ea	[VPlan] Replace hard-coded VPValue ids with patterns in tests. This makes the tests a bit more robust with respect to small changes in the value numbering.	2021-10-07 09:52:01 +01:00
Mikael Holmen	9bf5d91361	[GlobalISel] Silence gcc warning about unused variable	2021-10-07 07:18:04 +02:00
Itay Bookstein	40ec1c0f16	[IR][NFC] Rename getBaseObject to getAliaseeObject To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction (D109792), the function is renamed to getAliaseeObject.	2021-10-06 19:33:10 -07:00
David Blaikie	f6a561c4d6	DebugInfo: Use clang's preferred names for integer types This reverts c7f16ab3e3f27d944db72908c9c1b1b7366f5515 / r109694 - which suggested this was done to improve consistency with the gdb test suite. Possible that at the time GCC did not canonicalize integer types, and so matching types was important for cross-compiler validity, or that it was only a case of over-constrained test cases that printed out/tested the exact names of integer types. In any case neither issue seems to exist today based on my limited testing - both gdb and lldb canonicalize integer types (in a way that happens to match Clang's preferred naming, incidentally) and so never print the original text name produced in the DWARF by GCC or Clang. This canonicalization appears to be in `integer_types_same_name_p` for GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb. (I tested this with one translation unit defining 3 variables - `long`, `long ()()`, and `int ()()`, and another translation unit that had main, and a function that took `long ()()` as a parameter - then compiled them with mismatched compilers (either GCC+Clang, or Clang+(Clang with this patch applied)) and no matter the combination, despite the debug info for one CU naming the type "long int" and the other naming it "long", both debuggers printed out the name as "long" and were able to correctly perform overload resolution and pass the `long int ()()` variable to the `long (*)()` function parameter) Did find one hiccup, identified by the lldb test suite - that CodeView was relying on these names to map them to builtin types in that format. So added some handling for that in LLVM. (these could be split out into separate patches, but seems small enough to not warrant it - will do that if there ends up needing any reverti/revisiting) Differential Revision: https://reviews.llvm.org/D110455	2021-10-06 16:02:34 -07:00
Kuba Mracek	7329abf2f8	[GlobalDCE] In VFE, replace the whole 'sub' expression of unused relative-pointer-based vtable slots Differential Revision: https://reviews.llvm.org/D109114	2021-10-06 15:55:55 -07:00
LLVM GN Syncbot	ae4c0c7cfc	[gn build] Port ccfb0555f76b	2021-10-06 22:24:38 +00:00
Arthur Eubanks	72dddce652	More size_t -> uint64_t fixes after 05392466 Fixes some bots where the two differ.	2021-10-06 15:13:47 -07:00
Philip Reames	1183d65b4d	[SCEV] Search operand tree for scope bound when inferring flags from IR When checking to see if we can apply IR flags to a SCEV, we need to identify a bound on the defining scope of the SCEV to be produced. We'd previously added support for a couple SCEVExpr types which trivially imply bounds, but hadn't handled types such as umax where the bounds come from the bounds of the operands. This does the obvious thing, and recurses through operands searching for a tighter bound on the defining scope. I'm honestly surprised by how little this seems to mater on existing tests, but it's worth doing for completeness sake alone. Differential Revision: https://reviews.llvm.org/D111191	2021-10-06 15:10:02 -07:00
Craig Topper	58b68e70eb	[X86] Don't use popcnt for parity if only bits 7:0 of the input can be non-zero. Without popcnt we had a special case for using the parity flag from a single test i8 test instruction if only bits 7:0 could be non-zero. That special case is still useful when we have popcnt. To reach this special case, we enable custom lowering of parity for i16/i32/i64 even when popcnt is enabled. The check for POPCNT being enabled is now after the special case in LowerPARITY. Fixes PR52093 Differential Revision: https://reviews.llvm.org/D111249	2021-10-06 14:39:57 -07:00
Arthur Eubanks	ab7d421869	size_t -> uint64_t after 05392466 Fixes some bots where the two differ.	2021-10-06 13:52:04 -07:00
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
LLVM GN Syncbot	9f5c70c7ad	[gn build] Port 67231650e6ef	2021-10-06 20:22:05 +00:00
Nikita Popov	17c20a6dfb	[SCEV] Avoid unnecessary domination checks (NFC) When determining the defining scope, avoid repeatedly querying dominationg against the function entry instruction. This ends up begin a very common case that we can handle more efficiently.	2021-10-06 22:14:04 +02:00
Roman Lebedev	62d67d9e7c	[NFC][X86][LoopVectorize] Autogenerate check lines in a few tests for ease of updating For D111220	2021-10-06 22:54:15 +03:00
Nico Weber	07e5394c63	[gn build] (manually) port 77d5ccdc6f460 (similar to 64f623d4c37c6)	2021-10-06 15:41:38 -04:00
Chris Lattner	ad37a45a2e	[APInt] Fix isAllOnes and extractBits for zero width values. isAllOnes() should return true for zero bit values because there are no zeros in it. Thanks to Jay Foad for pointing this out. Differential Revision: https://reviews.llvm.org/D111241	2021-10-06 12:37:53 -07:00
Philip Reames	a7ae227baf	[scev] minor style improvement [nfc]	2021-10-06 12:15:16 -07:00
Philip Reames	2b3d913cc5	[tests] precommit test changes for D111191	2021-10-06 12:12:49 -07:00
Philip Reames	67896f494e	Returning poison from a function w/ noundef return attribute is UB This does for readability of returns within said function as what we do for the caller side when reasoning about what might be poison. Differential Revision: https://reviews.llvm.org/D111180	2021-10-06 11:52:18 -07:00
Wolfgang Pieb	f53d05135e	[UBSAN][PS4] For the PS4 target, emit the ud2 ocpode for ubsan traps. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D111172	2021-10-06 11:49:36 -07:00
wlei	16516f8925	[llvm-profgen] Support symbol list for accurate profile Differential Revision: https://reviews.llvm.org/D110859	2021-10-06 11:41:39 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit 8d64314ffea55f2ad94c1b489586daa8ce30f451.	2021-10-06 11:38:11 -07:00
Arthur Eubanks	1b76312e98	Update some types after D110451 To fix mismatched size_t vs uint64_t on some platforms.	2021-10-06 11:27:48 -07:00
Stefan Pintilie	740086596c	[PowerPC] Fix issue with lowering byval parameters. Lowering of byval parameters with sizes that are not represented by a single store require multiple stores to properly address the correct size of the parameter. Sizes that cannot be done with a single store are 3 bytes, 5 bytes, 6 bytes, 7 bytes. It is not correct to simply perform an 8 byte store and for these elements because then the store would be larger than the element and alias analysis would assume that this is undefined behaivour and return NoAlias for them. This patch adds the correct stores so that the size of the store is not larger than the size of the element. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D108795	2021-10-06 13:19:15 -05:00
Philip Reames	0658bab870	[SCEV] Infer flags from add/gep in any block This patch removes a compile time restriction from isSCEVExprNeverPoison. We've strengthened our ability to reason about flags on scopes other than addrecs, and this bailout prevents us from using it. The comment is also suspect as well in that we're in the middle of constructing a SCEV for I. As such, we're going to visit all operands anyways. Differential Revision: https://reviews.llvm.org/D111186	2021-10-06 11:11:54 -07:00
Simon Pilgrim	2ced9a42be	[CostModel][TTI] Replace BAD_ICMP_PREDICATE with ICMP_NE for generic smulo/umulo cost expansion Match the predicate used in TargetLowering::expandMULO to detect overflow	2021-10-06 19:11:33 +01:00
Simon Pilgrim	7bd097fd1e	[CostModel][TTI] Fix ops used for generic smulo/umulo cost expansion Fix copy+pasta that was checking for smul_fix instead of smul_with_overflow to detected signed values. The LShr is performed on the extended type as we use it to truncate+extract the upper/hi bits of the extended multiply. More closely matches the default expansion from TargetLowering::expandMULO	2021-10-06 19:11:32 +01:00
Simon Pilgrim	81b5da8c97	[CostModel][TTI] Replace BAD_ICMP_PREDICATE with ICMP_ULT/UGT for generic uadd/usubo cost expansion Match the predicates used in TargetLowering::expandUADDSUBO	2021-10-06 19:11:32 +01:00
Arthur Eubanks	8d64314ffe	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 11:03:51 -07:00
Craig Topper	05de0ab431	[X86] Add X86 and X64 prefixes to parity.ll to reduce duplicate check lines. NFC	2021-10-06 11:01:03 -07:00
LLVM GN Syncbot	665662a71e	[gn build] Port 10f16bc7b2bf	2021-10-06 17:59:12 +00:00
Kevin P. Neal	f86c930cc9	[FPEnv][InstSimplify] Fold constrained X + -0.0 ==> X Currently the fadd optimizations in InstSimplify don't know how to do this "X + -0.0 ==> X" fold when using the constrained intrinsics. This adds the support. This commit is derived from D106362 with some improvements from D107285. Differential Revision: https://reviews.llvm.org/D111085	2021-10-06 13:52:31 -04:00
Craig Topper	fa7a1bea2d	[X86] Add test cases for PR52093. NFC	2021-10-06 10:33:15 -07:00
Arthur Eubanks	72cf8b6044	Revert "[IR] Increase max alignment to 4GB" This reverts commit df84c1fe78130a86445d57563dea742e1b85156a. Breaks some bots	2021-10-06 10:21:35 -07:00
Shivam Gupta	7a189333ed	[NFC] Add doxygen comment for hasFp in RISCVFrameLowering.cpp	2021-10-06 22:35:28 +05:30
Pengxuan Zheng	b0045f5595	[ARM] Fix a bug in finding a pair of extracts to create VMOVRRD D100244 missed a check on the ResNo of the extract's operand 0 when finding a pair of extracts to combine into a VMOVRRD (extract(x, n); extract(x, n+1) -> VMOVRRD(extract x, n/2)). As a result, it can incorrectly pair an extract(x, n) with another extract(x:3, n+1) for example. This patch fixes the bug by adding the proper check on ResNo. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D111188	2021-10-06 10:03:32 -07:00
Arthur Eubanks	df84c1fe78	[IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 09:54:14 -07:00

1 2 3 4 5 ...

222473 Commits