llvm-project

Author	SHA1	Message	Date
Chen Zheng	1bf05fbc98	[PowerPC] refactor rewriteLoadStores for reusing; nfc This is split from https://reviews.llvm.org/D108750. Refactor rewriteLoadStores() so that we can reuse the outlined functions. Reviewed By: jsji Differential Revision: https://reviews.llvm.org/D110314	2021-10-07 12:59:20 +00:00
David Green	73346f5848	[ARM] Introduce a MQPRCopy Currently when creating tail predicated loops, we need to validate that all the live-outs of a loop will be equivalent with and without tail predication, and if they are not we cannot legally create a tail-predicated loop, leaving expensive vctp and vpst instructions in the loop. These notably can include register-allocation instructions like stack loads and stores, and copys lowered from COPYs to MVE_VORRs. Instead of trying to prove this is valid late in the pipeline, this patch introduces a MQPRCopy pseudo instruction that COPY is lowered to. This can then either be converted to a MVE_VORR where possible, or to a couple of VMOVD instructions if not. This way they do not behave differently within and outside of tail-predications regions, and we can know by construction that they are always valid. The idea is that we can do the same with stack load and stores, converting them to VLDR/VSTR or VLDM/VSTM where required to prove tail predication is always valid. This does unfortunately mean inserting multiple VMOVD instructions, instead of a single MVE_VORR, but my experiments show it to be an improvement in general. Differential Revision: https://reviews.llvm.org/D111048	2021-10-07 12:52:12 +01:00
Carl Ritson	b5d6ad20e1	[MachineCopyPropagation] Handle propagation of undef copies When propagating undefined copies the undef flag must also be propagated. Reviewed By: foad Differential Revision: https://reviews.llvm.org/D111219	2021-10-07 20:34:27 +09:00
Simon Pilgrim	e5fa68457a	[ExecutionEngine] remove unused <string> includes	2021-10-07 12:07:15 +01:00
Jay Foad	df2d4bc4cb	[TwoAddressInstruction] Fix ReplacedAllUntiedUses in processTiedPairs Fix the calculation of ReplacedAllUntiedUses when any of the tied defs are early-clobber. The effect of this is to fix the placement of kill flags on an instruction like this (from @f2 in test/CodeGen/SystemZ/asm-18.ll): INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], killed %4:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit After TwoAddressInstruction without this patch: %3:grh32bit = COPY killed %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Note that the COPY kills %4, even though there is a later use of %4 in the INLINEASM. This fails machine verification if you force it to run after TwoAddressInstruction (currently it is disabled for other reasons). After TwoAddressInstruction with this patch: %3:grh32bit = COPY %4:grh32bit INLINEASM &"stepb $1, $2" [attdialect], $0:[regdef-ec:GRH32Bit], def early-clobber %3:grh32bit, $1:[reguse tiedto:$0], %3:grh32bit(tied-def 3), $2:[reguse:GRH32Bit], %4:grh32bit Differential Revision: https://reviews.llvm.org/D110848	2021-10-07 10:10:11 +01:00
Cullen Rhodes	14cb138b15	[AArch64][SME] Update DUP (predicate) instruction Changes in architecture revision 00eac1: * Renamed to PSEL. * Copies whole source register. * Element type suffix removed from destination. * Element index no longer optional and '#' prefix has been removed. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Depends on D111212. Reviewed By: kmclaughlin Differential Revision: https://reviews.llvm.org/D111213	2021-10-07 08:55:11 +00:00
Cullen Rhodes	42ba79b7b0	[AArch64][SME] Update tile slice index offset Changes in architecture revision 00eac1: * Tile slice index offset no longer prefixed with '#'. * The syntax for 128-bit (.Q) ZA tile slice accesses must now include an explicit zero index. The reference can be found here: https://developer.arm.com/documentation/ddi0602/2021-09 Reviewed By: david-arm Differential Revision: https://reviews.llvm.org/D111212	2021-10-07 08:55:10 +00:00
Mikael Holmen	9bf5d91361	[GlobalISel] Silence gcc warning about unused variable	2021-10-07 07:18:04 +02:00
Itay Bookstein	40ec1c0f16	[IR][NFC] Rename getBaseObject to getAliaseeObject To better reflect the meaning of the now-disambiguated {GlobalValue, GlobalAlias}::getBaseObject after breaking off GlobalIFunc::getResolverFunction (D109792), the function is renamed to getAliaseeObject.	2021-10-06 19:33:10 -07:00
David Blaikie	f6a561c4d6	DebugInfo: Use clang's preferred names for integer types This reverts c7f16ab3e3f27d944db72908c9c1b1b7366f5515 / r109694 - which suggested this was done to improve consistency with the gdb test suite. Possible that at the time GCC did not canonicalize integer types, and so matching types was important for cross-compiler validity, or that it was only a case of over-constrained test cases that printed out/tested the exact names of integer types. In any case neither issue seems to exist today based on my limited testing - both gdb and lldb canonicalize integer types (in a way that happens to match Clang's preferred naming, incidentally) and so never print the original text name produced in the DWARF by GCC or Clang. This canonicalization appears to be in `integer_types_same_name_p` for GDB and in `TypeSystemClang::GetBasicTypeEnumeration` for lldb. (I tested this with one translation unit defining 3 variables - `long`, `long ()()`, and `int ()()`, and another translation unit that had main, and a function that took `long ()()` as a parameter - then compiled them with mismatched compilers (either GCC+Clang, or Clang+(Clang with this patch applied)) and no matter the combination, despite the debug info for one CU naming the type "long int" and the other naming it "long", both debuggers printed out the name as "long" and were able to correctly perform overload resolution and pass the `long int ()()` variable to the `long (*)()` function parameter) Did find one hiccup, identified by the lldb test suite - that CodeView was relying on these names to map them to builtin types in that format. So added some handling for that in LLVM. (these could be split out into separate patches, but seems small enough to not warrant it - will do that if there ends up needing any reverti/revisiting) Differential Revision: https://reviews.llvm.org/D110455	2021-10-06 16:02:34 -07:00
Kuba Mracek	7329abf2f8	[GlobalDCE] In VFE, replace the whole 'sub' expression of unused relative-pointer-based vtable slots Differential Revision: https://reviews.llvm.org/D109114	2021-10-06 15:55:55 -07:00
Arthur Eubanks	72dddce652	More size_t -> uint64_t fixes after 05392466 Fixes some bots where the two differ.	2021-10-06 15:13:47 -07:00
Philip Reames	1183d65b4d	[SCEV] Search operand tree for scope bound when inferring flags from IR When checking to see if we can apply IR flags to a SCEV, we need to identify a bound on the defining scope of the SCEV to be produced. We'd previously added support for a couple SCEVExpr types which trivially imply bounds, but hadn't handled types such as umax where the bounds come from the bounds of the operands. This does the obvious thing, and recurses through operands searching for a tighter bound on the defining scope. I'm honestly surprised by how little this seems to mater on existing tests, but it's worth doing for completeness sake alone. Differential Revision: https://reviews.llvm.org/D111191	2021-10-06 15:10:02 -07:00
Craig Topper	58b68e70eb	[X86] Don't use popcnt for parity if only bits 7:0 of the input can be non-zero. Without popcnt we had a special case for using the parity flag from a single test i8 test instruction if only bits 7:0 could be non-zero. That special case is still useful when we have popcnt. To reach this special case, we enable custom lowering of parity for i16/i32/i64 even when popcnt is enabled. The check for POPCNT being enabled is now after the special case in LowerPARITY. Fixes PR52093 Differential Revision: https://reviews.llvm.org/D111249	2021-10-06 14:39:57 -07:00
Arthur Eubanks	ab7d421869	size_t -> uint64_t after 05392466 Fixes some bots where the two differ.	2021-10-06 13:52:04 -07:00
Arthur Eubanks	05392466f0	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 13:29:23 -07:00
Nikita Popov	17c20a6dfb	[SCEV] Avoid unnecessary domination checks (NFC) When determining the defining scope, avoid repeatedly querying dominationg against the function entry instruction. This ends up begin a very common case that we can handle more efficiently.	2021-10-06 22:14:04 +02:00
Chris Lattner	ad37a45a2e	[APInt] Fix isAllOnes and extractBits for zero width values. isAllOnes() should return true for zero bit values because there are no zeros in it. Thanks to Jay Foad for pointing this out. Differential Revision: https://reviews.llvm.org/D111241	2021-10-06 12:37:53 -07:00
Philip Reames	a7ae227baf	[scev] minor style improvement [nfc]	2021-10-06 12:15:16 -07:00
Philip Reames	67896f494e	Returning poison from a function w/ noundef return attribute is UB This does for readability of returns within said function as what we do for the caller side when reasoning about what might be poison. Differential Revision: https://reviews.llvm.org/D111180	2021-10-06 11:52:18 -07:00
Wolfgang Pieb	f53d05135e	[UBSAN][PS4] For the PS4 target, emit the ud2 ocpode for ubsan traps. Reviewed By: probinson Differential Revision: https://reviews.llvm.org/D111172	2021-10-06 11:49:36 -07:00
Arthur Eubanks	569346f274	Revert "Reland [IR] Increase max alignment to 4GB" This reverts commit 8d64314ffea55f2ad94c1b489586daa8ce30f451.	2021-10-06 11:38:11 -07:00
Arthur Eubanks	1b76312e98	Update some types after D110451 To fix mismatched size_t vs uint64_t on some platforms.	2021-10-06 11:27:48 -07:00
Stefan Pintilie	740086596c	[PowerPC] Fix issue with lowering byval parameters. Lowering of byval parameters with sizes that are not represented by a single store require multiple stores to properly address the correct size of the parameter. Sizes that cannot be done with a single store are 3 bytes, 5 bytes, 6 bytes, 7 bytes. It is not correct to simply perform an 8 byte store and for these elements because then the store would be larger than the element and alias analysis would assume that this is undefined behaivour and return NoAlias for them. This patch adds the correct stores so that the size of the store is not larger than the size of the element. Reviewed By: nemanjai, #powerpc Differential Revision: https://reviews.llvm.org/D108795	2021-10-06 13:19:15 -05:00
Philip Reames	0658bab870	[SCEV] Infer flags from add/gep in any block This patch removes a compile time restriction from isSCEVExprNeverPoison. We've strengthened our ability to reason about flags on scopes other than addrecs, and this bailout prevents us from using it. The comment is also suspect as well in that we're in the middle of constructing a SCEV for I. As such, we're going to visit all operands anyways. Differential Revision: https://reviews.llvm.org/D111186	2021-10-06 11:11:54 -07:00
Arthur Eubanks	8d64314ffe	Reland [IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 11:03:51 -07:00
Kevin P. Neal	f86c930cc9	[FPEnv][InstSimplify] Fold constrained X + -0.0 ==> X Currently the fadd optimizations in InstSimplify don't know how to do this "X + -0.0 ==> X" fold when using the constrained intrinsics. This adds the support. This commit is derived from D106362 with some improvements from D107285. Differential Revision: https://reviews.llvm.org/D111085	2021-10-06 13:52:31 -04:00
Arthur Eubanks	72cf8b6044	Revert "[IR] Increase max alignment to 4GB" This reverts commit df84c1fe78130a86445d57563dea742e1b85156a. Breaks some bots	2021-10-06 10:21:35 -07:00
Shivam Gupta	7a189333ed	[NFC] Add doxygen comment for hasFp in RISCVFrameLowering.cpp	2021-10-06 22:35:28 +05:30
Pengxuan Zheng	b0045f5595	[ARM] Fix a bug in finding a pair of extracts to create VMOVRRD D100244 missed a check on the ResNo of the extract's operand 0 when finding a pair of extracts to combine into a VMOVRRD (extract(x, n); extract(x, n+1) -> VMOVRRD(extract x, n/2)). As a result, it can incorrectly pair an extract(x, n) with another extract(x:3, n+1) for example. This patch fixes the bug by adding the proper check on ResNo. Reviewed By: dmgreen Differential Revision: https://reviews.llvm.org/D111188	2021-10-06 10:03:32 -07:00
Arthur Eubanks	df84c1fe78	[IR] Increase max alignment to 4GB Currently the max alignment representable is 1GB, see D108661. Setting the align of an object to 4GB is desirable in some cases to make sure the lower 32 bits are clear which can be used for some optimizations, e.g. https://crbug.com/1016945. This uses an extra bit in instructions that carry an alignment. We can store 15 bits of "free" information, and with this change some instructions (e.g. AtomicCmpXchgInst) use 14 bits. We can increase the max alignment representable above 4GB (up to 2^62) since we're only using 33 of the 64 values, but I've just limited it to 4GB for now. The one place we have to update the bitcode format is for the alloca instruction. It stores its alignment into 5 bits of a 32 bit bitfield. I've added another field which is 8 bits and should be future proof for a while. For backward compatibility, we check if the old field has a value and use that, otherwise use the new field. Updating clang's max allowed alignment will come in a future patch. Reviewed By: hans Differential Revision: https://reviews.llvm.org/D110451	2021-10-06 09:54:14 -07:00
Nikita Popov	1301a8b473	[BasicAA] Don't unnecessarily extend pointer size BasicAA GEP decomposition currently performs all calculation on the maximum pointer size, but at least 64-bit, with an option to double the size. The code comment claims that this improves analysis power when working with uint64_t indices on 32-bit systems. However, I don't see how this can be, at least while maintaining correctness: When working on canonical code, the GEP indices will have GEP index size. If the original code worked on uint64_t with a 32-bit size_t, then there will be truncs inserted before use as a GEP index. Linear expression decomposition does not look through truncs, so this will be an opaque value as far as GEP decomposition is concerned. Working on a wider pointer size does not help here (or have any effect at all). When working on non-canonical code (before first InstCombine), the GEP indices are implicitly truncated to GEP index size. The BasicAA code currently just ignores this fact completely, and pretends that this truncation doesn't happen. This is incorrect and will be addressed by D110977. I believe that for correctness reasons, it is important to work on the actual GEP index size to properly model potential overflow. BasicAA tries to patch over the fact that it uses the wrong size (see adjustToPointerSize), but it only does that in limited cases (only for constant values, and not all of them either). I'd like to move this code towards always working on the correct size, and dropping these artificial pointer size adjustments is the first step towards that. Differential Revision: https://reviews.llvm.org/D110657	2021-10-06 18:40:21 +02:00
Sanjay Patel	e36d351d19	[InstSimplify] (x \| y) & (x \| !y) --> x https://alive2.llvm.org/ce/z/QagQMn This fold is handled by instcombine via SimplifyUsingDistributiveLaws(), but we are missing the sibliing fold for 'logical and' (implemented with 'select'). Retrofitting the code in instcombine looks much harder than just adding a small adjustment here, and this is potentially more efficient and beneficial to other passes.	2021-10-06 12:31:25 -04:00
Clement Courbet	3255015407	Fix incomplete conflict resolution in ff41fc07b12bd7bf3c8cd238824b16b1066fe5a0	2021-10-06 16:55:14 +02:00
Clement Courbet	ff41fc07b1	Revert "[AA] Teach BasicAA to recognize basic GEP range information." We have found a miscompile with this change, reverting while working on a reproducer. This reverts commit 455b60ccfbfdbb5d2b652666050544c31e6673b1.	2021-10-06 16:49:10 +02:00
Simon Pilgrim	0dcd2b40e6	[TTI] Remove default condition type and predicate arguments from getCmpSelInstrCost We need to be better at exposing the comparison predicate to getCmpSelInstrCost calls as some targets (e.g. X86 SSE) have very different costs for different comparisons (PR48337), and we can't always rely on the optional Instruction argument. This initial commit requires explicit condition type and predicate arguments. The next step will be to review a lot of the existing getCmpSelInstrCost calls which have used BAD_ICMP_PREDICATE even when the predicate is known. Differential Revision: https://reviews.llvm.org/D111024	2021-10-06 15:40:35 +01:00
luxufan	b384736b20	Revert "[JITLink][NFC] Add TableManager to replace PerGraph...Builder pass" This reverts commit 50a278c2aef21bf9b78865ad7c7554e506434b9c.	2021-10-06 21:34:18 +08:00
Simon Pilgrim	f6fa95b77f	[Support] ErrorHandling.h - Remove report_fatal_error(std::string) As described on D111049, removing the <string> dependency from error handling removes considerable build overhead, its recommended that the report_fatal_error(Twine) variant is used instead.	2021-10-06 14:32:38 +01:00
luxufan	50a278c2ae	[JITLink][NFC] Add TableManager to replace PerGraph...Builder pass This patch add a TableManager which reponsible for fixing edges that need entries to reference the target symbol and constructing such entries. In the past, the PerGraphGOTAndPLTStubsBuilder pass was used to build GOT and PLT entry, and the PerGraphTLSInfoEntryBuilder pass was used to build TLSInfo entry. By generalizing the behavior of building entry, I added a TableManager which could be reused when built GOT, PLT and TLSInfo entries. If this patch makes sense and can be accepted, I will apply the TableManager to other targets(MachO_x86_64, MachO_arm64, ELF_riscv), and delete the file PerGraphGOTAndPLTStubsBuilder.h Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D110383	2021-10-06 21:24:34 +08:00
Sanjay Patel	db231ebdb0	[InstCombine] fold fake vector extract to shift+trunc We already handle more complicated cases like: extelt (bitcast (inselt poison, X, 0)) --> trunc (lshr X) But we missed this simpler pattern: https://alive2.llvm.org/ce/z/D55h64 / https://alive2.llvm.org/ce/z/GKzzRq This is part of solving: https://llvm.org/PR52057 I made the transform depend on legal/desirable int type to avoid creating a shift of an illegal type (for example i128). I'm not sure if that restriction is actually necessary, but we can change that as a follow-up if the backend can deal with integer ops on too-wide illegal types. The pile of AVX512 test changes are all neutral AFAICT - the x86 backend seems to know how to turn that into the expected "kmov" instructions. Differential Revision: https://reviews.llvm.org/D111082	2021-10-06 08:12:05 -04:00
Amara Emerson	79d13bf22c	Revert "Revert "[GlobalISel][IRTranslator] Emit trap intrinsic for "unreachable""" This reverts commit d95cd81141a4e398e0d3337cb2e6617281d06278. Re-land the original patch now that the bug this exposed in selection has been fixed by 6bc64e24c38a	2021-10-06 04:16:19 -07:00
Simon Pilgrim	e9f4fa75ed	[llvm] Unix.h - Replace report_fatal_error(std::string) with report_fatal_error(Twine) As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.	2021-10-06 12:13:40 +01:00
Simon Pilgrim	21661607ca	[llvm] Replace report_fatal_error(std::string) uses with report_fatal_error(Twine) As described on D111049, we're trying to remove the <string> dependency from error handling and replace uses of report_fatal_error(const std::string&) with the Twine() variant which can be forward declared.	2021-10-06 12:04:30 +01:00
Nathan Sidwell	c11e7b59d2	[X86][NFC] structure-return simplificiation The X86 backend only needs to know whether structure return is via an sret pointer. This removes the categorization enumeration and adjusts, templatizes and renames the related functions. Differential Revision: https://reviews.llvm.org/D109966	2021-10-06 03:12:48 -07:00
Simon Pilgrim	0776924a17	[CostModel][X86] getCmpSelInstrCost - treat BAD_PREDICATEs the same as the worst case cost predicates for ICMP/FCMP instructions As suggested on D111024, we should treat getCmpSelInstrCost calls without a specific predicate as matching the worst case predicate cost. These regressions will be addressed with a mixture of D111024 and fixing other specific getCmpSelInstrCost calls to have realistic predicates.	2021-10-06 10:14:56 +01:00
Jonas Paulsson	3562076dfc	[SystemZ] Temporarily revert memcmp and memcpy patches Seem to cause test failures in compiler-rt. Revert "[SystemZ] Implement memcmp of variable length with CLC." This reverts commit 7a4e9a0c73667cb80e4572d41535a9e48f1ed9ef. Revert "[SystemZ] Implement memcpy of variable length with MVC." This reverts commit c6c13c58eebda605a9a05f1f13cac1e46407afc7.	2021-10-06 11:05:18 +02:00
David Spickett	fc36fb4d23	Revert "Second Recommit "[AArch64] Split bitmask immediate of bitwise AND operation"" This reverts commit 13f3c39f3658fa28cb008eb56a58d8e34697cd5d. Due to test failures in stage 2 clang tests on AArch64 bots.	2021-10-06 08:39:48 +00:00
David Sherwood	37edb7d3e2	[SVE] Fix incorrect DAG combines when extracting fixed-width from scalable vectors We were previously silently generating incorrect code when extracting a fixed-width vector from a scalable vector. This is worse than crashing, since the user will have no indication that this is currently unsupported behaviour. I have fixed the code to only perform DAG combines when safe to do so, i.e. the input and output vectors are both fixed-width or both scalable. Test added here: CodeGen/AArch64/sve-extract-scalable-vector.ll Differential revision: https://reviews.llvm.org/D110624	2021-10-06 09:27:44 +01:00
Paulo Matos	0c7495848a	[WebAssembly] Fix call_indirect on funcrefs The currently implementation of funcrefs is broken since it is putting the funcref itself on the stack before the call_indirect. Instead what should be on the stack is the constant 0, which is the index at which we store the funcref in __funcref_call_table. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D111152	2021-10-06 10:11:53 +02:00
Paulo Matos	91fe069c35	[WebAssembly] De-duplicate WasmAddressSpace and refactor reftype predicates This is a non-functional change to remove the duplicate WasmAddressSpace enum and refactor reftype predicates by moving them to the Utilities source file. Reviewed By: tlively Differential Revision: https://reviews.llvm.org/D111144	2021-10-06 09:56:23 +02:00

1 2 3 4 5 ...

151414 Commits