llvm-project

Author	SHA1	Message	Date
Krzysztof Parzyszek	49e75ebd85	[Bitcode(Reader\|Writer)] Convert Optional to std::optional	2022-12-07 15:27:38 -08:00
Jon Chesterfield	d77ae7f251	[amdgpu] Reimplement LDS lowering Renames the current lowering scheme to "module" and introduces two new ones, "kernel" and "table", plus a "hybrid" that chooses between those three on a per-variable basis. Unit tests are set up to pass with the default lowering of "module" or "hybrid" with this patch defaulting to "module", which will be a less dramatic codegen change relative to the current. This reflects the sparsity of test coverage for the table lowering method. Hybrid is better than module in every respect and will be default in a subsequent patch. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139433	2022-12-07 22:02:54 +00:00
Chris Bieneman	c861ea8736	Generate DXIL Shader hash DXIL shader bitcode is hashed and the hash is placed into the final output object file in its own data part. This change modifies the DXContainerGlobals pass to compute the shader hash (just an MD5 of the bitcode) and put the shader hash data into a global for the HASH part. This also sets the hash flag as appropriate for if the hashed shader contained debug information. There is additional handling required to get debug information in shaders working correctly with our tooling, but that will be addressed in subsequent patches. Reviewed By: python3kgae Differential Revision: https://reviews.llvm.org/D139357	2022-12-07 15:22:55 -06:00
Alexander Yermolovich	f2f8f70953	Revert "[llvm][dwwarf] Change CU/TU index to 64-bit" This reverts commit 5ebd28f3e56c00a739fda46c72c9e0f6528add87.	2022-12-07 13:14:23 -08:00
Alexander Yermolovich	a77376479d	Revert "[DWARFLibrary] Add support to re-construct cu-index" This reverts commit a5bd76a6e3119af9dd9c1d8af89e2b89f5267deb.	2022-12-07 13:14:11 -08:00
Alexander Yermolovich	a5bd76a6e3	[DWARFLibrary] Add support to re-construct cu-index Summary: According to DWARF5 specification and gnu specification for DWARF4 the offset entry in the CU/TU Index is 32 bits. This presents a problem when .debug_info.dwo in DWP file grows beyond 4GB. The CU Index becomes partially corrupted. This diff adds manual parsing of .debug_info.dwo/.debug_abbrev.dwo to reconstruct CU index in general, and TU index for DWARF5. This is a work around until DWARF6 spec is finalized. Next patch will change internal CU/TU struct to 64 bit, and change uses as necessary. The plan is to land all the patches in one go after all are approved. This patch originates from the discussion in: https://discourse.llvm.org/t/dwarf-dwp-4gb-limit/63902 Differential Revision: https://reviews.llvm.org/D137882	2022-12-07 13:08:35 -08:00
Alexander Yermolovich	5ebd28f3e5	[llvm][dwwarf] Change CU/TU index to 64-bit Summary: Changed contribution data structure to 64 bit. I added the 32bit and 64bit accessors to make it explicit where we use 32bit and where we use 64bit. Also to make sure sure we catch all the cases where this data structure is used.	2022-12-07 13:08:35 -08:00
Craig Topper	2c52d516da	Revert "[RISCV] Return InstSeq from generateInstSeqImpl instead of using an output parameter. NFC" This reverts commit d24915207c631b7cf637081f333b41bc5159c700. Thinking about this more this probably chewed up 100+ bytes of stack for each recursive call. So this probably needs more thought. The code simplification wasn't that much.	2022-12-07 12:59:31 -08:00
Matt Arsenault	90f60a6a73	NVPTX: Cleanup check for denormal mode Go through the common query and be explicit about the supported flush type.	2022-12-07 15:56:21 -05:00
Nicolai Hähnle	1598dc84bd	GISel/Combiner: maintain created instructions in a SetVector This is not a correctness fix because the set is only used for debug output. However, it helps avoid noise when looking at diffs between compiler runs. The set is only maintained with debug output enabled, so the added cost should be acceptable. Differential Revision: https://reviews.llvm.org/D139465	2022-12-07 21:40:34 +01:00
Koakuma	f8f41c3fcd	[SPARC] Lower SELECT_CC to MOVr on 64-bit target whenever possible On 64-bit target, when doing i64 SELECT_CC where one of the comparison operands is a constant zero, try to fold the compare and MOVcc into a MOVr instruction. For all integers, EQ and NE comparison are available, additionally for signed integers, GT, GE, LT, and LE is also available. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138922	2022-12-07 15:34:58 -05:00
Brad Smith	7806f86a5e	Revert "[SPARC] Mark the %g0 register as constant & use it to materialize zeros" 2 of the Sparc tests are now failing. This reverts commit 2c41310fc146a1f609147c65ac5f30e5a57e84a8.	2022-12-07 15:27:57 -05:00
Craig Topper	258bb453fb	[RISCV] Without Zfh, promote f16 inputs before creating RISCVISD::FCVT_W(U)_RV64 nodes. This allows us to remove a couple more Zfhmin isel patterns.	2022-12-07 12:25:30 -08:00
Craig Topper	e3540fb948	[RISCV] Promote f16 fp_to_int_sat with Zfhmin during lowering instead of isel. We already have a custom handler for FP_TO_(S/U)INT_SAT. It's easy enought to inject an FP_EXTEND in there.	2022-12-07 11:58:30 -08:00
James Y Knight	099001979f	[SPARC] Simplify instruction decoder. After https://reviews.llvm.org/D137653 named sub-operands can be used in the auto-generated instruction decoders. This allows the auto-generated decoders to work properly, so all the hand-coded decoders in the sparc target can be removed. In some instances, a manually-written decoder had not been implemented for an instruction, and thus that instruction was not decoded properly. These have been fixed (and tests added). Differential Revision: https://reviews.llvm.org/D137727	2022-12-07 14:37:08 -05:00
James Y Knight	372240dfe3	[TableGen] More named sub-operands work. Commit a538d1f13a13 first added support for named sub-operands in CodeEmitterGen. We now add a few more features to that, enabling further target cleanups. 1. Adds support for handling an EncoderMethod in a sub-operand in CodeEmitterGen. Previously, the specified encoder of a sub-operand was ignored, and only the default used. 2. Adds support for sub-operands in DecoderEmitter, along with support for tied sub-operands. The changes to the decoder required a few minor tweaks to a few targets, where existing brokeness was exposed. In order to keep this patch small, I left FIXMEs which will be addressed in upcoming patches. (Except MIPS16, since its object file emission/decoding is totally broken). Differential Revision: https://reviews.llvm.org/D137653	2022-12-07 14:37:08 -05:00
Craig Topper	b12fe0d429	[RISCV] Consolidate identical (fcopysign FPR32:, FPR16:) isel patterns. NFC	2022-12-07 11:35:55 -08:00
Keith Smiley	c9b6d641f0	Fix @llvm.global_ctors docs (NFC)	2022-12-07 11:24:08 -08:00
Tim Northover	6b98824a58	AArch64: emit `fcmp ord %a, zeroinitializer` as a single fcmeq. Most "ord" checks need two real-world compares to implement, but this is the canonical form of a "!isnan" check, which is equivalent to comparing the input for equality against itself.	2022-12-07 19:17:30 +00:00
Koakuma	2c41310fc1	[SPARC] Mark the %g0 register as constant & use it to materialize zeros Materialize zeros by copying from %g0, which is now marked as constant. This makes it possible for some common operations (like integer negation) to be performed in fewer instructions. This continues @arichardson's patch at D132561. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D138887	2022-12-07 13:34:13 -05:00
Joe Nash	bbfbec94b1	[AMDGPU] Enable OMod on more VOP3 instructions OMod was disabled if OpSel was enabled, but that restriction is more specific than necessary. Any VOP3 with float operands can use OMod. On GFX11, FMAC_F16_e64 can use op_sel. Previously, SIFoldOperands and convertToThreeAddress were accidentally correct when they reinterpreted the zero OMod operand on V_FMAC_F16_e64 as the OpSel operand on V_FMA_F16_gfx9_e64. Now we explicitly add op_sel if required. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D139469	2022-12-07 13:30:33 -05:00
Alex Richardson	9114ac67a9	Overload all llvm.annotation intrinsics for globals argument The global constant arguments could be in a different address space than the first argument, so we have to add another overloaded argument. This patch was originally made for CHERI LLVM (where globals can be in address space 200), but it also appears to be useful for in-tree targets as can be seen from the test diffs. Differential Revision: https://reviews.llvm.org/D138722	2022-12-07 18:29:18 +00:00
Amara Emerson	53445f5b1c	[GlobalISel] Add a new G_INVOKE_REGION_START instruction to fix an EH bug. We currently have a bug where the legalizer, when dealing with phi operands, may create instructions in the phi's incoming blocks at points which are effectively dead due to a possible exception throw. Say we have: throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL B returnbb bb: %v = phi i1 %true, throwbb, %false.... When legalizing we may need to widen the i1 %true value, and to do that we need to create new extension instructions in the incoming block. Our insertion point currently is the MBB::getFirstTerminator() which puts the IP before the unconditional branch terminator in throwbb. These extensions may never be executed if the call throws, and therefore we need to emit them before the call (but not too early, since our new instruction may need values defined within throwbb as well). throwbb: EH_LABEL x0 = %callarg1 BL @may_throw_call EH_LABEL %true = G_CONSTANT i32 1 ; <<<-- ruh'roh, this never executes if may_throw_call() throws! B returnbb bb: %v = phi i32 %true, throwbb, %false.... To fix this, I've added two new instructions. The main idea is that G_INVOKE_REGION_START is a terminator, which tries to model the fact that in the IR, the original invoke inst is actually a terminator as well. By using that as the new insertion point, we make sure to place new instructions on always executing paths. Unfortunately we still need to make the legalizer use a new insertion point API that I've added, since the existing `getFirstTerminator()` method does a reverse walk up the block, and any non-terminator instructions cause it to bail out. To avoid impacting compile time for all `getFirstTerminator()` uses, I've added a new method that does a forward walk instead. Differential Revision: https://reviews.llvm.org/D137905	2022-12-07 10:28:51 -08:00
Philip Reames	14ea545a7d	[RISCV][InsertVSETVLI] Generalize scalar move rule for when AVL is unchanged By definition, the AVL of the scalar move is equally zero to the prior AVL if they are the same value. This generalizes the existing code to the case where the scalar move has a register AVL which is unknown, but unchanged from the preceeding instruction. This doesn't cause any interesting diffs on its own, but another patch makes this case much more common. Split off to reduce a future diff.	2022-12-07 10:28:31 -08:00
Craig Topper	f1fd5c9b36	[RISCV] Remove pseudos for whole register load, store, and move. The MC layer instructions have the correct register classes, and the pseudos don't have any additional operands. So there doesn't seem to be any reason for them to exist. The pseudos were incorrectly going through code in RISCVMCInstLower that converted LMUL>1 register classes to LMUL1 register class. This makes the MCInst technically malformed, and prevented the vl2r.v, vl4r.v, and vl8r.v InstAliases from matching. This accounts for all of the .ll test diffs. Differential Revision: https://reviews.llvm.org/D139511	2022-12-07 10:19:58 -08:00
Haojian Wu	50daddf279	Fix an -Wunused-variable warning in release build, NFC	2022-12-07 18:59:17 +01:00
Craig Topper	938d0d6d7b	[RISCV] Replace uses of hasStdExtC with COrZca. Except MakeCompressible which will need more work. Reviewed By: reames Differential Revision: https://reviews.llvm.org/D139504	2022-12-07 09:34:01 -08:00
Mirko Brkusanin	fe42ebe442	[AMDGPU][GlobalISel] Fix legalizing image intrinsics for new types We no longer need to increase vector size to 16 for intrinsics that use more than 8 vgprs for addr. There is no image intrinsic that needs more than 12 so all currently existing cases will be covered. Using incorrect size was causing an error in instruction selection because instructions were updated to require new types (9x32, 10x32, 11x32, 12x32). Differential Revision: https://reviews.llvm.org/D139546	2022-12-07 18:20:58 +01:00
Krzysztof Parzyszek	110fe4f495	[IRReader] Convert Optional in DataLayoutCallbackTy to std::optional	2022-12-07 08:47:25 -08:00
serge-sans-paille	40ade845be	Revert "Store OptTable::Info::Name as a StringRef" Another revert, for another set of issues I don't reproduce locally... see https://lab.llvm.org/buildbot/#/builders/139/builds/32327 This reverts commit bdfa3100dc3ea9e9ce4d3d4100ea6bb4c3fa2b81.	2022-12-07 17:29:53 +01:00
Dmitry Kurtaev	a2c9f12dd6	[RISCV][JitLink] Propagate error from Expected<T> result during R_RISCV_PCREL_HI20 parsing related issue: https://github.com/llvm/llvm-project/issues/59139 Differential Revision: https://reviews.llvm.org/D138781	2022-12-07 08:26:38 -08:00
chenglin.bi	b4c8cfc7c2	[InstCombine] fold more icmp + select patterns by distributive laws follow up D139076, add icmp with not only eq/ne, but also gt/lt/ge/le. Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D139253	2022-12-07 23:55:49 +08:00
chenglin.bi	10c3df728c	[Instcombine] Canonicalize ~((A & B) ^ (A \| ?)) -> (A & B) \| ~(A \| ?) ~((A & B) ^ (A \| ?)) -> (A & B) \| ~(A \| ?) https://alive2.llvm.org/ce/z/JHN2p4 Reviewed By: spatel Differential Revision: https://reviews.llvm.org/D139299	2022-12-07 23:52:07 +08:00
serge-sans-paille	bdfa3100dc	Store OptTable::Info::Name as a StringRef This is a recommit of 8ae18303f97d5dcfaecc90b4d87effb2011ed82e, with a few cleanups. This avoids implicit conversion to StringRef at several points, which in turns avoid redundant calls to strlen. As a side effect, this greatly simplifies the implementation of StrCmpOptionNameIgnoreCase. It also eventually gives a consistent, humble speedup in compilation time (timing updated since original commit). https://llvm-compile-time-tracker.com/compare.php?from=76fcfea283472a80356d87c89270b0e2d106b54c&to=b70eb1f347f22fe4d2977360c4ed701eabc43994&stat=instructions:u Differential Revision: https://reviews.llvm.org/D139274	2022-12-07 16:32:37 +01:00
Guillaume Chatelet	7203a8614a	[reland][Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbol Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`. I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined. And currently, all calls to `emitLocalCommonSymbol` are provably `>0`. Differential Revision: https://reviews.llvm.org/D139439	2022-12-07 14:54:03 +00:00
Guillaume Chatelet	b822063669	Revert D139439 "[Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbol" This breaks Windows bots with `warning C4334: '<<': result of 32-bit shift implicitly converted to 64 bits (was 64-bit shift intended?)` Some shift operators are lacking a proper literal unit ('1ULL' instead of '1'). Will reland once fixed. This reverts commit c621c1a8e81856e6bf2be79714767d80466e9ede.	2022-12-07 14:51:26 +00:00
Nikita Popov	4d97a914d7	[SCEV] Use umin_seq for symbolic max BE count We were using umin_seq when computing the exact BE count, but not when computing the symbolic max BE count.	2022-12-07 15:32:49 +01:00
Guillaume Chatelet	c621c1a8e8	[Alignment] Use Align in MCStreamer emitZeroFill/emitLocalCommonSymbol Before performing this change, I checked that `ByteAlignment` was never `0` inside `MCAsmStreamer:emitZeroFill` and `MCAsmStreamer::emitLocalCommonSymbol`. I believe it is NFC as `0` values are illegal in `emitZeroFill` anyways, `Log2(ByteAlignment)` would be undefined. And currently, all calls to `emitLocalCommonSymbol` are provably `>0`. Differential Revision: https://reviews.llvm.org/D139439	2022-12-07 14:29:16 +00:00
Simon Pilgrim	de0ff1fcb1	X86SelectionDAGInfo.cpp - move dyn_cast check inside if(). NFC. Minor cleanup - we only use the non-null pointer inside the if() block	2022-12-07 14:21:24 +00:00
Florian Hahn	8fa81cfad4	[ConstraintElim] Address comments from D137840. Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D139482	2022-12-07 14:12:53 +00:00
Krzysztof Parzyszek	a81a0c97f1	[Remarks] Convert Optional to std::optional	2022-12-07 08:11:11 -06:00
Jay Foad	632118a090	[AMDGPU] Use SOP_Pseudo more consistently. NFC. SOPK_Pseudo was not inheriting from SOP_Pseudo at all, and some other Pseudo classes were needlessly redefining things that were already defined by SOP_Pseudo. Differential Revision: https://reviews.llvm.org/D139527	2022-12-07 12:56:42 +00:00
Haojian Wu	a2215149ae	Add implementation isTargetCanonicalConstantNode for hexagon. This fixes an infinite compiling loop caused by https://reviews.llvm.org/D137140 Differential Revision: https://reviews.llvm.org/D139525	2022-12-07 13:21:29 +01:00
Stephen Thomas	ab2e27faa4	[AMDGPU] Small cleanup in insertWaitcntInBlock() Move some code that checks if an instruction is a waitcount into a separate function, mainly to aid readability in the logic where it is used. Differential Revision: https://reviews.llvm.org/D139522	2022-12-07 11:58:59 +00:00
Janek van Oirschot	587747d8d1	[AMDGPU] G_IS_FPCLASS lower() support for IEEE fp types Simplified globalisel version of sdag's expandIS_FPCLASS. Reviewed By: arsenm, #amdgpu Differential Revision: https://reviews.llvm.org/D139128	2022-12-07 11:53:09 +00:00
Max Kazantsev	07de5d18c9	[SCEV] Remember blocks for which we know symbolic exit count but not exact The old code didn't bother to memoize blocks for which exact exit count is not known. As result, in situation when exact isn't known but symbolic is known, this info was lost. This patch fixes the situation: now we memoize when symbolic is known (exact always implies symbolic, so this is a strict superset of what was before). Differential Revision: https://reviews.llvm.org/D139515 Reviewed By: nikic	2022-12-07 17:51:30 +07:00
Anton Sidorenko	f8ed709345	[MachineCombiner] Extend reassociation logic to handle inverse instructions Machine combiner supports generic reassociation only of associative and commutative instructions, for example (A + X) + Y => (X + Y) + A. However, we can extend this generic support to handle patterns like (X + A) - Y => (X - Y) + A), where `-` is the inverse of `+`. This patch adds interface functions to process reassociation patterns of associative/commutative instructions and their inverse variants with minimal changes in backends. Differential Revision: https://reviews.llvm.org/D136754	2022-12-07 13:50:28 +03:00
Wenzel Jakob	cb5b25c587	[llvm-c] Added a C-API binding to query the LLVM version The LLVM C bindings currently offer no way to query the version string dynamically. This is a useful feature in situations where a program isn't compiled against a specific version of LLVM but rather loads it dynamically (e.g. using dlopen()). In situations where the shared library filename doesn't reveal the version (e.g. LLVM-C.dll) and to adapt to version-specific API differences, it is then useful to be able to query the version string by calling the proposed LLVMGetVersion function. Differential Revision: https://reviews.llvm.org/D139381	2022-12-07 11:18:32 +01:00
David Sherwood	93d9c2e563	[SVE] Commonise bfmlal* and fmlal* instruction classes Given the significant commonality between the bfmlal* and fmlal* instructions it makes sense to use just a single class for both. We can do this now that the bfmlal* lane intrinsics take a i32 index. Differential Revision: https://reviews.llvm.org/D138906	2022-12-07 09:30:32 +00:00
Piotr Sobczak	1e3abd82b9	[AMDGPU] Fix wide spills Update spill code to account for new vector types with bit widths: 288, 320, 352, 384. Related to D138205. Differential Revision: https://reviews.llvm.org/D139203	2022-12-07 10:23:52 +01:00

1 2 3 4 5 ...

164513 Commits