llvm-project

Author	SHA1	Message	Date
Craig Topper	fcacda899f	[RISCV] Remove constant_fold_cast_op from RISCVPostLegalizerCombiner. This is no longer tested after other recent changes. AArch64 does have this in their PostLegalizerCombiner.	2024-11-12 23:28:48 -08:00
Kazu Hirata	9571cc2b28	[ARM] Remove unused includes (NFC) (#115995 ) Identified with misc-include-cleaner.	2024-11-12 23:15:21 -08:00
Kazu Hirata	735ab61ac8	[CodeGen] Remove unused includes (NFC) (#115996 ) Identified with misc-include-cleaner.	2024-11-12 23:15:06 -08:00
Jianjian Guan	a6f8af676a	[RISCV] Improve vmsge and vmsgeu selection (#115435 ) Select vmsge(u) vs, C to vmsgt(u) vs, C-1 if C is not in the imm range and not the minimum value. Fix https://github.com/llvm/llvm-project/issues/114505.	2024-11-13 15:05:08 +08:00
Boaz Brickner	9a365bc9a0	[Clang] [NFC] Add "human" diagnostic argument format (#115835 ) This allows formatting large integers in a human friendly way. Example: "5321584" -> "5.32M". Use it where such human numbers are generated manually today.	2024-11-13 07:58:11 +01:00
Boaz Brickner	edfa75de33	[clang] [NFC] Split checkAttributesAfterMerging() to multiple functions (#115464 )	2024-11-13 07:42:50 +01:00
Luke Lau	1294ddabbc	[RISCV] Add cost model tests for vp.{s,u}{min,max}. NFC	2024-11-13 14:32:44 +08:00
Kasper Nielsen	1824e45cd7	[MLIR,Python] Support converting boolean numpy arrays to and from mlir attributes (unrevert) (#115481 ) This PR re-introduces the functionality of https://github.com/llvm/llvm-project/pull/113064, which was reverted in `0a68171b3c` due to memory lifetime issues. Notice that I was not able to re-produce the ASan results myself, so I have not been able to verify that this PR really fixes the issue. --- Currently it is unsupported to: 1. Convert a MlirAttribute with type i1 to a numpy array 2. Convert a boolean numpy array to a MlirAttribute Currently the entire Python application violently crashes with a quite poor error message https://github.com/pybind/pybind11/issues/3336 The complication handling these conversions, is that MlirAttribute represent booleans as a bit-packed i1 type, whereas numpy represents booleans as a byte array with 8 bit used per boolean. This PR proposes the following approach: 1. When converting a i1 typed MlirAttribute to a numpy array, we can not directly use the underlying raw data backing the MlirAttribute as a buffer to Python, as done for other types. Instead, a copy of the data is generated using numpy's unpackbits function, and the result is send back to Python. 2. When constructing a MlirAttribute from a numpy array, first the python data is read as a uint8_t to get it converted to the endianess used internally in mlir. Then the booleans are bitpacked using numpy's bitpack function, and the bitpacked array is saved as the MlirAttribute representation.	2024-11-13 01:23:10 -05:00
Matthias Springer	804d3c4ce1	[mlir][IR] Add `Block::isReachable` helper function (#114928 ) Add a new helper function `isReachable` to `Block`. This function traverses all successors of a block to determine if another block is reachable from the current block. This functionality has been reimplemented in multiple places in MLIR. Possibly additional copies in downstream projects. Therefore, moving it to a common place.	2024-11-13 14:58:09 +09:00
Sushant Gokhale	9991ea28fc	[CostModel][AArch64] Make extractelement, with fmul user, free whenev… (#111479 ) …er possible In case of Neon, if there exists extractelement from lane != 0 such that 1. extractelement does not necessitate a move from vector_reg -> GPR 2. extractelement result feeds into fmul 3. Other operand of fmul is a scalar or extractelement from lane 0 or lane equivalent to 0 then the extractelement can be merged with fmul in the backend and it incurs no cost. e.g. ``` define double @foo(<2 x double> %a) { %1 = extractelement <2 x double> %a, i32 0 %2 = extractelement <2 x double> %a, i32 1 %res = fmul double %1, %2 ret double %res } ``` `%2` and `%res` can be merged in the backend to generate: `fmul d0, d0, v0.d[1]` The change was tested with SPEC FP(C/C++) on Neoverse-v2. Compile time impact: None Performance impact: Observing 1.3-1.7% uplift on lbm benchmark with -flto depending upon the config.	2024-11-13 11:10:49 +05:30
Kazu Hirata	95554cbd77	[memprof] Teach extractCallsFromIR to recognize heap allocation functions (#115938 ) This patch teaches extractCallsFromIR to recognize heap allocation functions. Specifically, when we encounter a callee that is known to be a heap allocation function like "new", we set the callee GUID to 0. Note that I am planning to do the same for the caller-callee pairs extracted from the profile. That is, when I encounter a frame that does not have a callee, we assume that the frame is calling some heap allocation function with GUID 0. Technically, I'm not recognizing enough functions in this patch. TCMalloc is known to drop certain frames in the call stack immediately above new. This patch is meant to lay the groundwork, setting up GetTLI, plumbing it to extractCallsFromIR, and adjusting the unit tests. I'll address remaining issues in subsequent patches.	2024-11-12 21:37:29 -08:00
Matt Arsenault	5911fbb39d	AMDGPU: Do not fold copy to physreg from operation on frame index (#115977 )	2024-11-12 21:35:51 -08:00
Alex Bradbury	2baead09b2	[docs] Add blank line before bulletpoint list to fix HowToAddABuilder The bulletpoint list wasn't rendering properly due to a missing blank line.	2024-11-13 05:26:02 +00:00
Valentin Clement (バレンタインクレメン)	2583071fb4	[flang][cuda] Compute size of derived type arrays (#115914 )	2024-11-12 21:23:58 -08:00
Jonas Devlieghere	4714215efb	[lldb] Support true/false in ValueObject::SetValueFromCString (#115780 ) Support "true" and "false" (and "YES" and "NO" in Objective-C) in ValueObject::SetValueFromCString. Fixes #112597	2024-11-12 21:18:22 -08:00
Shilei Tian	de0fd64bed	[AMDGPU] Introduce a new generic target `gfx9-4-generic` (#115190 ) This patch introduces a new generic target, `gfx9-4-generic`. Since it doesn’t support FP8 and XF32-related instructions, the patch includes several code reorganizations to accommodate these changes.	2024-11-12 23:11:05 -05:00
Han-Kuan Chen	5a5502b9e1	[SLP] NFC. Use Value instead of template. (#115440 )	2024-11-13 11:58:19 +08:00
Justin Fargnoli	274feef7dd	Reland "[NVPTX] Emit prmt selection value in hex" (#115952 ) Initially landed in 3ed4b0b0efca7a9467ce83fc62de9413da38006d. Reverted in 375d1925dbd0c051fe2d4a86fe98ed08f4a502c5 because the [`load-store.ll`](https://github.com/llvm/llvm-project/blob/main/llvm/test/CodeGen/NVPTX/load-store.ll) test was not updated after 5e75880165553e9afb721239689a9c79ec84a108. 5e75880165553e9afb721239689a9c79ec84a108 is now updated in 7a99f2322c324972f2c5091dddd7752fa21d5a78.	2024-11-12 19:21:34 -08:00
Petr Hosek	5fa47d8c52	[libc] Support multilib with runtimes build (#115357 ) This adds minimal support for multilibs akin to libc++.	2024-11-12 19:16:27 -08:00
Sirui Mu	e887f8290d	[mlir][LLVM] Add !invariant.group metadata to llvm.load and llvm.store (#115723 ) This patch adds support for the `!invariant.group` metadata to the `llvm.load` and the `llvm.store` operation.	2024-11-13 10:54:34 +08:00
Valentin Clement (バレンタインクレメン)	37143fe27e	[flang][cuda] Make launch configuration optional for cuf kernel (#115947 )	2024-11-12 16:49:44 -08:00
Tarun Prabhu	01d233ff40	Revert "[clang][flang] Support -time in both clang and flang" Reverts llvm/llvm-project#109165 This created a buildbot failure on [Fuchsia](https://lab.llvm.org/buildbot/#/builders/11/builds/8080).	2024-11-12 17:08:02 -07:00
Sterling-Augustine	7ba864b592	[SandboxVectorizer] Register erase callback for seed collection (#115951 )	2024-11-12 16:03:27 -08:00
Matthias Springer	b0a4e958e8	[mlir][bufferization] Add support for non-unique `func.return` (#114017 ) Multiple `func.return` ops inside of a `func.func` op are now supported during bufferization. This PR extends the code base in 3 places: - When inferring function return types, `memref.cast` ops are folded away only if all `func.return` ops have matching buffer types. (E.g., we don't fold if two `return` ops have operands with different layout maps.) - The alias sets of all `func.return` ops are merged. That's because aliasing is a "may be" property. - The equivalence sets of all `func.return` ops are taken only if they match. If different `func.return` ops have different equivalence sets for their operands, the equivalence information is dropped. That's because equivalence is a "must be" property. This commit is in preparation of removing the deprecated `func-bufferize` pass. That pass can bufferize functions with multiple `return` ops.	2024-11-13 08:51:39 +09:00
Michael Jones	d6219e6599	[libc] Make fstatvfs test less flakey (#115949 )	2024-11-12 18:40:52 -05:00
Min-Yih Hsu	84e95beae9	[RISCV] Update SiFive P600's scheduling model on RVV instructions (#115243 ) The biggest change is assigning vector crypto instructions to the correct processor resource. The majority of these changes are guided by our RVV-capable llvm-exegesis.	2024-11-12 15:29:40 -08:00
Rahul Joshi	7b5e285d16	[NFC][Clang] Use range for loops in ClangDiagnosticsEmitter (#115573 ) Use range based for loops in Clang diagnostics emitter.	2024-11-12 14:39:02 -08:00
Shlomi Regev	13317502da	[mlir] Add a null pointer check in symbol lookup (#115165 ) Dead code analysis crashed because a symbol that is called/used didn't appear in the symbol table. This patch fixes this by adding a nullptr check after symbol table lookup.	2024-11-12 23:31:25 +01:00
LLVM GN Syncbot	5a5122cac6	[gn build] Port 0e97b4d05a0b	2024-11-12 22:23:40 +00:00
Thorsten Schütt	0e97b4d05a	[GlobalISel] Combine G_MERGE_VALUES of x and undef (#113616 ) into anyext x ; CHECK-NEXT: [[MV1:%[0-9]+]]:_(s64) = G_MERGE_VALUES [[TRUNC]](s32), [[DEF]](s32) Please continue padding merge values. // %bits_8_15:_(s8) = G_IMPLICIT_DEF // %0:_(s16) = G_MERGE_VALUES %bits_0_7:(s8), %bits_8_15:(s8) %bits_8_15 is defined by undef. Its value is undefined and we can pick an arbitrary value. For optimization, we pick anyext, which plays well with the undefinedness. // %0:_(s16) = G_ANYEXT %bits_0_7:(s8) The upper bits of %0 are undefined and the lower bits come from %bits_0_7.	2024-11-12 23:23:32 +01:00
Jorge Gorbe Moya	9d85ba5724	[SandboxIR] Preserve the order of switch cases after revert. (#115577 ) Preserving the case order is not strictly necessary to preserve semantics (for example, operations like SwitchInst::removeCase will happily swap cases around). However, I'm planning to introduce an optional verification step for SandboxIR that will use StructuralHash to compare IR after a revert to the original IR to help catch tracker bugs, and the order difference triggers a difference there.	2024-11-12 14:10:46 -08:00
Nikolas Klauser	a2042521a0	[libc++] Remove _AlgPolicy from std::copy and algorithms using std::copy (#115887 ) `std::copy` doesn't use the `_AlgPolicy` for anything other than calling itself with it, so we can just remove the argument. This also removes the need in a few other algorithms which had an `_AlgPolicy` argument only to call `copy`.	2024-11-12 23:03:52 +01:00
Alex Bradbury	8da61a3434	[llvm][docs] Expand HowToAddABuilder with guidance on testing locally (#115024 ) With <https://github.com/llvm/llvm-zorg/pull/289> and <https://github.com/llvm/llvm-zorg/pull/293> landed, it's now reasonable to ask people to test their builder configurations locally. This patch adds documentation on how to do so.	2024-11-12 22:02:20 +00:00
lialan	24a8092be7	[MLIR] Avoid `vector.extract_strided_slice` when not needed (#115941 ) In `staticallyExtractSubvector`, When the extracting slice is the same as source vector, do not need to emit `vector.extract_strided_slice`. This fixes the lit test case `@vector_store_i4` in `mlir\test\Dialect\Vector\vector-emulate-narrow-type.mlir`, where converting from `vector<8xi4>` to `vector<4xi8>` does not need slice extraction. The issue was introduced in #113411 and #115070, CI failure link: https://buildkite.com/llvm-project/github-pull-requests/builds/118845 This PR does not include a lit test case because it is a fix and the above mentioned `@vector_store_i4` test actually tests the mechanism. Signed-off-by: Alan Li <me@alanli.org>	2024-11-12 13:58:58 -08:00
Nikolas Klauser	36fa8bdfa0	[libc++][NFC] Remove unused functions from <__split_buffer> (#115735 )	2024-11-12 22:55:59 +01:00
Krzysztof Drewniak	49f90e798f	[mlir][affine] Cancel exactly-matching delinearize/linearize pairs (#115758 ) If we linearize values (with an assertion tha they are disjoint) and then delinearize that linear index with th exact same basis, we know that these operations are exact inverses of each other and can be replaced with the original inputs to the linearization. Similarly, if we take a linear index, delinearize it with some bases, and then re-linearize it with that same basis (noting that the outputs of the delinearization are guaranteed to by `disjoint`, even if this is not asserted on the linearize_index operation), the re-linearization is the inverse of the delinearization, so those two operations can also be canceled out. This commit adds canonicalization patterns for these simple cancelations.	2024-11-12 15:36:07 -06:00
Peng Sun	fe83a7282e	[TOSA] Introduce Tosa_ElementwiseUnaryOp with Type and Shape Enforcement (#115784 ) * Enforce that Tosa_ElementwiseUnaryOp requires output tensors to match the input tensor's type and shape. * Update the following ops to conform to Tosa_ElementwiseUnaryOp: clamp, erf, sigmoid, tanh, cos, sin, abs, bitwise_not, ceil, clz, exp, floor, log, logical_not, negate, reciprocal, rsqrt. * Add invalid tests for each operator to ensure compliance with TOSA v1.0 Specification. Signed-off-by: Peng Sun <peng.sun@arm.com>	2024-11-12 13:35:47 -08:00
Gábor Horváth	d2db9bd708	[clang][APINotes] Add support for the SwiftEscapable attribute (#115866 ) This is similar to SwiftCopyable. Also fix missing SwiftCopyable dump for TagInfo.	2024-11-12 21:34:56 +00:00
Tex Riddell	5c2a133b13	Emit constrained atan2 intrinsic for clang builtin (#113636 ) This change is part of this proposal: https://discourse.llvm.org/t/rfc-all-the-math-intrinsics/78294 - `Builtins.td` - Add f16 support for libm atan2 builtin - `CGBuiltin.cpp` - Emit constraint atan2 intrinsic for clang builtin - `clang/test/CodeGenCXX/builtin-calling-conv.cpp` - Use erff instead of atan2 for clang builtin to lib call calling convention check, now that atan2 maps to an intrinsic. - add atan2 cases to llvm.experimental.constrained tests for more backends: ARM, PowerPC, RISCV, SystemZ. - LangRef.rst: add llvm.experimental.constrained.atan2, revise llvm.atan2 description. Last part of Implement the atan2 HLSL Function. Fixes #70096.	2024-11-12 13:34:29 -08:00
Tarun Prabhu	f5396748c7	[clang][flang] Support -time in both clang and flang The -time option prints timing information for the subcommands (compiler, linker) in a format similar to that used by gcc/gfortran. This partially addresses requests from #89888	2024-11-12 14:27:22 -07:00
John Harrison	e5ba117274	[lldb-dap] Remove `g_dap` references from lldb-dap/LLDBUtils. (#115933 ) This refactor removes g_dap references from lldb-dap/LLDBUtils.{h,cpp} to allow us to create more than one g_dap instance in the future.	2024-11-12 13:19:17 -08:00
Nikolas Klauser	5b67372aec	[libc++] Remove a few unused includes from <__algorithm/find_end.h>	2024-11-12 22:11:15 +01:00
Craig Topper	4bd6e15a45	[RISCV][GISel] Sync MaxIterations/ObserverLvl/EnableFullDCE for PreLegalizer combiners with AArch64.	2024-11-12 13:07:51 -08:00
Michael Jones	6aa7403858	[libc] Fix fpbits test running 80bit ld everywhere (#115937 ) After #115084 the 80 bit long double tests error if sizeof(long double) isn't 96 or 128 bits. This caused failures in long double is double systems (since long double is 64 bits) so I've disabled the 80 bit long double tests on systems that don't use them.	2024-11-12 12:52:08 -08:00
Benjamin Maxwell	014455a587	[SDAG] Limit sincos/frexp stack slot folding to stores chained to entry (#115906 ) When the chain is not the entry node there is a risk the stores are within a (CALLSEQ_START, CALLSEQ_END), which when the node is expanded will lead to nested call sequences. It should be possible to check for this and allow more cases, but for now, let's limit this to cases where it's definitely safe. Fixes #115323	2024-11-12 20:48:41 +00:00
Miguel A. Arroyo	5cd6e21bdd	[LLD][COFF] allow saving intermediate files with /lldsavetemps (#115131 ) * Parity with the `-save-temps=` flag in the `ELF` `lld` driver.	2024-11-12 22:30:48 +02:00
Haojian Wu	70d6789c7a	[bazel] Port for 7302c8dbe71b7c03b73a35a21fa4b415fa1f4505	2024-11-12 21:06:19 +01:00
Maksim Panchenko	d922045381	[BOLT] Use AsmInfo for address size. NFCI (#115932 ) Use AsmInfo instead of DWARFObj interface for extracting address size and format.	2024-11-12 11:53:34 -08:00
Maksim Panchenko	be89e794f7	[BOLT][AArch64] Add support for long absolute LLD thunks/veneers (#113408 ) Absolute thunks generated by LLD reference function addresses recorded as data in code. Since they are generated by the linker, they don't have relocations associated with them and thus the addresses are left undetected. Use pattern matching to detect such thunks and handle them in VeneerElimination pass.	2024-11-12 11:27:08 -08:00
Krystian Stasiowski	3ab5927b97	[Clang][Comments] Make @relates an inline comment command (#115040 ) According to the Doxygen documentation, the `relates`, `related`, `relatesalso`, and `relatedalso` commands all have a single argument. This patch changes their classification from `VerbatimLineCommand` to `InlineCommand` so the argument is correctly parsed.	2024-11-12 14:18:28 -05:00

1 2 3 4 5 ...

518027 Commits