llvm-project

Author	SHA1	Message	Date
Joao Saffran	fb248daeef	rename visibility	2025-08-19 16:36:56 -07:00
Joao Saffran	567a3d4efe	remove default constructor	2025-08-19 16:35:55 -07:00
Joao Saffran	3b25b34cd9	save a copy	2025-08-19 16:35:05 -07:00
Joao Saffran	d38c00d09a	remove unused	2025-08-19 16:33:07 -07:00
Joao Saffran	8eb82fdac3	adding missing import	2025-08-19 10:56:17 -07:00
Joao Saffran	1d29111c4f	fix whitespace in test	2025-08-19 10:47:15 -07:00
Joao Saffran	6539364fa7	clean up	2025-08-19 10:45:49 -07:00
Joao Saffran	f6f2e61d5d	removing root parameter header from MC	2025-08-19 10:34:02 -07:00
Joao Saffran	1690a9c04d	clean up	2025-08-18 18:52:48 -07:00
Joao Saffran	31ec5e50ff	making parameter type and shader visibility use enums	2025-08-18 18:49:45 -07:00
Phoebe Wang	b0d2b57f7e	[Headers][X86] Remove more duplicated typedefs (#153820 ) They are defined in mmintrin.h	2025-08-16 00:21:40 +08:00
Shubham Sandeep Rastogi	cd0bf2735b	Revert "[LLDB] Update DIL handling of array subscripting. (#151605 )" This reverts commit 6d3ad9d9fd830eef0ac8a9d558e826b8b624e17d. This was reverted because it broke the LLDB greendragon bot.	2025-08-15 09:17:33 -07:00
Craig Topper	853094fd81	[VirtRegMap] Use TRI member variable. NFC	2025-08-15 09:14:09 -07:00
George Burgess IV	c10766cf49	[utils] add `stop_at_sha` to revert_checker's API (#152011 ) This is useful for downstream consumers of this as a module. It's unclear if interactive use wants this lever, but support can easily be added if so.	2025-08-15 16:13:29 +00:00
Nikita Popov	01bc742185	[CodeGen] Give ArgListEntry a proper constructor (NFC) (#153817 ) This ensures that the required fields are set, and also makes the construction more convenient.	2025-08-15 18:06:07 +02:00
Daniel Paoliello	1d1e52e614	[win][x64] Allow push/pop for stack alloc when unwind v2 is required (#153621 ) While attempting to enable Windows x64 unwind v2, compilation failed with the following error: ``` fatal error: error in backend: Windows x64 Unwind v2 is required, but LLVM has generated incompatible code in function '<redacted>': Cannot pop registers before the stack allocation has been deallocated ``` I traced this down to an optimization in `X86FrameLowering`: <`6961139ce9/llvm/lib/Target/X86/X86FrameLowering.cpp (L324-L340)`> Technically, using `push`/`pop` to adjust the stack is permitted under unwind v2: the requirement for a "canonical" epilog is that the stack is fully adjusted before the registers listed as pushed in the unwind table are popped. So, as long as the `.seh_unwindv2start` pseudo is after the pops that adjust the stack, then everything will work correctly. One other side effect of this change is that the stack is now allowed to be adjusted across multiple instructions, which would be needed for extremely large stack frames.	2025-08-15 09:03:44 -07:00
Leandro Lacerda	08ff017fb0	[libc] Improve GPU benchmarking (#153512 ) This patch improves the GPU benchmarking in this way: * Replace `rand`/`srand` with a deterministic per-thread RNG seeded by `call_index`: reproducible, apples-to-apples libc vs vendor comparisons. * Fix input generation: sample the unbiased exponent uniformly in `[min_exp, max_exp]`, clamp bounds, and skip `Inf`, `NaN`, `-0.0`, and `+0.0`. * Fix standard deviation: use an explicit estimator from sums and sums-of-squares (`sqrt(E[x^2] − E[x]^2)`) across samples. * Fix throughput overhead: subtract a loop-only baseline inside NVPTX/AMDGPU timing backends so `benchmark()` gets cycles-per-call already corrected (no `overhead()` call). * Adapt existing math benchmarks to the new RNG/timing plumbing (plumb `call_index`, drop `rand/srand`, clean includes). * Correct inter-thread aggregation: use iteration-weighted pooling to compute the global mean/variance, ensuring statistically sound `Cycles (Mean)` and `Stddev`. * Remove `Time / Iteration` column from the results table: it reported per-thread convergence time (not per-call latency) and was redundant/misleading next to `Cycles (Mean)`. * Remove unused `BenchmarkLogger` files: dead code that added maintenance and cognitive overhead without providing functionality. --- ## TODO (before merge) * [ ] Investigate compiler warnings and address their root causes. * [x] Review how per-thread results are aggregated into the overall result. ## Follow-ups (future PRs) * Add support to run throughput benchmarks with uniform (linear) input distributions, alongside the current log2-uniform scheme. * Review/adjust the configuration and coverage of existing math benchmarks. * Add more math benchmarks (e.g., `exp`/`expf`, others).	2025-08-15 11:00:17 -05:00
Ramkumar Ramachandra	f34326dac8	[VPlan] Introduce vputils::onlyScalarValuesUsed (NFC) (#153577 )	2025-08-15 15:55:59 +00:00
Shafik Yaghmour	868efdcf38	[Clang][Bytecode][NFC] Move Result into APSInt constructor (#153664 ) Static analysis flagged this line because we are copying Result instead of moving it.	2025-08-15 08:52:49 -07:00
Dave Lee	ae7e1b82fe	[lldb] Print ValueObject when GetObjectDescription fails (#152417 ) This fixes a few bugs, effectively through a fallback to `p` when `po` fails. The motivating bug this fixes is when an error within the compiler causes `po` to fail. Previously when that happened, only its value (typically an object's address) was printed – and problematically, no compiler diagnostics were shown. With this change, compiler diagnostics are shown, _and_ the object is fully printed (ie `p`). Another bug this fixes is when `po` is used on a type that doesn't provide an object description (such as a struct). Again, the normal `ValueObject` printing is used. Additionally, this also improves how lldb handles an object description method that fails in some way. Now an error will be shown (it wasn't before), and the value will be printed normally.	2025-08-15 08:37:26 -07:00
Tim Gymnich	ffaba758fb	[MLIR][ROCDL] Add permlane16.swap and permanlane32.swap (#153804 ) add rocdl.permlane16.swap and rocdl.permanlane32.swap	2025-08-15 17:35:31 +02:00
Simon Pilgrim	38eb14f27c	[X86] avx512vbmi2-builtins.c / avx512vlvbmi2-builtins.c - add C/C++ and 32/64-bit test coverage	2025-08-15 16:35:16 +01:00
Simon Pilgrim	7df862818e	[X86] avx512vbmi-builtins.c / avx512vbmivl-builtin.c - add C/C++ and 32/64-bit test coverage	2025-08-15 16:35:15 +01:00
Tim Renouf	f279c47cb3	AMDGPU gfx12: Add _dvgpr$ symbols for dynamic VGPRs (#148251 ) For each function with the AMDGPU_CS_Chain calling convention, with dynamic VGPRs enabled, add a _dvgpr$ symbol, with the value of the function symbol, plus an offset encoding one less than the number of VGPR blocks used by the function (16 VGPRs per block, no more than 128) in bits 5..3 of the symbol value. This is used by a front-end to have functions that are chained rather than called, and a dispatcher that dynamically resizes the VGPR count before dispatching to a function.	2025-08-15 16:33:06 +01:00
Aiden Grossman	0b04168948	[CI] Add Basic Bazel Checks (#153740 ) Having basic checks (like running buildifier) on the upstream bazel files would be helpful for contributors maintaining the bazel build. Add basic checks (currently just buildifier) to a workflow that runs whenever the bazel build files change.	2025-08-15 08:30:07 -07:00
cmtice	6d3ad9d9fd	[LLDB] Update DIL handling of array subscripting. (#151605 ) This updates the DIL code for handling array subscripting to more closely match and handle all the cases from the original 'frame var' implementation. Also updates the DIL array subscripting test. This particularly fixes some issues with handling synthetic children, objc pointers, and accessing specific bits within scalar data types.	2025-08-15 08:26:45 -07:00
Nikita Popov	11c2240049	[SDAGBuilder] Rename RetTys -> RetVTs (NFC) Make it clearer that this is a vector of EVTs, not IR types. Based on: https://github.com/llvm/llvm-project/pull/153798#discussion_r2279066696	2025-08-15 17:06:33 +02:00
Philip Reames	606937474e	[SDAG] Remove IndexType manipulation in getUniformBase and callers (#151578 ) All paths set it to the same value, just propagate that value to the consumer.	2025-08-15 08:00:47 -07:00
Florian Hahn	2b1e06598f	[LV] Regenerate some more check lines. (NFC)	2025-08-15 15:53:19 +01:00
Alexey Bataev	13b54f7dc1	[SLP] Recalculate dependencies for potential control dependencies if cleared If the control dependecies are cleared after calcellation of the copyables, need to reclculate them unconditionally. Fixes #153754 #153676	2025-08-15 07:52:10 -07:00
Phoebe Wang	f24d91eb2c	[Headers][X86] Remove duplicate __v8hu, NFCI (#153734 ) Newly added in xmmintrin.h by c8312bdd1665225c585dd2b0bff5e46d569edd45	2025-08-15 22:48:59 +08:00
David Green	144f3c4cbf	[AArch64] Adjust the scheduling info of SVE FCMP on Cortex-A510. (#153810 ) According to the SWOG, these have a lower throughput than other instructions. Mark them as taking multiple cycles to model that.	2025-08-15 15:45:33 +01:00
Mikhail R. Gadelha	d7199544af	[libc] Fix mbrtowc test (#153721 ) Previously, we were trying to memset a pointer that wasn't being initialized, and the test would randomly fail. This PR replaces the pointers with actual objects.	2025-08-15 11:44:33 -03:00
Akash Banerjee	1fd1d63463	[MLIR][OpenMP] Add a new AutomapToTargetData conversion pass in FIR (#153048 ) Add a new AutomapToTargetData pass. This gathers the declare target enter variables which have the AUTOMAP modifier. And adds omp.declare_target_enter/exit mapping directives for fir.alloca and fir.free oeprations on the AUTOMAP enabled variables. Automap Ref: OpenMP 6.0 section 7.9.7.	2025-08-15 15:41:41 +01:00
Simon Pilgrim	09267f6720	[X86] avx512vp2intersect-builtins.c / avx512vlvp2intersect-builtins.c - add C/C++ and 32/64-bit test coverage	2025-08-15 15:39:12 +01:00
Krishna Pandey	6602d6c7a7	[libc][math][docs] Add documentation for BFloat16 type (#153475 ) Signed-off-by: Krishna Pandey <kpandey81930@gmail.com>	2025-08-15 20:07:33 +05:30
Matt Arsenault	9a14b1d254	RuntimeLibcalls: Generate table of libcall name lengths (#153210 ) Avoids strlen when constructing the returned StringRef. We were emitting these in the libcall name lookup anyway, so split out the offsets for general use. Currently emitted as a separate table, not sure if it would be better to change the string offset table to store pairs of offset and width instead.	2025-08-15 23:29:10 +09:00
Benjamin Chetioui	8c0914d826	[mlir][bazel] Fix Bazel build after 6bb8f6f2d0ed672217e0a0521afc5b86913b717e (#153811 )	2025-08-15 14:28:44 +00:00
Kazu Hirata	f4bc3151bb	[mlir] Fix warnings This patch fixes: mlir/lib/Target/Wasm/TranslateFromWasm.cpp:82:1: error: unused variable 'wasmSectionName<(anonymous namespace)::WasmSectionType::DATACOUNT>' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp💯5: error: unused variable 'valueTypesEncodings' [-Werror,-Wunused-const-variable] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:735:13: error: unused function 'buildLiteralType<unsigned int>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:740:13: error: unused function 'buildLiteralType<unsigned long>' [-Werror,-Wunused-function] mlir/lib/Target/Wasm/TranslateFromWasm.cpp:292:33: error: private field 'symbols' is not used [-Werror,-Wunused-private-field]	2025-08-15 07:24:31 -07:00
Simon Pilgrim	17dd57b00e	[X86] avxvnni-builtins.c / avxvnniint8-builtins.c / avxvnniint16-builtins.c - add C/C++ and 32/64-bit test coverage	2025-08-15 15:17:15 +01:00
Guray Ozen	4c389178ee	[MLIR][NVVM] Print readable modifer (NFC) (#153779 ) Currently, modifier is printed as address, so it is not readable and not useful. This PR adds readable printing for it. --------- Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>	2025-08-15 15:47:39 +02:00
Guray Ozen	af92cabdef	[MLIR][NVVM] Combine griddepcontrol Ops (#152525 ) We've 2 ops: 1. nvvm.griddepcontrol.wait 2. nvvm.griddepcontrol.launch_dependents They are related to Grid Dependent Launch (or programmatic dependent launch in CUDA) and same concept. This PR unifies both ops into a single one.	2025-08-15 15:47:12 +02:00
Erich Keane	15d7a95ea9	[CIR] Refactor recipe init generation, cleanup after init (#153610 ) In preperation of the firstprivate implementation, this separates out some functions to make it easier to read. Additionally, it cleans up the VarDecl->alloca relationship, which will prevent issues if we have to re-use the same vardecl for a future generated recipe (and causes concerns in firstprivate later).	2025-08-15 06:41:42 -07:00
Gaëtan Bossu	9828745661	[AArch64][ISel] Select constructive EXT_ZZI pseudo instruction (#152554 ) The patch adds patterns to select the EXT_ZZI_CONSTRUCTIVE pseudo instead of the EXT_ZZI destructive instruction for vector_splice. This only works when the two inputs to vector_splice are identical. Given that registers aren't tied anymore, this gives the register allocator more freedom and a lot of MOVs get replaced with MOVPRFX. In some cases however, we could have just chosen the same input and output register, but regalloc preferred not to. This means we end up with some test cases now having more instructions: there is now a MOVPRFX while no MOV was previously needed.	2025-08-15 14:30:24 +01:00
David Green	649762cb04	Revert "[AArch64][GlobalISel] Add additional vecreduce.fadd and fadd 0.0 tests. NFC" This reverts commit 16314eb7312dab38d721c70f247f2117e9800704 as the test cases are failing under EXPENSIVE_CHECKS. Scalar vecreduce.fadd are not valid in GISel.	2025-08-15 14:23:53 +01:00
Stephen Tozer	bc216b057d	[Debugify] Improve reduction of debugify coverage build output (#150212 ) In current DebugLoc coverage builds, the output for any reasonably large build can become very large if any missing DebugLocs are present; this happens because single errors in LLVM may result in many errors being reported in the output report. The main cause of this is that the empty locations attached to instructions may be propagated to other instructions in later passes, which will each be reported as new errors. This patch prevents this by adding an "unknown" annotation to instructions after reporting them once, ensuring that any other DebugLocs copied or derived from the original empty location will not be marked as new errors. As a separate but related change, this patch updates the report generation script to deduplicate results using the recorded stacktrace if they are available, instead of the pass+instruction combination. This reduces the size of the reduction, but makes the reduction highly reliable, as the stacktrace allows us to very precisely identify when two bugs have originated from the same place.	2025-08-15 14:01:04 +01:00
Simon Pilgrim	bcb4984a0b	[X86] select-smin-smax.ll - add i128 tests Helps check quality of legality codegen (all we had was x86 i64 handling)	2025-08-15 13:48:13 +01:00
Simon Pilgrim	263e458273	[X86] select-smin-smax.ll - add i8/i16 test coverage (#153788 ) Pulled out of #151893 to show 32/64-bit target coverage	2025-08-15 13:37:11 +01:00
Erick Ochoa Lopez	61caab7789	[mlir][llvm] Add `align` attribute to `llvm.intr.masked.{expandload,compressstore}` (#153063 ) * Add `requiresArgsAndResultsAttr` to `LLVM_OneResultIntrOp` * Add `args_attrs` to `llvm.intr.masked.{expandload,compressstore}` The LLVM intrinsics [`llvm.intr.masked.expandload`](https://llvm.org/docs/LangRef.html#llvm-masked-expandload-intrinsics) and [`llvm.intr.masked.compressstore`](https://llvm.org/docs/LangRef.html#llvm-masked-compressstore-intrinsics) both allow an optional align parameter attribute to be set which defaults to one. Inlining the documentation below for [`llvm.intr.masked.expandload` 's ](https://llvm.org/docs/LangRef.html#id1522) and [`llvm.intr.masked.compressstore`'s](https://llvm.org/docs/LangRef.html#id1522) arguments respectively > The `align` parameter attribute can be provided for the first argument. The pointer alignment defaults to 1. > The `align` parameter attribute can be provided for the second argument. The pointer alignment defaults to 1.	2025-08-15 08:34:14 -04:00
Mehdi Amini	69453d7021	[MLIR] Fix memory leak in importWebAssemblyToModule when it fails to import (#153794 )	2025-08-15 12:33:25 +00:00

1 2 3 4 5 ...

548739 Commits