llvm-project

Author	SHA1	Message	Date
Aiden Grossman	869bce23fd	[CI] Setup generate_report to describe ninja failures This patch makes it so that generate_report will add information about failed build actions to the summary report. This makes it significantly easier to find compilation failures, especially given we run ninja with -k 0. This patch only does the integration into generate_report (along with testing). Actual utilization in the script is split into a separate patch to try and keep things clean. Reviewers: dschuff, cmtice, DavidSpickett, Keenuts, lnihlen, gburgessiv Reviewed By: cmtice, DavidSpickett Pull Request: https://github.com/llvm/llvm-project/pull/152621	2025-08-08 09:44:04 -07:00
Ellis Hoag	3b32893cd9	[InstrProf][NFC] Refactor profdata trace tests (#152550 ) Refactor some llvm-profdata tests to read text profiles which are easier to match with FileCheck	2025-08-08 09:39:58 -07:00
Slava Gurevich	0f59b8d4e3	Fix improper alignment of static buffer for placement-new of BufferQueue (#152408 ) No behavioral change, but eliminates potential UB in strict-alignment systems. The previous commit (llvm#94171) bulk-updated alignment usage to C++23 spec, but missed this occurrence.	2025-08-08 09:36:22 -07:00
Chao Chen	c96223434c	[mlir][xegpu] Add definition of SliceAttr (#150146 ) --------- Co-authored-by: Charitha Saumya <136391709+charithaintc@users.noreply.github.com>	2025-08-08 11:27:17 -05:00
Min-Yih Hsu	b4e8b8ee91	[mlir][vector] Canonicalize broadcast of shape_cast (#150523 ) Fold `broadcast(shape_cast(x))` into `broadcast(x)` if the type of x is compatible with broadcast's result type and the shape_cast only adds or removes ones in the leading dimensions. --------- Co-authored-by: Andrzej Warzyński <andrzej.warzynski@gmail.com> Co-authored-by: James Newling <james.newling@gmail.com>	2025-08-08 09:25:32 -07:00
Alexey Bataev	0419b459be	Revert "[SLP]Initial FMAD support (#149102 )" This reverts commit 0bcf45ea3458ba79eb4257afcfd6af954292c9ce to fix the regresions, reported in https://github.com/llvm/llvm-project/issues/152683	2025-08-08 09:17:59 -07:00
James Newling	b574bcf036	[mlir][TD] Support padding with poison (#152003 ) Signed-off-by: James Newling <james.newling@gmail.com>	2025-08-08 09:09:03 -07:00
Simon Pilgrim	45b4f1b438	[Headers][X86] Allow _mm512_set1_epi8/16/pd/ps intrinsics to be used in constexpr (#152746 ) Pulled out of #152288 as I need this to proceed with several other patches	2025-08-08 17:04:08 +01:00
Orlando Cazalet-Hyams	1778669739	[KeyInstr] Remove LLVM_EXPERIMENTAL_KEY_INSTRUCTIONS CMake flag (#152735 ) The CMake flag has been on by default for a month without any issues. This makes the feature support in LLVM unconditional (but does not enable the feature by default).	2025-08-08 17:03:28 +01:00
Simon Pilgrim	c8312bdd16	[Headers][X86] Enable constexpr handling for pmulhw/pmulhuw intrinsics (#152540 ) This patch updates the pmulhw/pmulhuw builtins to support constant expression handling - extending the VectorExprEvaluator::VisitCallExpr handling code that handles elementwise integer binop builtins. Hopefully this can be used as reference patch to show how to add future target specific constexpr handling with minimal code impact. I've also enabled pmullw constexpr handling (which are tagged on #152490) as they all use very similar tests. I've also had to tweak the MMX -> SSE2 wrapper as undefs are not permitted in constexpr shuffle masks Fixes #152524	2025-08-08 17:02:50 +01:00
Aiden Grossman	9ea1d39ead	[CI][Github] Remove Outdated Comments 5fc3e76ec4f323c22cddf7b9458137510507847a made the pipelines fail on errors and also removed the TODO comments, but did not remove the explanatory comments on why things were set up that way. Given things no longer succeed on error, these comments are outdated and should be removed.	2025-08-08 15:59:15 +00:00
Aiden Grossman	83dd7d97bd	[CI] Add Support for Parsing Ninja Logs to generate_test_report_lib This patch adds in support for taking the CLI output from ninja and parsing it for failures. This is intended to be used in the cases where all tests pass (or none have run), but the build fails to easily surface where exactly the build failed. The actual integration will happen in a future patch. Reviewers: gburgessiv, dschuff, lnihlen, DavidSpickett, Keenuts, cmtice Reviewed By: DavidSpickett, cmtice Pull Request: https://github.com/llvm/llvm-project/pull/152620	2025-08-08 08:42:25 -07:00
Muhammad Bassiouni	45b15946b1	[libc][hdrgen] Fix hdrgen when using macros as guards in stdlib.yaml. (#152732 )	2025-08-08 18:39:47 +03:00
Ivan R. Ivanov	7c141e2118	[ValueTracking] Add missing check for two-value PN recurrence matching (#152700 ) When InstTy is a type like IntrinsicInst which can have a variable number of arguments, we can encounter a case where Operation will have fewer than two arguments and error at the getOperand() calls. Fixes: https://github.com/llvm/llvm-project/issues/152725.	2025-08-08 17:39:24 +02:00
Muhammad Bassiouni	66734f4c3c	[libc][math] Refactor cbrtf implementation to header-only in src/__support/math folder. (#151846 ) Part of #147386 in preparation for: https://discourse.llvm.org/t/rfc-make-clang-builtin-math-functions-constexpr-with-llvm-libc-to-support-c-23-constexpr-math-functions/86450	2025-08-08 18:28:50 +03:00
nicebert	09bf2c5c91	[OpenMP] Claims omp_target_is_accessible as worked on (#151507 ) Includes link to current PR. Spec requires minor clarification.	2025-08-08 10:21:16 -05:00
Jordan Rupprecht	6a8e376d82	[bazel] Extra layering_check dep for #151228 : BFloat16 (#152741 )	2025-08-08 10:11:52 -05:00
Simon Pilgrim	f169893cbf	[Headers][X86] Allow BITALG vpopcntw/vpopcntb intrinsics to be used in constexpr (#152701 ) Matches VPOPCNTDQ handling	2025-08-08 16:09:26 +01:00
Amina Chabane	478b415181	[AArch64] Enable svcompact intrinsic in streaming mode with SME2.2 (#151703 ) When the target enables +sme2p2, the svcompact intrinsic is now available in streaming SVE mode, through updating the guards in arm_sve.td. Included Sema test acle_sve_compact.cpp.	2025-08-08 16:04:54 +01:00
Mikhail R. Gadelha	e91f68487c	[RISCV] Update SpacemiT-X60 vector fixed-point arithmetic latencies (#150517 ) This PR adds hardware-measured latencies for all instructions defined in Section 12 of the RVV specification: "Vector Fixed-Point Arithmetic Instructions" to the SpacemiT-X60 scheduling model.	2025-08-08 11:57:35 -03:00
Kazu Hirata	1bc49c0c97	[AST] Remove an unused local variable (NFC) (#152647 )	2025-08-08 07:45:22 -07:00
Kazu Hirata	8afa70f1c8	[llvm] Proofread SourceLevelDebugging.rst (#152646 ) This patch takes care of the highly mechanical part of proofreading SourceLevelDebugging.rst, namely: - hyphenating "32 bit value" and similar and - hypenating "Objective C"	2025-08-08 07:45:14 -07:00
Kazu Hirata	c11868f66c	[IR] Remove Intrinsic::getDeclaration (#152645 ) Intrinsic::getDeclaration has been deprecated for more than 9 months since: commit b9f08676abcfbb226c67b5ac2a7bc5b33254b915 Author: Rahul Joshi <rjoshi@nvidia.com> Date: Mon Oct 14 19:21:28 2024 -0700 This patch removes it. I'm not aware of any downstream use AFAIK.	2025-08-08 07:45:06 -07:00
Kazu Hirata	4e44e7c164	[Sema] Remove an unnecessary cast (NFC) (#152644 ) numTypeParams is already of unsigned. Co-authored-by: Corentin Jabot <corentinjabot@gmail.com>	2025-08-08 07:44:59 -07:00
Kazu Hirata	9beb18a6f0	[CodeGen] Remove an unnecessary cast (NFC) (#152643 ) getUnitInc() already returns int.	2025-08-08 07:44:51 -07:00
Kazu Hirata	30b0a9ec19	[ADT] Use range-based for loops in StringMap.h (NFC) (#152641 )	2025-08-08 07:44:44 -07:00
Simon Pilgrim	e64224a224	[Headers][X86] Allow AVX cast intrinsics to be used in constexpr (#152730 ) Still missing the "extend to 256-bit" casts - _mm256_castpd128_pd256 / _mm256_castps128_ps256 / _mm256_castsi128_si256 - due to constexpr not liking undefined/poison etc.	2025-08-08 15:39:39 +01:00
Guray Ozen	76a533c8ec	[MLIR][NVVM] Add pmevent (#152509 ) Add nvvm.pmevent Op that Triggers one or more of a fixed number of performance monitor events, with event index or mask specified by immediate operand. [For more information, see PTX ISA](https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#miscellaneous-instructions-pmevent)	2025-08-08 16:34:18 +02:00
tcottin	2c4b876fa8	[clangd] introduce doxygen parser (#150790 ) Followup work of #140498 to continue the work on clangd/clangd#529 Introduce the use of the Clang doxygen parser to parse the documentation of hovered code. - ASTContext independent doxygen parsing - Parsing doxygen commands to markdown for hover information Note: after this PR I have planned another patch to rearrange the information shown in the hover info. This PR is just for the basic introduction of doxygen parsing for hover information. --------- Co-authored-by: Maksim Ivanov <emaxx@google.com>	2025-08-08 16:07:36 +02:00
Timm Baeder	1b1f352cb9	[clang][bytecode] Handle reads on zero-size arrays (#152706 )	2025-08-08 16:03:02 +02:00
Timm Baeder	3ea76af3a1	[clang][bytecode][NFC] Remove a useless local variable (#152711 ) We can just check NonNullArgs.empty().	2025-08-08 15:52:23 +02:00
Timm Baeder	d54516b9ad	[clang][bytecode][NFC] Use an existing local variable (#152710 ) Instead of calling getSize() again.	2025-08-08 15:41:58 +02:00
Yingwei Zheng	ac8295550b	[Clang][CodeGen] Move `EmitPointerArithmetic` into `CodeGenFunction`. NFC. (#152634 ) `CodeGenFunction::EmitPointerArithmetic` is needed by https://github.com/llvm/llvm-project/pull/152575. Separate the NFC changes into a new PR for smooth review.	2025-08-08 21:41:03 +08:00
David Green	26b302fd8b	[AArch64] Rename Cost -> PromotedCost to avoid shadowing error	2025-08-08 14:37:24 +01:00
Erick Ochoa Lopez	a1672d7c6a	[mlir][vector] Add alignment attribute to `maskedload` and `maskedstore` (#151690 ) These commits continue the work done in https://github.com/llvm/llvm-project/pull/144344, of adding alignment attributes to operations in the vector and memref. These commits focus on adding the alignment attribute to the `maskedload` and `maskedstore` operations. The `VectorLoadConversion` pattern in VectorToLLVM is a template for `load`, `store`, `maskedload` and `maskedstore` operations. Having the alignment attribute in all these operations would allow for an easy way to propagate the alignment attribute from the vector dialect to the LLVM dialect. This patchset also includes changes to the conversion from VectorToLLVM to propagate the alignment attribute for the vector.{,masked}{load,store} operations.	2025-08-08 09:23:44 -04:00
Szymon Piotr Milczek	fd41700962	[InstCombine] visitShuffleVectorInst assert with vector of pointers fix. (#152341 ) In visitShuffleVectorInst there's an if block that's meant to turn shufflevector followed by bitcast into extractelement where possible. It assumes that there will never be bitcasts performed on vectors of ptr as such operations are almost always illegal, and ptrtoint instructions should be used instead. There is however an edge case where a bitcast instruction can be performed on a vector of type `<1 x ptr>` to turn it into type `ptr` In this edge case, the code initializes the variable `VecBitWidth` to 0. Then, when iterating over users that are bitcasts, an attempt is made to create a vector of size 0, which triggers and assert. This commit changes initialization of `VecBitWidth` to use datalayout to find the the size of the vector instead of getPrimitiveSizeInBits method which results in 0 for ptr and vectors of ptr.	2025-08-08 15:23:02 +02:00
Rahul Joshi	7f0e4079c8	[NFCI][TableGen] Make `Intrinsic::getAttributes` table driven (#152349 ) This a follow on to https://github.com/llvm/llvm-project/pull/152219 to reduce both code and frame size of `Intrinsic::getAttributes` further. Currently, this function consists of several switch cases (one per unique argument attributes) that populates the local `AS` array with all non-empty argument attributes for that intrinsic by calling `getIntrinsicArgAttributeSet`. This change makes this code table driven and implements `Intrinsic::getAttributes` without any switch cases, which reduces the code size of this function on all platforms and in addition reduces the frame size by a factor of 10 on Windows. This is achieved by: 1. Emitting table `ArgAttrIdTable` containing a concatenated list of `<ArgNo, AttrID>` entries across all unique arguments. 2. Emitting table `ArgAttributesInfoTable` (indexed by unique arguments-ID) to store the starting index and number of non-empty arg attributes. 3. Reserving unique function-ID 255 to indicate that the intrinsic has no function attributes (to replace `HasFnAttr` setup in each switch case). 4. Using a simple table lookup and for loop to build the list of argument and function attributes for a given intrinsic. Experimental data shows that with release builds and assertions disabled, this change reduces the code size for GCC and Clang builds on Linux by ~9KB for a modest (80/152 byte) increase in frame size. For Windows, it reduces the code size by 20KB and frame size from 4736 bytes to 461 bytes which is 10x reduction. Actual data is as follows: ``` Current trunk: Compiler gcc-13.3.0 clang-18.1.3 MSVC 19.43.34810.0 code size 0x35a9 0x370c 0x5581 frame size 0x120 0x118 0x1280 table driven Intrinsic::getAttributes: code size 0xcfb 0xcd0 0x1cf frame size 0x1b8 0x188 0x1A0 Total savings (code + data) 9212 bytes 9790 bytes 20119 bytes ``` Total savings above accounts for the additional data size for the 2 new tables, which in this experiment was: `ArgAttributesInfoTable` = 314 bytes and `ArgAttrIdTable` = 888 bytes. Coupled with the earlier https://github.com/llvm/llvm-project/pull/152219, this achieves a 46x reduction in frame size for this function in Windows release builds.	2025-08-08 06:02:43 -07:00
Timm Baeder	8d26252eec	[clang][bytecode][NFC] Dead blocks are always uninitialized (#152699 ) We always call the descriptor dtor before, so they are never initialized.	2025-08-08 14:57:38 +02:00
Yaxun (Sam) Liu	479556c720	[HIP] compressed bundle format defaults to v3 (#152600 ) HIP runtime support for compressed bundle format v3 is in place, therefore switch the default compressed bundle format to v3 in compiler. This allows both compressed and decompressed fat binary size to exceed 4GB by default. Environment variable COMPRESSED_BUNDLE_FORMAT_VERSION=2 can be used for backward compatibility for older HIP runtimes not supporting v3. Fixes: SWDEV-548879	2025-08-08 08:53:01 -04:00
sebvince	8949dc7f9c	[mlir][amdgpu] fold memref.subview/expand_shape/collapse_shape into amdgpu.gather_to_lds for DST operand (#152277 )	2025-08-08 05:47:33 -07:00
David Green	7f1638efc1	[AArch64] Generalize costing for FP16 instructions (#150033 ) This extracts the code for modelling a fp16 operation as `fptrunc(fpop(fpext,fpext))` into a new function named getFP16BF16PromoteCost so that it can be reused by the arithmetic instructions. The function takes a lambda to calculate the cost of the operation with the promoted type.	2025-08-08 13:40:07 +01:00
Lucas Ramirez	83c308f014	[AMDGPU][Scheduler] Consistent occupancy calculation during rematerialization (#149224 ) The `RPTarget`'s way of determining whether VGPRs are beneficial to save and whether the target has been reached w.r.t. VGPR usage currently assumes, if `CombinedVGPRSavings` is true, that free slots in one VGPR RC can always be used for the other. Implicitly, this makes the rematerialization stage (only current user of `RPTarget`) follow a different occupancy calculation than the "regular one" that the scheduler uses, one that assumes that ArchVGPR/AGPR usage can be balanced perfectly and at no cost, which is untrue in general. This ultimately yields suboptimal rematerialization decisions that require cross-VGPR-RC copies unnecessarily. This fixes that, making the `RPTarget`'s internal model of occupancy consistent with the regular one. The `CombinedVGPRSavings` flag is removed, and a form of cross-VGPR-RC saving implemented only for unified RFs, which is where it makes the most sense. Only when the amount of free VGPRs in a given VGPR RC (ArchVPGR or AGPR) is lower than the excess VGPR usage in the other VGPR RC does the `RPTarget` consider that a pressure reduction in the former will be beneficial to the latter.	2025-08-08 14:26:04 +02:00
Mel Chen	ab7281d896	[VPlan] Update naming in VPInterleaveRecipe constructor. nfc (#152472 )	2025-08-08 20:17:10 +08:00
Simon Pilgrim	1e9ed918dd	[X86][AVX512BITALG] add C/C++ and 32/64-bit builtins test coverage (#152693 )	2025-08-08 13:12:06 +01:00
Michael Buch	672f82a2ef	[lldb][test] TestExprDefinitionInDylib.py: add cases for calling ctors	2025-08-08 12:12:25 +01:00
Timm Baeder	fde9ee1ac2	[clang][bytecode] Don't deallocate dynamic blocks with pointers (#152672 ) This fixes the edge case we had with variables pointing to dynamic blocks, which forced us to convert basically all dynamic blocks to DeadBlock when deallocating them. We now don't run dynamic blocks through InterpState::deallocate() but instead add them to a DeadAllocations list when they are deallocated but still have pointers. As a consequence, not all blocks with Block::IsDead = true are DeadBlocks.	2025-08-08 13:02:01 +02:00
Florian Hahn	82d633e9ff	[VPlan] Materialize vector trip count using VPInstructions. (#151925 ) Materialize the vector trip count computation using VPInstruction instead of directly creating IR. This is one of the last few steps needed to model the full vector skeleton in VPlan. It also simplifies vector-trip count computations for scalable vectors, as we can re-use the UF x VF computation. PR: https://github.com/llvm/llvm-project/pull/151925	2025-08-08 11:44:32 +01:00
Sasa Vuckovic	9349484e8f	[MLIR] Make `PassPipelineOptions` virtually inheriting from PassOptions to allow diamond inheritance (#146370 ) ## Problem Given 3 pipelines, A, B, and a superset pipeline AB that runs both the A & B pipelines, it is not easy to manage their options - one needs to manually recreate all options from A and B into AB, and maintain them. This is tedious. ## Proposed solution Ideally, AB options class inherits from both A and B options, making the maintenance effortless. Today though, this causes problems as their base classes `PassPipelineOptions<A>` and `PassPipelineOptions<B>` both inherit from `mlir::detail::PassOptions`, leading to the so called "diamond inheritance problem", i.e. multiple definitions of the same symbol, in this case parseFromString that is defined in mlir::detail::PassOptions. Visually, the inheritance looks like this: ``` mlir::detail::PassOptions ↑ ↑ \| \| PassPipelineOptions<A> PassPipelineOptions<B> ↑ ↑ \| \| AOptions BOptions ↑ ↑ +---------+--------+ \| ABOptions ``` A proposed fix is to use the common solution to the diamond inheritance problem - virtual inheritance.	2025-08-08 12:33:56 +02:00
Ryotaro Kasuga	bd39ae6125	[Delinearization] Add function for fixed size array without relying on GEP (#145050 ) The existing functions `getIndexExpressionsFromGEP` and `tryDelinearizeFixedSizeImpl` provide functionality to delinearize memory accesses for fixed size array. They use the GEP source element type in their optimization heuristics. However, driving optimization heuristics based on GEP type information is not allowed. This patch introduces new functions `findFixedSizeArrayDimensions` and `delinearizeFixedSizeArray` to delinearize a fixed size array without using the type information in GEP. The new function `findFixedSizeArrayDimensions` infers the size of each dimension of the array based on the value to be added to the address as induction variables are incremented. `delinearizeFixedSizeArray` attempts to restore the subscripts of each dimension based on the estimated array size. This is an initial implementation that may not cover all cases, but is intended to replace the existing function in the future. Related: - https://discourse.llvm.org/t/enabling-loop-interchange/82589/4 - https://github.com/llvm/llvm-project/pull/124911#issuecomment-2962499501	2025-08-08 19:08:14 +09:00
Bart Chrzaszcz	92f6b15445	[clang] Fix bazel after eccc6e2. (#152681 )	2025-08-08 11:02:14 +01:00

1 2 3 4 5 ...

547939 Commits