llvm-project

Author	SHA1	Message	Date
Justin Fargnoli	4d2e1e1c74	Reland "[lit] Refactor available `ptxas` features" (#155923 ) Reland #154439. Reverted with #155914. Account for: - Windows `ptxas` outputting error messages to `stdout` instead of `stderr` - Tests in `llvm/test/DebugInfo/NVPTX`	2025-09-02 10:42:25 -07:00
Justin Fargnoli	826780a84e	Revert "[lit] Refactor available `ptxas` features" (#155914 ) Reverts llvm/llvm-project#154439 in order to resolve https://github.com/llvm/llvm-project/pull/154439#issuecomment-3234638253.	2025-08-28 14:18:58 -07:00
Justin Fargnoli	d77cf579d8	[lit] Refactor available `ptxas` features (#154439 ) ToT `lit` currently assumes that a given `ptxas` version supports all capabilities of prior `ptxas` releases. This approach was flexible enough to support the removal of 32-bit address compilation from `ptxas` in CUDA 12.1, but it struggles with the removal of Volta and prior compilation in CUDA 13.0. To deal with this, this PR refactors how `lit` defines the set of features available for a given `ptxas` version. It invokes `ptxas` not just to get its version, but also to get the list of supported SMs, supported PTX ISA versions, and support for 32-bit compilation. This approach should be flexible enough to deal with the changing support matrix of `ptxas` as it goes forward. One obvious downside is that this relies on parsing the `stdout` of `ptxas`, something that's inherently unstable. But, IMO, this is something that we can fix as needed.	2025-08-28 09:43:49 -07:00
Alex MacLean	d494eb0fa3	[NVPTX] Skip numbering unreferenced virtual registers (readability) (#154391 ) When assigning numbers to registers, skip any with neither uses nor defs. This is will not have any impact at all on the final SASS but it makes for slightly more readable PTX. This change should also ensure that future minor changes are less likely to cause noisy diffs in register numbering.	2025-08-19 12:27:46 -07:00
Alex MacLean	35693daa70	[NVPTX] Fix v2i8 call lowering, use generic ld/st nodes for call params (#146930 )	2025-07-28 10:41:51 -07:00
Alex MacLean	70333de6cf	[NVPTX] Consolidate and cleanup various NVPTXISD nodes (NFC) (#145581 ) This change consolidates and cleans up various NVPTXISD target-specific nodes in order to simplify SDAG ISel. While there are some whitespace changes in the emitted PTX it is otherwise a non-functional change. NVPTXISD::Wrapper - This node was used to wrap external-symbol and global-address nodes. It is redundant and has been removed. Instead we use the non-target versions of these nodes and convert them appropriately during ISel. NVPTXISD::CALL - Much of the family of nodes used to represent a PTX call instruction have been replaced by this new single node. It corresponds to a single instruction and is therefore much simpler to create and lower.	2025-06-25 11:42:21 -07:00
Alex MacLean	76c9bfefa4	[NVPTX] Remove Float register classes (#140487 ) These classes are redundant, as the untyped "Int" classes can be used for all float operations. This change is intended to be as minimal as possible and leaves the many potential simplifications and refactors this exposes as future work.	2025-05-21 11:33:57 -07:00
Alex MacLean	831592d617	[NVPTX] Fixup under-aligned dynamic alloca lowering (#139628 ) The alignment on a ISD::DYNAMIC_STACKALLOC node may be 0 to indicate that the default stack alignment should be used. Prior to this change, we passed this alignment through unchanged leading to an error in ptxas. Now, we use the stack-alignment in this case. Also did a little cleanup while I'm here.	2025-05-13 09:56:41 -07:00
Alex MacLean	369891b674	[NVPTX] use untyped loads and stores where ever possible (#137698 ) In most cases, the type information attached to load and store instructions is meaningless and inconsistently applied. We can usually use ".b" loads and avoid the complexity of trying to assign the correct type. The one expectation is sign-extending load, which will continue to use ".s" to ensure the sign extension into a larger register is done correctly.	2025-05-10 08:26:26 -07:00
peterbell10	0068078dca	[NVPTX] Remove `NVPTX::IMAD` opcode, and rely on intruction selection only (#121724 ) I noticed that NVPTX will sometimes emit `mad.lo` to multiply by 1, e.g. in https://gcc.godbolt.org/z/4j47Y9W4c. This happens when DAGCombiner operates on the add before the mul, so the imad contraction happens regardless of whether the mul could have been simplified. To fix this, I remove `NVPTXISD::IMAD` and only combine to mad during selection. This allows the default DAGCombiner patterns to simplify the graph without any NVPTX-specific intervention.	2025-01-15 20:09:18 +00:00
Fangrui Song	b279f6b098	[NVPTX,test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS), leaving a target triple which may not make sense. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.	2024-12-15 10:45:11 -08:00
Youngsuk Kim	0f0a96b862	[llvm][NVPTX] Strip unneeded '+0' in PTX load/store (#113017 ) Remove the extraneous '+0' immediate offset part in PTX load/stores, to improve readability of output PTX code.	2024-10-19 10:05:36 -04:00
Youngsuk Kim	5a0942cd74	[llvm][NVPTX] Don't emit unused var 'temp_param_reg' (NFC) (#89004 ) Don't emit unused variable 'temp_param_reg' which has been around since ae556d3ef72dfe5f40a337b7071f42b7bf5b66a4 .	2024-04-17 14:45:33 -04:00
Adrian Kuegel	f0a5e50550	[llvm][NVPTX] Add missing feature guard.	2024-03-19 06:53:14 +00:00
Alex MacLean	89b7b3b995	[NVPTX] support dynamic allocas with PTX alloca instruction (#84585 ) Add support for dynamically sized alloca instructions with the PTX alloca instruction introduced in PTX 7.3 ([9.7.15.3. Stack Manipulation Instructions: alloca] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#stack-manipulation-instructions-alloca))	2024-03-15 11:51:46 -07:00
Youngsuk Kim	f9304974cc	[llvm][NVPTX] Inform that 'DYNAMIC_STACKALLOC' is unsupported (#74684 ) Catch unsupported path early up, and emit error with information. Motivated by the following threads: * https://discourse.llvm.org/t/nvptx-problems-with-dynamic-alloca/70745 * #64017	2023-12-14 22:06:22 -05:00

16 Commits