llvm-project

Author	SHA1	Message	Date
Alex MacLean	76c9bfefa4	[NVPTX] Remove Float register classes (#140487 ) These classes are redundant, as the untyped "Int" classes can be used for all float operations. This change is intended to be as minimal as possible and leaves the many potential simplifications and refactors this exposes as future work.	2025-05-21 11:33:57 -07:00
Alex MacLean	831592d617	[NVPTX] Fixup under-aligned dynamic alloca lowering (#139628 ) The alignment on a ISD::DYNAMIC_STACKALLOC node may be 0 to indicate that the default stack alignment should be used. Prior to this change, we passed this alignment through unchanged leading to an error in ptxas. Now, we use the stack-alignment in this case. Also did a little cleanup while I'm here.	2025-05-13 09:56:41 -07:00
Alex MacLean	369891b674	[NVPTX] use untyped loads and stores where ever possible (#137698 ) In most cases, the type information attached to load and store instructions is meaningless and inconsistently applied. We can usually use ".b" loads and avoid the complexity of trying to assign the correct type. The one expectation is sign-extending load, which will continue to use ".s" to ensure the sign extension into a larger register is done correctly.	2025-05-10 08:26:26 -07:00
peterbell10	0068078dca	[NVPTX] Remove `NVPTX::IMAD` opcode, and rely on intruction selection only (#121724 ) I noticed that NVPTX will sometimes emit `mad.lo` to multiply by 1, e.g. in https://gcc.godbolt.org/z/4j47Y9W4c. This happens when DAGCombiner operates on the add before the mul, so the imad contraction happens regardless of whether the mul could have been simplified. To fix this, I remove `NVPTXISD::IMAD` and only combine to mad during selection. This allows the default DAGCombiner patterns to simplify the graph without any NVPTX-specific intervention.	2025-01-15 20:09:18 +00:00
Fangrui Song	b279f6b098	[NVPTX,test] Change llc -march= to -mtriple= Similar to 806761a7629df268c8aed49657aeccffa6bca449 -mtriple= specifies the full target triple while -march= merely sets the architecture part of the default target triple (e.g. Windows, macOS), leaving a target triple which may not make sense. Therefore, -march= is error-prone and not recommended for tests without a target triple. The issue has been benign as we recognize nvptx{,64}-apple-darwin as ELF instead of rejecting it outrightly.	2024-12-15 10:45:11 -08:00
Youngsuk Kim	0f0a96b862	[llvm][NVPTX] Strip unneeded '+0' in PTX load/store (#113017 ) Remove the extraneous '+0' immediate offset part in PTX load/stores, to improve readability of output PTX code.	2024-10-19 10:05:36 -04:00
Youngsuk Kim	5a0942cd74	[llvm][NVPTX] Don't emit unused var 'temp_param_reg' (NFC) (#89004 ) Don't emit unused variable 'temp_param_reg' which has been around since ae556d3ef72dfe5f40a337b7071f42b7bf5b66a4 .	2024-04-17 14:45:33 -04:00
Adrian Kuegel	f0a5e50550	[llvm][NVPTX] Add missing feature guard.	2024-03-19 06:53:14 +00:00
Alex MacLean	89b7b3b995	[NVPTX] support dynamic allocas with PTX alloca instruction (#84585 ) Add support for dynamically sized alloca instructions with the PTX alloca instruction introduced in PTX 7.3 ([9.7.15.3. Stack Manipulation Instructions: alloca] (https://docs.nvidia.com/cuda/parallel-thread-execution/index.html#stack-manipulation-instructions-alloca))	2024-03-15 11:51:46 -07:00
Youngsuk Kim	f9304974cc	[llvm][NVPTX] Inform that 'DYNAMIC_STACKALLOC' is unsupported (#74684 ) Catch unsupported path early up, and emit error with information. Motivated by the following threads: * https://discourse.llvm.org/t/nvptx-problems-with-dynamic-alloca/70745 * #64017	2023-12-14 22:06:22 -05:00

10 Commits