llvm-project

Author	SHA1	Message	Date
cabreraam	a033bf242f	[flang][hlfir] work towards handling char_convert in hlfir This patch aims to address the TODO for handling character conversion in HLFIR found [here](`1defa78124/flang/lib/Lower/ConvertExprToHLFIR.cpp (L1388)`) using [this similar operation but for FIR as inspiration](`3ea673a97b/flang/lib/Lower/ConvertExpr.cpp (L1212-L1271)`). Reviewed By: vzakhari, tblah Differential Revision: https://reviews.llvm.org/D155650	2023-07-31 10:45:10 -04:00
Roger Ferrer Ibanez	896aada3b6	[NFCI][mlir][Tests] Rename identifiers minor/major to avoid clashes with system headers Identifiers major and minor are often already taken in POSIX systems due to their presence in <sys/types.h> as part of the makedev library function. This causes compilation failures on FreeBSD and Linux systems with glibc <2.28. This change renames the identifiers to major_/minor_. Differential Revision: https://reviews.llvm.org/D156683	2023-07-31 14:36:35 +00:00
Matt Arsenault	fbeda975d2	InstCombine: Drop some typed pointer cast handling	2023-07-31 10:34:31 -04:00
Alexey Bataev	662efdee9b	[SLP][NFC]Improve handling of MinBWs container, NFC. Replaced by DenseMap instead of MapVector(the order is not important, just lookup is used) + reduced number of lookups.	2023-07-31 07:26:55 -07:00
Nikita Popov	72ec2c007e	[InstCombine] Fix handling of irreducible loops (PR64259) Fixes a regression introduced by D75362 for irreducible control flow. In that case, we may visit the predecessor that renders the current block live only later, and incorrectly determine that a block is dead. Instead, switch to using the same DeadEdges based implementation we also use during the main InstCombine iteration. This temporarily regresses some cases that need replacement of dead phi operands with poison, which is currently only done during the main run, but not worklist population. This will be addressed in a followup, to keep it separate from the correctness fix here. Fixes https://github.com/llvm/llvm-project/issues/64259.	2023-07-31 16:20:22 +02:00
Shraiysh Vaishay	2cb6d0c70b	[mlir][OpenMP] Translating if and final clauses for task construct Support for if and final clauses for task construct. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D130704	2023-07-31 19:44:17 +05:30
Matt Arsenault	d6f9428e46	GlobalISel: Pass MachineIRBuilder to applyMappingImpl The target should not have to construct MachineIRBuilders during RegBankSelect (we should perhaps hide the constructors for it). The pass should own the builder setup with the desired CSE configuration (although currently the pass does not use the CSE builder, which is what I want to fix). https://reviews.llvm.org/D156479	2023-07-31 10:03:38 -04:00
Alexey Bataev	85635c7f60	[SLP][NFC]Use ScalarTy consistently in getEntryCost, NFC.	2023-07-31 06:52:56 -07:00
Sergei Barannikov	aeeaadd6ee	[SystemZ] Replace OperandMatchResultTy with ParseStatus (NFC) ParseStatus is slightly more convenient to use due to implicit conversion from bool, which allows to do something like: ``` return Error(L, "msg"); ``` when with MatchOperandResultTy it had to be: ``` Error(L, "msg"); return MatchOperand_ParseFail; ``` It also has more appropriate name since parse* methods are not only for parsing operands. Reviewed By: uweigand Differential Revision: https://reviews.llvm.org/D154316	2023-07-31 16:44:00 +03:00
Matthias Springer	16b75cd2bb	[mlir][vector] Use DenseI64ArrayAttr for ExtractOp/InsertOp positions `DenseI64ArrayAttr` provides a better API than `I64ArrayAttr`. E.g., accessors returning `ArrayRef<int64_t>` (instead of `ArrayAttr`) are generated. Differential Revision: https://reviews.llvm.org/D156684	2023-07-31 15:25:37 +02:00
Matthias Springer	aba0ef7059	[mlir][bufferization] Support casts in EmptyTensorElimination EmptyTensorElimination is a pre-bufferization transformation that replaces "tensor.empty" ops with "tensor.extract_slice" ops. This revision adds support for cases where the input IR contains "tensor.cast" ops. Differential Revision: https://reviews.llvm.org/D156167	2023-07-31 15:20:00 +02:00
Nikita Popov	09156b36c6	[InstCombine] Move worklist preparation into InstCombinerImpl (NFC)	2023-07-31 15:18:12 +02:00
Matthias Springer	933fde3d1c	[mlir][tensor][NFC] Simplify extract_slice(cast) folder The type computation part is not needed. Differential Revision: https://reviews.llvm.org/D156652	2023-07-31 15:07:49 +02:00
Matthias Springer	b2826c0209	[mlir][NFC] Move offsets/sizes/strides helper to dialect utils and interface header * Move `foldDynamicIndexList` to `DialectUtils` and simplify function. * Move `OpWithOffsetSizesAndStridesConstantArgumentFolder` to `ViewLikeInterface` and add documentation. Differential Revision: https://reviews.llvm.org/D156581	2023-07-31 14:53:14 +02:00
Matt Arsenault	ab6cd2d498	AMDGPU: Simplify early exit handling for libcall simplify Early exit on intrinsics and don't duplicate indirect call checks. Also let the IRBuilder constructor figure out the insert point rather than doing it manually. Also avoid debug print about trying to simplify calls in more unhandled scenarios.	2023-07-31 08:18:12 -04:00
Matt Arsenault	d74c89fdb4	InstCombine: Drop some typed pointer bitcasts	2023-07-31 08:05:58 -04:00
Matt Arsenault	055a7f2512	AMDGPU: Adjust outdated comment	2023-07-31 08:05:13 -04:00
Matt Arsenault	51ec5a2733	AMDGPU: Use available subtarget member	2023-07-31 08:05:12 -04:00
Matt Arsenault	acc163d4ab	Inliner: Regenerate test Test claims to be autogenerated but some functions are inexplicibly missing checks.	2023-07-31 08:05:12 -04:00
Matt Arsenault	360a5d5612	AMDGPU: Remove some typed pointer handling	2023-07-31 08:05:12 -04:00
Matt Arsenault	d388222be2	InstCombine: Drop some typed pointer bitcast handling	2023-07-31 08:05:12 -04:00
Nimish Mishra	da1f1b2292	Prevent extraneous copy in f752265231c2d15590a53e45bcc850acf2450dfc Commit f752265231c2d15590a53e45bcc850acf2450dfc uses extraneous copy to the loop variable. Fixing the same	2023-07-31 17:31:19 +05:30
Jonas Hahnfeld	5ea647dea6	[CodeGen] Assert that EmittedDeferredDecls is empty Its contents are transferred into DeferredDecls in Release(), so it should be empty in moveLazyEmissionStates(). This matches the code downstream in Cling. Differential Revision: https://reviews.llvm.org/D156660	2023-07-31 13:40:00 +02:00
Haojian Wu	dcb28244fa	[clangd] Respect IWYU keep pragma for standard headers. see the issue https://github.com/llvm/llvm-project/issues/64191 Differential Revision: https://reviews.llvm.org/D156650	2023-07-31 13:21:54 +02:00
Nimish Mishra	f752265231	[flang][OpenMP] Support for privatization in common block This patch provides support for usage of common block in private/firstprivate and lastprivate clauses. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D156120	2023-07-31 16:46:18 +05:30
Haojian Wu	171868dc2c	[Tooling/Inclusion] Add std::range symbols in the mapping. Fixes https://github.com/llvm/llvm-project/issues/64191 Differential Revision: https://reviews.llvm.org/D156648	2023-07-31 13:05:47 +02:00
Peixin Qiao	b4c54b2027	[flang][OpenMP] Support common block in OpenMP private clause This supports the common block in OpenMP privat clause by making each common block member host-associated privatization and adds the test case. Reviewed By: kiranchandramohan Differential Revision: https://reviews.llvm.org/D127215	2023-07-31 16:24:12 +05:30
Simon Pilgrim	c1c86f9eae	[X86] LowerEXTRACT_VECTOR_ELT - match i8 extraction with MVT::i8 instead of getSizeInBits() Noticed on D156350	2023-07-31 11:37:26 +01:00
Jay Foad	0ef39e33d7	[StackColoring] Fix typo in comment	2023-07-31 11:35:57 +01:00
Sergio Afonso	fcb6a9c07c	[Flang][OpenMP][Lower] Refactor implementation of PFT to MLIR lowering This patch makes the following non-functional changes: - Extract OpenMP clause processing into a new internal `ClauseProcessor` class. Atomic and reduction-related clauses processing is kept unchanged, since atomic clauses are stored in `OmpAtomicClauseList` rather than `OmpClauseList` and there are many TODO comments related to the current implementation of reduction lowering. This has been left unchanged to avoid merge conflicts and work duplication. - Reorganize functions into sections in the file to improve readability. - Explicitly use mlir:: namespace everywhere, rather than just most places. - Spell out uses of `auto` in which the type wasn't explicitly stated as part of the initialization expression. - Normalize a few function names to match the rest and renamed variables in 'snake_case' to 'camelCase'. The main purpose is to reduce code duplication and simplify the implementation of upcoming work to support loop-attached target constructs and teams/ distribute lowering to MLIR. Differential Revision: https://reviews.llvm.org/D155981	2023-07-31 10:51:39 +01:00
Ingo Müller	bd17556d55	[mlir][memref][transform][python] Create .td file for bindings. This patch creates the .td files for the Python bindings of the transform ops of the MemRef dialect and integrates them into the build systems (CMake and Bazel). Reviewed By: ftynse Differential Revision: https://reviews.llvm.org/D156536	2023-07-31 09:49:28 +00:00
Vedant Paranjape	259d56d41d	[LoopAccessAnalysis] Add a const qualifier to getMaxSafeDepDistBytes() Add a const qualifier to this API call, since this is a member of MemoryDepChecker and LoopAccessInfo returns an object of this class as a const, as follows: const MemoryDepChecker &getDepChecker() const { return *DepChecker; } If one tries to use function as follows: LAI->getDepChecker().getMaxSafeDepDistBytes() results in the following error: passing ‘const llvm::MemoryDepChecker’ as ‘this’ argument discards qualifiers Reviewed By: nikic Differential Revision: https://reviews.llvm.org/D156304	2023-07-31 09:45:01 +00:00
Simon Pilgrim	076bee1020	[DAG] getNode() - fold (zext (trunc (assertzext x))) -> (assertzext x) If the pre-truncated value was the same width as the extension, and the assertzext guarantees that the extended bits are already zero, then skip the zext/trunc 'zero_extend_inreg' pattern. Addresses several regressions noticed in D155472	2023-07-31 10:43:11 +01:00
Simon Tatham	60b98363c7	Retain all jump table range checks when using BTI. This modifies the switch-statement generation in SelectionDAGBuilder, specifically the part that generates case clusters of type CC_JumpTable. A table-based branch of any kind is at risk of being a JOP gadget, if it doesn't range-check the offset into the table. For some types of table branch, such as Arm TBB/TBH, the impact of this is limited because the value loaded from the table is a relative offset of limited size; for others, such as a MOV PC,Rn computed branch into a table of further branch instructions, the gadget is fully general. When compiling for branch-target enforcement via Arm's BTI system, many of these table branch idioms use branch instructions of types that do not require a BTI instruction at the branch destination. This avoids the need to put a BTI at the start of each case handler, reducing the number of available gadgets //with// BTIs (i.e. ones which could be used by a JOP attack in spite of the BTI system). But without a range check, the use of a non-BTI-requiring branch also opens up a larger range of followup gadgets for an attacker's use. A defence against this is to avoid optimising away the range check on the table offset, even if the compiler believes that no out-of-range value should be able to reach the table branch. (Rationale: that may be true for values generated legitimately by the program, but not those generated maliciously by attackers who have already corrupted the control flow.) The effect of keeping the range check and branching to an unreachable block is that no actual code is generated at that block, so it will typically point at the end of the function. That may still cause some kind of unpredictable code execution (such as executing data as code, or falling through to the next function in the code section), but even if so, there will only be //one// possible invalid branch target, rather than giving an attacker the choice of many possibilities. This defence is enabled only when branch target enforcement is in use. Without branch target enforcement, the range check is easily bypassed anyway, by branching in to a location just after it. But with enforcement, the attacker will have to enter the jump table dispatcher at the initial BTI and then go through the range check. (Or, if they don't, it's because they //already// have a general BTI-bypassing gadget.) Reviewed By: MaskRay, chill Differential Revision: https://reviews.llvm.org/D155485	2023-07-31 10:39:50 +01:00
Cullen Rhodes	ce6303f0e6	[lli] Fix crash on empty entry-function Empty entry-function triggers the following assertion: llvm/lib/IR/Mangler.cpp:38: void getNameWithPrefixImpl(llvm::raw_ostream &, const llvm::Twine &, (anonymous namespace)::ManglerPrefixTy, const llvm::DataLayout &, char): Assertion `!Name.empty() && "getNameWithPrefix requires non-empty name"' failed. Throw an error if entry-function is empty. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D156516	2023-07-31 09:29:54 +00:00
Nikita Popov	41895843b5	[InstCombine] Only perform one iteration InstCombine is a worklist-driven algorithm, which works roughly as follows: * All instructions are initially pushed to the worklist. The initial order is in RPO program order. * All newly inserted instructions get added to the worklist. * When an instruction is folded, its users get added back to the worklist. * When the use-count of an instruction decreases, it gets added back to the worklist. * And a few of other heuristics on when we should revisit instructions. On top of the worklist algorithm, InstCombine layers an additional fix-point iteration: If any fold was performed in the previous iteration, then InstCombine will re-populate the worklist from scratch and fold the entire function again. This continues until a fix-point is reached. In the vast majority of cases, InstCombine will reach a fix-point within a single iteration: However, a second iteration is performed to verify that this is indeed the fixpoint. We can see this in the statistics for llvm-test-suite: "instcombine.NumOneIteration": 411380, "instcombine.NumTwoIterations": 117921, "instcombine.NumThreeIterations": 236, "instcombine.NumFourOrMoreIterations": 2, The way to read these numbers is that in 411380 cases, InstCombine performs no folds. In 117921 cases it performs a fold and reaches the fix-point within one iteration (the second iteration verifies the fixpoint). In the remaining 238 cases, more than one iteration is needed to reach the fixpoint. In other words, only in 0.04% of cases are additional iterations needed to reach a fixpoint. Conversely, in 22.3% of cases InstCombine performs a completely useless extra iteration to verify the fix point. This patch removes the fixpoint iteration from InstCombine, and always only perform a single iteration. This results in a major compile-time improvement of around 4% at negligible codegen impact. This explicitly does accept that we will not reach a fixpoint in all cases. However, this is mitigated by two factors: First, the data suggests that this happens very rarely in practice. Second, InstCombine runs many times during the optimization pipeline (8 times even without LTO), so there are many chances to recover such cases. In order to prevent accidental optimization regressions in the future, this implements a verify-fixpoint option, which is enabled by default when instcombine is specified in -passes and disabled when InstCombinePass() is constructed from C++. This means that test cases need to explicitly use the no-verify-fixpoint option if they fail to reach a fixed point (for a well understand reason we cannot / do not want to avoid). Differential Revision: https://reviews.llvm.org/D154579	2023-07-31 10:56:49 +02:00
wangpc	19a1b67b6d	[RISCV] Fix typo in C9LeftShift It should be 9 instead of 5. Reviewed By: kazu Differential Revision: https://reviews.llvm.org/D156500	2023-07-31 16:49:47 +08:00
Alex Zinenko	b17acc08a8	[mlir][python] more python gpu transform mixins Add the Python mix-in for MapNestedForallToThreads. Fix typing annotations in MapForallToBlocks and drop the attribute wrapping rendered unnecessary by attribute builders. Reviewed By: ingomueller-net Differential Revision: https://reviews.llvm.org/D156528	2023-07-31 08:24:18 +00:00
Alex Zinenko	8e4887a12e	[mlir] use a thread-local alternative to llvm::nulls LLVM is not set up in a thread-safe way, which seems to be leading to race conditions when sending stuff to llvm::nulls in opt builds. Try a thread-local alternative. Reviewed By: pzread Differential Revision: https://reviews.llvm.org/D156421	2023-07-31 08:21:21 +00:00
Francesco Petrogalli	c4b21d57bc	[llc] Add the command line option `-sched-model-force-enable-intervals`. The option is used to force the use of resource intervals in the machine scheduler, effectively ignoring the value of `EnableIntervals` in the instance of the `SchedMachineModel`. Reviewed By: anemet Differential Revision: https://reviews.llvm.org/D156540	2023-07-31 10:10:18 +02:00
Takuya Shimizu	e90f4fc6ac	[clang][ExprConstant] Print template arguments when describing stack frame This patch adds additional printing of template argument list when the described function is a template specialization. This can be useful when handling complex template functions in constexpr evaluator. Reviewed By: cjdb, dblaikie Differential Revision: https://reviews.llvm.org/D154366	2023-07-31 17:05:56 +09:00
Nikita Popov	063b37e7b4	Reapply [IR] Mark and/or constant expressions as undesirable Reapply after D156401, which stops PatternMatch from recognizing binop constant expressions, which should avoid the infinite loops and assertion failures this patch previously exposed. ----- In preparation for removing support for and/or expressions, mark them as undesirable. As such, we will no longer implicitly create such expressions, but they still exist.	2023-07-31 09:54:24 +02:00
Alexandros Lamprineas	893d3a61c0	Reland [FuncSpec] Add Phi nodes to the InstCostVisitor. This patch allows constant folding of PHIs when estimating the user bonus. Phi nodes are a special case since some of their inputs may remain unresolved until all the specialization arguments have been processed by the InstCostVisitor. Therefore, we keep a list of dead basic blocks and then lazily visit the Phi nodes once the user bonus has been computed for all the specialization arguments. Differential Revision: https://reviews.llvm.org/D154852	2023-07-31 08:25:48 +01:00
Simi Pallipurath	3f75d38a4d	[clang] Improve hermeticity of clang header tests. At the moment the below header tests fail with the multilib error in LLVM Embedded Toolchain for Arm because there is no corresponding aarch64 big endian library variant exist. Specifying --sysroot to its own testing directory clang/test/Headers/Inputs (which does not have any dependency library) prevents these header tests from being located in standard library directories. 1. clang/test/Headers/arm-neon-header.c 2. clang/test/Headers/arm-fp16-header.c Reviewed By: michaelplatings Differential Revision: https://reviews.llvm.org/D156427	2023-07-31 08:25:36 +01:00
Timm Bäder	d37f1e9965	[clang][Interp] Implement __builtin_isnormal Differential Revision: https://reviews.llvm.org/D155374	2023-07-31 09:14:16 +02:00
Timm Bäder	f444f39686	[clang][Interp] Implement __builtin_isfinite Differential Revision: https://reviews.llvm.org/D155372	2023-07-31 09:12:32 +02:00
Mel Chen	5962942902	[LV][NFC] Refine comments related to reduction idioms.	2023-07-31 00:06:45 -07:00
Timm Bäder	72450a7793	[clang][Interp] Implement __builtin_isinf Differential Revision: https://reviews.llvm.org/D155371	2023-07-31 08:49:22 +02:00
Sameer Sahasrabuddhe	d9847cde48	[GlobalISel] convergent intrinsics Introduced the convergent equivalent of the existing G_INTRINSIC opcodes: - G_INTRINSIC_CONVERGENT - G_INTRINSIC_CONVERGENT_W_SIDE_EFFECTS Out of the targets that currently have some support for GlobalISel, the patch assumes that the convergent intrinsics only relevant to SPIRV and AMDGPU. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D154766	2023-07-31 12:15:39 +05:30
Jim Lin	f2e44238ee	[RISCV] Clean up RISCVInstrInfoXTHead.td to look like the same style with other td file. NFC. Unify indent rule and add one blank line after comment block.	2023-07-31 14:38:21 +08:00

1 2 3 4 5 ...

469694 Commits