llvm-project

Author	SHA1	Message	Date
Amara Emerson	95ac3d15e9	[AArch64][GlobalISel] Add G_VECREDUCE fewerElements support for full scalarization. For some reductions like G_VECREDUCE_OR on AArch64, we need to scalarize completely if the source is <= 64b. This change adds support for that in the legalizer. If the source has a pow-2 num elements, then we can do a tree reduction using the scalar operation in the individual elements. Otherwise, we just create a sequential chain of operations. For AArch64, we only need to scalarize if the input is <64b. If it's great than 64b then we can first do a fewElements step to 64b, taking advantage of vector instructions until we reach the point of scalarization. I also had to relax the verifier checks for reductions because the intrinsics support <1 x EltTy> types, which we lower to scalars for GlobalISel. Differential Revision: https://reviews.llvm.org/D108276	2021-08-19 16:38:52 -07:00
Yonghong Song	c1169b8bd3	Revert "[DebugInfo] generate btf_tag annotations for DIComposite types" This reverts commit 2fded193e7a8fb5bd8fb339f00fd9de686390530. Builtbot reports some test failures. Revert now so I can take time to fix the issues.	2021-08-19 15:54:38 -07:00
Yonghong Song	2fded193e7	[DebugInfo] generate btf_tag annotations for DIComposite types Clang patch D106614 added attribute btf_tag support. This patch generates btf_tag annotations for DIComposite types. A field "annotations" is introduced to DIComposite, and the annotations are represented as an DINodeArray, similar to DIComposite elements. The following example illustrates how annotations are encoded in IR: distinct !DICompositeType(..., annotations: !10) !10 = !{!11, !12} !11 = !{!"btf_tag", !"a"} !12 = !{!"btf_tag", !"b"} Each btf_tag annotation is represented as a 2D array of meta strings. Each record may have more than one btf_tag annotations, as in the above example. Differential Revision: https://reviews.llvm.org/D106615	2021-08-19 15:37:44 -07:00
Arthur Eubanks	7c8206cd2a	[NFC] Cleanup AttributeList::getStackAlignment() So that we don't use a confusing index.	2021-08-19 14:21:40 -07:00
Alexandre Rames	cd28003336	[Support] Update `MD5` to follow other hashes. Introduce `StringRef final()` and `StringRef result()`. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D107781	2021-08-19 14:13:14 -07:00
Arthur Eubanks	44a3241f10	[NFC] Replace some attribute methods that use confusing indexes	2021-08-19 14:10:26 -07:00
Alexandre Rames	10a126325d	[NFC][Support] Move `MD5` members in `InternalState`. This prepares an update to follow other hashes. Reviewed By: dexonsmith Differential Revision: https://reviews.llvm.org/D108388	2021-08-19 14:04:14 -07:00
Adrian Prantl	1e586bcc3e	Move function definition out-of-line to fix the modularized build (NFC)	2021-08-19 12:26:23 -07:00
Simon Pilgrim	94e1442d78	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 17:49:32 +01:00
Simon Pilgrim	ff69c65b05	Fix empty paragraph passed to parameter Wdocumentation warning. NFC.	2021-08-19 16:48:28 +01:00
Simon Pilgrim	87c8c8ae97	Fix unknown parameter Wdocumentation warnings. NFC.	2021-08-19 15:40:10 +01:00
Sanjay Patel	ec54e275f5	Revert "[CVP] processSwitch: Remove default case when switch cover all possible values." This reverts commit 9934a5b2ed5aa6e6bbb2e55c3cd98839722c226e. This patch may cause miscompiles because it missed a constraint as shown in the examples from: https://llvm.org/PR51531	2021-08-19 08:43:51 -04:00
Sanjay Patel	eee0ded337	[InstCombine] add min/max intrinsics as freely invertible candidates In the optimized test, we are able to peak through the min/max that has 2 min/max operands and invert them all: https://alive2.llvm.org/ce/z/7gYMN5	2021-08-19 08:41:38 -04:00
Jon Chesterfield	77579b99e9	[openmp][nfc] Replace OMPGridValues array with struct [nfc] Replaces enum indices into an array with a struct. Named the fields to match the enum, leaves memory layout and initialization unchanged. Motivation is to later safely remove dead fields and replace redundant ones with (compile time) computation. It should also be possible to factor some common fields into a base and introduce a gfx10 amdgpu instance with less duplication than the arrays of integers require. Reviewed By: ronlieb Differential Revision: https://reviews.llvm.org/D108339	2021-08-19 13:25:42 +01:00
Bjorn Pettersson	36d5138619	[NewPM] Make some sanitizer passes parameterized in the PassRegistry Refactored implementation of AddressSanitizerPass and HWAddressSanitizerPass to use pass options similar to passes like MemorySanitizerPass. This makes sure that there is a single mapping from class name to pass name (needed by D108298), and options like -debug-only and -print-after makes a bit more sense when (despite that it is the unparameterized pass name that should be used in those options). A result of the above is that some pass names are removed in favor of the parameterized versions: - "khwasan" is now "hwasan<kernel;recover>" - "kasan" is now "asan<kernel>" - "kmsan" is now "msan<kernel>" Differential Revision: https://reviews.llvm.org/D105007	2021-08-19 12:43:37 +02:00
Renato Golin	894ad26bd5	Update {Small}BitVector size_type definition SmallBitVector implements a level of indirection over BitVector by storing a smaller bit-vector in a pointer-sized element, or in case the number of elements exceeds the bucket size, it creates a new pointer to a BitVector and uses that as its storage. However, the functions returning the vector size were using `unsigned`, which is ok for BitVector, but not for SmallBitVector, which is actually `uintptr_t`. This commit reuses the `size_type` definition to more than just `count` and propagates them into range iteration, size calculation, etc. This is a continuation of D108124. I haven't changed all occurrences of `unsigned` or `uintptr_t` to `size_type`, just those that were directly related. Following directions from clang-tidy on case of variables. Differential Revision: https://reviews.llvm.org/D108290	2021-08-19 11:13:38 +01:00
David Sherwood	f4122398e7	[LoopVectorize][AArch64] Enable ordered reductions by default for AArch64 I have added a new TTI interface called enableOrderedReductions() that controls whether or not ordered reductions should be enabled for a given target. By default this returns false, whereas for AArch64 it returns true and we rely upon the cost model to make sensible vectorisation choices. It is still possible to override the new TTI interface by setting the command line flag: -force-ordered-reductions=true\|false I have added a new RUN line to show that we use ordered reductions by default for SVE and Neon: Transforms/LoopVectorize/AArch64/strict-fadd.ll Transforms/LoopVectorize/AArch64/scalable-strict-fadd.ll Differential Revision: https://reviews.llvm.org/D106653	2021-08-19 09:29:40 +01:00
Wenlei He	eca03d2768	[CSSPGO] Track and use context-sensitive post-optimization function size to drive global pre-inliner in llvm-profgen This change enables llvm-profgen to use accurate context-sensitive post-optimization function byte size as a cost proxy to drive global preinline decisions. To do this, BinarySizeContextTracker is introduced to track function byte size under different inline context during disassembling. In preinliner, we can not query context byte size under switch `context-cost-for-preinliner`. The tracker uses a reverse trie to keep size of functions under different context (callee as parent, caller as child), and it can give best/longest possible matching context size for given input context. The new size cost is off by default. There're a few TODOs that needs to addressed: 1) avoid dangling string from `Offset2LocStackMap`, which will be addressed in split context work; 2) using inlinee's entry probe to make sure we have correct zero size for inlinee that's completely optimized away after inlining. Some tuning is also needed. Differential Revision: https://reviews.llvm.org/D108180	2021-08-18 22:50:57 -07:00
luxufan	a9095f005f	[JITLink] Optimize GOTPCRELX Relocations This patch optimize the GOTPCRELX Reloations, which is described in X86-64 psabi chapter B.2. And Not all optimization of this chapter is implemented. 1. Convert call and jmp has been implemented 2. Convert mov, but the optimization that when the symbol is defined in the lower 32-bit address space, memory operand in `mov` can be convertted into immediate operand has not been implemented. 3. Conver Test and Binop has not been implemented. The new test file named ELF_got_plt_optimizations.s has been added, and I moved some test cases about optimization of got/plt from ELF_x86_64_small_pic_relocations.s to the new test file. By referencing the lld, so, the optimization `Convert call and jmp` is not same as what psabi says, and I have explained it in the comment. Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D108280	2021-08-19 10:30:22 +08:00
Lang Hames	8a36750236	[ORC] Handle void and no-argument async wrapper calls.	2021-08-19 12:20:31 +10:00
Rong Xu	5fdaaf7fd8	[SampleFDO] Flow Sensitive Sample FDO (FSAFDO) profile loader This patch implements Flow Sensitive Sample FDO (FSAFDO) profile loader. We have two profile loaders for FS profile, one before RegAlloc and one before BlockPlacement. To enable it, when -fprofile-sample-use=<profile> is specified, add "-enable-fs-discriminator=true \ -disable-ra-fsprofile-loader=false \ -disable-layout-fsprofile-loader=false" to turn on the FS profile loaders. Differential Revision: https://reviews.llvm.org/D107878	2021-08-18 18:37:35 -07:00
Rafael Auler	d8bbfe8a48	[DWARF] Expose raw bytes in DWARFExpression This information is necessary for clients of DebugInfo that do not want to process a DWARF expression, but just treat it as a blob of data. In BOLT, for example, we need to read these expressions in CFIs and write them back to the binary, unchanged, so having access to the original expression encoding is a shortcut to avoid the need to re-encode the entire expression when re-writing exception handling info (CFIs). This patch is an alternative to https://reviews.llvm.org/D98301, in which we implement the support to re-encode these expressions. But since we don't really need to change anything in these expressions, we can just copy their bytes. Reviewed By: aprantl Differential Revision: https://reviews.llvm.org/D107515	2021-08-18 14:41:20 -07:00
Jon Chesterfield	21d91a8ef3	[libomptarget][devicertl] Replace lanemask with uint64 at interface Use uint64_t for lanemask on all GPU architectures at the interface with clang. Updates tests. The deviceRTL is always linked as IR so the zext and trunc introduced for wave32 architectures will fold after inlining. Simplification partly motivated by amdgpu gfx10 which will be wave32 and is awkward to express in the current arch-dependant typedef interface. Reviewed By: jdoerfert Differential Revision: https://reviews.llvm.org/D108317	2021-08-18 20:47:33 +01:00
Joe Nash	9dbc968ed9	[AMDGPU] Fix atomic float max/min intrinsics Hooked up raw.buffer.atomic.fmin/max.f64 This instruction should be available on GFX6, GFX7, and GFX10. It was implemented for GFX90a with a different name. Added intrinsic def for image_atomic_fmin/fmax; the instruction defs were already there. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D108208 Change-Id: I473f98d28b2afbeeb2c27822d9686b5e86634e2f	2021-08-18 14:12:42 -04:00
Nikita Popov	3dd8c9176b	[LICM] Remove AST-based implementation MSSA-based LICM has been enabled by default for a few years now. This drops the old AST-based implementation. Using loop(licm) will result in a fatal error, the use of loop-mssa(licm) is required (or just licm, which defaults to loop-mssa). Note that the core canSinkOrHoistInst() logic has to retain AST support for now, because it is shared with LoopSink. Differential Revision: https://reviews.llvm.org/D108244	2021-08-18 20:21:53 +02:00
Arthur Eubanks	2fc075948c	[NFC] Remove some unnecessary AttributeList methods These rely on methods I'm trying to cleanup.	2021-08-18 11:15:20 -07:00
Jessica Paquette	791006fb8c	[GlobalISel] Implement lowering for G_ISNAN + use it in AArch64 GlobalISel equivalent to `TargetLowering::expandISNAN`. Use it in AArch64 and add a testcase. Differential Revision: https://reviews.llvm.org/D108227	2021-08-18 10:54:25 -07:00
Jessica Paquette	d9873711cb	[GlobalISel] Add IRTranslator support for G_ISNAN Translate the `@llvm.isnan` intrinsic to G_ISNAN when we see it. This is pretty much the same as the associated SelectionDAGBuilder code. Main difference is that we don't expand it here. It makes more sense to do that during legalization in GlobalISel. GlobalISel will just legalize the generated illegal types. Differential Revision: https://reviews.llvm.org/D108226	2021-08-18 10:48:10 -07:00
Jessica Paquette	0a2b1ba33a	[GlobalISel] Add G_ISNAN Add a generic opcode equivalent to the `llvm.isnan` intrinsic + MachineVerifier support for it. We need an opcode here because we may want target-specific lowering later on. Differential Revision: https://reviews.llvm.org/D108222	2021-08-18 10:42:05 -07:00
Arthur Eubanks	7557d6c896	[NFC] Cleanup calls to CallBase::getAttribute()	2021-08-18 09:39:33 -07:00
Lang Hames	45ac5f5441	Revert "[ORC-RT][ORC] Introduce ELF/*nix Platform and runtime support." This reverts commit e256445bfff12013c3c4ad97da4aa69d25b175b5. This commit broke some of the bots (see e.g. https://lab.llvm.org/buildbot/#/builders/112/builds/8599). Reverting while I investigate.	2021-08-18 20:42:23 +10:00
Lang Hames	626a84e3b3	[ORC] Remove unused headers.	2021-08-18 18:19:05 +10:00
Petr Hosek	2d4470ab89	Revert "Allow rematerialization of virtual reg uses" This reverts commit 877572cc193a470f310eec46a7ce793a6cc97c2f which introduced PR51516.	2021-08-18 00:12:41 -07:00
Lang Hames	e256445bff	[ORC-RT][ORC] Introduce ELF/*nix Platform and runtime support. This change adds support to ORCv2 and the Orc runtime library for static initializers, C++ static destructors, and exception handler registration for ELF-based platforms, at present Linux and FreeBSD on x86_64. It is based on the MachO platform and runtime support introduced in bb5f97e3ad1. Patch by Peter Housel. Thanks very much Peter! Reviewed By: lhames Differential Revision: https://reviews.llvm.org/D108081	2021-08-18 15:00:22 +10:00
Arthur Eubanks	3f4d00bc3b	[NFC] More get/removeAttribute() cleanup	2021-08-17 21:05:41 -07:00
Arthur Eubanks	de0ae9e89e	[NFC] Cleanup more AttributeList::addAttribute()	2021-08-17 21:05:41 -07:00
Arthur Eubanks	cc327bd523	[NFC] Cleanup attribute methods in Function	2021-08-17 21:05:40 -07:00
Arthur Eubanks	ad727ab7d9	[NFC] Migrate some callers away from Function/AttributeLists methods that take an index These methods can be confusing.	2021-08-17 21:05:40 -07:00
Arthur Eubanks	46cf82532c	[NFC] Replace Function handling of attributes with less confusing calls To avoid magic constants and confusing indexes.	2021-08-17 21:05:40 -07:00
Jun Ma	9934a5b2ed	[CVP] processSwitch: Remove default case when switch cover all possible values. Differential Revision: https://reviews.llvm.org/D106056	2021-08-18 10:23:13 +08:00
Qiu Chaofan	5ca250a03d	[RegAlloc] Remove addAllocPriorityToGlobalRanges hook It was introduced in 1a6dc92 and only enabled on PowerPC/AMDGPU. That should be enabled for all targets. Reviewed By: arsenm Differential Revision: https://reviews.llvm.org/D108010	2021-08-18 10:21:27 +08:00
Wang, Pengfei	2379949aad	[X86] AVX512FP16 instructions enabling 3/6 Enable FP16 conversion instructions. Ref.: https://software.intel.com/content/www/us/en/develop/download/intel-avx512-fp16-architecture-specification.html Reviewed By: LuoYuanke Differential Revision: https://reviews.llvm.org/D105265	2021-08-18 09:03:41 +08:00
Mark Danial	4018d25da8	LoopNest Analysis expansion to return instructions that prevent a Loop Nest from being perfect Expand LoopNestAnalysis to return the full list of instructions that cause a loop nest to be imperfect. This is useful for other passes to know if they should continue for in the inner loops. Added New function getInterveningInstructions that returns a small vector with the instructions that prevent a loop for being perfect. Also added a couple of helper functions to reduce code duplication. Reviewed By: Whitney Differential Revision: https://reviews.llvm.org/D107773	2021-08-17 22:25:49 +00:00
Mikhail Borisov	f0fcd42495	[libc++abi] Fix possible infinite loop in itanium demangler A libfuzzer run has discovered some inputs for which the demangler does not terminate. When minimized, it looks like this: _Zcv1BIRT_EIS1_E Deciphered: _Z cv - conversion operator * result type 1B - "B" I - template args begin R - reference type <. T_ - forward template reference \| * E - template args end \| \| \| \| * parameter type \| \| I - template args begin \| \| S1_ - substitution #1 * <' E - template args end The reason is: template-parameter refs in conversion operator result type create forward-references, while substitutions are instantly resolved via back-references. Together these can create a reference loop. It causes an infinite loop in ReferenceType::collapse(). I see three possible ways to avoid these loops: 1. check if resolving a forward reference creates a loop and reject the invalid input (hard to traverse AST at this point) 2. check if a substitution contains a malicious forward reference and reject the invalid input (hard to traverse AST at this point; substitutions are quite common: may affect performance; hard to clearly detect loops at this point) 3. detect loops in ReferenceType::collapse() (cannot reject the input) This patch implements (3) as seemingly the least-impact change. As a side effect, such invalid input strings are not rejected and produce garbage, however there are already similar guards in `if (Printing) return;` checks. Fixes https://llvm.org/PR51407 Differential Revision: https://reviews.llvm.org/D107712	2021-08-17 18:13:26 -04:00
Mikhail Borisov	db7c68d808	[libc++abi][NFC] Move PODSmallVector definition to the top of ItaniumDemangle.h This change is needed to clean the non-relevant parts of diff from https://reviews.llvm.org/D107712 Differential Revision: https://reviews.llvm.org/D107771	2021-08-17 18:07:30 -04:00
Fraser Cormack	f3e9047249	[VP] Add vector-predicated reduction intrinsics This patch adds vector-predicated ("VP") reduction intrinsics corresponding to each of the existing unpredicated `llvm.vector.reduce.*` versions. Unlike the unpredicated reductions, all VP reductions have a start value. This start value is returned when the no vector element is active. Support for expansion on targets without native vector-predication support is included. This patch is based on the ["reduction slice"](https://reviews.llvm.org/D57504#1732277) of the LLVM-VP reference patch (https://reviews.llvm.org/D57504). Reviewed By: craig.topper Differential Revision: https://reviews.llvm.org/D104308	2021-08-17 17:56:35 +01:00
Philip Reames	982da7a20c	[SCEVExpander] Stop hoisting IR when reusing phis his is a fix for PR43678, and is an alternate patch to D105723. The basic issue we're running into is that LSR + SCEVExpander are moving the very instruction whose operand we're in the process of expanding. This breaks the subtle and ill-documented invariant which let LSR work. (Full story can be found here: https://reviews.llvm.org/D105723#2878473) Rather than attempting a fix, this change just removes the optimization entirely. The code is entirely untested, and removing it appears to have no impact I can find. This code was added back in 2014 by 1e12f8563d4b7 with a single test which does not seem to actually test the hoisting logic. From a philosophical standpoint, it also seems very strange to have the expander implementing optimizations which should live in a dedicated transform pass. Differential Revision: https://reviews.llvm.org/D106178	2021-08-17 09:38:32 -07:00
Fangrui Song	78cb1adc5c	[Object] Move llvm-nm's symbol version utility to ELFObjectFile::readDynsymVersions The utility can be reused by llvm-objdump -T. Reviewed By: jhenderson Differential Revision: https://reviews.llvm.org/D108096	2021-08-17 09:06:39 -07:00
Sebastian Neubauer	fbae34635d	[GlobalISel] Add combine for PTR_ADD with regbanks Combine two G_PTR_ADDs, but keep the register bank of the constant. That way, the combine can be used in post-regbank-select combines. Introduce two helper methods in CombinerHelper, getRegBank and setRegBank that get and set an optional register bank to a register. That way, they can be used before and after register bank selection. Differential Revision: https://reviews.llvm.org/D103326	2021-08-17 13:58:16 +02:00
Florian Mayer	0deedaa23f	[hwasan] Prevent reordering of tag checks. They were previously unconstrained, which allowed them to be reordered before the shadow memory write. Reviewed By: eugenis Differential Revision: https://reviews.llvm.org/D107901	2021-08-17 10:21:23 +01:00

1 2 3 4 5 ...

45822 Commits