25772 Commits

Author SHA1 Message Date
Matt Arsenault
91be65be65 GlobalISel: Try to make legalize rules more useful for vectors
Mostly keep the existing functions on scalars, but add versions which
also operate based on the vector element size.

llvm-svn: 353430
2019-02-07 17:25:51 +00:00
Nirav Dave
24e60819f6 [DAG] Cleanup of unused node in SimplifySelectCC.
llvm-svn: 353428
2019-02-07 17:13:55 +00:00
Nirav Dave
4b12236f7d [DAG] Cleanup unused node on failed SELECT Combine.
llvm-svn: 353426
2019-02-07 16:57:50 +00:00
Nirav Dave
724b81087d [DAG] Cleanup unused nodes on failed store-to-load forward combine.
llvm-svn: 353416
2019-02-07 15:38:14 +00:00
Craig Topper
428c14d1db [BranchFolding] Remove dead code for handling EHPad blocks
Summary: This code tries to handle the case where IBB is an EHPad, but there's an earlier check that uses PBB->hasEHPadSuccessor(). Where PBB is a predecessor of IBB. The hasEHPadSuccessor function would have visited IBB and seen that it was an EHPad and returned false. This would prevent us from reaching this code with IBB as an EHPad.

Looks like this code was originally added in rL37427 (ancient) and made dead in rL143001.

Reviewers: rnk, void, efriedma

Reviewed By: rnk

Subscribers: llvm-commits

Differential Revision: https://reviews.llvm.org/D57358

llvm-svn: 353375
2019-02-07 06:21:28 +00:00
Sam Clegg
1e71b04af6 Remove reference to non-existent function. NFC.
This comment is old. The code in question was removed in rL203174

Differential Revision: https://reviews.llvm.org/D57856

llvm-svn: 353352
2019-02-07 00:11:43 +00:00
Nirav Dave
b3506bf985 [DAG] Immediately cleanup unused nodes from extend-based combines.
llvm-svn: 353338
2019-02-06 20:12:03 +00:00
Michael Berg
f0d81a31b6 Move IR flag handling directly into builder calls for cases translated from Instructions in GlobalIsel
Reviewers: aditya_nandakumar, volkan

Reviewed By: aditya_nandakumar

Subscribers: rovka, kristof.beyls, volkan, Petar.Avramovic

Differential Revision: https://reviews.llvm.org/D57630

llvm-svn: 353336
2019-02-06 19:57:06 +00:00
Bjorn Pettersson
350352c8a5 [SelectionDAG] Cleanup some code comments. NFC
Don't repeat the function name in some doxygen
comments.

(Just a minor cleanup, while testing to push
from the git monorepo setup.)

llvm-svn: 353317
2019-02-06 17:36:18 +00:00
Jessica Paquette
e288c526f1 [GlobalISel][NFC] Gardening: Factor out code for simple unary intrinsics
There was a lot of repeated code wrt unary math intrinsics in
translateKnownIntrinsic. This factors out the repeated MIRBuilder code into
two functions: translateSimpleUnaryIntrinsic and getSimpleUnaryIntrinsicOpcode.

This simplifies adding simple unary intrinsics, since after this, all you have
to do is add the mapping to SimpleUnaryIntrinsicOpcodes.

Differential Revision: https://reviews.llvm.org/D57774

llvm-svn: 353316
2019-02-06 17:25:54 +00:00
Nirav Dave
e5c37958f9 [InlineAsm][X86] Add backend support for X86 flag output parameters.
Allow custom handling of inline assembly output parameters and add X86
flag parameter support.

llvm-svn: 353307
2019-02-06 15:26:29 +00:00
Nirav Dave
54511076d4 [SelectionDAGBuilder] Refactor Inline Asm output check. NFCI.
llvm-svn: 353305
2019-02-06 15:12:46 +00:00
Clement Courbet
5a6712b633 [DAGCombine][NFC] GatherAllAliases should take a LSBaseSDNode.
GatherAllAliases only makes sense for LSBaseSDNode. Enforce it with
static typing instead of runtime cast.

llvm-svn: 353291
2019-02-06 12:36:17 +00:00
Matt Arsenault
a3d0c5adaf GlobalISel: Verify G_GEP
llvm-svn: 353209
2019-02-05 20:04:12 +00:00
Alexey Bataev
f3a9150324 [DEBUG_INFO][NVPTX] Generate DW_AT_address_class to get the values in debugger.
Summary:
According to
https://docs.nvidia.com/cuda/archive/10.0/ptx-writers-guide-to-interoperability/index.html#cuda-specific-dwarf,
the compiler should emit the DW_AT_address_class attribute for all
variable and parameter. It means, that DW_AT_address_class attribute
should be used in the non-standard way to support compatibility with the
cuda-gdb debugger.
Clang is able to generate the information about the variable address
class. This information is emitted as the expression sequence
`DW_OP_constu <DWARF Address Space> DW_OP_swap DW_OP_xderef`. The patch
tries to find all such expressions and transform them into
`DW_AT_address_class <DWARF Address Space>` if target is NVPTX and the debugger is gdb.
If the expression is not found, then default values are used. For the
local variables <DWARF Address Space> is set to ADDR_local_space(6), for
the globals <DWARF Address Space> is set to ADDR_global_space(5). The
values are taken from the table in the same section 5.2. CUDA-Specific
DWARF Definitions.

Reviewers: echristo, probinson

Subscribers: jholewinski, aprantl, llvm-commits

Differential Revision: https://reviews.llvm.org/D57157

llvm-svn: 353203
2019-02-05 19:33:47 +00:00
Florian Hahn
3b251963c3 [CGP] Add support for sinking operands to their users, if they are free.
This patch improves code generation for some AArch64 ACLE intrinsics. It adds
support to CGP to duplicate and sink operands to their user, if they can be
folded into a target instruction, like zexts and sub into usubl. It adds a
TargetLowering hook shouldSinkOperands, which looks at the operands of
instructions to see if sinking is profitable.

I decided to add a new target hook, as for the sinking to be profitable,
at least on AArch64, we have to look at multiple operands of an
instruction, instead of looking at the users of a zext for example.

The sinking is done in CGP, because it works around an instruction
selection limitation. If instruction selection is not limited to a
single basic block, this patch should not be needed any longer.

Alternatively this could be done in the LoopSink pass, which tries to
undo LICM for instructions in blocks that are not executed frequently.

Note that we do not force the operands to sink to have a single user,
because we duplicate them before sinking. Therefore this is only
desirable if they really can be done for free. Additionally we could
consider the impact on live ranges later on.

This should fix https://bugs.llvm.org/show_bug.cgi?id=40025.

As for performance, we have internal code that uses intrinsics and can
be speed up by 10% by this change.

Reviewers: SjoerdMeijer, t.p.northover, samparker, efriedma, RKSimon, spatel

Reviewed By: samparker

Differential Revision: https://reviews.llvm.org/D57377

llvm-svn: 353152
2019-02-05 10:27:40 +00:00
Clement Courbet
17b51b655e [DAG] BaseIndexOffset: FrameIndexSDNodes with the same FrameIndex compare equal.
Reviewers: niravd

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57692

llvm-svn: 353143
2019-02-05 07:36:20 +00:00
Matt Arsenault
2bf74ec8c5 GlobalISel: Fix verifier crashing on non-register operands
Also correct the wording of error on subregisters.

llvm-svn: 353128
2019-02-05 00:53:22 +00:00
Matt Arsenault
7f09fd6b04 GlobalISel: Consolidate load/store legalization
The fewerElementsVectors implementation for load/stores
handles the scalar reduction case just as well, so drop
the redundant code in narrowScalar. This also introduces
support for narrowing irregular size breakdowns for
scalars.

llvm-svn: 353125
2019-02-05 00:26:12 +00:00
Craig Topper
d4e37afe45 [DAGCombiner] Discard pointer info when combining extract_vector_elt of a vector load when the index isn't constant
Summary:
If the index isn't constant, this transform inserts a multiply and an add on the index to calculating the base pointer for a scalar load. But we still create a memory operand with an offset of 0 and the size of the scalar access. But the access is really to an unknown offset within the original access size.

This can cause the machine scheduler to incorrectly calculate dependencies between this load and other accesses. In the case we saw, there was a 32 byte vector store that was split into two 16 byte stores, one with offset 0 and one with offset 16. The size of the memory operand for both was 16. The scheduler correctly detected the alias with the offset 0 store, but not the offset 16 store.

This patch discards the pointer info so we don't incorrectly detect aliasing. I wasn't sure if we could keep using the original offset and size without risking some other transform on the load changing the size.

I tried to reduce a test case, but there's still a lot of memory operations needed to get the scheduler to do the bad reordering. So it looked pretty fragile to maintain.

Reviewers: efriedma

Reviewed By: efriedma

Subscribers: arphaman, llvm-commits

Tags: #llvm

Differential Revision: https://reviews.llvm.org/D57616

llvm-svn: 353124
2019-02-05 00:22:23 +00:00
Matt Arsenault
81511e5428 GlobalISel: Implement narrowScalar for select
Don't handle vector conditions.

I think this can be merged in the future with
fewerElementsVectorSelect, although this becomes slightly tricky with
a vector condition.

llvm-svn: 353122
2019-02-05 00:13:44 +00:00
Matt Arsenault
24f14993e8 GlobalISel: Combine g_extract with g_merge_values
Try to use the underlying source registers.

This enables legalization in more cases where some irregular
operations are widened and others narrowed.

This seems to make the test_combines_2 AArch64 test worse, since the
MERGE_VALUES has multiple uses. Since this should be required for
legalization, a hasOneUse check is probably inappropriate (or maybe
should only be used if the merge is legal?).

llvm-svn: 353121
2019-02-04 23:41:59 +00:00
Matt Arsenault
1f795e2c2a GlobalISel: Enforce operand types for constants
A number of of tests were using imm operands, not cimm. Since CSE
relies on the exact ConstantInt* pointer used, and implicit
conversions are generally evil, also enforce the bitsize of the types.

llvm-svn: 353113
2019-02-04 23:29:31 +00:00
Matt Arsenault
f2a26339e2 GlobalISel: Verify g_select
Factor the common vector element consistency check many instructions
need out, although this makes the error messages worse.

llvm-svn: 353112
2019-02-04 23:29:16 +00:00
Matt Arsenault
46f9c6cf0b MachineVerifier: Move verification of G_* instructions to function
llvm-svn: 353111
2019-02-04 23:29:11 +00:00
Sam Clegg
313f9f54f5 [WebAssembly] MC: Mark more function aliases as functions
Aliases of functions are now marked as function symbols even if
they are bitcast to some other other non-function type.
This is important for WebAssembly where object and function
symbols can't alias each other.

Fixes PR38866

Differential Revision: https://reviews.llvm.org/D57538

llvm-svn: 353109
2019-02-04 23:07:34 +00:00
Matt Arsenault
8a59b1919c MIR: Validate LLT types when parsing
llvm-svn: 353107
2019-02-04 22:59:56 +00:00
Matt Arsenault
3d6a49b0b9 GlobalISel: Fix not calling observer when legalizing bitcount ops
This was hiding bugs from never legalizing the source type.

llvm-svn: 353102
2019-02-04 22:26:33 +00:00
Wolfgang Pieb
90d856cd5f [DEBUGINFO] Reposting r352642: Handle restore instructions in LiveDebugValues
The LiveDebugValues pass recognizes spills but not restores, which can
cause large gaps in location information for some variables, depending
on control flow. This patch make LiveDebugValues recognize restores and
generate appropriate DBG_VALUE instructions.

This patch was posted previously with r352642 and reverted in r352666 due
to buildbot errors. A missing return statement was the cause for the 
failures.

Reviewers: aprantl, NicolaPrica

Differential Revision: https://reviews.llvm.org/D57271

llvm-svn: 353089
2019-02-04 20:42:45 +00:00
Matt Arsenault
8121ec26c0 GlobalISel: Fix CSE handling of buildConstant
This fixes two problems with CSE done in buildConstant. First, this
would hit an assert when used with a vector result type. Solve this by
allowing CSE on the vector elements, but not on the result vector for
now.

Second, this was also performing the CSE based on the input
ConstantInt pointer. The underlying buildConstant could potentially
convert the constant depending on the result type, giving in a
different ConstantInt*. Stop allowing the APInt and ConstantInt forms
from automatically casting to the result type to avoid any similar
problems in the future.

llvm-svn: 353077
2019-02-04 19:15:50 +00:00
Matt Arsenault
0723828675 GlobalISel: Fix moreElementsToNextPow2
This was completely broken. The condition was inverted, and changed
the element type for vectors of pointers.

Fixes bug 40592.

llvm-svn: 353069
2019-02-04 18:42:24 +00:00
Jessica Paquette
834bded9d6 Revert "[GlobalISel] Add IRTranslator support for G_FFLOOR"
This reverts commit 8bbd570fd5205a04d88d2e5513a6e4adbd028039.

Apparently adding ffloor breaks AMDGPU somehow, so I need to back this out
while I look into it.

llvm-svn: 353064
2019-02-04 17:32:43 +00:00
Leonard Chan
68d428e578 [Intrinsic] Unsigned Fixed Point Multiplication Intrinsic
Add an intrinsic that takes 2 unsigned integers with the scale of them
provided as the third argument and performs fixed point multiplication on
them.

This is a part of implementing fixed point arithmetic in clang where some of
the more complex operations will be implemented as intrinsics.

Differential Revision: https://reviews.llvm.org/D55625

llvm-svn: 353059
2019-02-04 17:18:11 +00:00
Jessica Paquette
73158e7201 [GlobalISel] Add IRTranslator support for G_FFLOOR
Follow-up to https://reviews.llvm.org/D57484

Adds G_FFLOOR to translateKnownIntrinsic and update arm64-irtranslator.ll.

Differential Revision: https://reviews.llvm.org/D57485

llvm-svn: 353058
2019-02-04 17:15:34 +00:00
Sanjay Patel
c00bdab4c8 [CGP] use IRBuilder to simplify code
This is no-functional-change-intended although there could
be intermediate variations caused by a difference in the
debug info produced by setting that from the builder's 
insertion point. 

I'm updating the IR test file associated with this code just
to show that the naming differences from using the builder
are visible.

The motivation for adding a helper function is that we are
likely to extend this code to deal with other overflow ops.

llvm-svn: 353056
2019-02-04 16:30:46 +00:00
Matt Arsenault
56edf3f344 GlobalISel: Fix formatting of debug output
There was a missing space before the instruction name, and the newline
is redundant since MI::print by default adds one.

llvm-svn: 353046
2019-02-04 14:05:33 +00:00
Simon Pilgrim
a536b89fe0 [DAGCombine] Add ADD(SUB,SUB) combines
Noticed while investigating PR40483, and fixes the basic test case from the bug - but not a more general case.

We're pretty weak at dealing with ADD/SUB combines compared to the SimplifyAssociativeOrCommutative/SimplifyUsingDistributiveLaws abilities that InstCombine can manage.

llvm-svn: 353044
2019-02-04 13:44:49 +00:00
Andrea Di Biagio
edbf06a767 [AsmPrinter] Remove hidden flag -print-schedule.
This patch removes hidden codegen flag -print-schedule effectively reverting the
logic originally committed as r300311
(https://llvm.org/viewvc/llvm-project?view=revision&revision=300311).

Flag -print-schedule was originally introduced by r300311 to address PR32216
(https://bugs.llvm.org/show_bug.cgi?id=32216). That bug was about adding "Better
testing of schedule model instruction latencies/throughputs".

These days, we can use llvm-mca to test scheduling models. So there is no longer
a need for flag -print-schedule in LLVM. The main use case for PR32216 is
now addressed by llvm-mca.
Flag -print-schedule is mainly used for debugging purposes, and it is only
actually used by x86 specific tests. We already have extensive (latency and
throughput) tests under "test/tools/llvm-mca" for X86 processor models. That
means, most (if not all) existing -print-schedule tests for X86 are redundant.

When flag -print-schedule was first added to LLVM, several files had to be
modified; a few APIs gained new arguments (see for example method
MCAsmStreamer::EmitInstruction), and MCSubtargetInfo/TargetSubtargetInfo gained
a couple of getSchedInfoStr() methods.

Method getSchedInfoStr() had to originally work for both MCInst and
MachineInstr. The original implmentation of getSchedInfoStr() introduced a
subtle layering violation (reported as PR37160 and then fixed/worked-around by
r330615).
In retrospect, that new API could have been designed more optimally. We can
always query MCSchedModel to get the latency and throughput. More importantly,
the "sched-info" string should not have been generated by the subtarget.
Note, r317782 fixed an issue where "print-schedule" didn't work very well in the
presence of inline assembly. That commit is also reverted by this change.

Differential Revision: https://reviews.llvm.org/D57244

llvm-svn: 353043
2019-02-04 12:51:26 +00:00
Clement Courbet
1bb0e5ccfb [SelectionDAG] Add a BaseIndexOffset::print() method for debugging.
llvm-svn: 353028
2019-02-04 09:30:43 +00:00
Sanjay Patel
84ceae6048 [CGP] adjust target constraints for forming uaddo
There are 2 changes visible here:
1. There's no reason to limit this transform based on number
   of condition registers. That diff allows PPC to produce 
   slightly better (dot-instructions should be generally good) 
   code.
   Note: someone that cares about PPC codegen might want to 
   look closer at that output because it seems like we could
   still improve this.

2. We (probably?) should not bother trying to form uaddo (or
   other overflow ops) when there's no target support for such
   an op. This goes beyond checking whether the op is expanded
   because both PPC and AArch64 show better codegen for standard
   types regardless of whether the op is legal/custom.

llvm-svn: 353001
2019-02-03 17:53:09 +00:00
Sanjay Patel
00fcc74e50 [CGP] refactor optimizeCmpExpression (NFCI)
This is not truly NFC because we are bailing out without
a TLI now. That should not be a real concern though because
there should be a TLI in any real-world scenario. 

That seems better than passing around a pointer and then 
checking it for null-ness all over the place.

The motivation is to fix what appears to be an unintended
restriction on the uaddo transform - 
hasMultipleConditionRegisters() shouldn't be reason to limit 
the transform.

llvm-svn: 352988
2019-02-03 13:48:03 +00:00
Matt Arsenault
888aa5dedd GlobalISel: Implement widenScalar for G_UNMERGE_VALUES
For the scalar case only.

Also move the similar G_MERGE_VALUES handling to a separate function
and cleanup to make them look more similar.

llvm-svn: 352979
2019-02-03 00:07:33 +00:00
Matt Arsenault
0e5d856eb8 GlobalISel: Implement widenScalar for G_EXTRACT vector sources
Handle the basic element extract case.

llvm-svn: 352978
2019-02-02 23:56:00 +00:00
Matt Arsenault
cbaada6bc1 GlobalISel: Legalization for inttoptr/ptrtoint
llvm-svn: 352973
2019-02-02 23:29:55 +00:00
Simon Pilgrim
bd42f97946 [SDAG] Add SDNode/SDValue getConstantOperandAPInt helper. NFCI.
We already have the getConstantOperandVal helper which returns a uint64_t, but along comes the fuzzer and inserts a i128 -1 constant or something and the whole thing asserts.......

I've updated a few obvious cases, and tried to make use of the const reference where possible, but there's more to do. A number of existing oss-fuzz tickets should be fixed if we start using APInt and perform value clamping where necessary.

llvm-svn: 352961
2019-02-02 17:35:06 +00:00
Philip Reames
00056ed0e6 [CodeGen] Be as conservative about atomic accesses as for volatile
Background: At the moment, we record the AtomicOrdering of an access in the MMO, but also mark any atomic access as volatile in SelectionDAG. I'm working towards separating that. See https://reviews.llvm.org/D57601 for context.

Update all usages of isVolatile in lib/CodeGen to preserve behaviour once atomic MMOs stop being also volatile. This is NFC in it's current form, but is essential for correctness once we make that final change.

It useful to keep in mind that AtomicSDNode is not a parent of LoadSDNode, StoreSDNode, or LSBaseSDNode. As a result, any call to isVolatile on one of those static types doesn't need a companion isAtomic check.  We should probably adjust that class hierarchy long term, but for now, that seperation is useful.

I'm deliberately being conservative about handling. I want the change to stop adding volatile to be NFC itself, and then will work through places where we can be less conservative for atomics one by one in separate changes w/tests.

Differential Revision: https://reviews.llvm.org/D57596

llvm-svn: 352937
2019-02-01 22:58:52 +00:00
Mandeep Singh Grang
70d484d94e [COFF, ARM64] Fix localaddress to handle stack realignment and variable size objects
Summary: This fixes using the correct stack registers for SEH when stack realignment is needed or when variable size objects are present.

Reviewers: rnk, efriedma, ssijaric, TomTan

Reviewed By: rnk, efriedma

Subscribers: javed.absar, kristof.beyls, llvm-commits

Differential Revision: https://reviews.llvm.org/D57183

llvm-svn: 352923
2019-02-01 21:41:33 +00:00
James Y Knight
7716075a17 [opaque pointer types] Pass value type to GetElementPtr creation.
This cleans up all GetElementPtr creation in LLVM to explicitly pass a
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57173

llvm-svn: 352913
2019-02-01 20:44:47 +00:00
James Y Knight
14359ef1b6 [opaque pointer types] Pass value type to LoadInst creation.
This cleans up all LoadInst creation in LLVM to explicitly pass the
value type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57172

llvm-svn: 352911
2019-02-01 20:44:24 +00:00
James Y Knight
7976eb5838 [opaque pointer types] Pass function types to CallInst creation.
This cleans up all CallInst creation in LLVM to explicitly pass a
function type rather than deriving it from the pointer's element-type.

Differential Revision: https://reviews.llvm.org/D57170

llvm-svn: 352909
2019-02-01 20:43:25 +00:00