433849 Commits

Author SHA1 Message Date
David Green
e29f9f7572 [AArch64][X86] Add some fixed-order-recurrence tests to check the costmodel of fixed order recurrences. NFC 2022-08-24 08:18:01 +01:00
David Green
8dc1eee77d [AArch64][SVE] Remove -O1 from SVE intrinsic tests.
This removes -O1 from the SVE ACLE intrinsics tests and replaces it with
-O0 and "opt -mem2reg -instcombine -tailcallelim". Instrcombine and
TailCallElim are only added to keep the differences smaller and can be
removed in a followup patches. The only remaining differences in the
tests are tbaa nodes not being emitted under -O0, and the removable of
some tailcall flags.
2022-08-24 08:18:01 +01:00
Adrian Kuegel
23b3bcc7a6 [mlir][Bazel] Fix bazel build.
To avoid a dependency cycle, add BytecodeImplementation.h header to the
"IR" target.
2022-08-24 08:51:44 +02:00
Mahesh Ravishankar
10841fca9e Fix warning from a7bfdc23ab3ade54da99f0f59dababe4d71ae75b 2022-08-24 06:39:19 +00:00
Alex
07a700f814 [RISCV] Add zihintntl compressed instructions
Add zihintntl compressed instructions and some files related to zihintntl.
This patch is base on {D121670}.

Reviewed By: kito-cheng

Differential Revision: https://reviews.llvm.org/D121779
2022-08-24 14:29:02 +08:00
Paweł Bylica
72faccc3f0
[DAGCombine] Add more tests for cmp to sbb combination; NFC
Add 2 more tests for potential DAG combine of cmp into sbb.

Differential Revision: https://reviews.llvm.org/D132463
2022-08-24 08:27:58 +02:00
Mahesh Ravishankar
a7bfdc23ab [mlir][Linalg] Handle multi-result operations in Elementwise op fusion.
This drops the artificial requirement of producers having a single
result value to be able to fuse with consumers.

The current default also only fuses producer with consumer when the
producer has a single use. This is a simplifying assumption. There are
legitimate use cases where a producer can be fused with consumer and
the fused o pcould be used to replace the uses of the producer as
well. This needs to be done with care to avoid use-def violations. To
allow for downstream users to explore more fusion opportunities, the
core transformation method is exposed as a utility function.

This patch also modifies the control function to take just the fused
operand as the argument. This is enough information for the callers to
get the producer and the consumer operations being considered to
fuse. It also provides information of which producer result is used.

Differential Revision: https://reviews.llvm.org/D132301
2022-08-24 05:57:30 +00:00
esmeyi
dfe55cc1cd [AIX] use the original name as the input to create the new symbol for TLS symbol.
Summary: Currently, an error was reported when a thread local symbol has an invalid name. D100956 create a new symbol to prefix the TLS symbol name with a dot. When the symbol name is renamed, the error occurs. This patch uses the original symbol name (name in the symbol table) as the input for the symbol for TOC entry.

Reviewed By: shchenz, lkail

Differential Revision: https://reviews.llvm.org/D132348
2022-08-24 01:36:40 -04:00
ZHU Zijia
9c85382ade [RISCV] Handle register spill in branch relaxation
In branch relaxation pass, `j`'s with offset over 1MiB will be relaxed
to `jump` pseudo-instructions.

This patch allocates a stack slot for functions with a size greater than
1MiB. If the register scavenger cannot find a scratch register for
`jump`, spill a register to the slot before the jump and restore it
after the jump.

.mbb:
        foo
        j       .dest_bb
        bar
        bar
        bar
.dest_bb:
        baz

The above code will be relaxed to the following code.

.mbb:
        foo
        sd      s11, 0(sp)
        jump    .restore_bb, s11
        bar
        bar
        bar
        j       .dest_bb
.restore_bb:
        ld      s11, 0(sp)
.dest_bb:
        baz

Depends on D129999.

Reviewed By: StephenFan

Differential Revision: https://reviews.llvm.org/D130560
2022-08-24 13:27:56 +08:00
ZHU Zijia
d51581ff2c [RISCV][TableGen] Mark MachineInstr with FrameIndex as not compressible
If a MachineInstr's operand should be Reg in compiler's output but is
currently FrameIndex, `isCompressibleInst()` will terminate at
`MachineOperandType::getReg()`.

This patch adds `.isReg()` checks to make `isCompressibleInst()` return
false for these MachineInstr, allowing `getInstSizeInBytes()` to return
a value and `EstimateFunctionSizeInBytes()` to work as intended.

See https://reviews.llvm.org/D129999#3694222 for details.

Reviewed By: luismarques

Differential Revision: https://reviews.llvm.org/D129999
2022-08-24 13:23:38 +08:00
Kai Sasaki
ad714d5b74 [mlir][math] Lower math.floor,ceil to libm
Lower math.floor and math.ceil to libm

Reviewed By: ftynse

Differential Revision: https://reviews.llvm.org/D131876
2022-08-24 13:28:13 +09:00
Che-Yu Wu
f250b97222 Reland "[MLIR]Extend vector.gather to support n-D result"
Reviewed By: dcaballe

Differential Revision: https://reviews.llvm.org/D132507
2022-08-24 04:18:00 +00:00
Keno Fischer
30d7d74d5c [MSAN] Handle array alloca with non-i64 size specification
The array size specification of the an alloca can be any integer,
so zext or trunc it to intptr before attempting to multiply it
with an intptr constant.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131846
2022-08-24 03:24:21 +00:00
Keno Fischer
5739d29cde [MSAN] Correct shadow type for atomicrmw instrumentation
We were passing the type of `Val` to `getShadowOriginPtr`, rather
than the type of `Val`'s shadow resulting in broken IR. The fix
is simple.

Reviewed By: eugenis

Differential Revision: https://reviews.llvm.org/D131845
2022-08-24 03:24:19 +00:00
John Ericson
4c5114250b [Polly] Don't use llvm-config anymore (in CMake sad path)
If `LLVM_BUILD_MAIN_SRC_DIR` is not defined, just assume we are in
regular monorepo layout. Non-standard (and not really supported) layouts
can still be configured manually.

Reviewed By: beanz

Differential Revision: https://reviews.llvm.org/D132314
2022-08-23 22:47:14 -04:00
Bing1 Yu
6d8ddf53cc [X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141
2022-08-24 10:22:46 +08:00
Simon Pilgrim
e624f8a3bb [DAG] MatchRotate - bail if we fail to match a shl/srl pair
extractShiftForRotate may fail to return canonicalized shifts due to constant folding or other simplification that can occur in getNode()

Fixes Issue #57283
2022-08-24 03:05:07 +01:00
Chris Bieneman
887bafb503 [HLSL] Infer language from file extension
This allows the language mode for HLSL to be inferred from the file
extension.
2022-08-23 20:52:29 -05:00
Chris Bieneman
9616905744 [NFC] Fix warning
This change came in a few hours ago and introduced a warning. The fix
is trivial, so I'm providing it. The original change was reviewed here:

https://reviews.llvm.org/D132331
2022-08-23 20:50:37 -05:00
Bing1 Yu
0d8f9520c5 Revert "[X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit"
This reverts commit 07e34763b02728857e1d6e8ccd2b82820eb3c0cc.
2022-08-24 09:38:46 +08:00
Bing1 Yu
07e34763b0 [X86] Emulate _rdrand64_step with two rdrand32 if it is 32bit
Reviewed By: RKSimon

Differential Revision: https://reviews.llvm.org/D132141
2022-08-24 09:28:55 +08:00
River Riddle
ae97b5acf8 [mlir:Bytecode] Move variable to inside of the lambda to fix MSVC build
MSVC is not picking up a variable capture somehow, try moving it inside.
2022-08-23 17:43:53 -07:00
Amir Ayupov
37cbbea674 [BOLT][NFC] Move out handleAArch64IndirectCall
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
255 to 233 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132104
2022-08-23 17:37:01 -07:00
Amir Ayupov
c844850bdf [BOLT][NFC] Move out handleIndirectBranch
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
295 to 255 LoC.

Differential Revision: https://reviews.llvm.org/D132101
2022-08-23 17:36:51 -07:00
Amir Ayupov
ec1fbf229e [BOLT][NFC] Move out handleExternalReference
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
338 to 295 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132100
2022-08-23 17:36:41 -07:00
Amir Ayupov
6cd475f8ca [BOLT][NFC] Move out handlePCRelOperand
Move the large lambda out of BinaryFunction::disassemble, reducing its size from
377 to 338 LoC.

Reviewed By: rafauler

Differential Revision: https://reviews.llvm.org/D132099
2022-08-23 17:36:29 -07:00
Petr Hosek
eca29d4a37 [Clang] Avoid using unwind library in the MSVC environment
We're seeing the following warnings with --rtlib=compiler-rt:

  lld-link: warning: ignoring unknown argument '--as-needed'
  lld-link: warning: ignoring unknown argument '-lunwind'
  lld-link: warning: ignoring unknown argument '--no-as-needed'

MSVC doesn't use the unwind library, so just omit it.

Differential Revision: https://reviews.llvm.org/D132440
2022-08-24 00:09:01 +00:00
River Riddle
df4e637ca7 [mlir:Bytecode] Use UNSUPPORTED instead of XFAIL for s390x
Some tests still pass even though we don't claim big-endian support. Using
UNSUPPORTED is a better indicator than XFAIL that we don't guarantee that
the tests work.
2022-08-23 16:56:04 -07:00
River Riddle
02c2ecb9c6 [mlir:Bytecode] Add initial support for dialect defined attribute/type encodings
Dialects can opt-in to providing custom encodings by implementing the
`BytecodeDialectInterface`. This interface provides hooks, namely
`readAttribute`/`readType` and `writeAttribute`/`writeType`, that will be used
by the bytecode reader and writer. These hooks are provided a reader and writer
implementation that can be used to encode various constructs in the underlying
bytecode format. A unique feature of this interface is that dialects may choose
to only encode a subset of their attributes and types in a custom bytecode
format, which can simplify adding new or experimental components that aren't
fully baked.

Differential Revision: https://reviews.llvm.org/D132498
2022-08-23 16:56:04 -07:00
River Riddle
b3449392f5 [mlir:Bytecode][NFC] Cleanup Attribute/Type reading
This moves some parsing functionality from BytecodeReader to
AttrTypeReader, and removes some duplication between the attribute/type
code paths.

Differential Revision: https://reviews.llvm.org/D132497
2022-08-23 16:56:03 -07:00
River Riddle
83dc999948 [mlir:Bytecode][NFC] Refactor string section writing and reading
This extracts the string section writer and reader into dedicated
classes, which better separates the logic and will also simplify future
patches that want to interact with the string section.

Differential Revision: https://reviews.llvm.org/D132496
2022-08-23 16:56:03 -07:00
Teresa Johnson
d10c1b88f0 [memprof] Correct max size and access count computations
The existing code resulted in the max size and access counts being equal
to the min. Compute the max instead (max lifetime was already correct).

Differential Revision: https://reviews.llvm.org/D132515
2022-08-23 16:53:46 -07:00
isuckatcs
aac73a31ad [analyzer] Process non-POD array element destructors
The constructors of non-POD array elements are evaluated under
certain conditions. This patch makes sure that in such cases
we also evaluate the destructors.

Differential Revision: https://reviews.llvm.org/D130737
2022-08-24 01:28:21 +02:00
Slava Zakharin
af7edf1557 [flang] Keep original data type for do-variable value.
Keep the original data type of integer do-variables
for structured loops. When do-variable's data type
is an integer type shorter than IndexType, processing
the do-variable separately from the DoLoop's iteration index
allows getting rid of type casts, which can make backend
optimizations easier.

For example,
```
  do i = 2, n-1
    do j = 2, n-1
      ... = a(j-1, i)
    end do
  end do
```

If value of 'j' is computed by casting the DoLoop's iteration
index to 'i32', then Flang will produce the following LLVM IR:
```
  %1 = trunc i64 %iter_index to i32
  %2 = sub i32 %1, 1
  %3 = sext i32 %2 to i64
```

LLVM's InstCombine may try to get rid of the sign extension,
and may transform this into:
```
  %1 = shl i64 %iter_index, 32
  %2 = add i64 %1, -4294967296
  %3 = ashr exact i64 %2, 32
```

The extra computations for the element address applied on top
of this awkward pattern confuse LLVM vectorizer so that
it does not recognize the unit-strided access of 'a'.

Measured performance improvements on `SPEC CPU2000@IceLake`:
```
168.wupwise:    11.96%
171.swim:       11.22%
172.mrgid:      56.38%
178.galgel:      7.29%
301.apsi:        8.32%
```

Differential Revision: https://reviews.llvm.org/D132176
2022-08-23 15:54:54 -07:00
Louis Dionne
355e0ce3c5 [libc++] Extend check for non-ASCII characters to src/, test/ and benchmarks/
Differential Revision: https://reviews.llvm.org/D132180
2022-08-23 18:36:38 -04:00
Louis Dionne
89469df8ba [libc++] Remove trailing whitespace from libcxx includes, source, tests and benchmarks
Differential Revision: https://reviews.llvm.org/D132175
2022-08-23 18:25:54 -04:00
Siva Chandra
d00e97df0f [libc][Obvious] Fix typo is chmod implementation.
This now allows enabling the chmod function on aarch64.
2022-08-23 15:01:21 -07:00
Eli Friedman
ab0574dac3 Print more information when JSON parsing fails for unittests.
Trying to figure out intermittent failure on reverse-iteration buildbot.
2022-08-23 14:57:49 -07:00
Sanjay Patel
f8dfbea324 [SDAG] expand more is-power-of-2 patterns that use popcount
(ctpop x) == 1 --> (x != 0) && ((x & x-1) == 0)

Adjust the legality check to avoid the poor codegen on AArch64.
We probably only want to use popcount on this pattern when it
is a single instruction.

fixes #57225

Differential Revision: https://reviews.llvm.org/D132237
2022-08-23 17:53:53 -04:00
Sanjay Patel
7d670976db [AArch64] add test for popcount i32; NFC
More coverage for D132237
2022-08-23 17:53:53 -04:00
Sanjay Patel
8ccca3f3a4 [InstCombine] adjust tests for mul+add common factor; NFC
The existing tests were added with 2880d7b9e4c9a0, but
discussion in D132412 suggests that we should start
with a simpler pattern (the more complicated pattern
may not be a real problem).
2022-08-23 17:53:53 -04:00
Vitaly Buka
89476dbca1 [symbolizer] Fix build after 342e0eb
342e0eb reverted LLVM_ENABLE_RUNTIMES incompletly and missed /runtimes
part. This target has no issues with LLVM_ENABLE_RUNTIMES, so we can
keep it.
2022-08-23 14:51:39 -07:00
Vitaly Buka
11633314d8 [symbolizer] Remove check if it's monorepo 2022-08-23 14:49:05 -07:00
Vitaly Buka
3195449f2b [test][openmp] Relax condition in test
It runs 8 threads. Sometimes tsan is able to detect more than one of the
same race.
2022-08-23 14:29:06 -07:00
Louis Dionne
7dfcf9342b [CMake] Move cxx-headers to RUNTIME_DISTRIBUTION_COMPONENTS in Apple-stage2.cmake
We build libcxx using LLVM_ENABLE_RUNTIMES during Stage2, which requires
cxx-headers to be part of LLVM_RUNTIME_DISTRIBUTION_COMPONENTS instead
of LLVM_DISTRIBUTION_COMPONENTS.

rdar://99028431

Differential Revision: https://reviews.llvm.org/D132488
2022-08-23 17:02:51 -04:00
Vitaly Buka
b5a9adf1f5 [clang] Create alloca to pass into static lambda
"this" parameter of lambda if undef, notnull and differentiable.
So we need to pass something consistent.

Any alloca will work. It will be eliminated as unused later by optimizer.

Otherwise we generate code which Msan is expected to catch.

Reviewed By: efriedma

Differential Revision: https://reviews.llvm.org/D132275
2022-08-23 13:53:17 -07:00
Peter Steinfeld
86976ba7c9 [Flang] Make the TODO messages for intrinsics more consistent
This title says it all.

Differential Revision: https://reviews.llvm.org/D132179
2022-08-23 13:50:44 -07:00
Philip Reames
49547b2241 [slp] Pull out a getOperandInfo variant helper [nfc] 2022-08-23 13:46:05 -07:00
Alvin Wong
c0214db51a [llvm] Mark CFGuard fn ptr symbol as DSO local and add tests for mingw
For mingw target, if a symbol is not marked DSO local, a `.refptr` is
generated for it. This makes CFG check calls use an extra pointer
dereference, which adds extra overhead compared to the MSVC version,
so mark the CFG guard check funciton pointer DSO local to stop it.
This should have no effect on MSVC target.

Also adapt the existing cfguard tests to run for mingw targets, so that
this change is checked.

Reviewed By: rnk

Differential Revision: https://reviews.llvm.org/D132331
2022-08-23 23:39:39 +03:00
Alvin Wong
94778692ad [clang] Add support for __attribute__((guard(nocf)))
To support using Control Flow Guard with mingw-w64, Clang needs to
accept `__declspec(guard(nocf))` also for the GNU target. Since mingw
has `#define __declspec(a) __attribute__((a))` as built-in, the simplest
solution is to accept `__attribute__((guard(nocf)))` to be compatible with
MSVC and Clang's msvc target.

As a side effect, this also adds `[[clang::guard(nocf)]]` for C++.

Reviewed By: aaron.ballman

Differential Revision: https://reviews.llvm.org/D132302
2022-08-23 23:39:38 +03:00