988 Commits

Author SHA1 Message Date
Nuno Lopes
b0afa6bab9 [clang] Change some placeholders from undef to poison [NFC] 2024-11-19 15:18:40 +00:00
Justin Stitt
3dd1d888fb
[Clang] Implement labelled type filtering for overflow/truncation sanitizers w/ SSCLs (#107332)
[Related
RFC](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683/5?u=justinstitt)

### Summary

Implement type-based filtering via [Sanitizer Special Case
Lists](https://clang.llvm.org/docs/SanitizerSpecialCaseList.html) for
the arithmetic overflow and truncation sanitizers.

Currently, using the `type:` prefix with these sanitizers does nothing.
I've hooked up the SSCL parsing with Clang codegen so that we don't emit
the overflow/truncation checks if the arithmetic contains an ignored
type.

### Usefulness

You can craft ignorelists that ignore specific types that are expected
to overflow or wrap-around. For example, to ignore `my_type` from
`unsigned-integer-overflow` instrumentation:
```bash
$ cat ignorelist.txt
[unsigned-integer-overflow]
type:my_type=no_sanitize

$ cat foo.c
typedef unsigned long my_type;
void foo() {
  my_type a = ULONG_MAX;
  ++a;
}

$ clang foo.c -fsanitize=unsigned-integer-overflow -fsanitize-ignorelist=ignorelist.txt ; ./a.out
// --> no sanitizer error
```

If a type is functionally intended to overflow, like
[refcount_t](https://kernsec.org/wiki/index.php/Kernel_Protections/refcount_t)
and its associated APIs in the Linux kernel, then this type filtering
would prove useful for reducing sanitizer noise. Currently, the Linux
kernel dealt with this by
[littering](https://elixir.bootlin.com/linux/v6.10.8/source/include/linux/refcount.h#L139
) `__attribute__((no_sanitize("signed-integer-overflow")))` annotations
on all the `refcount_t` APIs. I think this serves as an example of how a
codebase could be made cleaner. We could make custom types that are
filtered out in an ignorelist, allowing for types to be more expressive
-- without the need for annotations. This accomplishes a similar goal to
https://github.com/llvm/llvm-project/pull/86618.


Yet another use case for this type filtering is whitelisting. We could
ignore _all_ types, save a few.

```bash
$ cat ignorelist.txt
[implicit-signed-integer-truncation]
type:*=no_sanitize # ignore literally all types
type:short=sanitize # except `short`

$ cat bar.c
// compile with -fsanitize=implicit-signed-integer-truncation
void bar(int toobig) {
  char a = toobig;  // not instrumented
  short b = toobig; // instrumented
}
```

### Other ways to accomplish the goal of sanitizer
allowlisting/whitelisting
* ignore list SSCL type support (this PR that you're reading)
* [my sanitize-allowlist
branch](https://github.com/llvm/llvm-project/compare/main...JustinStitt:llvm-project:sanitize-allowlist)
- this just implements a sibling flag `-fsanitize-allowlist=`, removing
some of the double negative logic present with `skip`/`ignore` when
trying to whitelist something.
* [Glob
Negation](https://discourse.llvm.org/t/rfc-support-globpattern-add-operator-to-invert-matches/80683)
- Implement a negation operator to the GlobPattern class so the
ignorelist query can use them to simulate allowlisting


Please let me know which of the three options we like best. They are not
necessarily mutually exclusive.

Here's [another related
PR](https://github.com/llvm/llvm-project/pull/86618) which implements a
`wraps` attribute. This can accomplish a similar goal to this PR but
requires in-source changes to codebases and also covers a wider variety
of integer definedness problems.

### CCs
@kees @vitalybuka @bwendling

---------

Signed-off-by: Justin Stitt <justinstitt@google.com>
2024-11-03 23:57:47 -08:00
Congcong Cai
c0c36aa018
[clang codegen] fix crash emitting __array_rank (#113186)
Fixed: #113044
the type of `ArrayTypeTraitExpr` can be changed, use i32 directly is
incorrect.

---------

Co-authored-by: Eli Friedman <efriedma@quicinc.com>
2024-10-22 17:03:51 +08:00
Erich Keane
d412cea8c4
[OpenACC] Implement 'tile' attribute AST (#110999)
The 'tile' clause shares quite a bit of the rules with 'collapse', so a
followup patch will add those tests/behaviors. This patch deals with
adding the AST node.

The 'tile' clause takes a series of integer constant expressions, or *.
The asterisk is now represented by a new OpenACCAsteriskSizeExpr node,
else this clause is very similar to others.
2024-10-03 08:34:43 -07:00
Nikita Popov
ecb98f9fed [IRBuilder] Remove uses of CreateGlobalStringPtr() (NFC)
Since the migration to opaque pointers, CreateGlobalStringPtr()
is the same as CreateGlobalString(). Normalize to the latter.
2024-09-23 16:30:50 +02:00
Chris B
a29afb754f
[HLSL] Allow truncation to scalar (#104844)
HLSL allows implicit conversions to truncate vectors to scalar
pr-values. These conversions are scored as vector truncations and should
warn appropriately.

This change allows forming a truncation cast to a pr-value, but not an
l-value. Truncating a vector to a scalar is performed by loading the
first element of the vector and disregarding the remaining elements.

Fixes #102964
2024-09-11 17:27:09 -05:00
JinjinLi868
56905dab7d
[clang] fix half && bfloat16 convert node expr codegen (#89051)
Data type conversion between fp16 and bf16 will generate fptrunc and
fpextend nodes, but they are actually bitcast nodes.
2024-09-10 10:47:33 +08:00
Hari Limaye
7eca38ce76
Reland "[clang] Add nuw attribute to GEPs (#105496)" (#107257)
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.

Relands #105496, which was reverted because it exposed a miscompilation
arising from #98608. This is now fixed by #106512.
2024-09-05 16:13:11 +01:00
Yingwei Zheng
9fef09fd29
[Clang][CodeGen] Fix type for atomic float incdec operators (#107075)
`llvm::ConstantFP::get(llvm::LLVMContext&, APFloat(float))` always
returns a f32 constant.
Fix https://github.com/llvm/llvm-project/issues/107054.
2024-09-04 12:19:46 +08:00
Vitaly Buka
69437a392e
Revert "[clang] Add nuw attribute to GEPs" (#106343)
Reverts llvm/llvm-project#105496

This patch breaks:
https://lab.llvm.org/buildbot/#/builders/25/builds/1952
https://lab.llvm.org/buildbot/#/builders/52/builds/1775

Somehow output is different with sanitizers.
Maybe non-determinism in the code?
2024-08-28 12:14:04 +02:00
Hari Limaye
3d2fd31c8f
[clang] Add nuw attribute to GEPs (#105496)
Add nuw attribute to inbounds GEPs where the expression used to form the
GEP is an addition of unsigned indices.
2024-08-27 14:20:48 +01:00
Justin Stitt
76236fafda
[Clang] Overflow Pattern Exclusion - rename some patterns, enhance docs (#105709)
From @vitalybuka's review on
https://github.com/llvm/llvm-project/pull/104889:
- [x] remove unused variable in tests
- [x] rename `post-decr-while` --> `unsigned-post-decr-while`
- [x] split `add-overflow-test` into `add-unsigned-overflow-test` and
`add-signed-overflow-test`
- [x] be more clear about defaults within docs
- [x] add table to docs

Here's a screenshot of the rendered table so you don't have to build the
html docs yourself to inspect the layout:

![image](https://github.com/user-attachments/assets/5d3497c4-5f5a-4579-b29b-96a0fd192faa)


CCs: @vitalybuka

---------

Signed-off-by: Justin Stitt <justinstitt@google.com>
Co-authored-by: Vitaly Buka <vitalybuka@google.com>
2024-08-23 23:33:23 -07:00
Florian Hahn
96509bb98f
[Matrix] Preserve signedness when extending matrix index expression. (#103044)
As per [1] the indices for a matrix element access operator shall have
integral or unscoped enumeration types and be non-negative. At the
moment, the index expression is converted to SizeType irrespective of
the signedness of the index expression. This causes implicit sign
conversion warnings if any of the indices is signed.

As per the spec, using signed types as indices is allowed and should not
cause any warnings. If the index expression is signed, extend to
SignedSizeType to avoid the warning.

[1]
https://clang.llvm.org/docs/MatrixTypes.html#matrix-type-element-access-operator

PR: https://github.com/llvm/llvm-project/pull/103044
2024-08-23 10:11:52 +01:00
Justin Stitt
295fe0bd43
[Clang] Re-land Overflow Pattern Exclusions (#104889)
Introduce "-fsanitize-undefined-ignore-overflow-pattern=" which can
be used to disable sanitizer instrumentation for common overflow-dependent
code patterns.

For a wide selection of projects, proper overflow sanitization could
help catch bugs and solve security vulnerabilities. Unfortunately, in
some cases the integer overflow sanitizers are too noisy for their users
and are often left disabled. Providing users with a method to disable
sanitizer instrumentation of common patterns could mean more projects
actually utilize the sanitizers in the first place.

One such project that has opted to not use integer overflow (or
truncation) sanitizers is the Linux Kernel. There has been some
discussion[1] recently concerning mitigation strategies for unexpected
arithmetic overflow. This discussion is still ongoing and a succinct
article[2] accurately sums up the discussion. In summary, many Kernel
developers do not want to introduce more arithmetic wrappers when
most developers understand the code patterns as they are.

Patterns like:

  if (base + offset < base) { ... }

or

  while (i--) { ... }

or

  #define SOME -1UL

are extremely common in a code base like the Linux Kernel. It is
perhaps too much to ask of kernel developers to use arithmetic wrappers
in these cases. For example:

  while (wrapping_post_dec(i)) { ... }

which wraps some builtin would not fly. This would incur too many
changes to existing code; the code churn would be too much, at least too
much to justify turning on overflow sanitizers.

Currently, this commit tackles three pervasive idioms:

1. "if (a + b < a)" or some logically-equivalent re-ordering like "if (a > b + a)"
2. "while (i--)" (for unsigned) a post-decrement always overflows here
3. "-1UL, -2UL, etc" negation of unsigned constants will always overflow

The patterns that are excluded can be chosen from the following list:

- add-overflow-test
- post-decr-while
- negated-unsigned-const

These can be enabled with a comma-separated list:

  -fsanitize-undefined-ignore-overflow-pattern=add-overflow-test,negated-unsigned-const

"all" or "none" may also be used to specify that all patterns should be
excluded or that none should be.

[1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
[2] https://lwn.net/Articles/979747/

CCs: @efriedma-quic @kees @jyknight @fmayer @vitalybuka
Signed-off-by: Justin Stitt <justinstitt@google.com>
Co-authored-by: Bill Wendling <morbo@google.com>
2024-08-20 20:13:44 +00:00
Thurston Dang
e398da2b37 Revert "[Clang] Overflow Pattern Exclusions (#100272)"
This reverts commit 9a666deecb9ff6ca3a6b12e6c2877e19b74b54da.

Reason: broke buildbots

e.g., fork-ubsan.test started failing at
https://lab.llvm.org/buildbot/#/builders/66/builds/2819/steps/9/logs/stdio

  Clang :: CodeGen/compound-assign-overflow.c
  Clang :: CodeGen/sanitize-atomic-int-overflow.c
started failing with https://lab.llvm.org/buildbot/#/builders/52/builds/1570
2024-08-15 10:18:52 -07:00
Justin Stitt
9a666deecb
[Clang] Overflow Pattern Exclusions (#100272)
Introduce "-fsanitize-overflow-pattern-exclusion=" which can be used to
disable sanitizer instrumentation for common overflow-dependent code
patterns.

For a wide selection of projects, proper overflow sanitization could
help catch bugs and solve security vulnerabilities. Unfortunately, in
some cases the integer overflow sanitizers are too noisy for their users
and are often left disabled. Providing users with a method to disable
sanitizer instrumentation of common patterns could mean more projects
actually utilize the sanitizers in the first place.

One such project that has opted to not use integer overflow (or
truncation) sanitizers is the Linux Kernel. There has been some
discussion[1] recently concerning mitigation strategies for unexpected
arithmetic overflow. This discussion is still ongoing and a succinct
article[2] accurately sums up the discussion. In summary, many Kernel
developers do not want to introduce more arithmetic wrappers when
most developers understand the code patterns as they are.

Patterns like:

    if (base + offset < base) { ... }

or

    while (i--) { ... }

or

    #define SOME -1UL

are extremely common in a code base like the Linux Kernel. It is
perhaps too much to ask of kernel developers to use arithmetic wrappers
in these cases. For example:

    while (wrapping_post_dec(i)) { ... }

which wraps some builtin would not fly. This would incur too many
changes to existing code; the code churn would be too much, at least too
much to justify turning on overflow sanitizers.

Currently, this commit tackles three pervasive idioms:

1. "if (a + b < a)" or some logically-equivalent re-ordering like "if (a > b + a)"
2. "while (i--)" (for unsigned) a post-decrement always overflows here
3. "-1UL, -2UL, etc" negation of unsigned constants will always overflow

The patterns that are excluded can be chosen from the following list:

- add-overflow-test
- post-decr-while
- negated-unsigned-const

These can be enabled with a comma-separated list:

    -fsanitize-overflow-pattern-exclusion=add-overflow-test,negated-unsigned-const

"all" or "none" may also be used to specify that all patterns should be
excluded or that none should be.

[1] https://lore.kernel.org/all/202404291502.612E0A10@keescook/
[2] https://lwn.net/Articles/979747/

CCs: @efriedma-quic @kees @jyknight @fmayer @vitalybuka
Signed-off-by: Justin Stitt <justinstitt@google.com>
Co-authored-by: Bill Wendling <morbo@google.com>
2024-08-15 00:17:06 +00:00
Matt Arsenault
e108853ac8
clang: Allow targets to set custom metadata on atomics (#96906)
Use this to replace the emission of the amdgpu-unsafe-fp-atomics
attribute in favor of per-instruction metadata. In the future
new fine grained controls should be introduced that also cover
the integer cases.

Add a wrapper around CreateAtomicRMW that appends the metadata,
and update a few use contexts to use it.
2024-07-26 09:57:28 +04:00
Akira Hatanaka
f6b06b42a3
[PAC] Implement function pointer re-signing (#98847)
Re-signing occurs when function type discrimination is enabled and a
function pointer is converted to another function pointer type that
requires signing using a different discriminator. A function pointer is
re-signed using discriminator zero when it's converted to a pointer to a
non-function type such as `void*`.

---------

Co-authored-by: Ahmed Bougacha <ahmed@bougacha.org>
Co-authored-by: John McCall <rjmccall@apple.com>
2024-07-18 07:51:17 -07:00
Mariya Podchishchaeva
9ad72df55c
[clang] Use different memory layout type for _BitInt(N) in LLVM IR (#91364)
There are two problems with _BitInt prior to this patch:
1. For at least some values of N, we cannot use LLVM's iN for the type
of struct elements, array elements, allocas, global variables, and so
on, because the LLVM layout for that type does not match the high-level
layout of _BitInt(N).
Example: Currently for i128:128 targets correct implementation is
possible either for __int128 or for _BitInt(129+) with lowering to iN,
but not both, since we have now correct implementation of __int128 in
place after a21abc7.
When this happens, opaque [M x i8] types used, where M =
sizeof(_BitInt(N)).
2. LLVM doesn't guarantee any particular extension behavior for integer
types that aren't a multiple of 8. For this reason, all _BitInt types
are now have in-memory representation that is a whole number of bytes.
I.e. for example _BitInt(17) now will have memory layout type i32.

This patch also introduces concept of load/store type and adds an API to
CodeGenTypes that returns the IR type that should be used for load and
store operations. This is particularly useful for the case when a
_BitInt ends up having array of bytes as memory layout type. For
_BitInt(N), let M = sizeof(_BitInt(N)), and let BITS = M * 8. Loads and
stores of iM would both (1) produce far better code from the backends
and (2) be far more optimizable by IR passes than loads and stores of [M
x i8].

Fixes https://github.com/llvm/llvm-project/issues/85139
Fixes https://github.com/llvm/llvm-project/issues/83419

---------

Co-authored-by: John McCall <rjmccall@gmail.com>
2024-07-15 09:40:39 +02:00
Mariya Podchishchaeva
41c6e43792
Reland [clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (#95802)
This commit implements the entirety of the now-accepted [N3017
-Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded by
AST consumers (CodeGen or constant expression evaluators) or expand
embed directive as a comma expression.

This reverts commit
682d461d5a.

---------

Co-authored-by: The Phantom Derpstorm <phdofthehouse@gmail.com>
Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
2024-06-20 14:38:46 +02:00
Mariya Podchishchaeva
6d973b4548
[clang][CodeGen] Return RValue from EmitVAArg (#94635)
This should simplify handling of resulting value by the callers.
2024-06-17 13:29:20 +02:00
William Junda Huang
675d8d629d
(New) Add option to generate additional debug info for expression dereferencing pointer to pointers (#95298)
This is a different implementation to #94100, which has been reverted.

When -fdebug-info-for-profiling is specified, for any Load expression if
the pointer operand is not a declared variable, clang will emit debug
info describing the type of the pointer operand (which can be an
intermediate expr)
2024-06-15 00:02:45 -04:00
Vitaly Buka
682d461d5a
Revert " [Sema, Lex, Parse] Preprocessor embed in C and C++ (and Obj-C and Obj-C++ by-proxy)" (#95299)
Reverts llvm/llvm-project#68620

Introduce or expose a memory leak and UB, see llvm/llvm-project#68620
2024-06-12 13:14:26 -07:00
The Phantom Derpstorm
5989450e00
[clang][Sema, Lex, Parse] Preprocessor embed in C and C++ (and Obj-C and Obj-C++ by-proxy) (#68620)
This commit implements the entirety of the now-accepted [N3017 -
Preprocessor
Embed](https://www.open-std.org/jtc1/sc22/wg14/www/docs/n3017.htm) and
its sister C++ paper [p1967](https://wg21.link/p1967). It implements
everything in the specification, and includes an implementation that
drastically improves the time it takes to embed data in specific
scenarios (the initialization of character type arrays). The mechanisms
used to do this are used under the "as-if" rule, and in general when the
system cannot detect it is initializing an array object in a variable
declaration, will generate EmbedExpr AST node which will be expanded
by AST consumers (CodeGen or constant expression evaluators) or
expand embed directive as a comma expression.

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
Co-authored-by: cor3ntin <corentinjabot@gmail.com>
Co-authored-by: H. Vetinari <h.vetinari@gmx.com>
Co-authored-by: Podchishchaeva, Mariya <mariya.podchishchaeva@intel.com>
2024-06-12 09:16:02 +02:00
William Junda Huang
3fce14569f
Revert "Add option to generate additional debug info for expression dereferencing pointer to pointers. #94100" (#95174)
The option is causing the binary output to be different when compiled
under `-O0`, because it introduce dbg.declare on pseudovariables. Going
to change this implementation to use dbg.value instead.
2024-06-11 17:33:20 -04:00
William Junda Huang
5cb00785aa
Add option to generate additional debug info for expression dereferencing pointer to pointers. (#94100)
This is another attempt to land #81545, which was reverted. 

Fixed test case by adding a target triple so that clang generates the same IR for all platforms
2024-06-03 16:42:24 -04:00
Mehdi Amini
7d4a45d982 Revert "Add option to generate additional debug info for expression dereferencing pointer to pointers. (#81545)"
This reverts commit aeccfee348c717165541d8d895b9b0cdfe31415c, and dependents:

Revert "[NFC] Fix PPC buildbot failure https://lab.llvm.org/buildbot/#/builders/230/builds/29066"
This reverts commit 2b1d1c51f6e321267cc86e9db7808298c59caf0e.

Revert "Fix test - remove unnecessary/incorrect `-S`, in favor of `-emit-llvm`"
This reverts commit ea1ecb50fa831583241fc531153bd2c072955d29.

The test is failing on MacOs and Windows
2024-05-29 21:56:59 -07:00
William Junda Huang
aeccfee348
Add option to generate additional debug info for expression dereferencing pointer to pointers. (#81545)
Such expression does not correspond to a variable in the source code
thus does not have a debug location. When the user collects perf data on
the program, if the intermediate memory load instruction is sampled, it
could not be attributed to any variable/class member, which causes the
sampling results to be under-counted.
This patch adds an option `-fdebug_info_for_pointer_type` to generate a
psuedo variable and its debug info for intermediate expression with
pointer dereferencing, so that perf data collected on the instruction of
that expression can be attributed to the correct class member.

This is a prototype so comments are needed.
2024-05-29 18:04:11 -04:00
Ahmed Bougacha
3575d23ca8
[clang][CodeGen] Remove unused LValue::getAddress CGF arg. (#92465)
This is in effect a revert of f139ae3d93797, as we have since gained a
more sophisticated way of doing extra IRGen with the addition of
RawAddress in #86923.
2024-05-20 10:23:04 -07:00
Krishna Narayanan
f17b1fb667
[Clang][CodeGen] Optimised LLVM IR for atomic increments/decrements on floats (#89362)
Fixes #53079
2024-05-02 10:42:34 +01:00
Maciej Gabka
bfc0317153
Move several vector intrinsics out of experimental namespace (#88748)
This patch is moving out following intrinsics:
* vector.interleave2/deinterleave2
* vector.reverse
* vector.splice

from the experimental namespace.

All these intrinsics exist in LLVM for more than a year now, and are
widely used, so should not be considered as experimental.
2024-04-29 10:16:45 +01:00
Matt Arsenault
bd84f5d5d7 clang: Remove unnecessary pointer bitcast 2024-04-22 11:35:09 +02:00
Björn Pettersson
20667dbec3
[clang][CodeGen] Fix shift-exponent ubsan check for signed _BitInt (#88004)
Commit 5f87957fefb21d454f2f (pull-requst #80515) corrected some codegen
problems related to _BitInt types being used as shift exponents. But it
did not fix it properly for the special case when the shift count
operand is a signed _BitInt.

The basic problem is the same as the one solved for unsigned _BitInt. As
we use an unsigned comparison to see if the shift exponent is
out-of-bounds, then we need to find an unsigned maximum allowed shift
amount to use in the check. Normally the shift amount is limited by
bitwidth of the LHS of the shift. However, when the RHS type is small in
relation to the LHS then we need to use a value that fits inside the
bitwidth of the RHS instead.

The earlier fix simply used the unsigned maximum when deterining the max
shift amount based on the RHS type. It did however not take into
consideration that the RHS type could have a signed representation. In
such situations we need to use the signed maximum instead. Otherwise we
do not recognize a negative shift exponent as UB.
2024-04-19 08:11:40 +02:00
Axel Lundberg
708c8cd743
Fix "[clang][UBSan] Add implicit conversion check for bitfields" (#87761)
Fix since #75481 got reverted.

- Explicitly set BitfieldBits to 0 to avoid uninitialized field member
for the integer checks:
```diff
-       llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first)};
+      llvm::ConstantInt::get(Builder.getInt8Ty(), Check.first),
+      llvm::ConstantInt::get(Builder.getInt32Ty(), 0)};
```
- `Value **Previous` was erroneously `Value *Previous` in
`CodeGenFunction::EmitWithOriginalRHSBitfieldAssignment`, fixed now.
- Update following:
```diff
-     if (Kind == CK_IntegralCast) {
+     if (Kind == CK_IntegralCast || Kind == CK_LValueToRValue) {
```
CK_LValueToRValue when going from, e.g., char to char, and
CK_IntegralCast otherwise.
- Make sure that `Value *Previous = nullptr;` is initialized (see
1189e87951)
- Add another extensive testcase
`ubsan/TestCases/ImplicitConversion/bitfield-conversion.c`

---------

Co-authored-by: Vitaly Buka <vitalybuka@gmail.com>
2024-04-08 12:30:27 -07:00
Vitaly Buka
029e1d7515
Revert "Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields""" (#87562)
Reverts llvm/llvm-project#87529

Reverts #87518

https://lab.llvm.org/buildbot/#/builders/37/builds/33262 is still broken
2024-04-03 15:19:03 -07:00
Vitaly Buka
8a5a1b7704
Revert "Revert "[clang][UBSan] Add implicit conversion check for bitfields"" (#87529)
Reverts llvm/llvm-project#87518

Revert is not needed as the regression was fixed with
1189e87951e59a81ee097eae847c06008276fef1.

I assumed the crash and warning are different issues, but according to
https://lab.llvm.org/buildbot/#/builders/240/builds/26629
fixing warning resolves the crash.
2024-04-03 10:58:39 -07:00
Vitaly Buka
5822ca5a01
Revert "[clang][UBSan] Add implicit conversion check for bitfields" (#87518)
Reverts llvm/llvm-project#75481

Breaks multiple bots, see #75481
2024-04-03 10:27:09 -07:00
Axel Lundberg
450f1952ac
[clang][UBSan] Add implicit conversion check for bitfields (#75481)
This patch implements the implicit truncation and implicit sign change
checks for bitfields using UBSan. E.g.,
`-fsanitize=implicit-bitfield-truncation` and
`-fsanitize=implicit-bitfield-sign-change`.
2024-04-03 08:55:03 -04:00
Chris B
9434c08347
[HLSL] Implement array temporary support (#79382)
HLSL constant sized array function parameters do not decay to pointers.
Instead constant sized array types are preserved as unique types for
overload resolution, template instantiation and name mangling.

This implements the change by adding a new `ArrayParameterType` which
represents a non-decaying `ConstantArrayType`. The new type behaves the
same as `ConstantArrayType` except that it does not decay to a pointer.

Values of `ConstantArrayType` in HLSL decay during overload resolution
via a new `HLSLArrayRValue` cast to `ArrayParameterType`.

`ArrayParamterType` values are passed indirectly by-value to functions
in IR generation resulting in callee generated memcpy instructions.

The behavior of HLSL function calls is documented in the [draft language
specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)
under the Expr.Post.Call heading.

Additionally the design of this implementation approach is documented in
[Clang's
documentation](https://clang.llvm.org/docs/HLSL/FunctionCalls.html)

Resolves #70123
2024-04-01 12:10:10 -05:00
Akira Hatanaka
84780af4b0
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86923)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies d9a685a9dd589486e882b722e513ee7b8c84870c, which was
reverted because it broke ubsan bots. There seems to be a bug in
coroutine code-gen, which is causing EmitTypeCheck to use the wrong
alignment. For now, pass alignment zero to EmitTypeCheck so that it can
compute the correct alignment based on the passed type (see function
EmitCXXMemberOrOperatorMemberCallExpr).
2024-03-28 06:54:36 -07:00
Akira Hatanaka
f75eebab88
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)" (#86898)
This reverts commit d9a685a9dd589486e882b722e513ee7b8c84870c.

The commit broke ubsan bots.
2024-03-27 18:14:04 -07:00
Akira Hatanaka
d9a685a9dd
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#86721)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.

This reapplies 8bd1f9116aab879183f34707e6d21c7051d083b6. The commit
broke msan bots because LValue::IsKnownNonNull was uninitialized.
2024-03-27 12:24:49 -07:00
Akira Hatanaka
b311756450
Revert "[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)" (#86674)
This reverts commit 8bd1f9116aab879183f34707e6d21c7051d083b6.

It appears that the commit broke msan bots.
2024-03-26 07:37:57 -07:00
Akira Hatanaka
8bd1f9116a
[CodeGen][arm64e] Add methods and data members to Address, which are needed to authenticate signed pointers (#67454)
To authenticate pointers, CodeGen needs access to the key and
discriminators that were used to sign the pointer. That information is
sometimes known from the context, but not always, which is why `Address`
needs to hold that information.

This patch adds methods and data members to `Address`, which will be
needed in subsequent patches to authenticate signed pointers, and uses
the newly added methods throughout CodeGen. Although this patch isn't
strictly NFC as it causes CodeGen to use different code paths in some
cases (e.g., `mergeAddressesInConditionalExpr`), it doesn't cause any
changes in functionality as it doesn't add any information needed for
authentication.

In addition to the changes mentioned above, this patch introduces class
`RawAddress`, which contains a pointer that we know is unsigned, and
adds several new functions for creating `Address` and `LValue` objects.
2024-03-25 18:05:42 -07:00
gulfemsavrun
23f895f656
[InstrProf] Single byte counters in coverage (#75425)
This patch inserts 1-byte counters instead of an 8-byte counters into
llvm profiles for source-based code coverage. The origial idea was
proposed as block-cov for PGO, and this patch repurposes that idea for
coverage: https://groups.google.com/g/llvm-dev/c/r03Z6JoN7d4

The current 8-byte counters mechanism add counters to minimal regions,
and infer the counters in the remaining regions via adding or
subtracting counters. For example, it infers the counter in the if.else
region by subtracting the counters between if.entry and if.then regions
in an if statement. Whenever there is a control-flow merge, it adds the
counters from all the incoming regions. However, we are not going to be
able to infer counters by subtracting two execution counts when using
single-byte counters. Therefore, this patch conservatively inserts
additional counters for the cases where we need to add or subtract
counters.

RFC:
https://discourse.llvm.org/t/rfc-single-byte-counters-for-source-based-code-coverage/75685
2024-02-26 14:44:55 -08:00
Justin Stitt
81b4b89197
[Sanitizer] Support -fwrapv with -fsanitize=signed-integer-overflow (#82432)
Clang has a `signed-integer-overflow` sanitizer to catch arithmetic
overflow; however, most of its instrumentation [fails to
apply](https://godbolt.org/z/ee41rE8o6) when `-fwrapv` is enabled; this
is by design.

The Linux kernel enables `-fno-strict-overflow` which implies `-fwrapv`.
This means we are [currently unable to detect signed-integer
wrap-around](https://github.com/KSPP/linux/issues/26). All the while,
the root cause of many security vulnerabilities in the Linux kernel is
[arithmetic overflow](https://cwe.mitre.org/data/definitions/190.html).

To work around this and enhance the functionality of
`-fsanitize=signed-integer-overflow`, we instrument signed arithmetic
even if the signed overflow behavior is defined.

Co-authored-by: Justin Stitt <justinstitt@google.com>
2024-02-21 21:00:08 +00:00
Chris Bieneman
0065161c72 Remove assert introduced in #71098
This should effectively preserve the old behavior for non-HLSL code.
2024-02-15 18:56:35 -06:00
Chris B
5c57fd717d
[HLSL] Vector standard conversions (#71098)
HLSL supports vector truncation and element conversions as part of
standard conversion sequences. The vector truncation conversion is a C++
second conversion in the conversion sequence. If a vector truncation is
in a conversion sequence an element conversion may occur after it before
the standard C++ third conversion.

Vector element conversions can be boolean conversions, floating point or
integral conversions or promotions.

[HLSL Draft
Specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf)

---------

Co-authored-by: Aaron Ballman <aaron@aaronballman.com>
2024-02-15 14:58:06 -06:00
Craig Topper
9be7b0a539
[IRGen][AArch64][RISCV] Generalize bitcast between i1 predicate vector and i8 fixed vector. (#76548)
Instead of only handling vscale x 16 x i1 predicate vectors, handle any
scalable i1 vector where the known minimum is divisible by 8.

This is used on RISC-V where we have multiple sizes of predicate
types.
2024-02-13 09:46:50 -08:00
Cooper Partin
16d1a6486c
[DirectX] Fix HLSL bitshifts to leverage the OpenCL pipeline for bitshifting (#81030)
Fixes #55106

In HLSL bit shifts are defined to shift by shift size % type size. This
contains the following changes:

HLSL codegen bit shifts will be emitted as x << (y & (sizeof(x) - 1) and
bitshift masking leverages the OpenCL pipeline for this.

Tests were also added to validate this behavior.


Before this change the following was being emitted:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl = shl i32 %0, %1
  ret i32 %shl
}

After this change:
; Function Attrs: noinline nounwind optnone
define noundef i32 @"?shl32@@YAHHH@Z"(i32 noundef %V, i32 noundef %S) #0
{
entry:
  %S.addr = alloca i32, align 4
  %V.addr = alloca i32, align 4
  store i32 %S, ptr %S.addr, align 4
  store i32 %V, ptr %V.addr, align 4
  %0 = load i32, ptr %V.addr, align 4
  %1 = load i32, ptr %S.addr, align 4
  %shl.mask = and i32 %1, 31
  %shl = shl i32 %0, %shl.mask
  ret i32 %shl
}

---------

Co-authored-by: Cooper Partin <coopp@ntdev.microsoft.com>
2024-02-08 11:50:21 -06:00