1443 Commits

Author SHA1 Message Date
Chuanqi Xu
d316a0bd48 [NFC] Remove unused ASTWriter::getTypeID
As the title suggests, the `ASTWriter:getTypeID` method is not used.
This patch removes it.
2024-05-20 13:36:46 +08:00
Vlad Serebrennikov
f7b0b99c52 [clang][NFC] Further improvements to const-correctness 2024-05-18 12:10:39 +03:00
Kazu Hirata
f71749c5ef [clang] Drop explicit conversions of string literals to StringRef (NFC)
We routinely rely on implicit conversions of string literals to
StringRef so that we can use operator==(StringRef, StringRef).

The LHS here are all known to be of StringRef.
2024-05-16 22:32:06 -07:00
Vlad Serebrennikov
31a203fa8a
[clang] Introduce SemaObjC (#89086)
This is continuation of efforts to split `Sema` up, following the
example of OpenMP, OpenACC, etc. Context can be found in
https://github.com/llvm/llvm-project/pull/82217 and
https://github.com/llvm/llvm-project/pull/84184.

I split formatting changes into a separate commit to help reviewing the
actual changes.
2024-05-13 23:37:59 +04:00
erichkeane
8ef2011b2c Reapply "[OpenACC] device_type clause Sema for Compute constructs"
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to.  As we don't have a source for the valid
architectures yet, this patch just accepts all.

Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.

This reverts commit 06f04b2e27f2586d3db2204ed4e54f8b78fea74e.
This reapplies commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.
The build failures were caused by the patch depending on the order of
evaluation of arguments to a function. This reapplication separates out
the capture of one of the values.
2024-05-13 10:29:43 -07:00
erichkeane
06f04b2e27 Revert "[OpenACC] device_type clause Sema for Compute constructs"
This reverts commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d.

This and the followup patch keep hitting an assert I wrote on the build
bots in a way that isn't clear.  Reverting so I can fix it without a
rush.
2024-05-13 08:40:43 -07:00
erichkeane
c4a9a37474 [OpenACC] device_type clause Sema for Compute constructs
device_type, also spelled as dtype, specifies the applicability of the
clauses following it, and takes a series of identifiers representing the
architectures it applies to.  As we don't have a source for the valid
architectures yet, this patch just accepts all.

Semantically, this also limits the list of clauses that can be applied
after the device_type, so this implements that as well.
2024-05-13 07:50:19 -07:00
Chuanqi Xu
e74a34b693 [NFC] [Serialization] Merge IdentID with IdentifierID
In ASTBitCodes.h, there are two type alias for the ID type of
Identifiers with the same underlying type. It is confusing. This patch
tries to merge the `IdentID` to `IdentifierID` to erase such confusion.
2024-05-13 14:05:02 +08:00
erichkeane
63177422a8 [OpenACC][NFC] Fix isa behavior for OpenACC types
I discovered while working on a different patch that I'd not implemented
the 'classof' for any of the Clauses, which resulted in 'isa' always
returning 'true' for all of the types.  This patch goes through all the
existing clauses and adds 'classof' such that it will work correctly.

Additionally, in doing this, I found a bug where I was doing a cast to
the wrong type in the ASTWriter, so this fixes that problem as well.
2024-05-10 07:03:48 -07:00
erichkeane
b1b465218d [OpenACC] 'wait' clause for compute construct sema
'wait' takes a few int-exprs (well, a series of async-arguments, but
    those are effectively just an int-expr), plus a pair of tags. This
patch adds the support for this to the AST, and does the appropriate
semantic analysis for them.
2024-05-09 06:44:12 -07:00
erichkeane
30cfe2b2ac [OpenACC] Implement 'async' clause sema for compute constructs
This is a pretty simple clause, it takes an 'async-argument', which
effectively needs to be just parsed as an 'int' argument, since it can
be an arbitrarly integer at runtime (and negative values are legal for
implementation defined values).

This patch also cleans up the async-argument parsing, so 'wait' got some
minor quality-of-life improvements for parsing (both clause and
    construct).
2024-05-07 07:14:14 -07:00
Chuanqi Xu
dfa7ff97b2 [C++20] [Modules] [Reduced BMI] Combine the signature of used modules
into the current module

Following of https://github.com/llvm/llvm-project/pull/86912. After
https://github.com/llvm/llvm-project/pull/86912, with reduced BMI, the
BMI can keep unchange if the dependent modules only changes the
implementation (without introduing new decls). However, this is not
strictly correct.

For example:

```
// a.cppm
export module a;
export inline int a() { ... }

// b.cppm
export module b;
import a;
export inline int b() { return a(); }
```

Since both `a()` and `b()` are inline, we need to make sure the BMI of
`b.pcm` will change after the implementation of `a()` changes.

We can't get that naturally since we won't record the body of `a()`
during the writing process. We can't reuse ODRHash here since ODRHash
won't calculate the called function recursively. So ODRHash will be
problematic if `a()` calls other inline functions.

Probably we can solve this by a new hash mechanism. But the safety and
efficiency may a problem too. Here we just combine the hash value of the
used modules conservatively.
2024-05-07 11:41:08 +08:00
erichkeane
48c8a5791a [OpenACC] Implement 'deviceptr' and 'attach' sema for compute constructs
These two are very similar to the other 'var-list' variants, except they
require that the type of the variable be a pointer.  This patch
implements that restriction.
2024-05-06 09:29:04 -07:00
Chuanqi Xu
947b062823 Reland "[Modules] No transitive source location change (#86912)"
This relands 6c31104.

The patch was reverted due to incorrectly introduced alignment. And the
patch was re-commited after fixing the alignment issue.

Following off are the original message:

This is part of "no transitive change" patch series, "no transitive
source location change". I talked this with @Bigcheese in the tokyo's
WG21 meeting.

The idea comes from @jyknight posted on LLVM discourse. That for:

```
// A.cppm
export module A;
...

// B.cppm
export module B;
import A;
...

//--- C.cppm
export module C;
import C;
```

Almost every time A.cppm changes, we need to recompile `B`. Due to we
think the source location is significant to the semantics. But it may be
good if we can avoid recompiling `C` if the change from `A` wouldn't
change the BMI of B.

This patch only cares source locations. So let's focus on source
location's example. We can see the full example from the attached test.

```
//--- A.cppm
export module A;
export template <class T>
struct C {
    T func() {
        return T(43);
    }
};
export int funcA() {
    return 43;
}

//--- A.v1.cppm
export module A;

export template <class T>
struct C {
    T func() {
        return T(43);
    }
};
export int funcA() {
    return 43;
}

//--- B.cppm
export module B;
import A;

export int funcB() {
    return funcA();
}

//--- C.cppm
export module C;
import A;
export void testD() {
    C<int> c;
    c.func();
}
```

Here the only difference between `A.cppm` and `A.v1.cppm` is that
`A.v1.cppm` has an additional blank line. Then the test shows that two
BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other
specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same
contents.

However, it is a different story for C, since C instantiates templates
from A, and the instantiation records the source information from module
A, which is different from `A` and `A.v1`, so it is expected that the
BMI `C.pcm` and `C.v1.pcm` can and should differ.

To fully understand the patch, we need to understand how we encodes
source locations and how we serialize and deserialize them.

For source locations, we encoded them as:

```
|
|
| _____ base offset of an imported module
|
|
|
|_____ base offset of another imported module
|
|
|
|
| ___ 0
```

As the diagram shows, we encode the local (unloaded) source location
from 0 to higher bits. And we allocate the space for source locations
from the loaded modules from high bits to 0. Then the source locations
from the loaded modules will be mapped to our source location space
according to the allocated offset.

For example, for,

```
// a.cppm
export module a;
...

// b.cppm
export module b;
import a;
...
```

Assuming the offset of a source location (let's name the location as
`S`) in a.cppm is 45 and we will record the value `45` into the BMI
`a.pcm`. Then in b.cppm, when we import a, the source manager will
allocate a space for module 'a' (according to the recorded number of
source locations) as the base offset of module 'a' in the current source
location spaces. Let's assume the allocated base offset as 90 in this
example. Then when we want to get the location in the current source
location space for `S`, we can get it simply by adding `45` to `90` to
`135`. Finally we can get the source location for `S` in module B as
`135`.

And when we want to write module `b`, we would also write the source
location of `S` as `135` directly in the BMI. And to clarify the
location `S` comes from module `a`, we also need to record the base
offset of module `a`, 90 in the BMI of `b`.

Then the problem comes. Since the base offset of module 'a' is computed
by the number source locations in module 'a'. In module 'b', the
recorded base offset of module 'a' will change every time the number of
source locations in module 'a' increase or decrease. In other words, the
contents of BMI of B will change every time the number of locations in
module 'a' changes. This is pretty sensitive. Almost every change will
change the number of locations. So this is the problem this patch want
to solve.

Let's continue with the existing design to understand what's going on.
Another interesting case is:

```
// c.cppm
export module c;
import whatever;
import a;
import b;
...
```

In `c.cppm`, when we import `a`, we still need to allocate a base
location offset for it, let's say the value becomes to `200` somehow.
Then when we reach the location `S` recorded in module `b`, we need to
translate it into the current source location space. The solution is
quite simple, we can get it by `135 + (200 - 90) = 245`. In another
word, the offset of a source location in current module can be computed
as `Recorded Offset + Base Offset of the its module file - Recorded Base
Offset`.

Then we're almost done about how we handle the offset of source
locations in serializers.

From the abstract level, what we want to do is to remove the hardcoded
base offset of imported modules and remain the ability to calculate the
source location in a new module unit. To achieve this, we need to be
able to find the module file owning a source location from the encoding
of the source location.

So in this patch, for each source location, we will store the local
offset of the location and the module file index. For the above example,
in `b.pcm`, the source location of `S` will be recorded as `135`
directly. And in the new design, the source location of `S` will be
recorded as `<1, 45>`. Here `1` stands for the module file index of `a`
in module `b`. And `45` means the offset of `S` to the base offset of
module `a`.

So the trade-off here is that, to make the BMI more independent, we need
to record more abstract information. And I feel it is worthy. The
recompilation problem of modules is really annoying and there are still
people complaining this. But if we can make this (including stopping
other changes transitively), I think this may be a killer feature for
modules. And from @Bigcheese , this should be helpful for clang explicit
modules too.

And the benchmarking side, I tested this patch against
https://github.com/alibaba/async_simple/tree/CXX20Modules. No
significant change on compilation time. The size of .pcm files becomes
to 204M from 200M. I think the trade-off is pretty fair.

I didn't use another slot to record the module file index. I tried to
use the higher 32 bits of the existing source location encodings to
store that information. This design may be safe. Since we use `unsigned`
to store source locations but we use uint64_t in serialization. And
generally `unsigned` is 32 bit width in most platforms. So it might not
be a safe problem. Since all the bits we used to store the module file
index is not used before. So the new encodings may be:

```
   |-----------------------|-----------------------|
   |           A           |         B         | C |

  * A: 32 bit. The index of the module file in the module manager + 1.
  * The +1
          here is necessary since we wish 0 stands for the current
module file.
  * B: 31 bit. The offset of the source location to the module file
  * containing it.
  * C: The macro bit. We rotate it to the lowest bit so that we can save
  * some
          space in case the index of the module file is 0.
```

(The B and C is the existing raw encoding for source locations)

Another reason to reuse the same slot of the source location is to
reduce the impact of the patch. Since there are a lot of places assuming
we can store and get a source location from a slot. And if I tried to
add another slot, a lot of codes breaks. I don't feel it is worhty.

Another impact of this decision is that, the existing small
optimizations for encoding source location may be invalided. The key of
the optimization is that we can turn large values into small values then
we can use VBR6 format to reduce the size. But if we decided to put the
module file index into the higher bits, then maybe it simply doesn't
work. An example may be the `SourceLocationSequence` optimization.

This will only affect the size of on-disk .pcm files. I don't expect
this impact the speed and memory use of compilations. And seeing my
small experiments above, I feel this trade off is worthy.

The mental model for handling source location offsets is not so complex
and I believe we can solve it by adding module file index to each stored
source location.

For the practical side, since the source location is pretty sensitive,
and the patch can pass all the in-tree tests and a small scale projects,
I feel it should be correct.

I'll continue to work on no transitive decl change and no transitive
identifier change (if matters) to achieve the goal to stop the
propagation of unnecessary changes. But all of this depends on this
patch. Since, clearly, the source locations are the most sensitive
thing.

---

The release nots and documentation will be added seperately.
2024-05-06 13:35:16 +08:00
David Stone
7d913c5ea9
[clang][Modules] Make Module::Requirement a struct (NFC) (#67900)
`Module::Requirement` was defined as a `std::pair<std::string, bool>`.
This required a comment to explain what the data members mean and makes
the usage harder to understand. Replace this with a struct with two
members, `FeatureName` and `RequiredState`.

---------

Co-authored-by: cor3ntin <corentinjabot@gmail.com>
2024-05-05 09:45:04 +02:00
erichkeane
01e91a2dde [OpenACC] Implement copyin, copyout, create clauses for compute construct
Like 'copy', these also have alternate names, so this implements that as
well.  Additionally, these have an optional tag of either 'readonly' or
'zero' depending on the clause.

Otherwise, this is a pretty rote implementation of the clause, as there
aren't any special rules for it.
2024-05-03 07:51:25 -07:00
erichkeane
054f7c0565 [OpenACC] Implement copy clause for compute constructs.
Like present, no_create, and first_private, copy is a clause that takes
just a var-list, and follows the same rules as the others.

The one unique part of this clause is that it ALSO supports two
deprecated/backwards-compatibility spellings, so this patch adds them
and implements them.
2024-05-03 07:20:41 -07:00
erichkeane
bd909d2e6f [OpenACC] Implement no_create and present clauses on compute constructs
These two are, from a semantic checking perspective, identical to
first-private/private/etc, other than appertainment. This patch
implements both.
2024-05-03 06:51:54 -07:00
erichkeane
a13c5140a2 [OpenACC] Implement firstprivate clause for compute constructs
This clause is pretty nearly copy/paste from private, except that it
doesn't support 'loop', and thus 'kernelsloop' for appertainment.
2024-05-03 06:33:35 -07:00
Jan Svoboda
477c705cb0
[clang][modules] Allow including module maps to be non-affecting (#89992)
The dependency scanner only puts top-level affecting module map files on
the command line for explicitly building a module. This is done because
any affecting child module map files should be referenced by the
top-level one, meaning listing them explicitly does not have any meaning
and only makes the command lines longer.

However, a problem arises whenever the definition of an affecting module
lives in a module map that is not top-level. Considering the rules
explained above, such module map file would not make it to the command
line. That's why 83973cf157f7850eb133a4bbfa0f8b7958bad215 started
marking the parents of an affecting module map file as affecting too.
This way, the top-level file does make it into the command line.

This can be problematic, though. On macOS, for example, the Darwin
module lives in "/usr/include/Darwin.modulemap" one of many module map
files included by "/usr/include/module.modulemap". Reporting the parent
on the command line forces explicit builds to parse all the other module
map files included by it, which is not necessary and can get expensive
in terms of file system traffic.

This patch solves that performance issue by stopping marking parent
module map files as affecting, and marking module map files as top-level
whenever they are top-level among the set of affecting files, not among
the set of all known files. This means that the top-level
"/usr/include/module.modulemap" is now not marked as affecting and
"/usr/include/Darwin.modulemap" is.
2024-05-01 09:38:16 -07:00
Erich Keane
fa67986d5b
[OpenACC] Private Clause on Compute Constructs (#90521)
The private clause is the first that takes a 'var-list', thus this has a
lot of additional work to enable the var-list type. A 'var' is a
traditional variable reference, subscript, member-expression, or
array-section, so checking of these is pretty minor.

Note: This ran into some issues with array-sections (aka sub-arrays)
that will be fixed in a follow-up patch.
2024-04-30 11:28:37 -07:00
Chuanqi Xu
d333a0de68 Revert "[Modules] No transitive source location change (#86912)"
This reverts commit 6c3110464bac3600685af9650269b0b2b8669d34.

Required by the post commit comments: https://github.com/llvm/llvm-project/pull/86912
2024-04-30 22:32:02 +08:00
Chuanqi Xu
b2b463bd8f [C++20] [Modules] Add signature to the BMI recording export imported
modules

After https://github.com/llvm/llvm-project/pull/86912,
for the following example,

```
export module A;
export import B;
```

The generated BMI of `A` won't change if the source location in `A`
changes. Further, we plan avoid more such changes.

However, it is slightly problematic since `export import` should
propagate all the changes.

So this patch adds a signature to the BMI of C++20 modules so that we
can propagate the changes correctly.
2024-04-30 16:33:34 +08:00
Chuanqi Xu
6c3110464b
[Modules] No transitive source location change (#86912)
This is part of "no transitive change" patch series, "no transitive
source location change". I talked this with @Bigcheese in the tokyo's
WG21 meeting.

The idea comes from @jyknight posted on LLVM discourse. That for:

```
// A.cppm
export module A;
...

// B.cppm
export module B;
import A;
...

//--- C.cppm
export module C;
import C;
```

Almost every time A.cppm changes, we need to recompile `B`. Due to we
think the source location is significant to the semantics. But it may be
good if we can avoid recompiling `C` if the change from `A` wouldn't
change the BMI of B.

# Motivation Example

This patch only cares source locations. So let's focus on source
location's example. We can see the full example from the attached test.

```
//--- A.cppm
export module A;
export template <class T>
struct C {
    T func() {
        return T(43);
    }
};
export int funcA() {
    return 43;
}

//--- A.v1.cppm
export module A;

export template <class T>
struct C {
    T func() {
        return T(43);
    }
};
export int funcA() {
    return 43;
}

//--- B.cppm
export module B;
import A;

export int funcB() {
    return funcA();
}

//--- C.cppm
export module C;
import A;
export void testD() {
    C<int> c;
    c.func();
}
```

Here the only difference between `A.cppm` and `A.v1.cppm` is that
`A.v1.cppm` has an additional blank line. Then the test shows that two
BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other
specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same
contents.

However, it is a different story for C, since C instantiates templates
from A, and the instantiation records the source information from module
A, which is different from `A` and `A.v1`, so it is expected that the
BMI `C.pcm` and `C.v1.pcm` can and should differ.

# Internal perspective of status quo

To fully understand the patch, we need to understand how we encodes
source locations and how we serialize and deserialize them.

For source locations, we encoded them as:

```
|
|
| _____ base offset of an imported module
|
|
|
|_____ base offset of another imported module
|
|
|
|
| ___ 0
```

As the diagram shows, we encode the local (unloaded) source location
from 0 to higher bits. And we allocate the space for source locations
from the loaded modules from high bits to 0. Then the source locations
from the loaded modules will be mapped to our source location space
according to the allocated offset.

For example, for,

```
// a.cppm
export module a;
...

// b.cppm
export module b;
import a;
...
```

Assuming the offset of a source location (let's name the location as
`S`) in a.cppm is 45 and we will record the value `45` into the BMI
`a.pcm`. Then in b.cppm, when we import a, the source manager will
allocate a space for module 'a' (according to the recorded number of
source locations) as the base offset of module 'a' in the current source
location spaces. Let's assume the allocated base offset as 90 in this
example. Then when we want to get the location in the current source
location space for `S`, we can get it simply by adding `45` to `90` to
`135`. Finally we can get the source location for `S` in module B as
`135`.

And when we want to write module `b`, we would also write the source
location of `S` as `135` directly in the BMI. And to clarify the
location `S` comes from module `a`, we also need to record the base
offset of module `a`, 90 in the BMI of `b`.

Then the problem comes. Since the base offset of module 'a' is computed
by the number source locations in module 'a'. In module 'b', the
recorded base offset of module 'a' will change every time the number of
source locations in module 'a' increase or decrease. In other words, the
contents of BMI of B will change every time the number of locations in
module 'a' changes. This is pretty sensitive. Almost every change will
change the number of locations. So this is the problem this patch want
to solve.

Let's continue with the existing design to understand what's going on.
Another interesting case is:

```
// c.cppm
export module c;
import whatever;
import a;
import b;
...
```

In `c.cppm`, when we import `a`, we still need to allocate a base
location offset for it, let's say the value becomes to `200` somehow.
Then when we reach the location `S` recorded in module `b`, we need to
translate it into the current source location space. The solution is
quite simple, we can get it by `135 + (200 - 90) = 245`. In another
word, the offset of a source location in current module can be computed
as `Recorded Offset + Base Offset of the its module file - Recorded Base
Offset`.

Then we're almost done about how we handle the offset of source
locations in serializers.

# The high level design of current patch

From the abstract level, what we want to do is to remove the hardcoded
base offset of imported modules and remain the ability to calculate the
source location in a new module unit. To achieve this, we need to be
able to find the module file owning a source location from the encoding
of the source location.

So in this patch, for each source location, we will store the local
offset of the location and the module file index. For the above example,
in `b.pcm`, the source location of `S` will be recorded as `135`
directly. And in the new design, the source location of `S` will be
recorded as `<1, 45>`. Here `1` stands for the module file index of `a`
in module `b`. And `45` means the offset of `S` to the base offset of
module `a`.

So the trade-off here is that, to make the BMI more independent, we need
to record more abstract information. And I feel it is worthy. The
recompilation problem of modules is really annoying and there are still
people complaining this. But if we can make this (including stopping
other changes transitively), I think this may be a killer feature for
modules. And from @Bigcheese , this should be helpful for clang explicit
modules too.

And the benchmarking side, I tested this patch against
https://github.com/alibaba/async_simple/tree/CXX20Modules. No
significant change on compilation time. The size of .pcm files becomes
to 204M from 200M. I think the trade-off is pretty fair.

# Some low level details

I didn't use another slot to record the module file index. I tried to
use the higher 32 bits of the existing source location encodings to
store that information. This design may be safe. Since we use `unsigned`
to store source locations but we use uint64_t in serialization. And
generally `unsigned` is 32 bit width in most platforms. So it might not
be a safe problem. Since all the bits we used to store the module file
index is not used before. So the new encodings may be:

```
   |-----------------------|-----------------------|
   |           A           |         B         | C |

  * A: 32 bit. The index of the module file in the module manager + 1. The +1
          here is necessary since we wish 0 stands for the current module file.
  * B: 31 bit. The offset of the source location to the module file containing it.
  * C: The macro bit. We rotate it to the lowest bit so that we can save some 
          space in case the index of the module file is 0.
```

(The B and C is the existing raw encoding for source locations)

Another reason to reuse the same slot of the source location is to
reduce the impact of the patch. Since there are a lot of places assuming
we can store and get a source location from a slot. And if I tried to
add another slot, a lot of codes breaks. I don't feel it is worhty.

Another impact of this decision is that, the existing small
optimizations for encoding source location may be invalided. The key of
the optimization is that we can turn large values into small values then
we can use VBR6 format to reduce the size. But if we decided to put the
module file index into the higher bits, then maybe it simply doesn't
work. An example may be the `SourceLocationSequence` optimization.

This will only affect the size of on-disk .pcm files. I don't expect
this impact the speed and memory use of compilations. And seeing my
small experiments above, I feel this trade off is worthy.

# Correctness

The mental model for handling source location offsets is not so complex
and I believe we can solve it by adding module file index to each stored
source location.

For the practical side, since the source location is pretty sensitive,
and the patch can pass all the in-tree tests and a small scale projects,
I feel it should be correct.

# Future Plans

I'll continue to work on no transitive decl change and no transitive
identifier change (if matters) to achieve the goal to stop the
propagation of unnecessary changes. But all of this depends on this
patch. Since, clearly, the source locations are the most sensitive
thing.

---

The release nots and documentation will be added seperately.
2024-04-30 15:57:58 +08:00
Chuanqi Xu
38067c50a9 [C++20] [Modules] [Reduced BMI] Avoid force writing static declarations
within module purview

Close https://github.com/llvm/llvm-project/issues/90259

Technically, the static declarations shouldn't be leaked from the module
interface, otherwise it is an illegal program according to the spec. So
we can get rid of the static declarations from the reduced BMI
technically. Then we can close the above issue.

However, there are too many `static inline` codes in existing headers.
So it will be a pretty big breaking change if we do this globally.
2024-04-30 11:34:34 +08:00
Chuanqi Xu
d86cc73bbf [NFC] [Serialization] Avoid using DeclID directly as much as possible
This patch tries to remove all the direct use of DeclID except the real
low level reading and writing. All the use of DeclID is converted to
the use of LocalDeclID or GlobalDeclID. This is helpful to increase the
readability and type safety.
2024-04-25 14:59:09 +08:00
Chuanqi Xu
72b58146b1 Revert "[NFC] [Serialization] Avoid using DeclID directly as much as possible"
This reverts commit 42070a5c092ed420bf92ebf38229c594885e94c7.

I forgot to touch lldb.
2024-04-25 14:26:07 +08:00
Chuanqi Xu
42070a5c09 [NFC] [Serialization] Avoid using DeclID directly as much as possible
This patch tries to remove all the direct use of DeclID except the real
low level reading and writing. All the use of DeclID is converted to
the use of LocalDeclID or GlobalDeclID. This is helpful to increase the
readability and type safety.
2024-04-25 14:14:05 +08:00
Chuanqi Xu
c2a98fdeb3
[NFC] Move DeclID from serialization/ASTBitCodes.h to AST/DeclID.h (#89873)
Previously, the DeclID is defined in serialization/ASTBitCodes.h under
clang::serialization namespace. However, actually the DeclID is not
purely used in serialization part. The DeclID is already widely used in
AST and all around the clang project via classes like `LazyPtrDecl` or
calling `ExternalASTSource::getExernalDecl()`. All such uses are via the
raw underlying type of `DeclID` as `uint32_t`. This is not pretty good.

This patch moves the DeclID class family to a new header `AST/DeclID.h`
so that the whole project can use the wrapped class `DeclID`,
`GlobalDeclID` and `LocalDeclID` instead of the raw underlying type.
This can improve the readability and the type safety.
2024-04-25 13:53:22 +08:00
Jan Svoboda
d609029d6c
[clang][modules] Allow module maps with textual headers to be non-affecting (#89441)
When writing out a PCM, we skip serializing headers' `HeaderFileInfo`
struct whenever this condition evaluates to `true`:

```c++
!HFI || (HFI->isModuleHeader && !HFI->isCompilingModuleHeader)
```

However, when Clang parses a module map file, each textual header gets a
`HFI` with `isModuleHeader=false`, `isTextualModuleHeader=true` and
`isCompilingModuleHeader=false`. This means the condition evaluates to
`false` even if the header was never included and the module map did not
affect the compilation. Each PCM file that happened to parse such module
map then contains a copy of the `HeaderFileInfo` struct for all textual
headers, and considers the containing module map affecting.

This patch makes it so that we skip headers that have not been included,
essentially removing the virality of textual headers when it comes to
PCM serialization.
2024-04-24 09:05:56 -07:00
Vlad Serebrennikov
662ef86042 [clang][NFC] Remove useless code in ASTWriter
A follow-up to #71709, addressing the static analysis finding reported in https://github.com/llvm/llvm-project/pull/71709/files#r1576846306
2024-04-24 11:09:06 +03:00
Pranav Kant
e1321fafbc Revert "Reapply "[Clang][Sema] placement new initializes typedef array with correct size (#83124)" (#89036)"
This reverts commit 74cab546825b32f24e44d69942cdbdd129160471.
2024-04-23 22:08:50 +00:00
mahtohappy
74cab54682
Reapply "[Clang][Sema] placement new initializes typedef array with correct size (#83124)" (#89036)
When in-place new-ing a local variable of an array of trivial type, the
generated code calls 'memset' with the correct size of the array,
earlier it was generating size (squared of the typedef array + size).

The cause: typedef TYPE TArray[8]; TArray x; The type of declarator is
Tarray[8] and in SemaExprCXX.cpp::BuildCXXNew we check if it's of
typedef and of constant size then we get the original type and it works
fine for non-dependent cases.
But in case of template we do TreeTransform.h:TransformCXXNEWExpr and
there we again check the allocated type which is TArray[8] and it stays
that way, so ArraySize=(Tarray[8] type, alloc Tarray[8*type]) so the
squared size allocation.

ArraySize gets calculated earlier in TreeTransform.h so that
if(!ArraySize) condition was failing.
fix: I changed that condition to if(ArraySize).
fixes https://github.com/llvm/llvm-project/issues/41441

---------

Co-authored-by: erichkeane <ekeane@nvidia.com>
2024-04-23 07:37:07 -07:00
Chuanqi Xu
b467c6b536 [NFC] [Serialization] Turn type alias GlobalDeclID into a class
Succsessor of b8e3b2ad66cf78ad2b. This patch also converts the type
alias GlobalDeclID to a class to improve the readability and type
safety.
2024-04-23 17:52:58 +08:00
Chuanqi Xu
07b1177eed [NFC] [Serialization] Use semantical type DeclID instead of raw type 'uint32_t'
This patch tries to use DeclID in the code bases to avoid use the raw
type 'uint32_t'. It is problematic to use the raw type 'uint32_t' if we
want to change the type of DeclID some day.
2024-04-23 12:44:00 +08:00
Erich Keane
dc20a0ea1f
[OpenACC] Implement 'num_gangs' sema for compute constructs (#89460)
num_gangs takes an 'int-expr-list', for 'parallel', and an 'int-expr'
for 'kernels'. This patch changes the parsing to always parse it as an
'int-expr-list', then correct the expression count during Sema. It also
implements the rest of the semantic analysis changes for this clause.
2024-04-22 08:57:25 -07:00
Jan Svoboda
3ea5dff0ef
[clang][modules] Only avoid pruning module maps when asked to (#89428)
Pruning non-affecting module maps is useful even when passing module
maps explicitly via `-fmodule-map-file=<path>`. For this situation, this
patch reinstates the behavior we had prior to #87849. For the situation
where the explicit module map file arguments were generated by the
dependency scanner (which already pruned the non-affecting ones), this
patch introduces new `-cc1` flag
`-fno-modules-prune-non-affecting-module-map-files` that avoids the
extra work.
2024-04-19 12:45:40 -07:00
Erich Keane
b8adf169bb [OpenACC] Implement 'vector_length' clause On compute constructs
The 'vector_length' clause is semantically identical to the
'num_workers' clause, in that it takes a mandatory single int-expr. This
is implemented identically to it.
2024-04-18 13:27:42 -07:00
Erich Keane
76600aee9d
[OpenACC] Implement 'num_workers' clause for compute constructs (#89151)
This clause just takes an 'int expr', which is not optional. This patch
implements the clause on compute constructs.
2024-04-18 12:42:22 -07:00
Jie Fu
609ee9fe6a [clang] Remove unused lambda capture (NFC)
llvm-project/clang/lib/Serialization/ASTWriter.cpp:5077:59:
error: lambda capture 'SemaRef' is not used [-Werror,-Wunused-lambda-capture]
    auto AddEmittedDeclRefOrZero = [this, &SemaDeclRefs, &SemaRef](Decl *D) {
                                                       ~~~^~~~~~~
1 error generated.
2024-04-18 15:49:52 +08:00
Chuanqi Xu
5d4e072a25 [C++20] [Modules] [Reduced BMI] Don't eagerly write static entities in
module purview

For,

```
export module A;
static int impl() { ... }
export int func() { return impl(); }
```

Previously, even with reduced BMI, the function `impl` will be emitted
into the BMI. After the patch, the static entities in module purview
won't get emitted eagerly. Now the static entities may only be emitted
if required.

Note that, this restriction is actually more relaxed than the language
standard required. The language spec said, the program is ill-formed if
any TU-local entities get exposed. However, we can't do this since there
are many static entities in the headers of existing libraries.
Forbidding that will cause many existing program fail immediately.

Another note here is, we can't do this for non-static non-exported
entities, they can be used for other module units within the same
module.
2024-04-18 15:38:02 +08:00
Chuanqi Xu
7ec342ba16 [C++20] [Modules] [Reduced BMI] Write Special Decl Lazily
ASTWrite would write some special records for the consumer of the AST
can get the information easily. This is fine. But currently all the
special records are written eagerly, which is conflicting with reduced
BMI.

In reduced BMI, we hope to write things as less as possible. It is not a
goal for reduced BMI to retain all the informations. So in this patch we
won't record the special declarations eagerly before we starting write.
But only writing the sepcial records after we writes decls and types,
then only the reached declarations will be collected.
2024-04-18 14:35:38 +08:00
Chuanqi Xu
83cc60565a [NFC] [Serialization] Extract logics to write special decls from
ASTWriter::WriteASTCore

ASTWriter::WriteASTCore is a big function and slightly hard to read.
This patch extracts the logics to force emitting special declarations
and writing special records for these declarations into member
functions.

This is helpful to improve the readability and also helpful for the
following patch to write special declarations lazily.
2024-04-18 14:34:24 +08:00
Chuanqi Xu
f2695a1c2f [C++20] [Modules] Avoid writing untouched DeclUpdates from GMF in
Reduced BMI

Mitigate https://github.com/llvm/llvm-project/issues/61447

The root cause of the above problem is that when we write a declaration,
we need to lookup all the redeclarations in the imported modules. Then
it will be pretty slow if there are too many redeclarations in different
modules. This patch doesn't solve the porblem.

What the patchs mitigated is, when we writing a named module, we shouldn't
write the declarations from GMF if it is unreferenced **in current
module unit**. The difference here is that, if the declaration is used
in the imported modules, we used to emit it as an update. But we
definitely want to avoid that after this patch.

For that reproducer in
https://github.com/llvm/llvm-project/issues/61447, it used to take 2.5s
to compile and now it only takes 0.49s to compile, which is a big win.
2024-04-18 11:00:28 +08:00
Erich Keane
6133878227
[OpenACC] Implement self clause for compute constructs (#88760)
`self` clauses on compute constructs take an optional condition
expression. We again limit the implementation to ONLY compute constructs
to ensure we get all the rules correct for others. However, this one
will be particularly complicated, as it takes a `var-list` for `update`,
so when we get to that construct/clause combination, we need to do that
as well.

This patch also furthers uses of the `OpenACCClauses.def` as it became
useful while implementing this (as well as some other minor refactors as
I went through).

Finally, `self` and `if` clauses have an interaction with each other, if
an `if` clause evaluates to `true`, the `self` clause has no effect.
While this is intended and can be used 'meaningfully', we are warning on
this with a very granular warning, so that this edge case will be
noticed by newer users, but can be disabled trivially.
2024-04-16 06:57:36 -07:00
Chuanqi Xu
d26dd58ca5 [StmtProfile] Don't profile the body of lambda expressions
Close https://github.com/llvm/llvm-project/issues/87609

We tried to profile the body of the lambda expressions in
https://reviews.llvm.org/D153957. But as the original comments show,
it is indeed dangerous. After we tried to skip calculating the ODR
hash values recently, we have fall into this trap twice.

So in this patch, I choose to not profile the body of the lambda
expression. The signature of the lambda is still profiled.
2024-04-16 15:41:26 +08:00
Vlad Serebrennikov
0a6f6df5b0
[clang] Introduce SemaCUDA (#88559)
This patch moves CUDA-related `Sema` function into new `SemaCUDA` class,
following the recent example of SYCL, OpenACC, and HLSL. This is a part
of the effort to split Sema. Additional context can be found in
https://github.com/llvm/llvm-project/pull/82217,
https://github.com/llvm/llvm-project/pull/84184,
https://github.com/llvm/llvm-project/pull/87634.
2024-04-13 08:54:25 +04:00
Erich Keane
daa88364df
[OpenACC] Implement 'if' clause for Compute Constructs (#88411)
Like with the 'default' clause, this is being applied to only Compute
Constructs for now. The 'if' clause takes a condition expression which
is used as a runtime value.

This is not a particularly complex semantic implementation, as there
isn't much to this clause, other than its interactions with 'self',
  which will be managed in the patch to implement that.
2024-04-12 14:13:31 -07:00
Chuanqi Xu
f21ead0675
[C++20] [Modules] [Reduced BMI] Remove unreachable decls GMF in redued BMI (#88359)
Following of https://github.com/llvm/llvm-project/pull/76930

This follows the idea of "only writes what we writes", which I think is
the most natural and efficient way to implement this optimization.

We start writing the BMI from the first declaration in module purview
instead of the global module fragment, so that everything in the GMF
untouched won't be written in the BMI naturally.

The exception is, as I said in
https://github.com/llvm/llvm-project/pull/76930, when we write a
declaration we need to write its decl context, and when we write the
decl context, we need to write everything from it. So when we see
`std::vector`, we basically need to write everything under namespace
std. This violates our intention. To fix this, this patch delays the
writing of namespace in the GMF.

From my local measurement, the size of the BMI decrease to 90M from 112M
for a local modules build. I think this is significant.

This feature will be covered under the experimental reduced BMI so that
it won't affect any existing users. So I'd like to land this when the CI
gets green.

Documents will be added seperately.
2024-04-12 12:51:58 +08:00
Jan Svoboda
84df7a09f8
[clang][modules] Do not resolve HeaderFileInfo externally in ASTWriter (#87848)
Clang uses the `HeaderFileInfo` struct to track bits of information on
header files, which gets used throughout the compiler. We also use this
to compute the set of affecting module maps in `ASTWriter` and in the
end serialize the information into the `HEADER_SEARCH_TABLE` record of a
PCM file, allowing clients to learn about headers from the module. In
doing so, Clang asks for existing `HeaderFileInfo` for all known
`FileEntries`. Note that this asks the loaded PCM files for the
information they have on each header file in question. This seems
unnecessary: we only want to serialize information on header files that
either belong to the current module or that got included textually.
Loaded PCM files can't provide us with any useful information.

For explicit modules with lazy loading (using `-fmodule-map-file=<path>`
with `-fmodule-file=<name>=<path>`) the compiler knows about header
files listed in the module map files on the command-line. This can be a
large number.

Asking for existing `HeaderFileInfo` can trigger deserialization of
`HEADER_SEARCH_TABLE` from loaded PCM files. Keys of the on-disk hash
table consist of the header file size and modification time. However,
with explicit modules Clang zeroes out the modification time. Moreover,
if you import lots of modules, some of their header files end up having
identical sizes. This means lots of hash collisions that can only be
resolved by running the serialized filename through `FileManager` and
comparing equality of the `FileEntry`. This ends up being super
expensive, essentially re-stating lots of the transitively loaded SDK
header files.

This patch cleans up the API for getting `HeaderFileInfo` and makes sure
`ASTWriter` uses the version that doesn't ask loaded PCM files for more
information. This removes the excessive stat traffic coming from
`ASTWriter` hopefully without changing observable behavior.
2024-04-11 14:44:55 -07:00