llvm-project

Author	SHA1	Message	Date
erichkeane	8ef2011b2c	Reapply "[OpenACC] device_type clause Sema for Compute constructs" device_type, also spelled as dtype, specifies the applicability of the clauses following it, and takes a series of identifiers representing the architectures it applies to. As we don't have a source for the valid architectures yet, this patch just accepts all. Semantically, this also limits the list of clauses that can be applied after the device_type, so this implements that as well. This reverts commit 06f04b2e27f2586d3db2204ed4e54f8b78fea74e. This reapplies commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d. The build failures were caused by the patch depending on the order of evaluation of arguments to a function. This reapplication separates out the capture of one of the values.	2024-05-13 10:29:43 -07:00
erichkeane	06f04b2e27	Revert "[OpenACC] device_type clause Sema for Compute constructs" This reverts commit c4a9a374749deb5f2a932a7d4ef9321be1b2ae5d. This and the followup patch keep hitting an assert I wrote on the build bots in a way that isn't clear. Reverting so I can fix it without a rush.	2024-05-13 08:40:43 -07:00
erichkeane	c4a9a37474	[OpenACC] device_type clause Sema for Compute constructs device_type, also spelled as dtype, specifies the applicability of the clauses following it, and takes a series of identifiers representing the architectures it applies to. As we don't have a source for the valid architectures yet, this patch just accepts all. Semantically, this also limits the list of clauses that can be applied after the device_type, so this implements that as well.	2024-05-13 07:50:19 -07:00
Chuanqi Xu	e74a34b693	[NFC] [Serialization] Merge IdentID with IdentifierID In ASTBitCodes.h, there are two type alias for the ID type of Identifiers with the same underlying type. It is confusing. This patch tries to merge the `IdentID` to `IdentifierID` to erase such confusion.	2024-05-13 14:05:02 +08:00
erichkeane	b1b465218d	[OpenACC] 'wait' clause for compute construct sema 'wait' takes a few int-exprs (well, a series of async-arguments, but those are effectively just an int-expr), plus a pair of tags. This patch adds the support for this to the AST, and does the appropriate semantic analysis for them.	2024-05-09 06:44:12 -07:00
Ellis Hoag	2ad6917c4c	[modules] Accept equivalent module caches from different symlink (#90925 ) Use `VFS.equivalent()`, which follows symlinks, to check if two module cache paths are equivalent. This prevents a PCH error when building from a different path that is a symlink of the original. ``` error: PCH was compiled with module cache path '/home/foo/blah/ModuleCache/2IBP1TNT8OR8D', but the path is currently '/data/users/foo/blah/ModuleCache/2IBP1TNT8OR8D' 1 error generated. ```	2024-05-07 13:55:44 -07:00
erichkeane	30cfe2b2ac	[OpenACC] Implement 'async' clause sema for compute constructs This is a pretty simple clause, it takes an 'async-argument', which effectively needs to be just parsed as an 'int' argument, since it can be an arbitrarly integer at runtime (and negative values are legal for implementation defined values). This patch also cleans up the async-argument parsing, so 'wait' got some minor quality-of-life improvements for parsing (both clause and construct).	2024-05-07 07:14:14 -07:00
erichkeane	48c8a5791a	[OpenACC] Implement 'deviceptr' and 'attach' sema for compute constructs These two are very similar to the other 'var-list' variants, except they require that the type of the variable be a pointer. This patch implements that restriction.	2024-05-06 09:29:04 -07:00
Chuanqi Xu	947b062823	Reland "[Modules] No transitive source location change (#86912 )" This relands 6c31104. The patch was reverted due to incorrectly introduced alignment. And the patch was re-commited after fixing the alignment issue. Following off are the original message: This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` \| \| \| _____ base offset of an imported module \| \| \| \|_____ base offset of another imported module \| \| \| \| \| ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` \|-----------------------\|-----------------------\| \| A \| B \| C \| * A: 32 bit. The index of the module file in the module manager + 1. * The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file * containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save * some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.	2024-05-06 13:35:16 +08:00
Younan Zhang	7a484d3a1f	[clang] Distinguish unresolved templates in UnresolvedLookupExpr (#89019 ) This patch revolves around the misuse of UnresolvedLookupExpr in BuildTemplateIdExpr. Basically, we build up an UnresolvedLookupExpr not only for function overloads but for "unresolved" templates wherever we need an expression for template decls. For example, a dependent VarTemplateDecl can be wrapped with such an expression before template instantiation. (See `617007240c`) Also, one important thing is that UnresolvedLookupExpr uses a "canonical" QualType to describe the containing unresolved decls: a DependentTy is for dependent expressions and an OverloadTy otherwise. Therefore, this modeling for non-dependent templates leaves a problem in that the expression is marked and perceived as if describing overload functions. The consumer then expects functions for every such expression, although the fact is the reverse. Hence, we run into crashes. As to the patch, I added a new canonical type "UnresolvedTemplateTy" to model these cases. Given that we have been using this model (intentionally or accidentally) and it is pretty baked in throughout the code, I think extending the role of UnresolvedLookupExpr is reasonable. Further, I added some diagnostics for the direct occurrence of these expressions, which are supposed to be ill-formed. As a bonus, this patch also fixes some typos in the diagnostics and creates RecoveryExprs rather than nothing in the hope of a better error-recovery for clangd. Fixes https://github.com/llvm/llvm-project/issues/88832 Fixes https://github.com/llvm/llvm-project/issues/63243 Fixes https://github.com/llvm/llvm-project/issues/48673	2024-05-05 11:38:49 +08:00
erichkeane	01e91a2dde	[OpenACC] Implement copyin, copyout, create clauses for compute construct Like 'copy', these also have alternate names, so this implements that as well. Additionally, these have an optional tag of either 'readonly' or 'zero' depending on the clause. Otherwise, this is a pretty rote implementation of the clause, as there aren't any special rules for it.	2024-05-03 07:51:25 -07:00
erichkeane	054f7c0565	[OpenACC] Implement copy clause for compute constructs. Like present, no_create, and first_private, copy is a clause that takes just a var-list, and follows the same rules as the others. The one unique part of this clause is that it ALSO supports two deprecated/backwards-compatibility spellings, so this patch adds them and implements them.	2024-05-03 07:20:41 -07:00
erichkeane	bd909d2e6f	[OpenACC] Implement no_create and present clauses on compute constructs These two are, from a semantic checking perspective, identical to first-private/private/etc, other than appertainment. This patch implements both.	2024-05-03 06:51:54 -07:00
erichkeane	a13c5140a2	[OpenACC] Implement firstprivate clause for compute constructs This clause is pretty nearly copy/paste from private, except that it doesn't support 'loop', and thus 'kernelsloop' for appertainment.	2024-05-03 06:33:35 -07:00
Ivan Murashko	9a9cff15a1	[Modules] Process include files changes (#90319 ) There were two diffs that introduced some options useful when you build modules externally and cannot rely on file modification time as the key for detecting input file changes: - [D67249](https://reviews.llvm.org/D67249) introduced the `-fmodules-validate-input-files-content` option, which allows the use of file content hash in addition to the modification time. - [D141632](https://reviews.llvm.org/D141632) propagated the use of `-fno-pch-timestamps` with Clang modules. There is a problem when the size of the input file (header) is not modified but the content is. In this case, Clang cannot detect the file change when the `-fno-pch-timestamps` option is used. The `-fmodules-validate-input-files-content` option should help, but there is an issue with its application: it's not applied when the modification time is stored as zero that is the case for `-fno-pch-timestamps`. The issue can be fixed using the same trick that was applied during the processing of `ForceCheckCXX20ModulesInputFiles`: ``` // When ForceCheckCXX20ModulesInputFiles and ValidateASTInputFilesContent // enabled, it is better to check the contents of the inputs. Since we can't // get correct modified time information for inputs from overriden inputs. if (HSOpts.ForceCheckCXX20ModulesInputFiles && ValidateASTInputFilesContent && F.StandardCXXModule && FileChange.Kind == Change::None) FileChange = HasInputContentChanged(FileChange); ``` The patch suggests the solution similar to the presented above and includes a LIT test to verify it.	2024-05-01 09:07:57 +01:00
Erich Keane	fa67986d5b	[OpenACC] Private Clause on Compute Constructs (#90521 ) The private clause is the first that takes a 'var-list', thus this has a lot of additional work to enable the var-list type. A 'var' is a traditional variable reference, subscript, member-expression, or array-section, so checking of these is pretty minor. Note: This ran into some issues with array-sections (aka sub-arrays) that will be fixed in a follow-up patch.	2024-04-30 11:28:37 -07:00
Chuanqi Xu	d333a0de68	Revert "[Modules] No transitive source location change (#86912 )" This reverts commit 6c3110464bac3600685af9650269b0b2b8669d34. Required by the post commit comments: https://github.com/llvm/llvm-project/pull/86912	2024-04-30 22:32:02 +08:00
Chuanqi Xu	6c3110464b	[Modules] No transitive source location change (#86912 ) This is part of "no transitive change" patch series, "no transitive source location change". I talked this with @Bigcheese in the tokyo's WG21 meeting. The idea comes from @jyknight posted on LLVM discourse. That for: ``` // A.cppm export module A; ... // B.cppm export module B; import A; ... //--- C.cppm export module C; import C; ``` Almost every time A.cppm changes, we need to recompile `B`. Due to we think the source location is significant to the semantics. But it may be good if we can avoid recompiling `C` if the change from `A` wouldn't change the BMI of B. # Motivation Example This patch only cares source locations. So let's focus on source location's example. We can see the full example from the attached test. ``` //--- A.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- A.v1.cppm export module A; export template <class T> struct C { T func() { return T(43); } }; export int funcA() { return 43; } //--- B.cppm export module B; import A; export int funcB() { return funcA(); } //--- C.cppm export module C; import A; export void testD() { C<int> c; c.func(); } ``` Here the only difference between `A.cppm` and `A.v1.cppm` is that `A.v1.cppm` has an additional blank line. Then the test shows that two BMI of `B.cppm`, one specified `-fmodule-file=A=A.pcm` and the other specified `-fmodule-file=A=A.v1.pcm`, should have the bit-wise same contents. However, it is a different story for C, since C instantiates templates from A, and the instantiation records the source information from module A, which is different from `A` and `A.v1`, so it is expected that the BMI `C.pcm` and `C.v1.pcm` can and should differ. # Internal perspective of status quo To fully understand the patch, we need to understand how we encodes source locations and how we serialize and deserialize them. For source locations, we encoded them as: ``` \| \| \| _____ base offset of an imported module \| \| \| \|_____ base offset of another imported module \| \| \| \| \| ___ 0 ``` As the diagram shows, we encode the local (unloaded) source location from 0 to higher bits. And we allocate the space for source locations from the loaded modules from high bits to 0. Then the source locations from the loaded modules will be mapped to our source location space according to the allocated offset. For example, for, ``` // a.cppm export module a; ... // b.cppm export module b; import a; ... ``` Assuming the offset of a source location (let's name the location as `S`) in a.cppm is 45 and we will record the value `45` into the BMI `a.pcm`. Then in b.cppm, when we import a, the source manager will allocate a space for module 'a' (according to the recorded number of source locations) as the base offset of module 'a' in the current source location spaces. Let's assume the allocated base offset as 90 in this example. Then when we want to get the location in the current source location space for `S`, we can get it simply by adding `45` to `90` to `135`. Finally we can get the source location for `S` in module B as `135`. And when we want to write module `b`, we would also write the source location of `S` as `135` directly in the BMI. And to clarify the location `S` comes from module `a`, we also need to record the base offset of module `a`, 90 in the BMI of `b`. Then the problem comes. Since the base offset of module 'a' is computed by the number source locations in module 'a'. In module 'b', the recorded base offset of module 'a' will change every time the number of source locations in module 'a' increase or decrease. In other words, the contents of BMI of B will change every time the number of locations in module 'a' changes. This is pretty sensitive. Almost every change will change the number of locations. So this is the problem this patch want to solve. Let's continue with the existing design to understand what's going on. Another interesting case is: ``` // c.cppm export module c; import whatever; import a; import b; ... ``` In `c.cppm`, when we import `a`, we still need to allocate a base location offset for it, let's say the value becomes to `200` somehow. Then when we reach the location `S` recorded in module `b`, we need to translate it into the current source location space. The solution is quite simple, we can get it by `135 + (200 - 90) = 245`. In another word, the offset of a source location in current module can be computed as `Recorded Offset + Base Offset of the its module file - Recorded Base Offset`. Then we're almost done about how we handle the offset of source locations in serializers. # The high level design of current patch From the abstract level, what we want to do is to remove the hardcoded base offset of imported modules and remain the ability to calculate the source location in a new module unit. To achieve this, we need to be able to find the module file owning a source location from the encoding of the source location. So in this patch, for each source location, we will store the local offset of the location and the module file index. For the above example, in `b.pcm`, the source location of `S` will be recorded as `135` directly. And in the new design, the source location of `S` will be recorded as `<1, 45>`. Here `1` stands for the module file index of `a` in module `b`. And `45` means the offset of `S` to the base offset of module `a`. So the trade-off here is that, to make the BMI more independent, we need to record more abstract information. And I feel it is worthy. The recompilation problem of modules is really annoying and there are still people complaining this. But if we can make this (including stopping other changes transitively), I think this may be a killer feature for modules. And from @Bigcheese , this should be helpful for clang explicit modules too. And the benchmarking side, I tested this patch against https://github.com/alibaba/async_simple/tree/CXX20Modules. No significant change on compilation time. The size of .pcm files becomes to 204M from 200M. I think the trade-off is pretty fair. # Some low level details I didn't use another slot to record the module file index. I tried to use the higher 32 bits of the existing source location encodings to store that information. This design may be safe. Since we use `unsigned` to store source locations but we use uint64_t in serialization. And generally `unsigned` is 32 bit width in most platforms. So it might not be a safe problem. Since all the bits we used to store the module file index is not used before. So the new encodings may be: ``` \|-----------------------\|-----------------------\| \| A \| B \| C \| * A: 32 bit. The index of the module file in the module manager + 1. The +1 here is necessary since we wish 0 stands for the current module file. * B: 31 bit. The offset of the source location to the module file containing it. * C: The macro bit. We rotate it to the lowest bit so that we can save some space in case the index of the module file is 0. ``` (The B and C is the existing raw encoding for source locations) Another reason to reuse the same slot of the source location is to reduce the impact of the patch. Since there are a lot of places assuming we can store and get a source location from a slot. And if I tried to add another slot, a lot of codes breaks. I don't feel it is worhty. Another impact of this decision is that, the existing small optimizations for encoding source location may be invalided. The key of the optimization is that we can turn large values into small values then we can use VBR6 format to reduce the size. But if we decided to put the module file index into the higher bits, then maybe it simply doesn't work. An example may be the `SourceLocationSequence` optimization. This will only affect the size of on-disk .pcm files. I don't expect this impact the speed and memory use of compilations. And seeing my small experiments above, I feel this trade off is worthy. # Correctness The mental model for handling source location offsets is not so complex and I believe we can solve it by adding module file index to each stored source location. For the practical side, since the source location is pretty sensitive, and the patch can pass all the in-tree tests and a small scale projects, I feel it should be correct. # Future Plans I'll continue to work on no transitive decl change and no transitive identifier change (if matters) to achieve the goal to stop the propagation of unnecessary changes. But all of this depends on this patch. Since, clearly, the source locations are the most sensitive thing. --- The release nots and documentation will be added seperately.	2024-04-30 15:57:58 +08:00
Erich Keane	39adc8f423	[NFC] Generalize ArraySections to work for OpenACC in the future (#89639 ) OpenACC is going to need an array sections implementation that is a simpler version/more restrictive version of the OpenMP version. This patch moves `OMPArraySectionExpr` to `Expr.h` and renames it `ArraySectionExpr`, then adds an enum to choose between the two. This also fixes a couple of 'drive-by' issues that I discovered on the way, but leaves the OpenACC Sema parts reasonably unimplemented (no semantic analysis implementation), as that will be a followup patch.	2024-04-25 10:22:03 -07:00
Chuanqi Xu	d86cc73bbf	[NFC] [Serialization] Avoid using DeclID directly as much as possible This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.	2024-04-25 14:59:09 +08:00
Chuanqi Xu	72b58146b1	Revert "[NFC] [Serialization] Avoid using DeclID directly as much as possible" This reverts commit 42070a5c092ed420bf92ebf38229c594885e94c7. I forgot to touch lldb.	2024-04-25 14:26:07 +08:00
Chuanqi Xu	42070a5c09	[NFC] [Serialization] Avoid using DeclID directly as much as possible This patch tries to remove all the direct use of DeclID except the real low level reading and writing. All the use of DeclID is converted to the use of LocalDeclID or GlobalDeclID. This is helpful to increase the readability and type safety.	2024-04-25 14:14:05 +08:00
Chuanqi Xu	c2a98fdeb3	[NFC] Move DeclID from serialization/ASTBitCodes.h to AST/DeclID.h (#89873 ) Previously, the DeclID is defined in serialization/ASTBitCodes.h under clang::serialization namespace. However, actually the DeclID is not purely used in serialization part. The DeclID is already widely used in AST and all around the clang project via classes like `LazyPtrDecl` or calling `ExternalASTSource::getExernalDecl()`. All such uses are via the raw underlying type of `DeclID` as `uint32_t`. This is not pretty good. This patch moves the DeclID class family to a new header `AST/DeclID.h` so that the whole project can use the wrapped class `DeclID`, `GlobalDeclID` and `LocalDeclID` instead of the raw underlying type. This can improve the readability and the type safety.	2024-04-25 13:53:22 +08:00
Chuanqi Xu	b467c6b536	[NFC] [Serialization] Turn type alias GlobalDeclID into a class Succsessor of b8e3b2ad66cf78ad2b. This patch also converts the type alias GlobalDeclID to a class to improve the readability and type safety.	2024-04-23 17:52:58 +08:00
Chuanqi Xu	b8e3b2ad66	[NFC] [Serialization] Turn type alias LocalDeclID into class Previously, the LocalDeclID and GlobalDeclID are defined as: ``` using LocalDeclID = DeclID; using GlobalDeclID = DeclID; ``` This is more or less concerning that we may misuse LocalDeclID and GlobalDeclID without understanding it. There is also a FIXME saying this. This patch tries to turn LocalDeclID into a class to improve the type safety here.	2024-04-23 16:56:14 +08:00
Chuanqi Xu	07b1177eed	[NFC] [Serialization] Use semantical type DeclID instead of raw type 'uint32_t' This patch tries to use DeclID in the code bases to avoid use the raw type 'uint32_t'. It is problematic to use the raw type 'uint32_t' if we want to change the type of DeclID some day.	2024-04-23 12:44:00 +08:00
Erich Keane	dc20a0ea1f	[OpenACC] Implement 'num_gangs' sema for compute constructs (#89460 ) num_gangs takes an 'int-expr-list', for 'parallel', and an 'int-expr' for 'kernels'. This patch changes the parsing to always parse it as an 'int-expr-list', then correct the expression count during Sema. It also implements the rest of the semantic analysis changes for this clause.	2024-04-22 08:57:25 -07:00
Erich Keane	b8adf169bb	[OpenACC] Implement 'vector_length' clause On compute constructs The 'vector_length' clause is semantically identical to the 'num_workers' clause, in that it takes a mandatory single int-expr. This is implemented identically to it.	2024-04-18 13:27:42 -07:00
Erich Keane	76600aee9d	[OpenACC] Implement 'num_workers' clause for compute constructs (#89151 ) This clause just takes an 'int expr', which is not optional. This patch implements the clause on compute constructs.	2024-04-18 12:42:22 -07:00
Volodymyr Sapsai	22e6bf77ad	[unused-includes][Serialization] Remove unused includes. NFC. (#88790 )	2024-04-16 10:12:26 -07:00
Erich Keane	6133878227	[OpenACC] Implement `self` clause for compute constructs (#88760 ) `self` clauses on compute constructs take an optional condition expression. We again limit the implementation to ONLY compute constructs to ensure we get all the rules correct for others. However, this one will be particularly complicated, as it takes a `var-list` for `update`, so when we get to that construct/clause combination, we need to do that as well. This patch also furthers uses of the `OpenACCClauses.def` as it became useful while implementing this (as well as some other minor refactors as I went through). Finally, `self` and `if` clauses have an interaction with each other, if an `if` clause evaluates to `true`, the `self` clause has no effect. While this is intended and can be used 'meaningfully', we are warning on this with a very granular warning, so that this edge case will be noticed by newer users, but can be disabled trivially.	2024-04-16 06:57:36 -07:00
Chuanqi Xu	d26dd58ca5	[StmtProfile] Don't profile the body of lambda expressions Close https://github.com/llvm/llvm-project/issues/87609 We tried to profile the body of the lambda expressions in https://reviews.llvm.org/D153957. But as the original comments show, it is indeed dangerous. After we tried to skip calculating the ODR hash values recently, we have fall into this trap twice. So in this patch, I choose to not profile the body of the lambda expression. The signature of the lambda is still profiled.	2024-04-16 15:41:26 +08:00
Kazu Hirata	89071f3559	[clang] Drop unaligned from calls to readNext (NFC) (#88842 ) Now readNext defaults to unaligned accesses. This patch drops unaligned to improve readability.	2024-04-16 00:09:41 -07:00
Vlad Serebrennikov	0a6f6df5b0	[clang] Introduce `SemaCUDA` (#88559 ) This patch moves CUDA-related `Sema` function into new `SemaCUDA` class, following the recent example of SYCL, OpenACC, and HLSL. This is a part of the effort to split Sema. Additional context can be found in https://github.com/llvm/llvm-project/pull/82217, https://github.com/llvm/llvm-project/pull/84184, https://github.com/llvm/llvm-project/pull/87634.	2024-04-13 08:54:25 +04:00
Erich Keane	daa88364df	[OpenACC] Implement 'if' clause for Compute Constructs (#88411 ) Like with the 'default' clause, this is being applied to only Compute Constructs for now. The 'if' clause takes a condition expression which is used as a runtime value. This is not a particularly complex semantic implementation, as there isn't much to this clause, other than its interactions with 'self', which will be managed in the patch to implement that.	2024-04-12 14:13:31 -07:00
Chuanqi Xu	f21ead0675	[C++20] [Modules] [Reduced BMI] Remove unreachable decls GMF in redued BMI (#88359 ) Following of https://github.com/llvm/llvm-project/pull/76930 This follows the idea of "only writes what we writes", which I think is the most natural and efficient way to implement this optimization. We start writing the BMI from the first declaration in module purview instead of the global module fragment, so that everything in the GMF untouched won't be written in the BMI naturally. The exception is, as I said in https://github.com/llvm/llvm-project/pull/76930, when we write a declaration we need to write its decl context, and when we write the decl context, we need to write everything from it. So when we see `std::vector`, we basically need to write everything under namespace std. This violates our intention. To fix this, this patch delays the writing of namespace in the GMF. From my local measurement, the size of the BMI decrease to 90M from 112M for a local modules build. I think this is significant. This feature will be covered under the experimental reduced BMI so that it won't affect any existing users. So I'd like to land this when the CI gets green. Documents will be added seperately.	2024-04-12 12:51:58 +08:00
Bill Wendling	fca51911d4	[NFC][Clang] Improve const correctness for IdentifierInfo (#79365 ) The IdentifierInfo isn't typically modified. Use 'const' wherever possible.	2024-04-11 00:33:40 +00:00
erichkeane	3d468566eb	[NFC] Remove unneeded 'maybe_unused' attributes This was added while we only had a partial implementation of clauses, so we don't need these anymore.	2024-04-10 09:37:01 -07:00
Jan Svoboda	fc3dff9b46	[clang][modules] Stop eagerly reading files with diagnostic pragmas (#87442 ) This makes it so that the importer doesn't need to stat all input files of a module that contain diagnostic pragmas, reducing file system traffic.	2024-04-10 09:08:01 -07:00
Erich Keane	0c7b92a42a	[OpenACC] Implement Default clause for Compute Constructs (#88135 ) As a followup to my previous commits, this is an implementation of a single clause, in this case the 'default' clause. This implements all semantic analysis for it on compute clauses, and continues to leave it rejected for all others (some as 'doesnt appertain', others as 'not implemented' as appropriate). This also implements and tests the TreeTransform as requested in the previous patch.	2024-04-10 07:10:24 -07:00
Ian Anderson	0cd0aa0296	[clang][modules] Headers meant to be included multiple times can be completely invisible in clang module builds (#83660 ) Once a file has been `#import`'ed, it gets stamped as if it was `#pragma once` and will not be re-entered, even on #include. This means that any errant #import of a file designed to be included multiple times, such as <assert.h>, will incorrectly mark it as include-once and break the multiple include functionality. Normally this isn't a big problem, e.g. <assert.h> can't have its NDEBUG mode changed after the first #import, but it is still mostly functional. However, when clang modules are involved, this can cause the header to be hidden entirely. Objective-C code most often uses #import for everything, because it's required for most Objective-C headers to prevent double inclusion and redeclaration errors. (It's rare for Objective-C headers to use macro guards or `#pragma once`.) The problem arises when a submodule includes a multiple-include header. The "already included" state is global across all modules (which is necessary so that non-modular headers don't get compiled into multiple translation units and cause redeclaration errors). If another module or the main file #import's the same header, it becomes invisible from then on. If the original submodule is not imported, the include of the header will effectively do nothing and the header will be invisible. The only way to actually get the header's declarations is to somehow figure out which submodule consumed the header, and import that instead. That's basically impossible since it depends on exactly which modules were built in which order. #import is a poor indicator of whether a header is actually include-once, as the #import is external to the header it applies to, and requires that all inclusions correctly and consistently use #import vs #include. When modules are enabled, consider a header marked `textual` in its module as a stronger indicator of multiple-include than #import's indication of include-once. This will allow headers like <assert.h> to always be included when modules are enabled, even if #import is erroneously used somewhere.	2024-04-05 10:13:42 -07:00
Erich Keane	30f6eafaa9	[OpenACC][NFC] Add OpenACC Clause AST Nodes/infrastructure (#87675 ) As a first step in adding clause support for OpenACC to Semantic Analysis, this patch adds the 'base' AST nodes required for clauses. This patch has no functional effect at the moment, but followup patches will add the semantic analysis of clauses (plus individual clauses).	2024-04-05 10:06:44 -07:00
Jan Svoboda	c925c1646d	[clang][modules] NFCI: Pragma diagnostic mappings: write/read `FileID` instead of `SourceLocation` (#87427 ) For pragma diagnostic mappings, we always write/read `SourceLocation` with offset 0. This is equivalent to just writing a `FileID`, which is exactly what this patch starts doing. Originally reviewed here: https://reviews.llvm.org/D137213	2024-04-03 03:36:53 +02:00
Chris B	9434c08347	[HLSL] Implement array temporary support (#79382 ) HLSL constant sized array function parameters do not decay to pointers. Instead constant sized array types are preserved as unique types for overload resolution, template instantiation and name mangling. This implements the change by adding a new `ArrayParameterType` which represents a non-decaying `ConstantArrayType`. The new type behaves the same as `ConstantArrayType` except that it does not decay to a pointer. Values of `ConstantArrayType` in HLSL decay during overload resolution via a new `HLSLArrayRValue` cast to `ArrayParameterType`. `ArrayParamterType` values are passed indirectly by-value to functions in IR generation resulting in callee generated memcpy instructions. The behavior of HLSL function calls is documented in the [draft language specification](https://microsoft.github.io/hlsl-specs/specs/hlsl.pdf) under the Expr.Post.Call heading. Additionally the design of this implementation approach is documented in [Clang's documentation](https://clang.llvm.org/docs/HLSL/FunctionCalls.html) Resolves #70123	2024-04-01 12:10:10 -05:00
Yeoul Na	3eb9ff3095	Turn 'counted_by' into a type attribute and parse it into 'CountAttributedType' (#78000 ) In `-fbounds-safety`, bounds annotations are considered type attributes rather than declaration attributes. Constructing them as type attributes allows us to extend the attribute to apply nested pointers, which is essential to annotate functions that involve out parameters: `void foo(int __counted_by(out_count) out_buf, int out_count)`. We introduce a new sugar type to support bounds annotated types, `CountAttributedType`. In order to maintain extra data (the bounds expression and the dependent declaration information) that is not trackable in `AttributedType` we create a new type dedicate to this functionality. This patch also extends the parsing logic to parse the `counted_by` argument as an expression, which will allow us to extend the model to support arguments beyond an identifier, e.g., `__counted_by(n + m)` in the future as specified by `-fbounds-safety`. This also adjusts `__bdos` and array-bounds sanitizer code that already uses `CountedByAttr` to check `CountAttributedType` instead to get the field referred to by the attribute.	2024-03-20 13:36:56 +09:00
Chuanqi Xu	3f6bc1adf8	[C++20] [Moduls] Avoid computing odr hash for functions from comparing constraint expression Previously we disabled to compute ODR hash for declarations from the global module fragment. However, we missed the case that the functions lives in the concept requiments (see the attached the test files for example). And the mismatch causes the potential crashment. Due to we will set the function body as lazy after we deserialize it and we will only take its body when needed. However, we don't allow to take the body during deserializing. So it is actually potentially problematic if we set the body as lazy first and computing the hash value of the function, which requires to deserialize its body. So we will meet a crash here. This patch tries to solve the issue by not taking the body of the function from GMF. Note that we can't skip comparing the constraint expression from the GMF directly since it is an key part of the function selecting and it may be the reason why we can't return 0 directly for `FunctionDecl::getODRHash()` from the GMF.	2024-03-11 11:39:21 +08:00
Vlad Serebrennikov	502756905c	[clang][NFC] Use "notable" for "interesting" identifiers in `IdentifierInfo` (#81542 ) This patch expands notion of "interesting" in `IdentifierInto` it to also cover ObjC keywords and builtins, which matches notion of "interesting" in serialization layer. What was previously "interesting" in `IdentifierInto` is now called "notable". Beyond clearing confusion between serialization and the rest of the compiler, it also resolved a naming problem: ObjC keywords, notable identifiers, and builtin IDs are all stored in the same bit-field. Now we can use "interesting" to name it and its corresponding type, instead of `ObjCKeywordOrInterestingOrBuiltin` abomination.	2024-02-14 16:39:00 +04:00
Vlad Serebrennikov	30338223e4	[clang] Refactor `IdentifierInfo::ObjcOrBuiltinID` (#71709 ) This patch refactors how values are stored inside `IdentifierInfo::ObjcOrBuiltinID` bit-field, and annotates it with `preferred_type`. In order to make the value easier to interpret by debuggers, a new `ObjCKeywordOrInterestingOrBuiltin` enum is added. Previous "layout" of this fields couldn't be represented with this new enum, because it skipped over some arbitrary enumerators, so a new "layout" was invented, which is reflected in `ObjCKeywordOrInterestingOrBuiltin` enum. I believe the new layout is simpler than the new one.	2024-02-12 20:40:57 +04:00
Chuanqi Xu	8eea582dcb	[C++20] [Modules] Introduce -fskip-odr-check-in-gmf (#79959 ) Close https://github.com/llvm/llvm-project/issues/79240 Cite the comment from @mizvekov in //github.com/llvm/llvm-project/issues/79240: > There are two kinds of bugs / issues relevant here: > > Clang bugs that this change hides > Here we can add a Frontend flag that disables the GMF ODR check, just > so > we can keep tracking, testing and fixing these issues. > The Driver would just always pass that flag. > We could add that flag in this current issue. > Bugs in user code: > I don't think it's worth adding a corresponding Driver flag for > controlling the above Frontend flag, since we intend it's behavior to > become default as we fix the problems, and users interested in testing > the more strict behavior can just use the Frontend flag directly. This patch follows the suggestion: - Introduce the CC1 flag `-fskip-odr-check-in-gmf` which is by default off, so that the every existing test will still be tested with checking ODR violations. - Passing `-fskip-odr-check-in-gmf` in the driver to keep the behavior we intended. - Edit the document to tell the users who are still interested in more strict checks can use `-Xclang -fno-skip-odr-check-in-gmf` to get the existing behavior.	2024-02-01 13:44:32 +08:00
SunilKuravinakop	a74e9ce5dc	[OpenMP] atomic compare weak : Parser & AST support (#79475 ) This is a support for " #pragma omp atomic compare weak". It has Parser & AST support for now. --------- Authored-by: Sunil Kuravinakop <kuravina@pe28vega.us.cray.com>	2024-01-31 06:32:06 -05:00

1 2 3 4 5 ...

1723 Commits