
Following of https://github.com/llvm/llvm-project/pull/86912 The motivation of the patch series is that, for a module interface unit `X`, when the dependent modules of `X` changes, if the changes is not relevant with `X`, we hope the BMI of `X` won't change. For the specific patch, we hope if the changes was about irrelevant declaration changes, we hope the BMI of `X` won't change. **However**, I found the patch itself is not very useful in practice, since the adding or removing declarations, will change the state of identifiers and types in most cases. That said, for the most simple example, ``` // partA.cppm export module m:partA; // partA.v1.cppm export module m:partA; export void a() {} // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` the BMI of `onlyUseB` will change after we change the implementation of `partA.cppm` to `partA.v1.cppm`. Since `partA.v1.cppm` introduces new identifiers and types (the function prototype). So in this patch, we have to write the tests as: ``` // partA.cppm export module m:partA; export int getA() { ... } export int getA2(int) { ... } // partA.v1.cppm export module m:partA; export int getA() { ... } export int getA(int) { ... } export int getA2(int) { ... } // partB.cppm export module m:partB; export void b() {} // m.cppm export module m; export import :partA; export import :partB; // onlyUseB; export module onlyUseB; import m; export inline void onluUseB() { b(); } ``` so that the new introduced declaration `int getA(int)` doesn't introduce new identifiers and types, then the BMI of `onlyUseB` can keep unchanged. While it looks not so great, the patch should be the base of the patch to erase the transitive change for identifiers and types since I don't know how can we introduce new types and identifiers without introducing new declarations. Given how tightly the relationship between declarations, types and identifiers, I think we can only reach the ideal state after we made the series for all of the three entties. The design of the patch is similar to https://github.com/llvm/llvm-project/pull/86912, which extends the 32-bit DeclID to 64-bit and use the higher bits to store the module file index and the lower bits to store the Local Decl ID. A slight difference is that we only use 48 bits to store the new DeclID since we try to use the higher 16 bits to store the module ID in the prefix of Decl class. Previously, we use 32 bits to store the module ID and 32 bits to store the DeclID. I don't want to allocate additional space so I tried to make the additional space the same as 64 bits. An potential interesting thing here is about the relationship between the module ID and the module file index. I feel we can get the module file index by the module ID. But I didn't prove it or implement it. Since I want to make the patch itself as small as possible. We can make it in the future if we want. Another change in the patch is the new concept Decl Index, which means the index of the very big array `DeclsLoaded` in ASTReader. Previously, the index of a loaded declaration is simply the Decl ID minus PREDEFINED_DECL_NUMs. So there are some places they got used ambiguously. But this patch tried to split these two concepts. As https://github.com/llvm/llvm-project/pull/86912 did, the change will increase the on-disk PCM file sizes. As the declaration ID may be the most IDs in the PCM file, this can have the biggest impact on the size. In my experiments, this change will bring 6.6% increase of the on-disk PCM size. No compile-time performance regression observed. Given the benefits in the motivation example, I think the cost is worthwhile.
93 lines
3.5 KiB
C++
93 lines
3.5 KiB
C++
//===- ModuleFile.cpp - Module description --------------------------------===//
|
|
//
|
|
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
|
|
// See https://llvm.org/LICENSE.txt for license information.
|
|
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
//
|
|
// This file implements the ModuleFile class, which describes a module that
|
|
// has been loaded from an AST file.
|
|
//
|
|
//===----------------------------------------------------------------------===//
|
|
|
|
#include "clang/Serialization/ModuleFile.h"
|
|
#include "ASTReaderInternals.h"
|
|
#include "clang/Serialization/ContinuousRangeMap.h"
|
|
#include "llvm/ADT/StringRef.h"
|
|
#include "llvm/Support/Compiler.h"
|
|
#include "llvm/Support/raw_ostream.h"
|
|
|
|
using namespace clang;
|
|
using namespace serialization;
|
|
using namespace reader;
|
|
|
|
ModuleFile::~ModuleFile() {
|
|
delete static_cast<ASTIdentifierLookupTable *>(IdentifierLookupTable);
|
|
delete static_cast<HeaderFileInfoLookupTable *>(HeaderFileInfoTable);
|
|
delete static_cast<ASTSelectorLookupTable *>(SelectorLookupTable);
|
|
}
|
|
|
|
template<typename Key, typename Offset, unsigned InitialCapacity>
|
|
static void
|
|
dumpLocalRemap(StringRef Name,
|
|
const ContinuousRangeMap<Key, Offset, InitialCapacity> &Map) {
|
|
if (Map.begin() == Map.end())
|
|
return;
|
|
|
|
using MapType = ContinuousRangeMap<Key, Offset, InitialCapacity>;
|
|
|
|
llvm::errs() << " " << Name << ":\n";
|
|
for (typename MapType::const_iterator I = Map.begin(), IEnd = Map.end();
|
|
I != IEnd; ++I) {
|
|
llvm::errs() << " " << I->first << " -> " << I->second << "\n";
|
|
}
|
|
}
|
|
|
|
LLVM_DUMP_METHOD void ModuleFile::dump() {
|
|
llvm::errs() << "\nModule: " << FileName << "\n";
|
|
if (!Imports.empty()) {
|
|
llvm::errs() << " Imports: ";
|
|
for (unsigned I = 0, N = Imports.size(); I != N; ++I) {
|
|
if (I)
|
|
llvm::errs() << ", ";
|
|
llvm::errs() << Imports[I]->FileName;
|
|
}
|
|
llvm::errs() << "\n";
|
|
}
|
|
|
|
// Remapping tables.
|
|
llvm::errs() << " Base source location offset: " << SLocEntryBaseOffset
|
|
<< '\n';
|
|
|
|
llvm::errs() << " Base identifier ID: " << BaseIdentifierID << '\n'
|
|
<< " Number of identifiers: " << LocalNumIdentifiers << '\n';
|
|
dumpLocalRemap("Identifier ID local -> global map", IdentifierRemap);
|
|
|
|
llvm::errs() << " Base macro ID: " << BaseMacroID << '\n'
|
|
<< " Number of macros: " << LocalNumMacros << '\n';
|
|
dumpLocalRemap("Macro ID local -> global map", MacroRemap);
|
|
|
|
llvm::errs() << " Base submodule ID: " << BaseSubmoduleID << '\n'
|
|
<< " Number of submodules: " << LocalNumSubmodules << '\n';
|
|
dumpLocalRemap("Submodule ID local -> global map", SubmoduleRemap);
|
|
|
|
llvm::errs() << " Base selector ID: " << BaseSelectorID << '\n'
|
|
<< " Number of selectors: " << LocalNumSelectors << '\n';
|
|
dumpLocalRemap("Selector ID local -> global map", SelectorRemap);
|
|
|
|
llvm::errs() << " Base preprocessed entity ID: " << BasePreprocessedEntityID
|
|
<< '\n'
|
|
<< " Number of preprocessed entities: "
|
|
<< NumPreprocessedEntities << '\n';
|
|
dumpLocalRemap("Preprocessed entity ID local -> global map",
|
|
PreprocessedEntityRemap);
|
|
|
|
llvm::errs() << " Base type index: " << BaseTypeIndex << '\n'
|
|
<< " Number of types: " << LocalNumTypes << '\n';
|
|
dumpLocalRemap("Type index local -> global map", TypeRemap);
|
|
|
|
llvm::errs() << " Base decl index: " << BaseDeclIndex << '\n'
|
|
<< " Number of decls: " << LocalNumDecls << '\n';
|
|
}
|