Chuanqi Xu 2f2ea3557b
[Serialization] No transitive identifier change (#92085)
Following of https://github.com/llvm/llvm-project/pull/92083

The motivation is still cutting of the unnecessary change in the
dependency chain. See the above link (recursively) for details.

After this patch, (and the above patch), we can already do something
pretty interesting. For example,

#### Motivation example

```

//--- m-partA.cppm
export module m:partA;

export inline int getA() {
    return 43;
}

export class A {
public:
    int getMem();
};

export template <typename T>
class ATempl {
public:
    T getT();
};

//--- m-partA.v1.cppm
export module m:partA;

export inline int getA() {
    return 43;
}

// Now we add a new declaration without introducing a new type.
// The consuming module which didn't use m:partA completely is expected to be
// not changed.
export inline int getA2() {
    return 88;
}

export class A {
public:
    int getMem();
    // Now we add a new declaration without introducing a new type.
    // The consuming module which didn't use m:partA completely is expected to be
    // not changed.
    int getMem2();
};

export template <typename T>
class ATempl {
public:
    T getT();
    // Add a new declaration without introducing a new type.
    T getT2();
};

//--- m-partB.cppm
export module m:partB;

export inline int getB() {
    return 430;
}

//--- m.cppm
export module m;
export import :partA;
export import :partB;

//--- useBOnly.cppm
export module useBOnly;
import m;

export inline int get() {
    return getB();
}
```

In this example, module `m` exports two partitions `:partA` and
`:partB`. And a consumer `useBOnly` only consumes the entities from
`:partB`. So we don't hope the BMI of `useBOnly` changes if only
`:partA` changes. After this patch, we can make it if the change of
`:partA` doesn't introduce new types. (And we can get rid of this if we
make no-transitive-type-change).

As the example shows, when we change the implementation of `:partA` from
`m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration
`getA2()` at the global namespace, add a new member function `getMem2()`
to class `A` and add a new member function to `getT2()` to class
template `ATempl`. And since `:partA` is not used by `useBOnly`
completely, the BMI of `useBOnly` won't change after we made above
changes.

#### Design details

Method used in this patch is similar with
https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. It extends the 32 bit
IdentifierID to 64 bits and use the higher 32 bits to store the module
file index. So that the encoding of the identifier won't get affected by
other modules.

#### Overhead

Similar with https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. The change is only
expected to increase the size of the on-disk .pcm files and not affect
the compile-time performances. And from my experiment, the size of the
on-disk change only increase 1%+ and observe no compile-time impacts.

#### Future Plans

I'll try to do the same thing for type ids. IIRC, it won't change the
dependency graph if we add a new type in an unused units. I do think
this is a significant win. And this will be a pretty good answer to "why
modules are better than headers."
2024-06-20 13:30:05 +08:00

92 lines
3.4 KiB
C++

//===- ModuleFile.cpp - Module description --------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
//
// This file implements the ModuleFile class, which describes a module that
// has been loaded from an AST file.
//
//===----------------------------------------------------------------------===//
#include "clang/Serialization/ModuleFile.h"
#include "ASTReaderInternals.h"
#include "clang/Serialization/ContinuousRangeMap.h"
#include "llvm/ADT/StringRef.h"
#include "llvm/Support/Compiler.h"
#include "llvm/Support/raw_ostream.h"
using namespace clang;
using namespace serialization;
using namespace reader;
ModuleFile::~ModuleFile() {
delete static_cast<ASTIdentifierLookupTable *>(IdentifierLookupTable);
delete static_cast<HeaderFileInfoLookupTable *>(HeaderFileInfoTable);
delete static_cast<ASTSelectorLookupTable *>(SelectorLookupTable);
}
template<typename Key, typename Offset, unsigned InitialCapacity>
static void
dumpLocalRemap(StringRef Name,
const ContinuousRangeMap<Key, Offset, InitialCapacity> &Map) {
if (Map.begin() == Map.end())
return;
using MapType = ContinuousRangeMap<Key, Offset, InitialCapacity>;
llvm::errs() << " " << Name << ":\n";
for (typename MapType::const_iterator I = Map.begin(), IEnd = Map.end();
I != IEnd; ++I) {
llvm::errs() << " " << I->first << " -> " << I->second << "\n";
}
}
LLVM_DUMP_METHOD void ModuleFile::dump() {
llvm::errs() << "\nModule: " << FileName << "\n";
if (!Imports.empty()) {
llvm::errs() << " Imports: ";
for (unsigned I = 0, N = Imports.size(); I != N; ++I) {
if (I)
llvm::errs() << ", ";
llvm::errs() << Imports[I]->FileName;
}
llvm::errs() << "\n";
}
// Remapping tables.
llvm::errs() << " Base source location offset: " << SLocEntryBaseOffset
<< '\n';
llvm::errs() << " Base identifier ID: " << BaseIdentifierID << '\n'
<< " Number of identifiers: " << LocalNumIdentifiers << '\n';
llvm::errs() << " Base macro ID: " << BaseMacroID << '\n'
<< " Number of macros: " << LocalNumMacros << '\n';
dumpLocalRemap("Macro ID local -> global map", MacroRemap);
llvm::errs() << " Base submodule ID: " << BaseSubmoduleID << '\n'
<< " Number of submodules: " << LocalNumSubmodules << '\n';
dumpLocalRemap("Submodule ID local -> global map", SubmoduleRemap);
llvm::errs() << " Base selector ID: " << BaseSelectorID << '\n'
<< " Number of selectors: " << LocalNumSelectors << '\n';
dumpLocalRemap("Selector ID local -> global map", SelectorRemap);
llvm::errs() << " Base preprocessed entity ID: " << BasePreprocessedEntityID
<< '\n'
<< " Number of preprocessed entities: "
<< NumPreprocessedEntities << '\n';
dumpLocalRemap("Preprocessed entity ID local -> global map",
PreprocessedEntityRemap);
llvm::errs() << " Base type index: " << BaseTypeIndex << '\n'
<< " Number of types: " << LocalNumTypes << '\n';
dumpLocalRemap("Type index local -> global map", TypeRemap);
llvm::errs() << " Base decl index: " << BaseDeclIndex << '\n'
<< " Number of decls: " << LocalNumDecls << '\n';
}