llvm-project

shylie/llvm-project

Fork 0

Commit Graph

Author	SHA1	Message	Date
Chuanqi Xu	03921b979d	[serialization] No transitive type change (#92511 ) Following of https://github.com/llvm/llvm-project/pull/92085. #### motivation The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. And this will be the last patch of the `no-transitive-*-change` series. If there are any following patches, they might be C++20 Named modules specific to handle special grammars like `ADL` (See the reply in https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53 for example). So they won't affect the whole serialization part as the series patch did. #### example After this patch, finally we are able to cut of unnecessary change of types. For example, ``` //--- m-partA.cppm export module m:partA; //--- m-partA.v1.cppm export module m:partA; namespace NS { class A { public: int getValue() { return 43; } }; } //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` The BMI of `useBOnly.cppm` is expected to not change if we only add a new class in `m:partA`. This will be pretty useful in practice. #### implementation details The key idea of this patch is similar with the previous patches: extend the 32bits type ID to 64bits so that we can store the module file index in the higher bits. Then the encoding of the type ID is independent on the imported modules. But there are two differences from the previous patches: - TypeID is not completely an index of serialized types. We used the lower 3 bits to store the qualifiers. - TypeID won't take part in any lookup process. So the uses of TypeID is much less than the previous patches. The first difference make we have some more slightly complex bit operations. And the second difference makes the patch much simpler than the previous ones.	2024-06-21 09:21:40 +08:00
Chuanqi Xu	2f2ea3557b	[Serialization] No transitive identifier change (#92085 ) Following of https://github.com/llvm/llvm-project/pull/92083 The motivation is still cutting of the unnecessary change in the dependency chain. See the above link (recursively) for details. After this patch, (and the above patch), we can already do something pretty interesting. For example, #### Motivation example ``` //--- m-partA.cppm export module m:partA; export inline int getA() { return 43; } export class A { public: int getMem(); }; export template <typename T> class ATempl { public: T getT(); }; //--- m-partA.v1.cppm export module m:partA; export inline int getA() { return 43; } // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. export inline int getA2() { return 88; } export class A { public: int getMem(); // Now we add a new declaration without introducing a new type. // The consuming module which didn't use m:partA completely is expected to be // not changed. int getMem2(); }; export template <typename T> class ATempl { public: T getT(); // Add a new declaration without introducing a new type. T getT2(); }; //--- m-partB.cppm export module m:partB; export inline int getB() { return 430; } //--- m.cppm export module m; export import :partA; export import :partB; //--- useBOnly.cppm export module useBOnly; import m; export inline int get() { return getB(); } ``` In this example, module `m` exports two partitions `:partA` and `:partB`. And a consumer `useBOnly` only consumes the entities from `:partB`. So we don't hope the BMI of `useBOnly` changes if only `:partA` changes. After this patch, we can make it if the change of `:partA` doesn't introduce new types. (And we can get rid of this if we make no-transitive-type-change). As the example shows, when we change the implementation of `:partA` from `m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration `getA2()` at the global namespace, add a new member function `getMem2()` to class `A` and add a new member function to `getT2()` to class template `ATempl`. And since `:partA` is not used by `useBOnly` completely, the BMI of `useBOnly` won't change after we made above changes. #### Design details Method used in this patch is similar with https://github.com/llvm/llvm-project/pull/92083 and https://github.com/llvm/llvm-project/pull/86912. It extends the 32 bit IdentifierID to 64 bits and use the higher 32 bits to store the module file index. So that the encoding of the identifier won't get affected by other modules. #### Overhead Similar with https://github.com/llvm/llvm-project/pull/92083 and https://github.com/llvm/llvm-project/pull/86912. The change is only expected to increase the size of the on-disk .pcm files and not affect the compile-time performances. And from my experiment, the size of the on-disk change only increase 1%+ and observe no compile-time impacts. #### Future Plans I'll try to do the same thing for type ids. IIRC, it won't change the dependency graph if we add a new type in an unused units. I do think this is a significant win. And this will be a pretty good answer to "why modules are better than headers."	2024-06-20 13:30:05 +08:00

Author

SHA1

Message

Date

Chuanqi Xu

03921b979d

[serialization] No transitive type change (#92511 )

Following of https://github.com/llvm/llvm-project/pull/92085. 

#### motivation

The motivation is still cutting of the unnecessary change in the
dependency chain. See the above link (recursively) for details.

And this will be the last patch of the `no-transitive-*-change` series.
If there are any following patches, they might be C++20 Named modules
specific to handle special grammars like `ADL` (See the reply in
https://discourse.llvm.org/t/rfc-c-20-modules-introduce-thin-bmi-and-decls-hash/74755/53
for example). So they won't affect the whole serialization part as the
series patch did.

#### example

After this patch, finally we are able to cut of unnecessary change of
types. For example,

```

//--- m-partA.cppm
export module m:partA;

//--- m-partA.v1.cppm
export module m:partA;

namespace NS {
    class A {
        public:
            int getValue() {
                return 43;
            }
    };
}

//--- m-partB.cppm
export module m:partB;

export inline int getB() {
    return 430;
}

//--- m.cppm
export module m;
export import :partA;
export import :partB;

//--- useBOnly.cppm
export module useBOnly;
import m;

export inline int get() {
    return getB();
}
```

The BMI of `useBOnly.cppm` is expected to not change if we only add a
new class in `m:partA`. This will be pretty useful in practice.

#### implementation details

The key idea of this patch is similar with the previous patches: extend
the 32bits type ID to 64bits so that we can store the module file index
in the higher bits. Then the encoding of the type ID is independent on
the imported modules.

But there are two differences from the previous patches:
- TypeID is not completely an index of serialized types. We used the
lower 3 bits to store the qualifiers.
- TypeID won't take part in any lookup process. So the uses of TypeID is
much less than the previous patches.

The first difference make we have some more slightly complex bit
operations. And the second difference makes the patch much simpler than
the previous ones.

2024-06-21 09:21:40 +08:00

Chuanqi Xu

2f2ea3557b

[Serialization] No transitive identifier change (#92085 )

Following of https://github.com/llvm/llvm-project/pull/92083

The motivation is still cutting of the unnecessary change in the
dependency chain. See the above link (recursively) for details.

After this patch, (and the above patch), we can already do something
pretty interesting. For example,

#### Motivation example

```

//--- m-partA.cppm
export module m:partA;

export inline int getA() {
    return 43;
}

export class A {
public:
    int getMem();
};

export template <typename T>
class ATempl {
public:
    T getT();
};

//--- m-partA.v1.cppm
export module m:partA;

export inline int getA() {
    return 43;
}

// Now we add a new declaration without introducing a new type.
// The consuming module which didn't use m:partA completely is expected to be
// not changed.
export inline int getA2() {
    return 88;
}

export class A {
public:
    int getMem();
    // Now we add a new declaration without introducing a new type.
    // The consuming module which didn't use m:partA completely is expected to be
    // not changed.
    int getMem2();
};

export template <typename T>
class ATempl {
public:
    T getT();
    // Add a new declaration without introducing a new type.
    T getT2();
};

//--- m-partB.cppm
export module m:partB;

export inline int getB() {
    return 430;
}

//--- m.cppm
export module m;
export import :partA;
export import :partB;

//--- useBOnly.cppm
export module useBOnly;
import m;

export inline int get() {
    return getB();
}
```

In this example, module `m` exports two partitions `:partA` and
`:partB`. And a consumer `useBOnly` only consumes the entities from
`:partB`. So we don't hope the BMI of `useBOnly` changes if only
`:partA` changes. After this patch, we can make it if the change of
`:partA` doesn't introduce new types. (And we can get rid of this if we
make no-transitive-type-change).

As the example shows, when we change the implementation of `:partA` from
`m-partA.cppm` to `m-partA.v1.cppm`, we add new function declaration
`getA2()` at the global namespace, add a new member function `getMem2()`
to class `A` and add a new member function to `getT2()` to class
template `ATempl`. And since `:partA` is not used by `useBOnly`
completely, the BMI of `useBOnly` won't change after we made above
changes.

#### Design details

Method used in this patch is similar with
https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. It extends the 32 bit
IdentifierID to 64 bits and use the higher 32 bits to store the module
file index. So that the encoding of the identifier won't get affected by
other modules.

#### Overhead

Similar with https://github.com/llvm/llvm-project/pull/92083 and
https://github.com/llvm/llvm-project/pull/86912. The change is only
expected to increase the size of the on-disk .pcm files and not affect
the compile-time performances. And from my experiment, the size of the
on-disk change only increase 1%+ and observe no compile-time impacts.

#### Future Plans

I'll try to do the same thing for type ids. IIRC, it won't change the
dependency graph if we add a new type in an unused units. I do think
this is a significant win. And this will be a pretty good answer to "why
modules are better than headers."

2024-06-20 13:30:05 +08:00

2 Commits