Haojian Wu 6a9f79e102 [pseudo] Eliminate the type-name identifier ambiguities in the grammar.
See https://reviews.llvm.org/D130626 for motivation.

Identifier in the grammar has different categories (type-name, template-name,
namespace-name), they requires semantic information to resolve. This patch is
to eliminate the "local" ambiguities in type-name, and namespace-name, which
gives us a performance boost of the parser:

  - eliminate all different type rules (class-name, enum-name, typedef-name), and
    fold them into a unified type-name, this removes the #1 type-name ambiguity, and
    gives us a big performance boost;
  - remove the namespace-alis rules, as they're hard and uninteresting;

Note that we could eliminate more and gain more performance (like fold template-name,
type-name, namespace together), but at current stage, we'd like keep all existing
categories of the identifier (as they might assist in correlated disambiguation &
keep the representation of important concepts uniform).

| file               |ambiguous nodes |  forest size     | glrParse performance |
|SemaCodeComplete.cpp|  11k -> 5.7K   | 10.4MB -> 7.9MB  | 7.1MB/s -> 9.98MB/s  |
|       AST.cpp      |  1.3k -> 0.73K | 0.99MB -> 0.77MB | 6.7MB/s -> 8.4MB/s   |

Differential Revision: https://reviews.llvm.org/D130747
2022-08-17 14:30:53 +02:00

31 lines
1.5 KiB
C++

// RUN: clang-pseudo -grammar=cxx -source=%s --print-forest -print-statistics | FileCheck %s
void foo() {
T* a; // a multiply expression or a pointer declaration?
// CHECK: statement-seq~statement := <ambiguous>
// CHECK-NEXT: ├─statement~expression-statement := expression ;
// CHECK-NEXT: │ ├─expression~multiplicative-expression := multiplicative-expression * pm-expression
// CHECK-NEXT: │ │ ├─multiplicative-expression~IDENTIFIER := tok[5]
// CHECK-NEXT: │ │ ├─* := tok[6]
// CHECK-NEXT: │ │ └─pm-expression~id-expression := unqualified-id #1
// CHECK-NEXT: │ │ └─unqualified-id~IDENTIFIER := tok[7]
// CHECK-NEXT: │ └─; := tok[8]
// CHECK-NEXT: └─statement~simple-declaration := decl-specifier-seq init-declarator-list ;
// CHECK-NEXT: ├─decl-specifier-seq~simple-type-specifier := <ambiguous>
// CHECK-NEXT: │ ├─simple-type-specifier~IDENTIFIER := tok[5]
// CHECK-NEXT: │ └─simple-type-specifier~IDENTIFIER := tok[5]
// CHECK-NEXT: ├─init-declarator-list~ptr-declarator := ptr-operator ptr-declarator
// CHECK-NEXT: │ ├─ptr-operator~* := tok[6]
// CHECK-NEXT: │ └─ptr-declarator~id-expression =#1
// CHECK-NEXT: └─; := tok[8]
}
// CHECK: 2 Ambiguous nodes:
// CHECK-NEXT: 1 simple-type-specifier
// CHECK-NEXT: 1 statement
// CHECK-EMPTY:
// CHECK-NEXT: 0 Opaque nodes:
// CHECK-EMPTY:
// CHECK-NEXT: Ambiguity: 0.20 misparses/token
// CHECK-NEXT: Unparsed: 0.00%