llvm-project/llvm/tools/llvm-rc/ResourceScriptTokenList.def
Martin Storsjö 3673cc7a42
[llvm-rc] Don't interpret integer literals as octal numbers in rc.exe mode (#166915)
It turns out that rc.exe doesn't interpret integer literals as octal
numbers - but GNU windres does. Previously, llvm-rc did interpret them
as octal.

Fix the issue by stripping away the leading zeros during tokenization.
The alternative (which would be somewhat cleaner, as visible in
tokenizer.test) would be to retain them in the RCToken object, but strip
them out before calling
StringRef::getAsInteger. Alternatively to handle the radix detection
locally in llvm-rc code and not rely on getAsInteger to autodetect it.
Both of those solutions require propagating the IsWindres flag so that
it is available within RCToken, or at least when calling
RCToken::intValue().

Fixes: https://github.com/llvm/llvm-project/issues/144723
2025-11-08 22:24:29 +02:00

42 lines
2.0 KiB
C++

//===-- ResourceScriptTokenList.h -------------------------------*- C++-*-===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===---------------------------------------------------------------------===//
//
// This is a part of llvm-rc tokenizer. It lists all the possible tokens
// that might occur in a correct .rc script.
//
//===---------------------------------------------------------------------===//
// Long tokens. They might consist of more than one character.
TOKEN(Invalid) // Invalid token. Should not occur in a valid script.
TOKEN(Int) // Integer (decimal or hexadecimal, and possibly octal for windres).
TOKEN(String) // String value.
TOKEN(Identifier) // Script identifier (resource name or type).
TOKEN(LineComment) // Beginning of single-line comment.
TOKEN(StartComment) // Beginning of multi-line comment.
// Short tokens. They usually consist of exactly one character.
// The definitions are of the form SHORT_TOKEN(TokenName, TokenChar).
// TokenChar is the one-character token representation occuring in the correct
// .rc scripts.
SHORT_TOKEN(BlockBegin, '{') // Start of the script block; can also be BEGIN.
SHORT_TOKEN(BlockEnd, '}') // End of the block; can also be END.
SHORT_TOKEN(Comma, ',') // Comma - resource arguments separator.
SHORT_TOKEN(Plus, '+') // Addition operator.
SHORT_TOKEN(Minus, '-') // Subtraction operator.
SHORT_TOKEN(Asterisk, '*') // Multiplication operator.
SHORT_TOKEN(Slash, '/') // Division operator.
SHORT_TOKEN(Pipe, '|') // Bitwise-OR operator.
SHORT_TOKEN(Amp, '&') // Bitwise-AND operator.
SHORT_TOKEN(Tilde, '~') // Bitwise-NOT operator.
SHORT_TOKEN(LeftParen, '(') // Left parenthesis in the script expressions.
SHORT_TOKEN(RightParen, ')') // Right parenthesis.
#undef TOKEN
#undef SHORT_TOKEN