crt: Avoid best-fit mapping when constructing argv for main()

__getmainargs() parses the command line from the return value of
GetCommandLineA() which uses best-fit mapping when converting the
native wide-char command line to the process code page. This can
create security issues. For example, fullwidth quotation mark (U+FF02)
may get converted to ASCII quotation mark (U+0022), which will break
argument quoting and can result in argument injection, for example,
if malicious filenames are passed as an argument to a program. There
are other security issues with best-fit mapping too.

Call __wgetmainargs() to get wide-argv and convert it to narrow-argv
without best-fit mapping. If conversion isn't lossless, print an
error message and _exit(255) without calling main() at all. While
this might not be ideal with every application, with most applications
a lossy conversion would be a "garbage in, garbage out" situation.
For example, lossy conversion of filenames doesn't make any sense.

Note that if _dowildcard is set, then filenames from wildcard expansion
can prevent the application from running if those filenames contain
characters that cannot be converted losslessly.

Setting the process code page to UTF-8 using an application manifest
would also fix the issue (apart from unpaired surrogates which are
invalid UTF-16 but legal on Windows command line and in filenames).
Setting UTF-8 in a manifest is only supported on Windows 10 version 1903
and later, and switching to UTF-8 could create new issues in some apps.
The method in this commit works on old Windows versions too. Even with
UTF-8, this commit matters because it blocks unpaired surrogates on the
command line.

The best-fit conversion issue affects a large number of applications
that use main() instead of wmain(). It's better to fix the issue at
toolchain level instead of trying to fix every application separately.
Examples of applications where this has already been reported:

  - The report about the issue in curl has more technical details:
    https://hackerone.com/reports/2550951

  - In XZ Utils the issue was already solved by setting UTF-8 code page:
    https://tukaani.org/xz/#_argument_injection_on_windows
    (CVE-2024-47611)

Thanks to Orange Tsai and splitline from DEVCORE Research Team
for discovering this issue.

Signed-off-by: Lasse Collin <lasse.collin@tukaani.org>
Signed-off-by: LIU Hao <lh_mouse@126.com>
1 file changed