summaryrefslogtreecommitdiff
diff options
context:
space:
mode:
-rw-r--r--content/mingw-code-golfing-quirks/asm_diff.pngbin0 -> 180641 bytes
-rw-r--r--content/mingw-code-golfing-quirks/correct_output.pngbin0 -> 190224 bytes
-rw-r--r--content/mingw-code-golfing-quirks/erroneous_output.pngbin0 -> 149310 bytes
-rw-r--r--content/mingw-code-golfing-quirks/index.md68
-rw-r--r--content/mingw-code-golfing-quirks/xmm1.pngbin0 -> 221637 bytes
5 files changed, 68 insertions, 0 deletions
diff --git a/content/mingw-code-golfing-quirks/asm_diff.png b/content/mingw-code-golfing-quirks/asm_diff.png
new file mode 100644
index 0000000..6757c99
--- /dev/null
+++ b/content/mingw-code-golfing-quirks/asm_diff.png
Binary files differ
diff --git a/content/mingw-code-golfing-quirks/correct_output.png b/content/mingw-code-golfing-quirks/correct_output.png
new file mode 100644
index 0000000..fda3079
--- /dev/null
+++ b/content/mingw-code-golfing-quirks/correct_output.png
Binary files differ
diff --git a/content/mingw-code-golfing-quirks/erroneous_output.png b/content/mingw-code-golfing-quirks/erroneous_output.png
new file mode 100644
index 0000000..1f47bf5
--- /dev/null
+++ b/content/mingw-code-golfing-quirks/erroneous_output.png
Binary files differ
diff --git a/content/mingw-code-golfing-quirks/index.md b/content/mingw-code-golfing-quirks/index.md
new file mode 100644
index 0000000..d7e8ff5
--- /dev/null
+++ b/content/mingw-code-golfing-quirks/index.md
@@ -0,0 +1,68 @@
++++
+title = "MinGW, x86_64 调用规范与代码高尔夫"
+date = 2025-03-28
+authors = ["135e2 (Mole Shang)"]
+[taxonomies]
+tags = ["技术"]
++++
+
+最近和友人研究短码竞赛,用 MinGW 跑测试的时候出现了诡异的现象,在此留作记录。
+
+<!-- more -->
+
+由于常用的短码 OJ 均使用 [TCC](https://www.bellard.org/tcc/) 编译,自然为了模拟其行为,并节省字符数,我们选择了 C90 标准(不强制要求声明函数签名)并开启了 `-fno-builtin` 编译选项(以忽略烦人的编译器警告)。
+
+跑了几段测试后,黑魔法出现:
+
+![❓](./erroneous_output.png)
+
+一开始我们都觉得是 MinGW 犯了什么毛病,于是拉进 GDB 里一通猛调:
+
+根据 [Windows x64 calling convention](https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#example-of-argument-passing-2---all-floats),`printf`函数的第二个浮点参数应当存在 `$xmm1` 中,调试一下却发现已经正确写入。
+
+![xmm1 register](./xmm1.png)
+
+用 GDB 跳进 `printf` 实现里大概翻了一下,也没有注意到什么奇怪的地方。将各种寄存器倒腾一番之后,依然百思不得其解。
+
+友人突然想起来看看编译参数;删掉 `-fno-builtin` 之后,尽管多了不少警告,MinGW 突然产出了正确的结果:
+
+![❗](./correct_output.png)
+
+那么问题来了:开不开 builtin 和 `printf` 的输出结果有什么关系?
+
+~~又到了喜闻乐见的编译原理检测时间~~
+
+GCC 文档如此解释 `no-builtin` 选项:
+```
+-fno-builtin
+-fno-builtin-function
+
+ Don't recognize built-in functions that do not begin with
+ __builtin_ as prefix. GCC normally generates special code to
+ handle certain built-in functions more efficiently; for
+ instance, calls to "alloca" may become single instructions
+ which adjust the stack directly, and calls to "memcpy" may
+ become inline copy loops. The resulting code is often both
+ smaller and faster, but since the function calls no longer
+ appear as such, you cannot set a breakpoint on those calls, nor
+ can you change the behavior of the functions by linking with a
+ different library. In addition, when a function is recognized
+ as a built-in function, GCC may use information about that
+ function to warn about problems with calls to that function, or
+ to generate more efficient code, even if the resulting code
+ still contains calls to that function. For example, warnings
+ are given with -Wformat for bad calls to "printf" when "printf"
+ is built in and "strlen" is known not to modify global memory.
+```
+
+再次仔细阅读 [Windows x64 calling convention](https://learn.microsoft.com/en-us/cpp/build/x64-calling-convention?view=msvc-170#varargs),注意到 `Varargs` 一节有这么一行小字:
+
+> For floating-point values only, both the integer register and the floating-point register must contain the value, in case the callee expects the value in the integer registers.
+
+如此对比两者产生的汇编,理由就很容易想清楚了:
+
+![asm diff](./asm_diff.png)
+
+显然我们在没有函数签名定义的情况下,又开启了 `no-builtin` 选项,编译器因而无从获取 `printf` 的签名信息,只能从调用推断其签名为 `void printf(const char*, double)`,无法遵循正确的 `varargs` 调用规范设置通用寄存器 `$rdx`,导致 Windows 运行时标准库中的 `printf` 作为 callee 获取第二个参数时永远是全0(`$rdx` 原值);而一旦关闭这一选项,即使不写签名,编译器也能通过内建的函数匹配机制获得完整的 `printf` 签名信息(`int printf(const char*, ...)`),便能生成正确的汇编调用设置 `$rax`,结果也自然没有问题了。
+
+C,很奇妙吧。 xDDD \ No newline at end of file
diff --git a/content/mingw-code-golfing-quirks/xmm1.png b/content/mingw-code-golfing-quirks/xmm1.png
new file mode 100644
index 0000000..009719c
--- /dev/null
+++ b/content/mingw-code-golfing-quirks/xmm1.png
Binary files differ