zhangguanzhang's Blog

给鲲鹏920和低版本飞腾编译arm64 clickhouse

字数统计: 6.3k阅读时长: 34 min
2025/02/13

clickhouse 官方 docker 镜像无法在老的 arm64 cpu 上运行,需要编译

由来

业务方有在使用 clickhouse,当时版本是 22.8.5.29,随着后面也有 arm64 需求,在 arm64 机器上部署了 ck 后即使没有业务数据内存占用也非常高,在持续刷下面日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
__1::__function::__policy_storage const*) @ 0x8e8379c in /usr/bin/clickhouse
26. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8e7f950 in /usr/bin/clickhouse
27. ? @ 0x8e82844 in /usr/bin/clickhouse
28. start_thread @ 0x7624 in /usr/lib/aarch64-linux-gnu/libpthread-2.31.so
29. ? @ 0xd149c in /usr/lib/aarch64-linux-gnu/libc-2.31.so
(version 22.8.5.29 (official build))
2025.02.08 16:10:33.504961 [ 54 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 18.44 GiB (attempt to allocate chunk of 8709611 bytes), maximum: 18.00 GiB. OvercommitTracker decision: Memory overcommit isn't used. Waiting time or overcommit denominator are set to zero. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8dcd368 in /usr/bin/clickhouse
1. DB::Exception::Exception<char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string_view<char, std::__1::char_traits<char> > >(int, fmt::v8::basic_format_string<char, fmt::v8::type_identity<char const*>::type, fmt::v8::type_identity<char const*>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<long&>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<std::__1::basic_string_view<char, std::__1::char_traits<char> > >::type>, char const*&&, char const*&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, std::__1::basic_string_view<char, std::__1::char_traits<char> >&&) @ 0x8dbfe08 in /usr/bin/clickhouse
2. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf6d4 in /usr/bin/clickhouse
3. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf0f0 in /usr/bin/clickhouse
4. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf0f0 in /usr/bin/clickhouse
5. DB::Memory<Allocator<false, false> >::alloc(unsigned long) @ 0x8e26118 in /usr/bin/clickhouse
6. DB::WriteBufferFromFile::WriteBufferFromFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, int, unsigned int, char*, unsigned long) @ 0x8e46138 in /usr/bin/clickhouse
7. DB::DiskLocal::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, DB::WriteMode, DB::WriteSettings const&) @ 0x1141bc88 in /usr/bin/clickhouse
8. DB::DataPartStorageBuilderOnDisk::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, DB::WriteSettings const&) @ 0x1230e6e8 in /usr/bin/clickhouse
9. DB::MergeTreeDataPartWriterOnDisk::Stream::Stream(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::IDataPartStorageBuilder> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::ICompressionCodec> const&, unsigned long, DB::WriteSettings const&) @ 0x12441ef0 in /usr/bin/clickhouse

搜索了看到几个 issue 都建议升级版本:

过程

arm64 运行环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ lscpu
架构: aarch64
CPU 运行模式: 64-bit
字节序: Little Endian
CPU: 16
在线 CPU 列表: 0-15
每个核的线程数: 1
每个座的核数: 1
座: 16
NUMA 节点: 1
厂商 ID: Phytium
型号: 3
型号名称: ARMv8 CPU
步进: 0x1
CPU 最大 MHz: 2100.0000
CPU 最小 MHz: 2100.0000
BogoMIPS: 100.00
L1d 缓存: 1 MiB
L1i 缓存: 1 MiB
L2 缓存: 8 MiB
L3 缓存: 512 MiB
NUMA 节点0 CPU: 0-15
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
标记: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

升级后起不来

根据 ck 官方 update 文档 里可知官方努力保持一年兼容期,两个版本之前差异小于一年或者 LTS 版本少于两个可以升级。而 23 的 LTS 版本是 v23.8.16.40-lts 。然后换了镜像后运行不起来:

1
2
/entrypoint.sh: line 40:    23 Illegal instruction     (core dumped) clickhouse extract-from-config --config-file "$CLICKHOUSE_CONFIG" --key='storage_configuration.disks.*.path'
/entrypoint.sh: line 41: 25 Illegal instruction (core dumped) clickhouse extract-from-config --config-file "$CLICKHOUSE_CONFIG" --key='storage_configuration.disks.*.metadata_path'

然后尝试了下列版本:

1
2
3
4
5
6
7
8
9
10
23.5.5.92
23.3.22.3
23.3.9.55
23.3.1
22.12.6.22
22.8.21.38
22.10.7.13
22.10.1
22.9.1.2603
22.9.7.34

发现 22.9 才能启动,查看 changelog https://clickhouse.com/docs/en/whats-new/changelog/2022#-clickhouse-release-2210-2022-10-25

1
2
Aarch64 binaries now require at least ARMv8.2, released in 2016. Most notably, this enables use of ARM LSE, i.e. native atomic operations. \
Also, CMake build option "NO_ARMV81_OR_HIGHER" has been added to allow compilation of binaries for older ARMv8.0 hardware, e.g. Raspberry Pi 4. #41610 (Robert Schulze).

22.10 开始使用 arm LSE 指令集做原子操作,但是该指令集 armv8.2 才有,但是也在 Cmake 添加了选项 NO_ARMV81_OR_HIGHER 对于 armv8.0 编译支持。

查看编译文档

官方虽然文档写得比较多,但是感觉比较琐碎。跟着前面 changelog 的 pr #41987 看了下官方是 github action 编译的。github action 构建历史有上限,找 pr 里的 BuilderBinAarch64V80Compat 找不到,然后在新版本 .github/workflows/master.yml 里找到了 build_arm_v80compat。但是用的是下面编译命令:

1
python3 -m praktika run 'Build (arm_v80compat)' --workflow "MasterCI" --ci

praktika 这个 pip 仓库上找不到,于是找到相关最新的 action 里看下具体怎么编译的,找到了相关日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
INFO:root:Pulling image clickhouse/binary-builder:54b46bb22708 - done
INFO:root:Going to run packager with cd /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/docker/packager && \
CMAKE_FLAGS='-DENABLE_CLICKHOUSE_SELF_EXTRACTING=1' ./packager \
--output-dir=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build \
--package-type=binary --compiler=clang-19-aarch64-v80compat \
--cache=sccache --s3-rw-access --s3-bucket=clickhouse-builds \
--docker-image-version=54b46bb22708 --with-profiler --with-buzzhouse --version=25.2.1.1773 --official

2025-02-10 22:37:56,547 Will build ClickHouse pkg with cmd: 'docker run --network=host --user=1000:1000 --rm \
--volume=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build:/output \
--volume=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse:/build \
-e OUTPUT_DIR=/output -e DEB_ARCH=arm64 -e CC=clang-19 -e CXX=clang++-19 \
-e BUILD_TYPE=None -e SCCACHE_BUCKET=clickhouse-builds -e SCCACHE_S3_KEY_PREFIX=ccache/sccache \
-e VERSION_STRING='25.2.1.1773' \
-e CMAKE_FLAGS="$CMAKE_FLAGS -DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-aarch64.cmake \
-DNO_ARMV81_OR_HIGHER=1 -DCMAKE_C_COMPILER=clang-19 -DCMAKE_CXX_COMPILER=clang++-19 \
-DCOMPILER_CACHE=sccache -DENABLE_BUILD_PROFILING=1 \
-DENABLE_BUZZHOUSE=1 -DCLICKHOUSE_OFFICIAL_BUILD=1" \
-e BUILD_TARGET='clickhouse-bundle' --volume=/home/ubuntu/.cargo/registry:/rust/cargo/registry \
clickhouse/binary-builder:54b46bb22708'

搜了下相关源码相关关键字 binary-builder,大体了解了下官方编译步骤:

  • docker/packager 下提供了 packager 二进制(实际是python脚本)来编译
  • docker/packager/binary-builder 是利用 docker 打包一个镜像,里面包含 llvm、rust、clang 之类的编译环境

该容器镜像由 CI 构建定期推送到 dockerhub 上,然后不想宿主机上安装环境可以使用它来进行编译,具体参数查看:

1
2
$ cd docker/packager
$ packager --help

编译

并且官方文档有 building-in-docker ,所以打算使用官方的 docker 镜像构建,机器要提前安装好 docker,硬盘最好有 50G 容量,源码非常大,自备稳定猫咪之类的。拉取源码和编译全程建议开个 screen 里操作:

拉取源码

因为子模块非常多,建议使用 git2 避免一些奇怪问题。

1
git clone https://github.com/ClickHouse/ClickHouse.git

切到指定分支和拉取子模块:

1
2
git checkout v23.8.16.40-lts
git submodule update --init --recursive

查看子模块完整性

1
2
git status
git submodule status

确认拉取完成后再开始后面操作

失败的编译

查看 docker/packager/packager 内容编译 arm64v8.0 执行下面命令:

1
2
3
cd ./docker/packager
packager --package-type=binary --output-dir=build_results \
--compiler=clang-16-aarch64-v80compat

然后报错 clang 版本相关,默认拉取的 clickhouse/binary-builder:latest,查看了下 packager 源码和 --help 有选项 --docker-image-version 指定镜像 tag,去 dockerhub 上找了下 clickhouse/binary-builder 的 clang 16 版本,发现 tag 好像是 commitid 相关:

1
2
3
4
5
6
7
8
9
$ git log -n1
commit e143a9039ba36ad0c25f2ed85503f36e88f61063 (HEAD, tag: v23.8.16.40-lts)
Merge: 8afc5bd5d1c a60c914df11
Author: Antonio Andelic <antonio2368@users.noreply.github.com>
Date: Thu Jul 25 20:50:17 2024 +0100

Merge pull request #66717 from ClickHouse/backport/23.8/66548

Backport #66548 to 23.8: Correctly track memory for `Allocator::realloc`

搜索了下 e143a903 找到了 tag 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063,但是镜像比较大,然后用 daoCloud 的同步了下,拉取下:

1
2
3
docker pull m.daocloud.io/docker.io/clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063
docker tag m.daocloud.io/docker.io/clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063 \
clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

然后编译:

1
2
./docker/packager/packager --package-type=binary --compiler=clang-16-aarch64-v80compat  \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

相关报错一直无法解决,提了 issue 75923,官方说该版本已经不维护不提供帮助,让我不要自己编译,而是去下载官方编译的。下载了当然是 core dump。然后自己找了下官方编译的 build_arm_v80compat ,只有一个下载直链,不存在老版本,action 里可以看到下载链接:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2025-02-10 22:46:45,343 Output placed into /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build
INFO:root:Built successfully
INFO:root:Build finished as success, log path /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build_log/build_log.log
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
INFO:root:Processing file without compression
INFO:root:File is too large, do not provide content type
INFO:root:Upload /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse to https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse-full Meta: {}
Notice: Binary static URL (with debug info): https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse-full
INFO:root:Processing file without compression
INFO:root:File is too large, do not provide content type
INFO:root:Upload /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse-stripped to https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse Meta: {}
Notice: Binary static URL (compact): https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse
Run action done for: [binary_aarch64_v80compat]
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
INFO:root:Get token with 4322 remaining requests
INFO:root:User robot-clickhouse-ci-1 with 4322 remaining requests is used
{}
=== Run script finished ===

每个 action 里都有链接,但是太早的就没有了,因为 github action 只保留上限的构建历史。看了下应该是构建都存在 s3 上:

1
2
3
4
5
6
7
8
=== Post run script [Build (arm_v80compat)], workflow [MasterCI] ===
Job provides s3 artifacts [[Artifact.Config(name='CH_ARMV80C_DARWIN_BIN', type='s3', path='/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse', _provided_by='Build (arm_v80compat)', _s3_path='')]]
Run command: [ls -l /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse]
-rwxr-xr-x 1 ubuntu ubuntu 770032980 Feb 10 23:27 /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse
Run command [aws s3 cp /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse s3://clickhouse-builds/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/clickhouse]
Artifact report enabled and will be uploaded: [{'build_urls': ['https://clickhouse-builds.s3.amazonaws.com/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/clickhouse']}]
Run command [aws s3 cp ./ci/tmp/artifact_report_build_arm_v80compat.json s3://clickhouse-builds/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/artifact_report_build_arm_v80compat.json]
Insert results to CIDB

根据链接规则推断了下下载 url,发现 403:

1
wget https://clickhouse-builds.s3.amazonaws.com/REFs/master/e143a9039ba36ad0c25f2ed85503f36e88f61063/build_arm_v80compat/clickhouse

既然说 v23 不维护,然后尝试了 v24 的 lts 版本也一样,相关报错也提了 issue 75960

成功的编译

报错都是 contrib/sysrootcontrib/thrift 这俩子模块,github 上看了下历史记录,这俩模块基本没怎么更新到新版本,感觉不是模块问题。搜了一些相关编译,发现都是编译 22.10 之前的版本居多。麒麟的 yum 源里版本也比较老,联系了麒麟有没有新版本 rpm 包,然后麒麟发了个新版本源码编译安装的文档过来,看了下是在 arm64 上编译的 25 版本,大体步骤为:

  • rpm 包安装 cmake 3.26
  • 编译安装 gcc-11.3.0 并换 std 库
  • 编译安装 clang-18.1.0
  • 编译安装 llvm-18.1.0
  • 安装 rust 1.80.0
  • 安装 ccache src.rpm
  • 声明 CC=clang CXX=clang++ 后
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=/opt/clickhouse-bin -DNO_ARMV81_OR_HIGHER=1
    • ninja -j32
    • ninja install

根据编译相关,packager--compiler=clang-16-aarch64-v80compat 相关逻辑:

1
2
3
4
5
6
7
8
9
10
11
ARM_V80COMPAT_SUFFIX = "-aarch64-v80compat"
...
is_cross_arm_v80compat = compiler.endswith(ARM_V80COMPAT_SUFFIX)
...
elif is_cross_arm_v80compat:
cc = compiler[: -len(ARM_V80COMPAT_SUFFIX)]
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-aarch64.cmake"
)
cmake_flags.append("-DNO_ARMV81_OR_HIGHER=1")
result.append("DEB_ARCH=arm64")

-DNO_ARMV81_OR_HIGHER=1 是必填的,而插入的 -DCMAKE_TOOLCHAIN_FILE 则是配置使用交叉编译工具链。交叉编译一直失败,是不是可以 arm64 机器上和麒麟那样源码编译 clang 等相关再尝试编译 ck,但是新开的麒麟 arm64 机器上没猫咪 gcc 和 llvm 下载非常慢。就同步看了下 clickhouse/binary-builder 发现还提供了 arm64 的镜像,想着直接 arm64 上不指定 ARM_V80COMPAT_SUFFIX 那就会用内置的 arm64 gcc 了,然后再指定 -DNO_ARMV81_OR_HIGHER=1 编译选项那就和麒麟一样了。和之前一样找了个高配置 arm64 机器上:

  • 拉源码,进去 checkout 到 v23.8.16.40-lts,拉取子模块源码
  • 拉取对应的 clickhouse/binary-builder 并 tag

查看脚本得到编译命令和选项:

1
2
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1' ./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

执行后终于没卡在 contrib/sysrootcontrib/thrift 了,但是最后报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Feb 12 08:22:09 [3590/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SubtargetFeatureInfo.cpp.o
Feb 12 08:22:10 [3591/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/CompressInstEmitter.cpp.o
Feb 12 08:22:10 [3592/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86DisassemblerTables.cpp.o
Feb 12 08:22:10 [3593/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/CodeExpander.cpp.o
Feb 12 08:22:11 [3594/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagEdge.cpp.o
Feb 12 08:22:11 [3595/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/WebAssemblyDisassemblerEmitter.cpp.o
Feb 12 08:22:11 [3596/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86RecognizableInstr.cpp.o
Feb 12 08:22:11 [3596/10884] cd /build/build_docker/rust/skim && /usr/bin/cmake -E make_directory /build/build_docker/rust/skim/RelWithDebInfo && /usr/bin/cmake -E env CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=/usr/bin/clang++-16 CC_aarch64-unknown-linux-gnu=/usr/bin/clang-16 HOST_CC=/usr/bin/clang-16 CXX_aarch64-unknown-linux-gnu=/usr/bin/clang++-16 HOST_CXX=/usr/bin/clang++-16 CORROSION_BUILD_DIR=/build/build_docker/rust/skim CARGO_BUILD_RUSTC=/rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/rustc /rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/cargo rustc --target=aarch64-unknown-linux-gnu --package _ch_rust_skim_rust --manifest-path /build/build_docker/rust/skim/Cargo.toml --target-dir /build/build_docker/RelWithDebInfo/cargo/build --profile=release -- -Cdefault-linker-libraries=no -Clink-args=--target=aarch64-linux-gnu && /usr/bin/cmake -E copy_if_different /build/build_docker/RelWithDebInfo/cargo/build/aarch64-unknown-linux-gnu/release/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/RelWithDebInfo
Feb 12 08:22:11 error: package `cxx v1.0.140` cannot be built because it requires rustc 1.73 or newer, while the currently active rustc version is 1.72.0-nightly
Feb 12 08:22:11 Either upgrade to rustc 1.73 or newer, or use
Feb 12 08:22:11 cargo update -p cxx@1.0.140 --precise ver
Feb 12 08:22:11 where `ver` is the latest version of `cxx` supporting rustc 1.72.0-nightly
Feb 12 08:22:12 [3600/10884] Linking CXX static library contrib/ulid-c-cmake/lib_ulid.a
Feb 12 08:22:12 FAILED: rust/skim/CMakeFiles/cargo-build__ch_rust_skim_rust rust/skim/RelWithDebInfo/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/CMakeFiles/cargo-build__ch_rust_skim_rust /build/build_docker/rust/skim/RelWithDebInfo/lib_ch_rust_skim_rust.a
Feb 12 08:22:12 cd /build/build_docker/rust/skim && /usr/bin/cmake -E make_directory /build/build_docker/rust/skim/RelWithDebInfo && /usr/bin/cmake -E env CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=/usr/bin/clang++-16 CC_aarch64-unknown-linux-gnu=/usr/bin/clang-16 HOST_CC=/usr/bin/clang-16 CXX_aarch64-unknown-linux-gnu=/usr/bin/clang++-16 HOST_CXX=/usr/bin/clang++-16 CORROSION_BUILD_DIR=/build/build_docker/rust/skim CARGO_BUILD_RUSTC=/rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/rustc /rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/cargo rustc --target=aarch64-unknown-linux-gnu --package _ch_rust_skim_rust --manifest-path /build/build_docker/rust/skim/Cargo.toml --target-dir /build/build_docker/RelWithDebInfo/cargo/build --profile=release -- -Cdefault-linker-libraries=no -Clink-args=--target=aarch64-linux-gnu && /usr/bin/cmake -E copy_if_different /build/build_docker/RelWithDebInfo/cargo/build/aarch64-unknown-linux-gnu/release/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/RelWithDebInfo
Feb 12 08:22:12 [3602/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SearchableTableEmitter.cpp.o
Feb 12 08:22:12 [3603/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/CTagsEmitter.cpp.o
Feb 12 08:22:12 [3604/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/VarLenCodeEmitterGen.cpp.o
Feb 12 08:22:12 [3605/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/InstrInfoEmitter.cpp.o
Feb 12 08:22:12 [3606/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagOperands.cpp.o
Feb 12 08:22:12 [3607/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagInstr.cpp.o
Feb 12 08:22:12 [3608/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDag.cpp.o
Feb 12 08:22:12 [3609/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86FoldTablesEmitter.cpp.o
Feb 12 08:22:12 [3610/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagPredicateDependencyEdge.cpp.o
Feb 12 08:22:12 [3611/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/COFF.cpp.o
Feb 12 08:22:13 [3612/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagPredicate.cpp.o
Feb 12 08:22:13 [3613/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/AMDGPUMetadataVerifier.cpp.o
Feb 12 08:22:14 [3614/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/Dwarf.cpp.o
Feb 12 08:22:15 [3615/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SubtargetEmitter.cpp.o
Feb 12 08:22:18 [3616/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/RegisterInfoEmitter.cpp.o
Feb 12 08:22:19 [3617/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchTree.cpp.o
Feb 12 08:22:24 [3618/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/GlobalISelEmitter.cpp.o

报错说 rust 版本 1.72 低了,要求 1.73,搜了下 nightly-2023-07-04 搜不到有用的,nightly 是每日发行版本,找不到对应的 1.73 是啥日期。如果直接使用新版本 binary-builder ,里面的 clang 和要编译的 ck 版本对不上,于是根据 clickhouse/binary-builder 的 Dockerfile 找了下 rust 的临近更新是 nightly-2024-12-01

1
2
3
4
5
6
7
8
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y && \
chmod 777 -R /rust && \
rustup toolchain install nightly-2024-12-01 && \
rustup default nightly-2024-12-01 && \
rustup toolchain remove stable && \
rustup component add rust-src && \
rustup target add x86_64-unknown-linux-gnu && \
rustup target add aarch64-unknown-linux-gnu && \

前面编译报错结尾会有一个 docker run ... 完整命令,结尾加一个 bash run 起来后内部升级下 rust 尝试:

1
2
3
4
5
6
# 记得依旧使用 screen
docker run .... bash
rustup toolchain install nightly-2024-12-01
rustup default nightly-2024-12-01
rustup component add rust-src #<--- 似乎没必要执行
rustup target add aarch64-unknown-linux-gnu #<--- 似乎没必要执行

实际 clickhouse/binary-builder 和 它的 Dockerfile CMD 都是执行:

1
CMD ["bash" "-c" "/build.sh 2>&1"]

所以上面的容器里继续执行 /build.sh 就可以编译了,编译期间看了下 build.sh 存在以下逻辑:

1
2
3
4
5
6
7
8
9
10
11
12
if check_prebuild_exists /build/packages/pre-build
then
# Execute all commands
for file in /build/packages/pre-build/*.sh ;
do
# The script may want to modify environment variables. Why not to allow it to do so?
# shellcheck disable=SC1090
source "$file"
done
else
echo "There are no subcommands to execute :)"
fi

所以如果不交互式直接使用脚本自动化编译的步骤,在拉取 clickhouse/binary-builder 对应镜像后可以是下面:

1
2
3
4
5
6
7
mkdir -p packages/pre-build
cat << EOF > packages/pre-build/update-rust.sh
rustup toolchain install nightly-2024-12-01
rustup default nightly-2024-12-01
EOF
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1' ./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

最后编译出来 2.9G 大小:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Feb 12 10:39:51 Average compiler                  0.000 s
Feb 12 10:39:51 Average cache read hit 0.000 s
Feb 12 10:39:51 Failed distributed compilations 0
Feb 12 10:39:51 Cache location Local disk: "/root/.cache/sccache"
Feb 12 10:39:51 Version (client) 0.5.4
Feb 12 10:39:51 + ccache --evict-older-than 1d
Feb 12 10:39:51 + '[' '' == 1 ']'
Feb 12 10:39:51 + '[' -n '' ']'
Feb 12 10:39:51 + ls -l /output
Feb 12 10:39:51 total 6200956
Feb 12 10:39:51 -rwxr-xr-x 1 root root 3336977696 Feb 12 10:39 clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-benchmark -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-client -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-compressor -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-copier -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-disks -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-extract-from-config -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-format -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-git-import -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper-client -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper-converter -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-local -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-obfuscator -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-server -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-static-files-disk-uploader -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-su -> clickhouse
Feb 12 10:39:51 -rwxr-xr-x 1 root root 3012799232 Feb 12 10:39 unit_tests_dbms

然后加一些选项编译:

1
2
3
4
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1 -DCMAKE_BUILD_TYPE=Release -DENABLE_TESTS=OFF -DENABLE_DEBUG=OFF -DSPLIT_DEBUG_SYMBOLS=ON' \
./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Feb 13 01:24:21 Failed distributed compilations       0
Feb 13 01:24:21 Cache location Local disk: "/root/.cache/sccache"
Feb 13 01:24:21 Version (client) 0.5.4
Feb 13 01:24:21 + ccache --evict-older-than 1d
Feb 13 01:24:21 + '[' '' == 1 ']'
Feb 13 01:24:21 + '[' -n '' ']'
Feb 13 01:24:21 + ls -l /output
Feb 13 01:24:21 total 680996
Feb 13 01:24:21 -rwxr-xr-x 1 root root 697336944 Feb 13 01:24 clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-benchmark -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-client -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-compressor -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-copier -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-disks -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-extract-from-config -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-format -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-git-import -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper-client -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper-converter -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-local -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-obfuscator -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-server -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-static-files-disk-uploader -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-su -> clickhouse

上面的文件打包,然后 core dump 机器上测试下没问题:

1
2
3
4
./clickhouse local -q 'select 1'
1
./clickhouse-server --version
ClickHouse server version 23.8.16.1.

打包 Docker 镜像

根据 docker histort --no-trunc 确认了 Dockerfile 是用的 docker/server/Dockerfile.ubuntu,但是编译的镜像是基于 ubuntu:22.04 的:

1
2
3
4
5
6
7
8
9
10
11
12
13
# cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

避免意外和 CVE,还是用 ubuntu:22.04 稳妥些,参考了下最新的 master 分支上的 22.04 整了下 docker/server/Dockerfile.fix

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
FROM ubuntu:22.04

# see https://github.com/moby/moby/issues/4032#issuecomment-192327844
# It could be removed after we move on a version 23:04+
ARG DEBIAN_FRONTEND=noninteractive

# ARG for quick switch to a given ubuntu mirror
ARG apt_archive="http://archive.ubuntu.com"

# We shouldn't use `apt upgrade` to not change the upstream image. It's updated biweekly

# user/group precreated explicitly with fixed uid/gid on purpose.
# It is especially important for rootless containers: in that case entrypoint
# can't do chown and owners of mounted volumes should be configured externally.
# We do that in advance at the begining of Dockerfile before any packages will be
# installed to prevent picking those uid / gid by some unrelated software.
# The same uid / gid (101) is used both for alpine and ubuntu.
RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list \
&& groupadd -r clickhouse --gid=101 \
&& useradd -r -g clickhouse --uid=101 --home-dir=/var/lib/clickhouse --shell=/bin/bash clickhouse \
&& apt-get update \
&& apt-get install --yes --no-install-recommends \
ca-certificates \
locales \
tzdata \
wget \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*

ARG VERSION="25.1.3.23"

ARG single_binary_location_url=""

# install from a single binary
RUN if [ -n "${single_binary_location_url}" ]; then \
echo "installing from single binary url: ${single_binary_location_url}" \
&& rm -rf /tmp/clickhouse_binary \
&& mkdir -p /tmp/clickhouse_binary \
&& wget --progress=bar:force:noscroll "${single_binary_location_url}" -O /tmp/clickhouse_binary/clickhouse \
&& chmod +x /tmp/clickhouse_binary/clickhouse \
&& /tmp/clickhouse_binary/clickhouse install --user "clickhouse" --group "clickhouse" \
&& rm -rf /tmp/* ; \
fi

# The rest is the same in the official docker and in our build system
#docker-official-library:on

# post install
# we need to allow "others" access to clickhouse folder, because docker container
# can be started with arbitrary uid (openshift usecase)
RUN clickhouse-local -q 'SELECT * FROM system.build_options' \
&& mkdir -p /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client \
&& chmod ugo+Xrw -R /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client

RUN locale-gen en_US.UTF-8
ENV LANG=en_US.UTF-8
ENV TZ=UTC

RUN mkdir /docker-entrypoint-initdb.d

COPY docker_related_config.xml /etc/clickhouse-server/config.d/
COPY entrypoint.sh /entrypoint.sh

EXPOSE 9000 8123 9009
VOLUME /var/lib/clickhouse

ENV CLICKHOUSE_CONFIG=/etc/clickhouse-server/config.xml

ENTRYPOINT ["/entrypoint.sh"]

直接拷贝二进制进去执行 clickhouse install 会造成 overlay diff 浪费,所以需要起一个 web 下载,编译容器镜像:

1
2
3
4
5
6
7
8
9
RANDOM_PORT=50358
docker run -d --name ck -v $PWD/build_results:/usr/share/nginx/html/ \
-p 50358:80 \
m.daocloud.io/docker.io/library/nginx:alpine

cd docker/server/
docker build . --build-arg version="23.8.16.40" --network host \
--build-arg single_binary_location_url=http://127.0.0.1:50358/clickhouse \
-t clickhouse/clickhouse-server:23.8.16.40 -f Dockerfile.fix

一些其他信息

clickhouse/binary-builder 新版本的 tag 似乎不是 commidID 了,找了下一些版本 tag 和 clang 对应:

  • dd5e777b6745 18
  • 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063 16

官方和编译的二进制文件 file 信息对比:

1
2
clickhouse: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=3e70066de2db0f08f97ca310827184d61111dc22, not stripped
clickhouse: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 3.7.0, BuildID[sha1]=586fe5a059c988b60f99dcd794e68f0e4cf69619, not stripped

切换其他版本的时候要处理下子模块,相关命令:

1
2
3
4
git submodule update --recursive --checkout

git submodule foreach --recursive git reset --hard
git submodule update --recursive

切版本还要记得清理 build_docker 目录

参考

其实也没参考多少,就是这些是能搜到的 arm64 编译相关

CATALOG
  1. 1. 由来
  2. 2. 过程
    1. 2.1. arm64 运行环境
    2. 2.2. 升级后起不来
    3. 2.3. 查看编译文档
    4. 2.4. 编译
      1. 2.4.1. 拉取源码
      2. 2.4.2. 失败的编译
      3. 2.4.3. 成功的编译
    5. 2.5. 打包 Docker 镜像
    6. 2.6. 一些其他信息
  3. 3. 参考