zhangguanzhang's Blog

给鲲鹏920和低版本飞腾编译arm64 clickhouse

字数统计: 6.3k阅读时长: 34 min
2025/02/13 209

clickhouse 官方 docker 镜像无法在老的 arm64 cpu 上运行,需要编译

由来

业务方有在使用 clickhouse,当时版本是 22.8.5.29,随着后面也有 arm64 需求,在 arm64 机器上部署了 ck 后即使没有业务数据内存占用也非常高,在持续刷下面日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
__1::__function::__policy_storage const*) @ 0x8e8379c in /usr/bin/clickhouse
26. ThreadPoolImpl<std::__1::thread>::worker(std::__1::__list_iterator<std::__1::thread, void*>) @ 0x8e7f950 in /usr/bin/clickhouse
27. ? @ 0x8e82844 in /usr/bin/clickhouse
28. start_thread @ 0x7624 in /usr/lib/aarch64-linux-gnu/libpthread-2.31.so
29. ? @ 0xd149c in /usr/lib/aarch64-linux-gnu/libc-2.31.so
(version 22.8.5.29 (official build))
2025.02.08 16:10:33.504961 [ 54 ] {} <Error> void DB::MergeTreeBackgroundExecutor<DB::MergeMutateRuntimeQueue>::routine(DB::TaskRuntimeDataPtr) [Queue = DB::MergeMutateRuntimeQueue]: Code: 241. DB::Exception: Memory limit (total) exceeded: would use 18.44 GiB (attempt to allocate chunk of 8709611 bytes), maximum: 18.00 GiB. OvercommitTracker decision: Memory overcommit isn't used. Waiting time or overcommit denominator are set to zero. (MEMORY_LIMIT_EXCEEDED), Stack trace (when copying this message, always include the lines below):

0. DB::Exception::Exception(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, int, bool) @ 0x8dcd368 in /usr/bin/clickhouse
1. DB::Exception::Exception<char const*, char const*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >, std::__1::basic_string_view<char, std::__1::char_traits<char> > >(int, fmt::v8::basic_format_string<char, fmt::v8::type_identity<char const*>::type, fmt::v8::type_identity<char const*>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<long&>::type, fmt::v8::type_identity<std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > >::type, fmt::v8::type_identity<std::__1::basic_string_view<char, std::__1::char_traits<char> > >::type>, char const*&&, char const*&&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, long&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> >&&, std::__1::basic_string_view<char, std::__1::char_traits<char> >&&) @ 0x8dbfe08 in /usr/bin/clickhouse
2. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf6d4 in /usr/bin/clickhouse
3. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf0f0 in /usr/bin/clickhouse
4. MemoryTracker::allocImpl(long, bool, MemoryTracker*) @ 0x8dbf0f0 in /usr/bin/clickhouse
5. DB::Memory<Allocator<false, false> >::alloc(unsigned long) @ 0x8e26118 in /usr/bin/clickhouse
6. DB::WriteBufferFromFile::WriteBufferFromFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, int, unsigned int, char*, unsigned long) @ 0x8e46138 in /usr/bin/clickhouse
7. DB::DiskLocal::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, DB::WriteMode, DB::WriteSettings const&) @ 0x1141bc88 in /usr/bin/clickhouse
8. DB::DataPartStorageBuilderOnDisk::writeFile(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, unsigned long, DB::WriteSettings const&) @ 0x1230e6e8 in /usr/bin/clickhouse
9. DB::MergeTreeDataPartWriterOnDisk::Stream::Stream(std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::IDataPartStorageBuilder> const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::shared_ptr<DB::ICompressionCodec> const&, unsigned long, DB::WriteSettings const&) @ 0x12441ef0 in /usr/bin/clickhouse

搜索了看到几个 issue 都建议升级版本:

过程

arm64 运行环境

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
$ lscpu
架构: aarch64
CPU 运行模式: 64-bit
字节序: Little Endian
CPU: 16
在线 CPU 列表: 0-15
每个核的线程数: 1
每个座的核数: 1
座: 16
NUMA 节点: 1
厂商 ID: Phytium
型号: 3
型号名称: ARMv8 CPU
步进: 0x1
CPU 最大 MHz: 2100.0000
CPU 最小 MHz: 2100.0000
BogoMIPS: 100.00
L1d 缓存: 1 MiB
L1i 缓存: 1 MiB
L2 缓存: 8 MiB
L3 缓存: 512 MiB
NUMA 节点0 CPU: 0-15
Vulnerability Itlb multihit: Not affected
Vulnerability L1tf: Not affected
Vulnerability Mds: Not affected
Vulnerability Meltdown: Not affected
Vulnerability Mmio stale data: Not affected
Vulnerability Spec store bypass: Not affected
Vulnerability Spectre v1: Mitigation; __user pointer sanitization
Vulnerability Spectre v2: Not affected
Vulnerability Srbds: Not affected
Vulnerability Tsx async abort: Not affected
标记: fp asimd evtstrm aes pmull sha1 sha2 crc32 cpuid

升级后起不来

根据 ck 官方 update 文档 里可知官方努力保持一年兼容期,两个版本之前差异小于一年或者 LTS 版本少于两个可以升级。而 23 的 LTS 版本是 v23.8.16.40-lts 。然后换了镜像后运行不起来:

1
2
/entrypoint.sh: line 40:    23 Illegal instruction     (core dumped) clickhouse extract-from-config --config-file "$CLICKHOUSE_CONFIG" --key='storage_configuration.disks.*.path'
/entrypoint.sh: line 41: 25 Illegal instruction (core dumped) clickhouse extract-from-config --config-file "$CLICKHOUSE_CONFIG" --key='storage_configuration.disks.*.metadata_path'

然后尝试了下列版本:

1
2
3
4
5
6
7
8
9
10
23.5.5.92
23.3.22.3
23.3.9.55
23.3.1
22.12.6.22
22.8.21.38
22.10.7.13
22.10.1
22.9.1.2603
22.9.7.34

发现 22.9 才能启动,查看 changelog https://clickhouse.com/docs/en/whats-new/changelog/2022#-clickhouse-release-2210-2022-10-25

1
2
Aarch64 binaries now require at least ARMv8.2, released in 2016. Most notably, this enables use of ARM LSE, i.e. native atomic operations. \
Also, CMake build option "NO_ARMV81_OR_HIGHER" has been added to allow compilation of binaries for older ARMv8.0 hardware, e.g. Raspberry Pi 4. #41610 (Robert Schulze).

22.10 开始使用 arm LSE 指令集做原子操作,但是该指令集 armv8.2 才有,但是也在 Cmake 添加了选项 NO_ARMV81_OR_HIGHER 对于 armv8.0 编译支持。

查看编译文档

官方虽然文档写得比较多,但是感觉比较琐碎。跟着前面 changelog 的 pr #41987 看了下官方是 github action 编译的。github action 构建历史有上限,找 pr 里的 BuilderBinAarch64V80Compat 找不到,然后在新版本 .github/workflows/master.yml 里找到了 build_arm_v80compat。但是用的是下面编译命令:

1
python3 -m praktika run 'Build (arm_v80compat)' --workflow "MasterCI" --ci

praktika 这个 pip 仓库上找不到,于是找到相关最新的 action 里看下具体怎么编译的,找到了相关日志:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
INFO:root:Pulling image clickhouse/binary-builder:54b46bb22708 - done
INFO:root:Going to run packager with cd /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/docker/packager && \
CMAKE_FLAGS='-DENABLE_CLICKHOUSE_SELF_EXTRACTING=1' ./packager \
--output-dir=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build \
--package-type=binary --compiler=clang-19-aarch64-v80compat \
--cache=sccache --s3-rw-access --s3-bucket=clickhouse-builds \
--docker-image-version=54b46bb22708 --with-profiler --with-buzzhouse --version=25.2.1.1773 --official

2025-02-10 22:37:56,547 Will build ClickHouse pkg with cmd: 'docker run --network=host --user=1000:1000 --rm \
--volume=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build:/output \
--volume=/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse:/build \
-e OUTPUT_DIR=/output -e DEB_ARCH=arm64 -e CC=clang-19 -e CXX=clang++-19 \
-e BUILD_TYPE=None -e SCCACHE_BUCKET=clickhouse-builds -e SCCACHE_S3_KEY_PREFIX=ccache/sccache \
-e VERSION_STRING='25.2.1.1773' \
-e CMAKE_FLAGS="$CMAKE_FLAGS -DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-aarch64.cmake \
-DNO_ARMV81_OR_HIGHER=1 -DCMAKE_C_COMPILER=clang-19 -DCMAKE_CXX_COMPILER=clang++-19 \
-DCOMPILER_CACHE=sccache -DENABLE_BUILD_PROFILING=1 \
-DENABLE_BUZZHOUSE=1 -DCLICKHOUSE_OFFICIAL_BUILD=1" \
-e BUILD_TARGET='clickhouse-bundle' --volume=/home/ubuntu/.cargo/registry:/rust/cargo/registry \
clickhouse/binary-builder:54b46bb22708'

搜了下相关源码相关关键字 binary-builder,大体了解了下官方编译步骤:

  • docker/packager 下提供了 packager 二进制(实际是python脚本)来编译
  • docker/packager/binary-builder 是利用 docker 打包一个镜像,里面包含 llvm、rust、clang 之类的编译环境

该容器镜像由 CI 构建定期推送到 dockerhub 上,然后不想宿主机上安装环境可以使用它来进行编译,具体参数查看:

1
2
$ cd docker/packager
$ packager --help

编译

并且官方文档有 building-in-docker ,所以打算使用官方的 docker 镜像构建,机器要提前安装好 docker,硬盘最好有 50G 容量,源码非常大,自备稳定猫咪之类的。拉取源码和编译全程建议开个 screen 里操作:

拉取源码

因为子模块非常多,建议使用 git2 避免一些奇怪问题。

1
git clone https://github.com/ClickHouse/ClickHouse.git

切到指定分支和拉取子模块:

1
2
git checkout v23.8.16.40-lts
git submodule update --init --recursive

查看子模块完整性

1
2
git status
git submodule status

确认拉取完成后再开始后面操作

失败的编译

查看 docker/packager/packager 内容编译 arm64v8.0 执行下面命令:

1
2
3
cd ./docker/packager
packager --package-type=binary --output-dir=build_results \
--compiler=clang-16-aarch64-v80compat

然后报错 clang 版本相关,默认拉取的 clickhouse/binary-builder:latest,查看了下 packager 源码和 --help 有选项 --docker-image-version 指定镜像 tag,去 dockerhub 上找了下 clickhouse/binary-builder 的 clang 16 版本,发现 tag 好像是 commitid 相关:

1
2
3
4
5
6
7
8
9
$ git log -n1
commit e143a9039ba36ad0c25f2ed85503f36e88f61063 (HEAD, tag: v23.8.16.40-lts)
Merge: 8afc5bd5d1c a60c914df11
Author: Antonio Andelic <antonio2368@users.noreply.github.com>
Date: Thu Jul 25 20:50:17 2024 +0100

Merge pull request #66717 from ClickHouse/backport/23.8/66548

Backport #66548 to 23.8: Correctly track memory for `Allocator::realloc`

搜索了下 e143a903 找到了 tag 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063,但是镜像比较大,然后用 daoCloud 的同步了下,拉取下:

1
2
3
docker pull m.daocloud.io/docker.io/clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063
docker tag m.daocloud.io/docker.io/clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063 \
clickhouse/binary-builder:54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

然后编译:

1
2
./docker/packager/packager --package-type=binary --compiler=clang-16-aarch64-v80compat  \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

相关报错一直无法解决,提了 issue 75923,官方说该版本已经不维护不提供帮助,让我不要自己编译,而是去下载官方编译的。下载了当然是 core dump。然后自己找了下官方编译的 build_arm_v80compat ,只有一个下载直链,不存在老版本,action 里可以看到下载链接:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
2025-02-10 22:46:45,343 Output placed into /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build
INFO:root:Built successfully
INFO:root:Build finished as success, log path /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build_log/build_log.log
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
INFO:root:Processing file without compression
INFO:root:File is too large, do not provide content type
INFO:root:Upload /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse to https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse-full Meta: {}
Notice: Binary static URL (with debug info): https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse-full
INFO:root:Processing file without compression
INFO:root:File is too large, do not provide content type
INFO:root:Upload /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse-stripped to https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse Meta: {}
Notice: Binary static URL (compact): https://s3.amazonaws.com/clickhouse-builds/master/aarch64v80compat/clickhouse
Run action done for: [binary_aarch64_v80compat]
INFO:botocore.credentials:Found credentials from IAM Role: ec2_admin
INFO:root:Get token with 4322 remaining requests
INFO:root:User robot-clickhouse-ci-1 with 4322 remaining requests is used
{}
=== Run script finished ===

每个 action 里都有链接,但是太早的就没有了,因为 github action 只保留上限的构建历史。看了下应该是构建都存在 s3 上:

1
2
3
4
5
6
7
8
=== Post run script [Build (arm_v80compat)], workflow [MasterCI] ===
Job provides s3 artifacts [[Artifact.Config(name='CH_ARMV80C_DARWIN_BIN', type='s3', path='/home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse', _provided_by='Build (arm_v80compat)', _s3_path='')]]
Run command: [ls -l /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse]
-rwxr-xr-x 1 ubuntu ubuntu 770032980 Feb 10 23:27 /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse
Run command [aws s3 cp /home/ubuntu/actions-runner/_work/ClickHouse/ClickHouse/ci/tmp/build/clickhouse s3://clickhouse-builds/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/clickhouse]
Artifact report enabled and will be uploaded: [{'build_urls': ['https://clickhouse-builds.s3.amazonaws.com/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/clickhouse']}]
Run command [aws s3 cp ./ci/tmp/artifact_report_build_arm_v80compat.json s3://clickhouse-builds/REFs/master/57b1cafdcb704280f05ee61345e8fbad9a6af5ed/build_arm_v80compat/artifact_report_build_arm_v80compat.json]
Insert results to CIDB

根据链接规则推断了下下载 url,发现 403:

1
wget https://clickhouse-builds.s3.amazonaws.com/REFs/master/e143a9039ba36ad0c25f2ed85503f36e88f61063/build_arm_v80compat/clickhouse

既然说 v23 不维护,然后尝试了 v24 的 lts 版本也一样,相关报错也提了 issue 75960

成功的编译

报错都是 contrib/sysrootcontrib/thrift 这俩子模块,github 上看了下历史记录,这俩模块基本没怎么更新到新版本,感觉不是模块问题。搜了一些相关编译,发现都是编译 22.10 之前的版本居多。麒麟的 yum 源里版本也比较老,联系了麒麟有没有新版本 rpm 包,然后麒麟发了个新版本源码编译安装的文档过来,看了下是在 arm64 上编译的 25 版本,大体步骤为:

  • rpm 包安装 cmake 3.26
  • 编译安装 gcc-11.3.0 并换 std 库
  • 编译安装 clang-18.1.0
  • 编译安装 llvm-18.1.0
  • 安装 rust 1.80.0
  • 安装 ccache src.rpm
  • 声明 CC=clang CXX=clang++ 后
    • mkdir build && cd build
    • cmake .. -DCMAKE_INSTALL_PREFIX=/opt/clickhouse-bin -DNO_ARMV81_OR_HIGHER=1
    • ninja -j32
    • ninja install

根据编译相关,packager--compiler=clang-16-aarch64-v80compat 相关逻辑:

1
2
3
4
5
6
7
8
9
10
11
ARM_V80COMPAT_SUFFIX = "-aarch64-v80compat"
...
is_cross_arm_v80compat = compiler.endswith(ARM_V80COMPAT_SUFFIX)
...
elif is_cross_arm_v80compat:
cc = compiler[: -len(ARM_V80COMPAT_SUFFIX)]
cmake_flags.append(
"-DCMAKE_TOOLCHAIN_FILE=/build/cmake/linux/toolchain-aarch64.cmake"
)
cmake_flags.append("-DNO_ARMV81_OR_HIGHER=1")
result.append("DEB_ARCH=arm64")

-DNO_ARMV81_OR_HIGHER=1 是必填的,而插入的 -DCMAKE_TOOLCHAIN_FILE 则是配置使用交叉编译工具链。交叉编译一直失败,是不是可以 arm64 机器上和麒麟那样源码编译 clang 等相关再尝试编译 ck,但是新开的麒麟 arm64 机器上没猫咪 gcc 和 llvm 下载非常慢。就同步看了下 clickhouse/binary-builder 发现还提供了 arm64 的镜像,想着直接 arm64 上不指定 ARM_V80COMPAT_SUFFIX 那就会用内置的 arm64 gcc 了,然后再指定 -DNO_ARMV81_OR_HIGHER=1 编译选项那就和麒麟一样了。和之前一样找了个高配置 arm64 机器上:

  • 拉源码,进去 checkout 到 v23.8.16.40-lts,拉取子模块源码
  • 拉取对应的 clickhouse/binary-builder 并 tag

查看脚本得到编译命令和选项:

1
2
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1' ./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

执行后终于没卡在 contrib/sysrootcontrib/thrift 了,但是最后报错:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
Feb 12 08:22:09 [3590/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SubtargetFeatureInfo.cpp.o
Feb 12 08:22:10 [3591/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/CompressInstEmitter.cpp.o
Feb 12 08:22:10 [3592/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86DisassemblerTables.cpp.o
Feb 12 08:22:10 [3593/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/CodeExpander.cpp.o
Feb 12 08:22:11 [3594/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagEdge.cpp.o
Feb 12 08:22:11 [3595/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/WebAssemblyDisassemblerEmitter.cpp.o
Feb 12 08:22:11 [3596/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86RecognizableInstr.cpp.o
Feb 12 08:22:11 [3596/10884] cd /build/build_docker/rust/skim && /usr/bin/cmake -E make_directory /build/build_docker/rust/skim/RelWithDebInfo && /usr/bin/cmake -E env CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=/usr/bin/clang++-16 CC_aarch64-unknown-linux-gnu=/usr/bin/clang-16 HOST_CC=/usr/bin/clang-16 CXX_aarch64-unknown-linux-gnu=/usr/bin/clang++-16 HOST_CXX=/usr/bin/clang++-16 CORROSION_BUILD_DIR=/build/build_docker/rust/skim CARGO_BUILD_RUSTC=/rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/rustc /rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/cargo rustc --target=aarch64-unknown-linux-gnu --package _ch_rust_skim_rust --manifest-path /build/build_docker/rust/skim/Cargo.toml --target-dir /build/build_docker/RelWithDebInfo/cargo/build --profile=release -- -Cdefault-linker-libraries=no -Clink-args=--target=aarch64-linux-gnu && /usr/bin/cmake -E copy_if_different /build/build_docker/RelWithDebInfo/cargo/build/aarch64-unknown-linux-gnu/release/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/RelWithDebInfo
Feb 12 08:22:11 error: package `cxx v1.0.140` cannot be built because it requires rustc 1.73 or newer, while the currently active rustc version is 1.72.0-nightly
Feb 12 08:22:11 Either upgrade to rustc 1.73 or newer, or use
Feb 12 08:22:11 cargo update -p cxx@1.0.140 --precise ver
Feb 12 08:22:11 where `ver` is the latest version of `cxx` supporting rustc 1.72.0-nightly
Feb 12 08:22:12 [3600/10884] Linking CXX static library contrib/ulid-c-cmake/lib_ulid.a
Feb 12 08:22:12 FAILED: rust/skim/CMakeFiles/cargo-build__ch_rust_skim_rust rust/skim/RelWithDebInfo/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/CMakeFiles/cargo-build__ch_rust_skim_rust /build/build_docker/rust/skim/RelWithDebInfo/lib_ch_rust_skim_rust.a
Feb 12 08:22:12 cd /build/build_docker/rust/skim && /usr/bin/cmake -E make_directory /build/build_docker/rust/skim/RelWithDebInfo && /usr/bin/cmake -E env CARGO_TARGET_AARCH64_UNKNOWN_LINUX_GNU_LINKER=/usr/bin/clang++-16 CC_aarch64-unknown-linux-gnu=/usr/bin/clang-16 HOST_CC=/usr/bin/clang-16 CXX_aarch64-unknown-linux-gnu=/usr/bin/clang++-16 HOST_CXX=/usr/bin/clang++-16 CORROSION_BUILD_DIR=/build/build_docker/rust/skim CARGO_BUILD_RUSTC=/rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/rustc /rust/rustup/toolchains/nightly-2023-07-04-aarch64-unknown-linux-gnu/bin/cargo rustc --target=aarch64-unknown-linux-gnu --package _ch_rust_skim_rust --manifest-path /build/build_docker/rust/skim/Cargo.toml --target-dir /build/build_docker/RelWithDebInfo/cargo/build --profile=release -- -Cdefault-linker-libraries=no -Clink-args=--target=aarch64-linux-gnu && /usr/bin/cmake -E copy_if_different /build/build_docker/RelWithDebInfo/cargo/build/aarch64-unknown-linux-gnu/release/lib_ch_rust_skim_rust.a /build/build_docker/rust/skim/RelWithDebInfo
Feb 12 08:22:12 [3602/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SearchableTableEmitter.cpp.o
Feb 12 08:22:12 [3603/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/CTagsEmitter.cpp.o
Feb 12 08:22:12 [3604/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/VarLenCodeEmitterGen.cpp.o
Feb 12 08:22:12 [3605/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/InstrInfoEmitter.cpp.o
Feb 12 08:22:12 [3606/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagOperands.cpp.o
Feb 12 08:22:12 [3607/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagInstr.cpp.o
Feb 12 08:22:12 [3608/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDag.cpp.o
Feb 12 08:22:12 [3609/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/X86FoldTablesEmitter.cpp.o
Feb 12 08:22:12 [3610/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagPredicateDependencyEdge.cpp.o
Feb 12 08:22:12 [3611/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/COFF.cpp.o
Feb 12 08:22:13 [3612/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchDagPredicate.cpp.o
Feb 12 08:22:13 [3613/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/AMDGPUMetadataVerifier.cpp.o
Feb 12 08:22:14 [3614/10884] Building CXX object contrib/llvm-project/llvm/lib/BinaryFormat/CMakeFiles/LLVMBinaryFormat.dir/Dwarf.cpp.o
Feb 12 08:22:15 [3615/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/SubtargetEmitter.cpp.o
Feb 12 08:22:18 [3616/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/RegisterInfoEmitter.cpp.o
Feb 12 08:22:19 [3617/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/GlobalISel/CMakeFiles/LLVMTableGenGlobalISel.dir/GIMatchTree.cpp.o
Feb 12 08:22:24 [3618/10884] Building CXX object contrib/llvm-project/llvm/utils/TableGen/CMakeFiles/llvm-tblgen.dir/GlobalISelEmitter.cpp.o

报错说 rust 版本 1.72 低了,要求 1.73,搜了下 nightly-2023-07-04 搜不到有用的,nightly 是每日发行版本,找不到对应的 1.73 是啥日期。如果直接使用新版本 binary-builder ,里面的 clang 和要编译的 ck 版本对不上,于是根据 clickhouse/binary-builder 的 Dockerfile 找了下 rust 的临近更新是 nightly-2024-12-01

1
2
3
4
5
6
7
8
RUN curl https://sh.rustup.rs -sSf | bash -s -- -y && \
chmod 777 -R /rust && \
rustup toolchain install nightly-2024-12-01 && \
rustup default nightly-2024-12-01 && \
rustup toolchain remove stable && \
rustup component add rust-src && \
rustup target add x86_64-unknown-linux-gnu && \
rustup target add aarch64-unknown-linux-gnu && \

前面编译报错结尾会有一个 docker run ... 完整命令,结尾加一个 bash run 起来后内部升级下 rust 尝试:

1
2
3
4
5
6
# 记得依旧使用 screen
docker run .... bash
rustup toolchain install nightly-2024-12-01
rustup default nightly-2024-12-01
rustup component add rust-src #<--- 似乎没必要执行
rustup target add aarch64-unknown-linux-gnu #<--- 似乎没必要执行

实际 clickhouse/binary-builder 和 它的 Dockerfile CMD 都是执行:

1
CMD ["bash" "-c" "/build.sh 2>&1"]

所以上面的容器里继续执行 /build.sh 就可以编译了,编译期间看了下 build.sh 存在以下逻辑:

1
2
3
4
5
6
7
8
9
10
11
12
if check_prebuild_exists /build/packages/pre-build
then
# Execute all commands
for file in /build/packages/pre-build/*.sh ;
do
# The script may want to modify environment variables. Why not to allow it to do so?
# shellcheck disable=SC1090
source "$file"
done
else
echo "There are no subcommands to execute :)"
fi

所以如果不交互式直接使用脚本自动化编译的步骤,在拉取 clickhouse/binary-builder 对应镜像后可以是下面:

1
2
3
4
5
6
7
mkdir -p packages/pre-build
cat << EOF > packages/pre-build/update-rust.sh
rustup toolchain install nightly-2024-12-01
rustup default nightly-2024-12-01
EOF
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1' ./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

最后编译出来 2.9G 大小:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
Feb 12 10:39:51 Average compiler                  0.000 s
Feb 12 10:39:51 Average cache read hit 0.000 s
Feb 12 10:39:51 Failed distributed compilations 0
Feb 12 10:39:51 Cache location Local disk: "/root/.cache/sccache"
Feb 12 10:39:51 Version (client) 0.5.4
Feb 12 10:39:51 + ccache --evict-older-than 1d
Feb 12 10:39:51 + '[' '' == 1 ']'
Feb 12 10:39:51 + '[' -n '' ']'
Feb 12 10:39:51 + ls -l /output
Feb 12 10:39:51 total 6200956
Feb 12 10:39:51 -rwxr-xr-x 1 root root 3336977696 Feb 12 10:39 clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-benchmark -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-client -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-compressor -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-copier -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-disks -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-extract-from-config -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-format -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-git-import -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper-client -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-keeper-converter -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-local -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-obfuscator -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-server -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-static-files-disk-uploader -> clickhouse
Feb 12 10:39:51 lrwxrwxrwx 1 root root 10 Feb 12 10:39 clickhouse-su -> clickhouse
Feb 12 10:39:51 -rwxr-xr-x 1 root root 3012799232 Feb 12 10:39 unit_tests_dbms

然后加一些选项编译:

1
2
3
4
CMAKE_FLAGS='-DNO_ARMV81_OR_HIGHER=1 -DCMAKE_BUILD_TYPE=Release -DENABLE_TESTS=OFF -DENABLE_DEBUG=OFF -DSPLIT_DEBUG_SYMBOLS=ON' \
./docker/packager/packager --package-type=binary \
--output-dir=build_results --docker-image-version 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
Feb 13 01:24:21 Failed distributed compilations       0
Feb 13 01:24:21 Cache location Local disk: "/root/.cache/sccache"
Feb 13 01:24:21 Version (client) 0.5.4
Feb 13 01:24:21 + ccache --evict-older-than 1d
Feb 13 01:24:21 + '[' '' == 1 ']'
Feb 13 01:24:21 + '[' -n '' ']'
Feb 13 01:24:21 + ls -l /output
Feb 13 01:24:21 total 680996
Feb 13 01:24:21 -rwxr-xr-x 1 root root 697336944 Feb 13 01:24 clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-benchmark -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-client -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-compressor -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-copier -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-disks -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-extract-from-config -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-format -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-git-import -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper-client -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-keeper-converter -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-local -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-obfuscator -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-server -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-static-files-disk-uploader -> clickhouse
Feb 13 01:24:21 lrwxrwxrwx 1 root root 10 Feb 13 01:24 clickhouse-su -> clickhouse

上面的文件打包,然后 core dump 机器上测试下没问题:

1
2
3
4
./clickhouse local -q 'select 1'
1
./clickhouse-server --version
ClickHouse server version 23.8.16.1.

打包 Docker 镜像

根据 docker histort --no-trunc 确认了 Dockerfile 是用的 docker/server/Dockerfile.ubuntu,但是编译的镜像是基于 ubuntu:22.04 的:

1
2
3
4
5
6
7
8
9
10
11
12
13
# cat /etc/os-release 
PRETTY_NAME="Ubuntu 22.04.4 LTS"
NAME="Ubuntu"
VERSION_ID="22.04"
VERSION="22.04.4 LTS (Jammy Jellyfish)"
VERSION_CODENAME=jammy
ID=ubuntu
ID_LIKE=debian
HOME_URL="https://www.ubuntu.com/"
SUPPORT_URL="https://help.ubuntu.com/"
BUG_REPORT_URL="https://bugs.launchpad.net/ubuntu/"
PRIVACY_POLICY_URL="https://www.ubuntu.com/legal/terms-and-policies/privacy-policy"
UBUNTU_CODENAME=jammy

避免意外和 CVE,还是用 ubuntu:22.04 稳妥些,参考了下最新的 master 分支上的 22.04 整了下 docker/server/Dockerfile.fix

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
FROM ubuntu:22.04

# see https://github.com/moby/moby/issues/4032#issuecomment-192327844
# It could be removed after we move on a version 23:04+
ARG DEBIAN_FRONTEND=noninteractive

# ARG for quick switch to a given ubuntu mirror
ARG apt_archive="http://archive.ubuntu.com"

# We shouldn't use `apt upgrade` to not change the upstream image. It's updated biweekly

# user/group precreated explicitly with fixed uid/gid on purpose.
# It is especially important for rootless containers: in that case entrypoint
# can't do chown and owners of mounted volumes should be configured externally.
# We do that in advance at the begining of Dockerfile before any packages will be
# installed to prevent picking those uid / gid by some unrelated software.
# The same uid / gid (101) is used both for alpine and ubuntu.
RUN sed -i "s|http://archive.ubuntu.com|${apt_archive}|g" /etc/apt/sources.list \
&& groupadd -r clickhouse --gid=101 \
&& useradd -r -g clickhouse --uid=101 --home-dir=/var/lib/clickhouse --shell=/bin/bash clickhouse \
&& apt-get update \
&& apt-get install --yes --no-install-recommends \
ca-certificates \
locales \
tzdata \
wget \
&& rm -rf /var/lib/apt/lists/* /var/cache/debconf /tmp/*

ARG VERSION="25.1.3.23"

ARG single_binary_location_url=""

# install from a single binary
RUN if [ -n "${single_binary_location_url}" ]; then \
echo "installing from single binary url: ${single_binary_location_url}" \
&& rm -rf /tmp/clickhouse_binary \
&& mkdir -p /tmp/clickhouse_binary \
&& wget --progress=bar:force:noscroll "${single_binary_location_url}" -O /tmp/clickhouse_binary/clickhouse \
&& chmod +x /tmp/clickhouse_binary/clickhouse \
&& /tmp/clickhouse_binary/clickhouse install --user "clickhouse" --group "clickhouse" \
&& rm -rf /tmp/* ; \
fi

# The rest is the same in the official docker and in our build system
#docker-official-library:on

# post install
# we need to allow "others" access to clickhouse folder, because docker container
# can be started with arbitrary uid (openshift usecase)
RUN clickhouse-local -q 'SELECT * FROM system.build_options' \
&& mkdir -p /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client \
&& chmod ugo+Xrw -R /var/lib/clickhouse /var/log/clickhouse-server /etc/clickhouse-server /etc/clickhouse-client

RUN locale-gen en_US.UTF-8
ENV LANG=en_US.UTF-8
ENV TZ=UTC

RUN mkdir /docker-entrypoint-initdb.d

COPY docker_related_config.xml /etc/clickhouse-server/config.d/
COPY entrypoint.sh /entrypoint.sh

EXPOSE 9000 8123 9009
VOLUME /var/lib/clickhouse

ENV CLICKHOUSE_CONFIG=/etc/clickhouse-server/config.xml

ENTRYPOINT ["/entrypoint.sh"]

直接拷贝二进制进去执行 clickhouse install 会造成 overlay diff 浪费,所以需要起一个 web 下载,编译容器镜像:

1
2
3
4
5
6
7
8
9
RANDOM_PORT=50358
docker run -d --name ck -v $PWD/build_results:/usr/share/nginx/html/ \
-p 50358:80 \
m.daocloud.io/docker.io/library/nginx:alpine

cd docker/server/
docker build . --build-arg version="23.8.16.40" --network host \
--build-arg single_binary_location_url=http://127.0.0.1:50358/clickhouse \
-t clickhouse/clickhouse-server:23.8.16.40 -f Dockerfile.fix

一些其他信息

clickhouse/binary-builder 新版本的 tag 似乎不是 commidID 了,找了下一些版本 tag 和 clang 对应:

  • dd5e777b6745 18
  • 54187-e143a9039ba36ad0c25f2ed85503f36e88f61063 16

官方和编译的二进制文件 file 信息对比:

1
2
clickhouse: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked, interpreter /lib/ld-linux-aarch64.so.1, for GNU/Linux 3.7.0, BuildID[sha1]=3e70066de2db0f08f97ca310827184d61111dc22, not stripped
clickhouse: ELF 64-bit LSB shared object, ARM aarch64, version 1 (SYSV), dynamically linked (uses shared libs), for GNU/Linux 3.7.0, BuildID[sha1]=586fe5a059c988b60f99dcd794e68f0e4cf69619, not stripped

切换其他版本的时候要处理下子模块,相关命令:

1
2
3
4
git submodule update --recursive --checkout

git submodule foreach --recursive git reset --hard
git submodule update --recursive

切版本还要记得清理 build_docker 目录

参考

其实也没参考多少,就是这些是能搜到的 arm64 编译相关

CATALOG
  1. 1. 由来
  2. 2. 过程
  3. 3. 参考