zhangguanzhang's Blog

从PTTYPE="dos"到TYPE="LVM2_member"的救援

字数统计: 2.1k阅读时长: 11 min
2019/12/03

同事叫我救援一台云主机,虽说是虚拟机,但是类比到硬件服务器还是一样的操作,这里记录下给后来者查阅

故障信息

控制台进去看到centos7的背景虚化的数字7+转圈,重启下看下完整的错误,重启选了内核然后进到图形界面的时候按下ecs取消,观察终端

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
[  OK ] Started Show Plymouth Boot Screen.
[ OK ] Reached target Paths.
[ OK ] Reached target Basic System.
[ 124.522110] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 125.034736] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 125.542788] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 126.522110] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 127.068643] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
[ 127.576830] dracut-initqueue[240]: Warning: dracut-initqueue timeout - starting timeout scripts
...
[ 185.082387] dracut-initqueue[240]: Warning: Could not boot.
[ 185.118736] dracut-initqueue[240]: Warning: /dev/centos/root does not exist
[ 185.119239] dracut-initqueue[240]: Warning: /dev/mapper/centos-root does not exist
Starting Dracut Emergency Shell...
Warning: /dev/centos/root does not exist
Warning: /dev/mapper/centos-root does not exist
Generating "/run/initramfs/rdsosreport.txt"

Entering emergency mode. Exit the shell to continue.
Type "journalctl" to view system logs.
You might want to save "/run/initramfs/rdsosreport.txt" to a USB stack or /boot
after mounting them and attach it to a bug report.

dracut:/#

处理

挂载iso进救援模式

找不到根分区,关闭虚机,后台拷贝下系统盘的卷先备份下。然后给虚机的IDE光驱挂载了个centos的iso,修改虚机启动顺序到ISO,进Troubleshooting –> Rescue a CentOS Linux system

1
2
3
4
5
1) Continue
2) Read-only mount
3) Skip to shell
3) Quit (Reboot)
Please make a selection from the above: 1

选择了1后提示没有任何Linux分区

1
2
3
4
5
6
=====================================================================================
=====================================================================================
Rescue Mount

You don't have any Linux partitions. The system will reboot automatically when you exit from the shell.
Please press <return> to get a shell.

按下回车进入交互式shell

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
sh-4.2# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
vda 252:0 0 40G 0 disk
├─vda1 252:1 0 2M 0 part
├─vda2 252:2 0 200M 0 part
└─vda3 252:3 0 39.8G 0 part
vdb 252:16 0 400G 0 disk
loop0 7:1 0 420.3M 1 loop
loop1 7:1 0 2G 1 loop
├─live-rw 253:0 0 2G 0 dm /
└─live-base 253:1 0 2G 1 dm
loop2 7:2 0 512M 1 loop
└─live-rw 253:0 0 2G 0 dm /
sh-4.2# fdisk -l /dev/vda

Disk /dev/vda: 42.9 GB, 42949672960 bytes, 83886080 sectors
Units = sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk label type: dos
Disk identifier: 0x000ad4f2

Device Boot Start End Blocks Id System
/dev/vda1 2048 6143 2048 83 Linux
/dev/vda2 * 6144 415743 204800 83 Linux
/dev/vda3 415744 83886079 41735168 8e Linux LVM
sh-4.2# blkid
/run/install/repo/LiveOS/squashfs.img: TYPE="squashfs"
/dev/sr0: UUID="2018-05-03-20-55-23-00" LABEL="Centos 7 x86_64" TYPE="iso9660" PTTYPE="dos"
/dev/sr1: UUID="2019-11-01-16-33-37-00" LABEL="config-2" TYPE="iso9660"
/dev/vda1: UUID="e438c18a-c97d-432c-ae66-a538cd1fbb4d" TYPE="xfs"
/dev/vda3: PTTYPE="dos"
/dev/loop0: TYPE="squashfs"
...

查看下块,vda2是boot,vda3是lvm也就是根所在,问题是/dev/vda3: PTTYPE="dos"不知道为何变成了dos类型,正常应该是TYPE="LVM2_member"

1
/dev/vda3: UUID="xxxxxxxxxxxxxx" TYPE="LVM2_member"

看看lvm的状态

1
2
3
4
mkdir /mnt/sysimage
vgchange -a y
mount /dev/centos/root /mnt/sysimage
ls -l /mnt/sysimage

发现根分区的文件还在

重做lvm为了去掉PTTYPE=”dos”

尝试着重做pv试试

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
sh-4.2# vgremove centos
Do you really wanto ro remove volume group "centos" containing 1 logical volumes? [y/n]: y
Logical volume "root" successfully removed
Volume group "centos" successfully removed
sh-4.2# pvremove /dev/vda3
Labels on physical volume "/dev/vda3" successfully wiped.
sh-4.2# pvcreate /dev/vda3
WARNING: dos signature detected on /dev/vda3 at offset 510. Wipe it? [y/n]: y
Wiping dos signature on /dev/vda3.
Physical volume "/dev/vda3" successfully created.
sh-4.2# vgcreate centos /dev/vda3
Volume group "centos" successfully created
sh-4.2# lvcreate -n root -l 100%FREE centos
Logical volume "root" created
sh-4.2# mkdir /mnt/root
sh-4.2# mount /dev/centos/root /mnt/root

被后面的xfs_repair输出滚动冲没了,大致就是lvcreate的时候提示有xfs标签,选择不擦除,最终得到了个残缺的的/dev/centos/root,然后xfs_repair它后重启也无法正常开机。再次进救援模式挂载了后,在chroot到故障的根进不去报错/bin/bash no such file,才意识到损坏了文件,很多so文件都丢了
最开始留有备份,打算在文件层面恢复

最后也应该是最开应该做的正确操作

下发了台不是lvm的centos7.6机器,然后给该机器挂载了数据盘为vdb(50G,其实大于等于故障机器根的真实占用大小即可),备份的卷挂载为vdc。利用数据盘中转下原有的根的文件
打算把lvm的文件系统文件拷贝到数据盘vdb的文件系统上,然后在故障机器的救援模式下挂载这个数据盘,把数据盘的根文件拷贝到残缺的系统盘的根下

格式化vdb

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
[root@fix-data ~]# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
sr1 11:1 1 464K 0 rom
vda 253:0 0 40G 0 disk
└─vda1 253:1 0 40G 0 part /
vdb 253:16 0 50G 0 disk
[root@fix-data ~]# parted /dev/vdb
GNU Parted 3.1
Using /dev/vdb
Welcome to GNU Parted! Type 'help' to view a list of commands.
(parted) mklabel gpt
(parted) mkpart
Partition name? []? 1
File system type? [ext2]? xfs
Start? 1
End? 100%
(parted) p
Model: Virtio Block Device (virtblk)
Disk /dev/vdb: 53.7GB
Sector size (logical/physical): 512B/512B
Partition Table: gpt
Disk Flags:

Number Start End Size File system Name Flags
1 1049kB 53.7GB 53.7GB 1

(parted) q
Information: You may need to update /etc/fstab.

[root@fix-data ~]# mkfs.xfs /dev/vdb1
meta-data=/dev/vdb1 isize=512 agcount=4, agsize=3276672 blks
= sectsz=512 attr=2, projid32bit=1
= crc=1 finobt=0, sparse=0
data = bsize=4096 blocks=13106688, imaxpct=25
= sunit=0 swidth=0 blks
naming =version 2 bsize=4096 ascii-ci=0 ftype=1
log =internal log bsize=4096 blocks=6399, version=2
= sectsz=512 sunit=0 blks, lazy-count=1
realtime =none extsz=4096 blocks=0, rtextents=0
[root@fix-data ~]# mkdir -p /mnt/{root,data}

挂载需要修复的系统盘的克隆卷

后台挂载好后

1
2
3
4
5
6
7
8
9
10
11
12
# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 1024M 0 rom
sr1 11:1 1 464K 0 rom
vda 253:0 0 40G 0 disk
└─vda1 253:1 0 40G 0 part /
vdb 253:16 0 50G 0 disk
└─vdb1 253:17 0 50G 0 part
vdc 253:32 0 40G 0 disk
├─vdc1 253:33 0 2M 0 part
├─vdc2 253:34 0 200M 0 part
└─vdc3 253:35 0 39.8G 0 part

安装lvm2工具(新机器因为不是lvm的根分区所以系统没有安装这个)

1
yum install -y lvm2

激活lvm状态

1
2
$ vgchange -a y
1 logical volume(s) in volume group "centos" now active

根 –> /mnt/root/
/boot –> /mnt/root/boot
数据盘 –> /mnt/data/

1
2
3
mount /dev/centos/root /mnt/root/
mount /dev/vdc2 /mnt/root/boot
mount /dev/vdb1 /mnt/data

然后拷贝

1
cp -a /mnt/root/* /mnt/data/

拷贝完后取消挂载并关机

1
2
3
4
umount /mnt/data/
umount /mnt/root/boot/
umount /mnt/root/
poweroff

拷贝

后台把该数据盘挂载到故障机器上,故障机器在救援模式里操作

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
sh-4.2# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sr0 11:0 1 4.2G 0 rom /run/install/repo
sr1 11:1 1 464K 0 rom
vda 252:0 0 40G 0 disk
├─vda1 252:1 0 2M 0 part
├─vda2 252:2 0 200M 0 part /mnt/sysimage/boot
└─vda3 252:3 0 39.8G 0 part
└─centos-root 253:2 0 39.8G 0 lvm /mnt/sysimage
vdb 252:16 0 400G 0 disk
vdc 252:32 0 50G 0 disk
└─vdc1 252:33 0 50G 0 part
loop0 7:1 0 420.3M 1 loop
loop1 7:1 0 2G 1 loop
├─live-rw 253:0 0 2G 0 dm /
└─live-base 253:1 0 2G 1 dm
loop2 7:2 0 512M 1 loop
└─live-rw 253:0 0 2G 0 dm /

可以看到数据盘为vdc,挂载vdc1到/mnt/data,然后拷贝到/mnt/sysimage

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
sh-4.2# mkdir /mnt/data
sh-4.2# mount /dev/vdc1 /mnt/data
sh-4.2# ls -l /mnt/data
total 16
lrwxrwxrwx 1 root root 7 Jul 24 2018 bin -> usr/bin
dr-xr-xr-x. 5 root root 4096 Jul 24 2018 boot
drwxr-xr-x. 2 root root 18 Dec 3 04:24 dev
drwxr-xr-x. 143 root root 8192 Dec 3 03:54 etc
drwxr-xr-x. 3 root root 20 Jul 25 2018 home
lrwxrwxrwx 1 root root 7 Jul 24 2018 lib -> usr/lib
lrwxrwxrwx 1 root root 9 Jul 24 2018 lib64 -> usr/lib64
drwxr-xr-x. 2 root root 6 Apr 11 2018 media
drwxr-xr-x. 2 root root 6 Apr 11 2018 mnt
drwxr-xr-x. 4 root root 34 Nov 1 17:19 opt
drwxr-xr-x. 2 root root 6 Jul 24 2018 proc
dr-xr-x---. 9 root root 258 Nov 5 16:14 root
drwxr-xr-x. 2 root root 6 Jul 24 2018 run
lrwxrwxrwx 1 root root 8 Jul 24 2018 sbin -> usr/sbin
drwxr-xr-x. 2 root root 6 Apr 11 2018 srv
drwxr-xr-x. 2 root root 6 Jul 24 2018 sys
drwxrwxrwt. 7 root root 114 Dec 3 04:55 tmp
drwx------ 7 root root 66 Jul 24 2018 usr
drwxr-xr-x. 21 root root 4096 Jul 24 2018 var
drwxr-xr-x. 2 root root 6 Nov 4 02:09 version
sh-4.2# cd /mnt/sysimage
sh-4.2# cp -a /mnt/data/* .

重做grub.cfg,该系统不是grub2,如果是grub2则/boot下有grub2目录

1
2
3
4
5
mount -o bind /dev /mnt/sysimage/dev
mount -o bind /proc /mnt/sysimage/proc
mount -o bind /run /mnt/sysimage/run
mount -o bind /sys /mnt/sysimage/sys
mv boot/grub/grub.cfg{,.bak}

然后chroot进来生成grub.cfg

1
2
chroot .
grub-mkconfig -o /boot/grub2/grub.cfg

开机正常

CATALOG
  1. 1. 故障信息
  2. 2. 处理
    1. 2.1. 挂载iso进救援模式
    2. 2.2. 重做lvm为了去掉PTTYPE=”dos”
    3. 2.3. 最后也应该是最开应该做的正确操作
      1. 2.3.1. 格式化vdb
      2. 2.3.2. 挂载需要修复的系统盘的克隆卷
      3. 2.3.3. 拷贝