2016-08-02

Recovering RHEL in emergency mode in AWS EC2

OK. I'm sure everybody knows that, but I did not. When you have AWS EC2 instance, say c3.8xlarge, you will get 2 x 320 GB SSD storage. Thats nice, isn't it? But, besides you have to manually attach that storage when launching the machine (4. Add Storage -> Add New Volume -> Volume Type: Instance Store 0 and repeat for Instance Store 1), it gets purged every time you stop and start your instance. I did not knew that so I have created LVM on these and added physical volume to the /etc/fstab to be auto mounted on next boot:

# pvcreate --yes /dev/xvdb
# pvcreate --yes /dev/xvdc
# vgcreate mygroup /dev/xvdb /dev/xvdc
# lvcreate --size 1GB --name myvol mygroup
# echo "/dev/mapper/mygroup-myvol /mnt/test xfs defaults 0 0" >>/etc/fstab

Now, here comes the problem: on instance stop and start, RHEL notices there is now no logical volume and skips into emergency mode? Now what? Let's recover using second machine. Executive summary:

  1. Take some working machine
  2. Detach root device volume from broken machine
  3. Attach it to working machine
  4. From the working machine mount, fix fstab and umount
  5. Detach
  6. Attach (do not ask me why, but as a Device, I had to use /dev/sda1 instead of /dev/sda)

Because mine broken and working machines were created from same RHEL 7.2 image, when I have attempted to mount I got this:

# mount /dev/xvdf2 /mnt/tmp/
mount: wrong fs type, bad option, bad superblock on /dev/xvdf2,
       missing codepage or helper program, or other error

       In some cases useful info is found in syslog - try
       dmesg | tail or so.
# dmesg | tail
[  265.755327] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[  288.106763] SELinux: initialized (dev tmpfs, type tmpfs), uses transition SIDs
[  568.495065] Adjusting xen more than 11% (9437184 vs 9311354)
[  583.752252] blkfront: xvdf: barrier or flush: disabled; persistent grants: disabled; indirect descriptors: enabled;
[  583.766118]  xvdf: xvdf1 xvdf2
[  662.933196] XFS (xvdf2): Filesystem has duplicate UUID 379de64d-ea11-4f5b-ae6a-0aa50ff7b24d - can't mount
[  752.706161] XFS (xvdf2): Filesystem has duplicate UUID 379de64d-ea11-4f5b-ae6a-0aa50ff7b24d - can't mount
[  842.806648] XFS (xvdf2): Filesystem has duplicate UUID 379de64d-ea11-4f5b-ae6a-0aa50ff7b24d - can't mount
[  879.618806] XFS (xvdf): Invalid superblock magic number
[  884.951716] XFS (xvdf2): Filesystem has duplicate UUID 379de64d-ea11-4f5b-ae6a-0aa50ff7b24d - can't mount

So I had to mount with -o nouuid option.