Recently we had to restore a volume from a snapshot. The client wanted to restore a volume from a few months ago, to find some work that had been deleted in the past months.
This of course is easy to do in the AWS Console, or via the aws cli.
Once the snapshot is restored, it can be attached as the root volume. Start the instance, and then you boot into the old volume. Right? Wrong.
Let’s take a closer look. Here are some steps to replicate the issue we were having.
# Create a Snapshot
myles@TMcC ~ aws ec2 create-snapshot --volume-id vol-0fa0b570d1ceb0377 --description 'test_volume_snapshot'
Description: test_volume_snapshot
Encrypted: false
OwnerId: '879834674196'
Progress: ''
SnapshotId: snap-0a05794531961f6ea
StartTime: '2021-07-26T18:21:21.392000+00:00'
State: pending
Tags: []
VolumeId: vol-0fa0b570d1ceb0377
VolumeSize: 8
#Restore The Volume
myles@TMcC ~ aws ec2 create-volume --availability-zone us-east-1e --encrypted --snapshot-id snap-0a05794531961f6ea
AvailabilityZone: us-east-1e
CreateTime: '2021-07-26T18:31:22+00:00'
Encrypted: true
Iops: 100
MultiAttachEnabled: false
Size: 8
SnapshotId: snap-0a05794531961f6ea
State: creating
Tags: []
VolumeId: vol-0369468dd51ab1c77
VolumeType: gp2
We now have the backup volume. We can now attach it to the ec2 instance to see if we can access it.
# Attach the volume
myles@TMcC ~ aws ec2 attach-volume --device /dev/sdh --instance-id i-09acafde7d0f2a734 --volume-id vol-0369468dd51ab1c77
AttachTime: '2021-07-26T18:31:57.429000+00:00'
Device: /dev/sdh
InstanceId: i-09acafde7d0f2a734
State: attaching
VolumeId: vol-0369468dd51ab1c77
Now we can ssh into the instance, then mount the backup drive.
# ssh
ssh inttest #use your own config to login to the instance
# Must be the root user to perform commands below
# find the drive
lsblk
Device Boot Start End Sectors Size Id Type
/dev/xvdh1 * 2048 16777182 16775135 8G 83 Linux
root@ip-172-31-34-181:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 32.3M 1 loop /snap/snapd/11588
loop1 7:1 0 55.5M 1 loop /snap/core18/1997
loop2 7:2 0 33.3M 1 loop /snap/amazon-ssm-agent/3552
loop3 7:3 0 32.3M 1 loop /snap/snapd/12398
loop4 7:4 0 55.5M 1 loop /snap/core18/2074
xvda 202:0 0 8G 0 disk
└─xvda1 202:1 0 8G 0 part /
xvdh 202:112 0 8G 0 disk
└─xvdh1 202:113 0 8G 0 part
# Mount Drive
mount /dev/xvdh1 /mnt #Mounting it to the /mnt drive
# Add some stuff to make it more obvious the two volumes are # different.
touch /mnt/copy_of_original.txt
touch /original.txt
This is where things get difficult.
- If you stop the instance
# Stop the Instance
aws ec2 stop-instances --instance-ids i-09acafde7d0f2a734
StoppingInstances:
- CurrentState:
Code: 64
Name: stopping
InstanceId: i-09acafde7d0f2a734
PreviousState:
Code: 16
Name: running
# Wait for it to stop
- change the device path of the new volume to be the root volume
# detach the drives
aws ec2 detach-volume --volume-id vol-0369468dd51ab1c77
AttachTime: '2021-07-26T18:31:57+00:00'
Device: /dev/sdh
InstanceId: i-09acafde7d0f2a734
State: detaching
VolumeId: vol-0369468dd51ab1c77
aws ec2 detach-volume --volume-id vol-0fa0b570d1ceb0377
AttachTime: '2021-07-22T20:35:52+00:00'
Device: /dev/sda1
InstanceId: i-09acafde7d0f2a734
State: detaching
VolumeId: vol-0fa0b570d1ceb0377
# attach with different paths/root
aws ec2 attach-volume --device /dev/sda1 --instance-id i-09acafde7d0f2a734 --volume-id vol-0369468dd51ab1c77
AttachTime: '2021-07-26T18:43:51.109000+00:00'
Device: /dev/sda1
InstanceId: i-09acafde7d0f2a734
State: attaching
VolumeId: vol-0369468dd51ab1c77
aws ec2 attach-volume --device /dev/sdg --instance-id i-09acafde7d0f2a734 --volume-id vol-0fa0b570d1ceb0377
AttachTime: '2021-07-26T18:44:48.422000+00:00'
Device: /dev/sdg
InstanceId: i-09acafde7d0f2a734
State: attaching
VolumeId: vol-0fa0b570d1ceb0377
- Restart the instance
# start the instance
aws ec2 start-instances --instance-ids i-09acafde7d0f2a734
StartingInstances:
- CurrentState:
Code: 0
Name: pending
InstanceId: i-09acafde7d0f2a734
PreviousState:
Code: 80
Name: stopped
I would expect the new volume to be the root volume! right? WRONG!!
# Show that it's still using it.
# ssh to the instance
ssh inttest #use your own config to login to the instance
# Find the drive
root@ip-172-31-34-181:~# lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
loop0 7:0 0 32.3M 1 loop /snap/snapd/12398
loop1 7:1 0 55.5M 1 loop /snap/core18/2074
loop2 7:2 0 55.5M 1 loop /snap/core18/1997
loop3 7:3 0 32.3M 1 loop /snap/snapd/11588
loop4 7:4 0 33.3M 1 loop /snap/amazon-ssm-agent/3552
xvda 202:0 0 8G 0 disk
└─xvda1 202:1 0 8G 0 part
xvdg 202:96 0 8G 0 disk
└─xvdg1 202:97 0 8G 0 part /
# mount the unmounted drive
mount /dev/xvda1 /mnt
# Let's look
ls -l /original.txt
-rw-r--r-- 1 root root 0 Jul 26 18:35 /original.txt
ls -l /mnt/copy_of_original.txt
-rw-r--r-- 1 root root 0 Jul 26 18:34 /mnt/copy_of_original.txt
This is backwards?!?
As long as BOTH volumes are attached to the instance, it will always take the old one. Why? I don’t know. I’m assuming, that it first looks for a volumes uuid, then it looks for some other label or parameter that I don’t understand.
How do we fix this?
- Don’t start an instance with both volumes attached at the same time. If you need both attached at the same time, start the server with it detached. Once booted, you can attach and mount the server. This is very dangerous, because the next time the instance is rebooted, it will boot to the old drive.
- Attach the old drive to a new instance, and use ssh/scp or some other tool to move the files. There a number of ways to do this.
# Stop the instance
aws ec2 stop-instances --instance-ids i-09acafde7d0f2a734
StoppingInstances:
- CurrentState:
Code: 64
Name: stopping
InstanceId: i-09acafde7d0f2a734
PreviousState:
Code: 16
Name: running
# Wait for it to stop
# Detach the original volume
aws ec2 detach-volume --volume-id vol-0fa0b570d1ceb0377
AttachTime: '2021-07-26T18:44:48+00:00'
Device: /dev/sdg
InstanceId: i-09acafde7d0f2a734
State: detaching
VolumeId: vol-0fa0b570d1ceb0377
# Start the instance
aws ec2 start-instances --instance-ids i-09acafde7d0f2a734
ssh inttest #use your own config to login to the instance
#This fails which is GOOD!
ls -l /original.txt
ls: cannot access '/original.txt': No such file or directory
#This succeeds which is double GOOD!
ls -l /copy_of_original.txt
-rw-r--r-- 1 root root 0 Jul 26 18:34 /copy_of_original.txt
Tools To Help
- fdisk -l – list all drives attached to the instance
- lsblk – Lists all information about available block devices
- blkid – List uuid and other attributes for available block devices
- nvme id-ctrl <drive handle> – Gives a lot of info, including the aws volume id of the drive.