Do you have a containerized ceph-volume environment which now requires maintenance? Have you been trying to run ceph-(objectstore or bluestore)-tool but can not figure out how-to mount+unlock your OSD? Look no further. The following branch will exist until this feature is merged, or something comparable is implemented upstream.
You can build a maintenance container using an existing running osd container, making changes, and then executing the following:
- Create a new entrypoint after
osd_ceph_volume_activate
that doesn't die unless stopped manually:
opt/ceph-container/bin/entrypoint.sh
osd_ceph_volume_mount)
ami_privileged
source /opt/ceph-container/bin/osd_volume_mount.sh
osd_volume_mount
STAYALIVE="1"
do_stayalive
;;
After you finish updating the entrypoint.sh, you'll need to create the script it references. I recommend using the activate script as a template:
new_file: opt/ceph-container/bin/osd_volume_mount.sh
## Container Shell
docker exec -it ceph-osd-<some_running_container_id> /bin/bash
cp opt/ceph-container/bin/osd_volume_activate.sh opt/ceph-container/bin/osd_volume_mount.sh
## replace the osd_volume_activate variable, with the new one:
:1,$ s/osd_volume_activate/osd_volume_mount/g
5 substitutions on 5 lines
## save the changes just made to a new tag:
docker commit <working_container_hash> ceph/osd:debug
With everything setup, now you can being maintenance:
Maintenance Steps
## Disable and stop the old svc to prevent systemd from interfering
systemctl disable ceph-osd@<osd.id>
Removed symlink /etc/systemd/system/multi-user.target.wants/ceph-osd@<osd.id>.service.
systemctl stop ceph-osd@<osd.id>
## Launch maintenance container
docker run --rm -d --net=host --privileged=true --pid=host --ipc=host --cpu-quota=400000 -v /dev:/dev -v /etc/localtime:/etc/localtime:ro -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /var/run/ceph:/var/run/ceph:z -v /var/run/udev/:/var/run/udev/ -v /var/log/ceph:/var/log/ceph:z -e OSD_BLUESTORE=1 -e OSD_FILESTORE=0 -e OSD_DMCRYPT=1 -e CLUSTER=ceph -v /run/lvm/:/run/lvm/ -e CEPH_DAEMON=OSD_CEPH_VOLUME_MOUNT -e OSD_ID=<osd.id> --name=ceph-osd-<osd.id>-debug ceph/osd:debug
## Jump into the maintenance container's shell
docker exec -it ceph-osd-<osd.id>-debug bash
## Repair OSD
ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-3/
fsck success
ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-3/
repair success
exit
## Stop old container to umount, and restart the OSD
docker stop ceph-osd-<osd.id>-debug
systemctl enable ceph-osd@<osd.id>
Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-osd@<osd.id>.service to /etc/systemd/system/ceph-osd@.service.
systemctl start ceph-osd@<osd.id>
Comments