Do you have a containerized ceph-volume environment which now requires maintenance? Have you been trying to run ceph-(objectstore or bluestore)-tool but can not figure out how-to mount+unlock your OSD? Look no further. The following branch will exist until this feature is merged, or something comparable is implemented upstream.

You can build a maintenance container using an existing running osd container, making changes, and then executing the following:

  • Create a new entrypoint after osd_ceph_volume_activate that doesn't die unless stopped manually:
opt/ceph-container/bin/entrypoint.sh
  osd_ceph_volume_mount)
    ami_privileged
    source /opt/ceph-container/bin/osd_volume_mount.sh
    osd_volume_mount
    STAYALIVE="1"
    do_stayalive
    ;;

After you finish updating the entrypoint.sh, you'll need to create the script it references. I recommend using the activate script as a template:

new_file: opt/ceph-container/bin/osd_volume_mount.sh
## Container Shell
docker exec -it ceph-osd-<some_running_container_id> /bin/bash
cp opt/ceph-container/bin/osd_volume_activate.sh opt/ceph-container/bin/osd_volume_mount.sh

## replace the osd_volume_activate variable, with the new one:
  :1,$ s/osd_volume_activate/osd_volume_mount/g
  5 substitutions on 5 lines

## save the changes just made to a new tag:
docker commit <working_container_hash> ceph/osd:debug

With everything setup, now you can being maintenance:

Maintenance Steps
## Disable and stop the old svc to prevent systemd from interfering 
systemctl disable ceph-osd@<osd.id>
  Removed symlink /etc/systemd/system/multi-user.target.wants/ceph-osd@<osd.id>.service.
systemctl stop ceph-osd@<osd.id>

## Launch maintenance container
docker run --rm -d --net=host --privileged=true --pid=host --ipc=host --cpu-quota=400000 -v /dev:/dev -v /etc/localtime:/etc/localtime:ro -v /var/lib/ceph:/var/lib/ceph:z -v /etc/ceph:/etc/ceph:z -v /var/run/ceph:/var/run/ceph:z -v /var/run/udev/:/var/run/udev/ -v /var/log/ceph:/var/log/ceph:z -e OSD_BLUESTORE=1 -e OSD_FILESTORE=0 -e OSD_DMCRYPT=1 -e CLUSTER=ceph -v /run/lvm/:/run/lvm/ -e CEPH_DAEMON=OSD_CEPH_VOLUME_MOUNT -e OSD_ID=<osd.id> --name=ceph-osd-<osd.id>-debug ceph/osd:debug

## Jump into the maintenance container's shell
docker exec -it ceph-osd-<osd.id>-debug bash

## Repair OSD
ceph-bluestore-tool fsck --path /var/lib/ceph/osd/ceph-3/
  fsck success
ceph-bluestore-tool repair --path /var/lib/ceph/osd/ceph-3/
  repair success
exit

## Stop old container to umount, and restart the OSD
docker stop ceph-osd-<osd.id>-debug
systemctl enable ceph-osd@<osd.id>
  Created symlink from /etc/systemd/system/multi-user.target.wants/ceph-osd@<osd.id>.service to /etc/systemd/system/ceph-osd@.service.
systemctl start ceph-osd@<osd.id>