Restoring a Cloud instance
SOC2/CI-110
Follow break glass process to ensure you have the proper access to perform this playbook.
Extract the instance from Control Plane if cloud.sourcegraph.com/control-plane-mode=true
is in config.yaml
. Follow the Extract instance from control plane (break glass) section from the Ops Dashboard of the instance, go/cloud-ops.
At the end, follow the Backfill instance into control plane
section from the Ops Dashboard of the instance, go/cloud-ops
Restoring Cloud SQL
Use cases:
- Cloud SQL data is corrupted by a broken database migration
- Cloud SQL data is deleted
Restore from automated backup
Below process is derived from GCP documentation
The restoration process will be performed with gcloud
. Learn more about why not terraform?.
List all backups, note the id of the latest (or the one right before database state is corrupted) SUCCESSFUL backup as SQL_BACKUP_ID
mi2 instance sql-backup list --slug $SLUG -e $ENVIRONMENT
Restore the backup to the current instance.
mi2 instance sql-restore create --backup-id $SQL_BACKUP_ID --slug $SLUG -e $ENVIRONMENT
List operations to watch for progress
mi2 instance sql-restore list --slug $SLUG -e $ENVIRONMENT
Restore GKE cluster application(s)
Use cases (tested scenarios):
- GKE cluster was deleted
- application namespaces was deleted
- single application was deleted
- PV from single application was deleted
Backup and restore uses native GKE mechanism.
- Follow break glass process
- List available backups
- [Extract the instance from control plane]
- Assess the damage
List backups
mi2 instance backup list --slug $SLUG -e $ENVIRONMENT
note the backup name, you will need it later.
Restore cluster and applications from backup
cd sourcegraph/cloud
cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID
mi2 instance tfc deploy -auto-approve -e $ENVIRONMENT --slug $SLUG
mi2 instance workon -e $ENVIRONMENT --slug $SLUG
mi2 instance restore create --backup-name $BACKUP_NAME --restore-type full-replace --slug $SLUG -e $ENVIRONMENT
Restore the full namespace
cd sourcegraph/cloud
mi2 instance restore create --backup-name <BACKUP_NAME> --restore-type full-replace --slug $SLUG -e $ENVIRONMENT
Note: if pod hangs with PVCs pending, use below command:
kubectl delete sc gce-pd-gkebackup-de && kubectl get sc sourcegraph -o json | jq '.metadata.name = "gce-pd-gkebackup-de"' | kubectl apply -f -
Restore stateless application
e.g. sourcegraph-frontend
cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID/kubernetes
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm . | kubectl apply -f -
Restore statefull application from disk backup
e.g. gitserver
, zoekt
cd sourcegraph/cloud
mi2 instance restore create --backup-name $BACKUP_NAME --restore-type [gitserver|indexed-search] --slug $SLUG -e $ENVIRONMENT
Restore statefull application with empty disk
e.g. gitserver
, zoekt
cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID/kubernetes
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm . | kubectl apply -f -
Restoring GCP deleted project
Notes:
-
accidental deletion of GCP project was performed using the following command:
gcloud projects delete <PROJECT_ID>
based on official GCP documentation -
according to GCP official documentation, GCP project can be restored within 30 days since deletion
-
export environment variables
export ENVIRONMENT=[dev|prod]
export SLUG=<SLUG>
export GCP_PROJECT=$(mi2 instance get -e $ENVIRONMENT --slug $SLUG | jq -r '.status.gcp.projectId')
- peform undelete
gcloud projects undelete $GCP_PROJECT
- verify project is restored
gcloud projects describe $GCP_PROJECT
# should be: lifecycleState: ACTIVE