Like many of you, we also use the self-hosted version of GitLab CE and it is one of the BEST open source software written in Ruby on Rails. Though GitLab got its own rake tasks to take backup to AWS S3, it wasn’t working for us and several other users. Hence we developed our own solution and sharing it here so that it can be helpful to others in need.
Once the GitLab repositories and uploads sizes become 6GB+, the probability of GitLab’s backup rake task working is 50/50. Most of the time it will fail while it tries to prepare backup or while uploading to AWS S3. To avoid this issue, we simply split the back up into two segments.
First, we upload the repositories and then we upload the uploads directory. It is done using a simple shell script located at https://gist.github.com/balajidl/44c9389835547fa40e88e881ac1d40ee
What does the shell script do?
- Initialize the log file
- Backup the repositories using the GitLab provided rake tasks
gitlab:backup:create
by ignoring the uploads foldergitlab:backup:create SKIP=uploads
- Backup the uploads by compressing the GitLab uploads folder
- Upload the two .gz files to AWS S3
- Periodically remove old backup files to avoid excess billing 🙂
- Send an email using SendGrid when the back up is successful.
LOGFILE="/var/log/gitlab-backup.log" | |
echo "**************START*************************" >> $LOGFILE | |
#---------1. GITLAB git repos backup --------- | |
echo "Start the gitlab backup process" >> $LOGFILE | |
/opt/gitlab/bin/gitlab-rake gitlab:backup:create SKIP=uploads >> $LOGFILE | |
echo "git backup done" >> $LOGFILE | |
echo "Now, uploading gitlab repo to s3" >> $LOGFILE | |
DATE=`date +%Y_%m_%d` | |
FILENAME="*${DATE}*gitlab_backup*" | |
EE="$(find /mnt/data/gitlab/ -type f -name $FILENAME)" | |
# Todo check $EE is not empty | |
echo "Gitlab backup repo file to be uploaded is: " $EE >> $LOGFILE | |
/usr/local/bin/aws s3 cp ${EE} s3://meow/ >> $LOGFILE | |
#---------1. GITLAB uploads backup --------- | |
echo "Start - gitlab uploads folder to s3" >> $LOGFILE | |
BAK_DES=/mnt/data/gitlab/backups/ | |
BAK_SOURCES=/mnt/data/gitlab/uploads/ | |
BAK_DATE=`date +%F` | |
BAK_DATETIME=`date +%F-%H%M` | |
BAK_FOLDER=${BAK_DEST}/${BAK_DATE} | |
BAK_FILE=${BAK_DES}uploads-${BAK_DATETIME}.tar.gz | |
echo "creating tar.gz file at " ${BAK_FILE} >> $LOGFILE | |
tar czPf ${BAK_FILE} ${BAK_SOURCES} | |
echo "Now uploading gitlab uploads folder to s3" | |
/usr/local/bin/aws s3 cp ${BAK_FILE} s3://meow/ >> $LOGFILE | |
#---------3. Delete old files --------- | |
echo 'Deleting backup older than '${KEEP_DAYS}' days' >> $LOGFILE | |
find /mnt/data/gitlab/backups/ -type f -name '*.tar' -mtime +3 -exec rm {} \; | |
find /mnt/data/gitlab/backups/ -type f -name '*.tar.gz' -mtime +3 -exec rm {} \; | |
#---------4. Send email to me to make me smile --------- | |
SENDGRID_API_KEY="long-key-here" | |
BAK_DATETIME=`date +%F-%H:%M` | |
SUBJECT="Gitlab backup to succcessful: ${BAK_DATETIME}" | |
REQUEST_DATA='{"personalizations": [{ | |
"to": [{ "email": "foo@foo.bar" }], | |
"subject": "'"$SUBJECT"'" | |
}], | |
"from": { | |
"email": "foo@foo.bar", | |
"name": "Code.spritle.com" | |
}, | |
"content": [{ | |
"type": "text/plain", | |
"value": "Keep smiling" | |
}] | |
}'; | |
curl -X "POST" "https://api.sendgrid.com/v3/mail/send" \ | |
-H "Authorization: Bearer $SENDGRID_API_KEY" \ | |
-H "Content-Type: application/json" \ | |
-d "$REQUEST_DATA" | |
echo "Sent email notification via sendgrid" >> $LOGFILE | |
echo "***************END***************************" >> $LOGFILE |
Voila ^_^!
Amazing! Isn’t it? Please feel free to comment if you have questions or need help. Happy to help 🙂