How to train your Tiny-yoloV3 model in Google Colab

Google Colab offers free 12GB GPU enabled virtual machines for 12 hrs. If you are like me who couldn’t afford GPU enabled computer, Google Colab is a blessing.

Check out my other blog post on Real-time custom object detection using Tiny-yoloV3 and OpenCV to prepare the config files and dataset for training.

So, I’m assuming you have the following necessary files

custom_cfg (folder containing the following files)
.cfg file(modified yolov3-tiny.cfg according to no. of classes),
.data file,
.names file,
train.txt (training data set),
test.txt (testing data set),
darknet53.conv.74 or other intermediate .weights file
custom_dataset (folder containing training images and their yolo object bounding box values)

Let’s get started.

1. Create a Google Colab Jupiter notebook

Goto Google Colab and create a new Jupiter notebook and set it to use GPU.

2. Connect your Google Drive account

We need to transfer some files to VM. We can do with git or with Google drive. We will use Google Drive to transfer large files and training images. Create a code block and run this code

from google.colab import drive
drive.mount('/content/gdrive')

You will be asked to authenticate with your Google account. Once done, you can access your drive contents under ‘/content/gdrive/’.

%ll /content/gdrive/My\ Drive/

2. Install cuDNN

CUDA is Nvidia’s language/API for programming on the graphics card. cuDNN is a library for deep neural nets built using CUDA. It provides GPU accelerated functionality for common operations in deep neural nets.

Download the cuDNN similar to the CUDA version installed in your VM. To check CUDA version, run command

!/usr/local/cuda/bin/nvcc --version

Upload the cuDNN file to your Google Drive. Then extract that file in ‘/usr/local/’ folder

!tar -xzvf /content/gdrive/My\ Drive/cudnn-10.1-linux-x64-v7.6.5.32.tgz -C /usr/local/

!chmod a+r /usr/local/cuda/include/cudnn.h

3. Install Darknet in your VM

Darknet is an open-source neural network framework written in C and CUDA. You can clone Darknet repo directly in VM. However, we need to edit some files in this repo and we cannot edit directly in VM. So, fork the repo and make changes.

Change line.no 1 and 2 in ‘Makefile’, set the value to 1. This will enable Darknet to use GPU. commit and push the changes.

Then to clone your forked repo in VM. Run command

#clone the repo
!git clone <your repo git url>

Compile Darknet

#cd into the darknet 
%cd darknet 

#Compile Darknet
!make

4. Upload your config files and training data

Upload the custom_cfg and custom_dataset folder into your Google Drive and move those into your VM darknet folder

!cp -a /content/gdrive/My\ Drive/custom_cfg /content/darknet/ 

!cp -a /content/gdrive/My\ Drive/custom_dataset /content/darknet/

6. Lets train

Run this command to start training

!./darknet detector train custom_cfg/licence_plate.data custom_cfg/licence_plate.cfg custom_cfg/darknet53.conv.74 -dont_show

once your training gets completed you will get final_weights. you can see intermediate weights for every 100 iterations until 900, then for every 10,000 iterations in folder ‘darknet/backup/’. you can move those file to Google Drive and download from your Google Drive account

That’s all folks, your model is trained.

7 Replies to “How to train your Tiny-yoloV3 model in Google Colab”

Lydia says:
August 17, 2020 at 8:16 am
Hi. I am creating a dateset on one class to detect people. When I start the training it gets to layer 15 before I get a “Cuda malloc failed” and “darknet: ./src/utils.c:326: error: Assertion `0′ failed.” I followed all the steps provided and even had to resolve a few other issues I ran into but I can’t seem to resolve this issue. Any recommendations?
Thanks in advance.
1. Javeed Basha says:
  August 17, 2020 at 2:14 pm
  I’m not sure, I guess the GPU doesn’t have enough memory. Can you try decreasing the batch size or increasing subdivisions in .cfg file
2. Ganesh Raj says:
  March 15, 2021 at 9:12 am
  This is because the file path to open cv ( image or video ) is wrong …. kindly check it
Geofrey Kivaru says:
October 1, 2020 at 7:52 pm
Hello,
Thank you for excellent tutorial Javeed. I’ve followed all the steps that you mention but I find myself stuck at training step. When I run step 6 I get following error:
” ./darknet: error while loading shared libraries: libcudnn.so.8: cannot open shared object file: No such file or directory “.
I’ve installed cuNN correctly and I can see it on my colab directory along with libcudnn.so.8 file but unfortunately I cant find a way to access it while training, please help.
1. Geofrey Kivaru says:
  October 4, 2020 at 8:30 pm
  Solved it by running command: “!ldconfig /usr/local/cuda/lib64” before training step. I guess might help someone
  1. Artur says:
    November 21, 2020 at 6:35 pm
    Thank you!
RAJAT says:
December 28, 2020 at 7:04 pm
Thanks bro, But I am having problem in uploding cuDNN file to colab. Please help.
!tar -xzvf /content/gdrive/My\ Drive/cudnn-10.1-linux-x64-v7.6.5.32.tgz -C /usr/local/
when i am downloading cuDNN file for linux from nvidia site I am getting “cudnn-10.1-linux-x64-v8.0.5.39.solitairetheme8” file downloaded. I am not getting any tgz file.