In this article, I explain the implementation of Convolutional Neural Network (CNN) using Keras frame work in python. Keras is a high level neural network API to build deep learning models. Now a days, deep learning models have achieved promising results in many tasks in the field of computer vision. CNN is a kind of deep learning model and achieved promising results in image classification tasks. In other words, CNN acts as a powerful image classifier. This article explains the implementation of CNN using keras for hand written digit classification task. Remember, deep networks are the data driven models, i.e., more data is needed for training a deep neural network. For my explanation, I am using MNIST dataset for the hand written digit classification task.
About MNIST dataset:
This dataset contains 60000, 10000 instances in the Training and Testing sets respectively. The following code snippet loads the MNIST dataset in keras. The variables data_train and data_test contains data instances of digits in training and testing sets, where as label_train and label_test stores the labels of training and testing data instances.
#load training and testing data
from keras.datasets import mnist
(data_train, label_train), (data_test, label_test) = mnist.load_data()
We can view the dimensions of variables using the following python code snippet.
#printing dimensions of variables
print(data_train.shape,data_test.shape)
output of the above code snippet: (60000, 28, 28) (10000, 28, 28)
From the above output, we have to understand that the training instances are 60000 and testing instances are 10000 respectively. Each data instance (image) dimension is 28 x 28, i.e., training and testing data instances are gray scale images. The following code snippet will be helpful to visualize the images in training and testing sets. It will display four random images in training set.
#displaying the random digits
rand1 = random.randint(1,60000)
rand2 = random.randint(1,60000)
rand3 = random.randint(1,60000)
rand4 = random.randint(1,60000)
cv2.imshow('digit',data_train[rand1])
cv2.waitKey(500)
cv2.imshow('digit',data_train[rand2])
cv2.waitKey(500)
cv2.imshow('digit',data_train[rand3])
cv2.waitKey(500)
cv2.imshow('digit',data_train[rand4])
cv2.waitKey(200)
Since we are working with gray scale images, it is necessary to reshape as explained in following code snippet. In other words, we have to reshape the 28 x 28 images into 28 x 28 x 1 images.
#reshape training and testing data to train and test model
data_train = data_train.reshape(60000,28,28,1)
data_test = data_test.reshape(10000,28,28,1)
In general, we are using class labels are integers (1,2,3,...). It is necessary to convert integers into binary class matrix (i.e. called as one hot encoding). The following code converts integer class labels into binary class matrix labels.
from keras.utils import to_categorical
#one-hot encoding of training and testing labels
y_train = to_categorical(label_train)
y_test = to_categorical(label_test)
Now, this is time for designing CNN for classification task. Here, I have chosen three convolution layers and each layer is followed by activation layer RELU. Then, one fully connected layer and softmax layers are added at the end for classification tasks.
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten
#create model
model = Sequential()
#add model layers
model.add(Conv2D(64, kernel_size=3, activation='relu', input_shape=(28,28,1)))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Conv2D(32, kernel_size=3, activation='relu'))
model.add(Flatten())
model.add(Dense(10, activation='softmax'))
Now, we have to compile the model to measure the model performance as shown below and then training the model with the chosen options.
#compile model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
#training the model
model.fit(data_train, y_train, validation_data=(data_test, y_test), epochs=1)
with the single epoch, we got validation accuracy on MNIST dataset is 97.93. This is great. By increasing number of epochs, accuracy may increase.
output: loss: 0.3824 - acc: 0.9464 - val_loss: 0.0693 - val_acc: 0.9793