轉換學習 (Dogs & Cats)

本案例使用 1,024張『貓』、和 1,024張『狗』的照片來訓練一個可以辨識貓或狗的模型, 利用『轉換學習(Transfer Learning)』的方法,我們可以在很短的時間就做到92.5%的正確率。

In [1]:
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
from scipy import ndimage, misc

n = 8; m = 4
img_file = ["data/train/cats/cat.{}.jpg".format(i) for i in range(16)] + [
    "data/train/dogs/dog.{}.jpg".format(i) for i in range(16)] 

plt.figure(figsize=(20, 12))
for i in range(m):
    for j in range(n):
        ax = plt.subplot(m, n, i*n + j + 1)
        img = ndimage.imread(img_file[i*n + j], mode="RGB")
        img = misc.imresize(img, (80, 80))
        plt.imshow(img)
        ax.get_xaxis().set_visible(False)
        ax.get_yaxis().set_visible(False)
plt.show()

In [2]:
## This notebook is built around using tensorflow as the backend for keras
##Updated to Keras 2.0
from keras import backend
import os
import numpy as np
from keras.models import Sequential
from keras.layers import Activation, Dropout, Flatten, Dense
from keras.preprocessing.image import ImageDataGenerator
from keras.layers import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras import optimizers
from keras import applications
from keras.models import Model
# from PIL import Image
# import h5py
Using TensorFlow backend.

Import Images

In [3]:
# dimensions of our images.
img_width, img_height = 150, 150
train_data_dir = 'data/train'
validation_data_dir = 'data/validation'

##preprocessing
# used to rescale the pixel values from [0, 255] to [0, 1] interval
datagen = ImageDataGenerator(rescale=1./255)
batch_size = 32

# automagically retrieve images and their classes for train and validation sets
train_generator = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='binary')

validation_generator = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='binary')
Found 2048 images belonging to 2 classes.
Found 832 images belonging to 2 classes.

Small Conv Net

Model architecture definition

In [4]:
# a simple stack of 3 convolution layers with a ReLU activation and followed by max-pooling layers.
model = Sequential()
model.add(Convolution2D(32, (3, 3), input_shape=(img_width, img_height,3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(32, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Convolution2D(64, (3, 3)))
model.add(Activation('relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Flatten())
model.add(Dense(64))
model.add(Activation('relu'))
model.add(Dropout(0.5))
model.add(Dense(1))
model.add(Activation('sigmoid'))
In [5]:
model.compile(loss='binary_crossentropy',
              optimizer='rmsprop',
              metrics=['accuracy'])

Training

In [6]:
epochs = 20
train_samples = 2048
validation_samples = 832
In [7]:
model.fit_generator(
        train_generator,
        steps_per_epoch=train_samples // batch_size,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=validation_samples// batch_size,)
Epoch 1/20
64/64 [==============================] - 20s - loss: 0.7184 - acc: 0.5425 - val_loss: 0.6734 - val_acc: 0.5204
Epoch 2/20
64/64 [==============================] - 13s - loss: 0.6769 - acc: 0.5859 - val_loss: 0.6496 - val_acc: 0.6514
Epoch 3/20
64/64 [==============================] - 13s - loss: 0.6260 - acc: 0.6611 - val_loss: 0.5761 - val_acc: 0.7115
Epoch 4/20
64/64 [==============================] - 13s - loss: 0.5901 - acc: 0.7051 - val_loss: 0.6119 - val_acc: 0.6394
Epoch 5/20
64/64 [==============================] - 13s - loss: 0.5571 - acc: 0.7173 - val_loss: 0.5697 - val_acc: 0.6863
Epoch 6/20
64/64 [==============================] - 13s - loss: 0.5306 - acc: 0.7456 - val_loss: 0.5794 - val_acc: 0.6923
Epoch 7/20
64/64 [==============================] - 13s - loss: 0.4976 - acc: 0.7617 - val_loss: 0.5556 - val_acc: 0.7115
Epoch 8/20
64/64 [==============================] - 13s - loss: 0.4752 - acc: 0.7720 - val_loss: 0.5629 - val_acc: 0.7380
Epoch 9/20
64/64 [==============================] - 13s - loss: 0.4386 - acc: 0.7983 - val_loss: 0.5625 - val_acc: 0.7260
Epoch 10/20
64/64 [==============================] - 13s - loss: 0.3813 - acc: 0.8257 - val_loss: 0.6211 - val_acc: 0.7103
Epoch 11/20
64/64 [==============================] - 13s - loss: 0.3478 - acc: 0.8486 - val_loss: 0.6081 - val_acc: 0.7308
Epoch 12/20
64/64 [==============================] - 13s - loss: 0.3233 - acc: 0.8584 - val_loss: 0.6489 - val_acc: 0.7248
Epoch 13/20
64/64 [==============================] - 13s - loss: 0.2769 - acc: 0.8813 - val_loss: 0.5515 - val_acc: 0.7284
Epoch 14/20
64/64 [==============================] - 13s - loss: 0.2300 - acc: 0.9097 - val_loss: 0.6971 - val_acc: 0.7212
Epoch 15/20
64/64 [==============================] - 13s - loss: 0.2125 - acc: 0.9141 - val_loss: 0.7240 - val_acc: 0.7284
Epoch 16/20
64/64 [==============================] - 13s - loss: 0.1743 - acc: 0.9360 - val_loss: 0.7785 - val_acc: 0.7368
Epoch 17/20
64/64 [==============================] - 13s - loss: 0.1487 - acc: 0.9424 - val_loss: 0.8430 - val_acc: 0.7464
Epoch 18/20
64/64 [==============================] - 13s - loss: 0.1337 - acc: 0.9429 - val_loss: 0.8877 - val_acc: 0.7151
Epoch 19/20
64/64 [==============================] - 13s - loss: 0.1108 - acc: 0.9590 - val_loss: 1.0877 - val_acc: 0.7127
Epoch 20/20
64/64 [==============================] - 13s - loss: 0.1065 - acc: 0.9565 - val_loss: 1.1210 - val_acc: 0.7103
Out[7]:
<keras.callbacks.History at 0x1bd0bc596a0>

Evaluating on validation set

In [8]:
model.evaluate_generator(validation_generator, validation_samples)
Out[8]:
[1.0338689942593471, 0.72336989182692313]
In [9]:
model.save_weights('models/basic_cnn_20_epochs.h5')

Data augmentation for improving the model

In [10]:
train_datagen_augmented = ImageDataGenerator(
        rescale=1./255,        # normalize pixel values to [0,1]
        shear_range=0.2,       # randomly applies shearing transformation
        zoom_range=0.2,        # randomly applies shearing transformation
        horizontal_flip=True)  # randomly flip the images

# same code as before
train_generator_augmented = train_datagen_augmented.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode='binary')
Found 2048 images belonging to 2 classes.
In [11]:
model.fit_generator(
        train_generator_augmented,
        steps_per_epoch=train_samples // batch_size,
        epochs=epochs,
        validation_data=validation_generator,
        validation_steps=validation_samples // batch_size,)
Epoch 1/20
64/64 [==============================] - 20s - loss: 0.5257 - acc: 0.7612 - val_loss: 0.5442 - val_acc: 0.7680
Epoch 2/20
64/64 [==============================] - 18s - loss: 0.4533 - acc: 0.8022 - val_loss: 0.5603 - val_acc: 0.7404
Epoch 3/20
64/64 [==============================] - 18s - loss: 0.4534 - acc: 0.8027 - val_loss: 0.7995 - val_acc: 0.7224
Epoch 4/20
64/64 [==============================] - 18s - loss: 0.4488 - acc: 0.8022 - val_loss: 0.5067 - val_acc: 0.7608
Epoch 5/20
64/64 [==============================] - 18s - loss: 0.4493 - acc: 0.8062 - val_loss: 0.5366 - val_acc: 0.7488
Epoch 6/20
64/64 [==============================] - 18s - loss: 0.4256 - acc: 0.8174 - val_loss: 0.5034 - val_acc: 0.7728
Epoch 7/20
64/64 [==============================] - 18s - loss: 0.4064 - acc: 0.8257 - val_loss: 0.5230 - val_acc: 0.7752
Epoch 8/20
64/64 [==============================] - 18s - loss: 0.4237 - acc: 0.8232 - val_loss: 0.5076 - val_acc: 0.7704
Epoch 9/20
64/64 [==============================] - 18s - loss: 0.3883 - acc: 0.8257 - val_loss: 0.4885 - val_acc: 0.7704
Epoch 10/20
64/64 [==============================] - 18s - loss: 0.4003 - acc: 0.8257 - val_loss: 0.5256 - val_acc: 0.7656
Epoch 11/20
64/64 [==============================] - 18s - loss: 0.3850 - acc: 0.8281 - val_loss: 0.5385 - val_acc: 0.7728
Epoch 12/20
64/64 [==============================] - 17s - loss: 0.4003 - acc: 0.8281 - val_loss: 0.4912 - val_acc: 0.7921
Epoch 13/20
64/64 [==============================] - 18s - loss: 0.3768 - acc: 0.8379 - val_loss: 0.6657 - val_acc: 0.7740
Epoch 14/20
64/64 [==============================] - 17s - loss: 0.3844 - acc: 0.8345 - val_loss: 0.4896 - val_acc: 0.7752
Epoch 15/20
64/64 [==============================] - 18s - loss: 0.3970 - acc: 0.8286 - val_loss: 0.4951 - val_acc: 0.7885
Epoch 16/20
64/64 [==============================] - 18s - loss: 0.3832 - acc: 0.8467 - val_loss: 0.5684 - val_acc: 0.7656
Epoch 17/20
64/64 [==============================] - 18s - loss: 0.3796 - acc: 0.8438 - val_loss: 0.4836 - val_acc: 0.8017
Epoch 18/20
64/64 [==============================] - 18s - loss: 0.3799 - acc: 0.8354 - val_loss: 0.4962 - val_acc: 0.7788
Epoch 19/20
64/64 [==============================] - 18s - loss: 0.3611 - acc: 0.8481 - val_loss: 0.5655 - val_acc: 0.7728
Epoch 20/20
64/64 [==============================] - 18s - loss: 0.3906 - acc: 0.8428 - val_loss: 0.4696 - val_acc: 0.8089
Out[11]:
<keras.callbacks.History at 0x1bd74cc32b0>

Evaluation & Validation

In [12]:
model.evaluate_generator(validation_generator, validation_samples)
Out[12]:
[0.48666439271675277, 0.80115685096153844]
In [13]:
model.save_weights('models/augmented_20_epochs.h5')

Using a pre-trained model

VGG16 + small MLP

download VGG16

In [14]:
model_vgg = applications.VGG16(include_top=False, weights='imagenet')

Using the VGG16 model to process samples

In [15]:
train_generator_bottleneck = datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)

validation_generator_bottleneck = datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_width, img_height),
        batch_size=batch_size,
        class_mode=None,
        shuffle=False)
Found 2048 images belonging to 2 classes.
Found 832 images belonging to 2 classes.

This is a long process, so we save the output of the VGG16 once and for all.

In [16]:
bottleneck_features_train = model_vgg.predict_generator(train_generator_bottleneck, train_samples // batch_size)
np.save(open('models/bottleneck_features_train.npy', 'wb'), bottleneck_features_train)
In [17]:
bottleneck_features_validation = model_vgg.predict_generator(validation_generator_bottleneck, validation_samples // batch_size)
np.save(open('models/bottleneck_features_validation.npy', 'wb'), bottleneck_features_validation)

Now we can load it ...

In [18]:
train_data = np.load(open('models/bottleneck_features_train.npy', 'rb'))
train_labels = np.array([0] * (train_samples // 2) + [1] * (train_samples // 2))

validation_data = np.load(open('models/bottleneck_features_validation.npy', 'rb'))
validation_labels = np.array([0] * (validation_samples // 2) + [1] * (validation_samples // 2))

and define and train the custom fully connected nural network

In [19]:
model_top = Sequential()
model_top.add(Flatten(input_shape=train_data.shape[1:]))
model_top.add(Dense(256, activation='relu'))
model_top.add(Dropout(0.5))
model_top.add(Dense(1, activation='sigmoid'))

model_top.compile(optimizer='rmsprop', loss='binary_crossentropy', metrics=['accuracy'])
In [20]:
model_top.fit(train_data, train_labels,
        epochs=epochs, 
        batch_size=batch_size,
        validation_data=(validation_data, validation_labels))
Train on 2048 samples, validate on 832 samples
Epoch 1/20
2048/2048 [==============================] - 1s - loss: 0.8737 - acc: 0.7412 - val_loss: 0.4293 - val_acc: 0.7837
Epoch 2/20
2048/2048 [==============================] - 1s - loss: 0.3957 - acc: 0.8379 - val_loss: 0.3657 - val_acc: 0.8269
Epoch 3/20
2048/2048 [==============================] - 1s - loss: 0.3174 - acc: 0.8618 - val_loss: 0.2349 - val_acc: 0.9062
Epoch 4/20
2048/2048 [==============================] - 1s - loss: 0.2736 - acc: 0.8965 - val_loss: 0.3680 - val_acc: 0.8401
Epoch 5/20
2048/2048 [==============================] - 1s - loss: 0.2432 - acc: 0.9038 - val_loss: 0.4230 - val_acc: 0.8365
Epoch 6/20
2048/2048 [==============================] - 1s - loss: 0.2130 - acc: 0.9141 - val_loss: 0.3723 - val_acc: 0.8462
Epoch 7/20
2048/2048 [==============================] - 1s - loss: 0.1965 - acc: 0.9185 - val_loss: 0.2550 - val_acc: 0.9087
Epoch 8/20
2048/2048 [==============================] - 1s - loss: 0.1414 - acc: 0.9414 - val_loss: 0.2735 - val_acc: 0.9050
Epoch 9/20
2048/2048 [==============================] - 1s - loss: 0.1323 - acc: 0.9414 - val_loss: 0.2714 - val_acc: 0.9062
Epoch 10/20
2048/2048 [==============================] - 1s - loss: 0.1181 - acc: 0.9521 - val_loss: 0.3466 - val_acc: 0.8990
Epoch 11/20
2048/2048 [==============================] - 1s - loss: 0.1285 - acc: 0.9463 - val_loss: 0.3082 - val_acc: 0.9026
Epoch 12/20
2048/2048 [==============================] - 1s - loss: 0.0927 - acc: 0.9634 - val_loss: 0.3508 - val_acc: 0.8954
Epoch 13/20
2048/2048 [==============================] - 1s - loss: 0.0968 - acc: 0.9600 - val_loss: 0.3416 - val_acc: 0.9050
Epoch 14/20
2048/2048 [==============================] - 1s - loss: 0.0782 - acc: 0.9697 - val_loss: 0.5825 - val_acc: 0.8510
Epoch 15/20
2048/2048 [==============================] - 1s - loss: 0.0841 - acc: 0.9678 - val_loss: 0.4221 - val_acc: 0.8978
Epoch 16/20
2048/2048 [==============================] - 1s - loss: 0.0610 - acc: 0.9771 - val_loss: 0.3826 - val_acc: 0.9050
Epoch 17/20
2048/2048 [==============================] - 1s - loss: 0.0621 - acc: 0.9756 - val_loss: 0.4922 - val_acc: 0.8918
Epoch 18/20
2048/2048 [==============================] - 1s - loss: 0.0426 - acc: 0.9854 - val_loss: 0.4165 - val_acc: 0.9087
Epoch 19/20
2048/2048 [==============================] - 1s - loss: 0.0555 - acc: 0.9805 - val_loss: 0.4881 - val_acc: 0.9002
Epoch 20/20
2048/2048 [==============================] - 1s - loss: 0.0445 - acc: 0.9854 - val_loss: 0.6359 - val_acc: 0.8714
Out[20]:
<keras.callbacks.History at 0x1bd0c677e80>

The training process of this small neural network is very fast : ~2s per epoch

In [21]:
model_top.save_weights('models/bottleneck_20_epochs.h5')

Bottlenect model evaluation

In [22]:
model_top.evaluate(validation_data, validation_labels)
768/832 [==========================>...] - ETA: 0s
Out[22]:
[0.63587793386362201, 0.87139423076923073]

We reached a 88% accuracy on the validation after ~1m of training (~20 epochs) and 8% of the samples originally available on the Kaggle competition !

Fine-tuning the top layers of a pre-trained network

Start by instantiating the VGG base and loading its weights.

In [23]:
model_vgg = applications.VGG16(weights='imagenet', include_top=False, input_shape=(150, 150, 3))

Build a classifier model to put on top of the convolutional model. For the fine tuning, we start with a fully trained-classifer. We will use the weights from the earlier model. And then we will add this model on top of the convolutional base.

In [24]:
top_model = Sequential()
top_model.add(Flatten(input_shape=model_vgg.output_shape[1:]))
top_model.add(Dense(256, activation='relu'))
top_model.add(Dropout(0.5))
top_model.add(Dense(1, activation='sigmoid'))

top_model.load_weights('models/bottleneck_30_epochs.h5')

#model_vgg.add(top_model)
model = Model(inputs = model_vgg.input, outputs = top_model(model_vgg.output))

For fine turning, we only want to train a few layers. This line will set the first 25 layers (up to the conv block) to non-trainable

In [25]:
for layer in model.layers[:15]:
    layer.trainable = False
In [26]:
# compile the model with a SGD/momentum optimizer
# and a very slow learning rate.
model.compile(loss='binary_crossentropy',
              optimizer=optimizers.SGD(lr=1e-4, momentum=0.9),
              metrics=['accuracy'])
In [27]:
# prepare data augmentation configuration  . . . do we need this?
train_datagen = ImageDataGenerator(
        rescale=1./255,
        shear_range=0.2,
        zoom_range=0.2,
        horizontal_flip=True)

test_datagen = ImageDataGenerator(rescale=1./255)

train_generator = train_datagen.flow_from_directory(
        train_data_dir,
        target_size=(img_height, img_width),
        batch_size=batch_size,
        class_mode='binary')

validation_generator = test_datagen.flow_from_directory(
        validation_data_dir,
        target_size=(img_height, img_width),
        batch_size=batch_size,
        class_mode='binary')
Found 2048 images belonging to 2 classes.
Found 832 images belonging to 2 classes.
In [28]:
# fine-tune the model
model.fit_generator(
    train_generator,
    steps_per_epoch=train_samples // batch_size,
    epochs=epochs,
    validation_data=validation_generator,
    validation_steps=validation_samples // batch_size)
Epoch 1/20
64/64 [==============================] - 81s - loss: 0.3221 - acc: 0.9087 - val_loss: 0.3964 - val_acc: 0.8930
Epoch 2/20
64/64 [==============================] - 80s - loss: 0.2013 - acc: 0.9331 - val_loss: 0.3099 - val_acc: 0.9147
Epoch 3/20
64/64 [==============================] - 80s - loss: 0.1220 - acc: 0.9585 - val_loss: 0.3256 - val_acc: 0.9123
Epoch 4/20
64/64 [==============================] - 80s - loss: 0.1500 - acc: 0.9424 - val_loss: 0.2944 - val_acc: 0.9135
Epoch 5/20
64/64 [==============================] - 80s - loss: 0.1231 - acc: 0.9561 - val_loss: 0.2904 - val_acc: 0.9171
Epoch 6/20
64/64 [==============================] - 80s - loss: 0.1073 - acc: 0.9595 - val_loss: 0.3145 - val_acc: 0.9123
Epoch 7/20
64/64 [==============================] - 80s - loss: 0.0727 - acc: 0.9756 - val_loss: 0.2810 - val_acc: 0.9255
Epoch 8/20
64/64 [==============================] - 80s - loss: 0.0820 - acc: 0.9731 - val_loss: 0.3489 - val_acc: 0.9159
Epoch 9/20
64/64 [==============================] - 80s - loss: 0.0836 - acc: 0.9692 - val_loss: 0.3068 - val_acc: 0.9183
Epoch 10/20
64/64 [==============================] - 80s - loss: 0.0776 - acc: 0.9731 - val_loss: 0.3212 - val_acc: 0.9147
Epoch 11/20
64/64 [==============================] - 80s - loss: 0.0779 - acc: 0.9722 - val_loss: 0.3289 - val_acc: 0.9171
Epoch 12/20
64/64 [==============================] - 80s - loss: 0.0499 - acc: 0.9810 - val_loss: 0.3055 - val_acc: 0.9231
Epoch 13/20
64/64 [==============================] - 80s - loss: 0.0443 - acc: 0.9849 - val_loss: 0.2816 - val_acc: 0.9231
Epoch 14/20
64/64 [==============================] - 80s - loss: 0.0489 - acc: 0.9795 - val_loss: 0.3457 - val_acc: 0.9243
Epoch 15/20
64/64 [==============================] - 80s - loss: 0.0524 - acc: 0.9805 - val_loss: 0.3896 - val_acc: 0.9183
Epoch 16/20
64/64 [==============================] - 80s - loss: 0.0605 - acc: 0.9785 - val_loss: 0.3538 - val_acc: 0.9207
Epoch 17/20
64/64 [==============================] - 80s - loss: 0.0353 - acc: 0.9883 - val_loss: 0.3598 - val_acc: 0.9243
Epoch 18/20
64/64 [==============================] - 80s - loss: 0.0405 - acc: 0.9839 - val_loss: 0.3135 - val_acc: 0.9351
Epoch 19/20
64/64 [==============================] - 80s - loss: 0.0359 - acc: 0.9849 - val_loss: 0.3295 - val_acc: 0.9183
Epoch 20/20
64/64 [==============================] - 80s - loss: 0.0475 - acc: 0.9834 - val_loss: 0.3621 - val_acc: 0.9207
Out[28]:
<keras.callbacks.History at 0x1bd0c8e76d8>
In [29]:
model.save_weights('models/finetuning_20epochs_vgg.h5')

Evaluating on validation set

Computing loss and accuracy :

In [30]:
model.evaluate_generator(validation_generator, validation_samples)
Out[30]:
[0.32431596470433549, 0.92529296875]

We reached a 92.5% accuracy on the validation with 8% of the samples originally available on the Kaggle competition!

In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]:
 
In [ ]: