使用CNN的图像分类器原理和实现

本文是关于使用Python中的TFLearn创建图像分类器来识别cat-vs-dogs的。问题在这里托管卡格勒.

机器学习现在是世界上最热门的话题之一。嗯, 甚至可以说是当今世界的新电力。但确切地说, 什么是机器学习, 这只是通过提供大量数据来教授机器的一种方法。要了解有关机器学习及其算法的更多信息, 你可以参考本文参考部分中提供的一些链接。

今天, 我们将创建自己的图像分类器, 以区分是否

给定图片是狗或猫

或其他取决于你提供的数据。为了实现我们的目标, 我们将使用其中一种著名的机器学习算法来进行图像分类, 即卷积神经网络(或CNN)。

因此, CNN基本上是什么-我们知道它是一种机器学习算法, 用于机器具有远见的理解图像特征并记住这些特征以猜测是否将新图像的名称输入到机器中。由于它不是一篇介绍CNN的文章, 因此, 如果你对CNN的工作方式和行为感兴趣, 我将在最后添加一些链接。

因此, 在浏览了所有这些链接之后, 让我们看看如何创建自己的cat-vs-dog图像分类器。对于数据集, 我们将使用cat-vs-dog的kaggle数据集：

训练数据集链接
测试数据集链接

现在, 在获取数据集之后, 我们需要对数据进行一点预处理, 并在训练数据集期间为那里给出的每个图像提供标签。为此, 我们可以看到训练数据集的每个图像的名称都以” cat”或” dog”开头, 因此我们将利用它的优势, 然后将一个热编码器用于机器以了解标签(cat [1 , 0]或dog [0, 1])。

def label_img(img):
    word_label = img.split('.')[-3]
   
 # DIY One hot encoder
    if word_label == 'cat': return [1, 0]
    elif word_label == 'dog': return [0, 1]

所需的库：

TFLearn–具有用于TensorFlow的高级API的深度学习库, 用于创建我们的CNN层
tqdm–只是为了简单的设计, 立即使你的循环显示一个智能进度表
Numpy–处理图像矩阵
open-cv–处理图像, 例如将其转换为灰度等。
os–访问文件系统以从火车上读取图像并从我们的机器上测试目录
random–整理数据以克服偏见
matplotlib–显示我们的预测结果。
tensoflow–仅使用张量板比较损耗和亚当曲线我们的结果数据或获得的日志。

应该根据用户的便利性设置TRAIN_DIR和TEST_DIR并使用基本的超参数, 如历元, 学习率等, 以提高准确性。我已将图像转换为灰度, 因此我们只需要处理2-d矩阵, 否则3-d矩阵很难直接应用于CNN, 特别是对于初学者不建议这样做。下面是受到严重注释的代码, 否则, 你可以在我的GitHub帐户中找到此代码链接.

# Python program to create
# Image Classifier using CNN
  
# Importing the required libraries
import cv2
import os
import numpy as np
from random import shuffle
from tqdm import tqdm
  
'''Setting up the env'''
  
TRAIN_DIR = 'E:/dataset /Cats_vs_Dogs /train'
TEST_DIR = 'E:/dataset /Cats_vs_Dogs /test1'
IMG_SIZE = 50
LR = 1e - 3
  
  
'''Setting up the model which will help with tensorflow models'''
MODEL_NAME = 'dogsvscats-{}-{}.model' . format (LR, '6conv-basic' )
  
'''Labelling the dataset'''
def label_img(img):
     word_label = img.split( '.' )[ - 3 ]
     # DIY One hot encoder
     if word_label = = 'cat' : return [ 1 , 0 ]
     elif word_label = = 'dog' : return [ 0 , 1 ]
  
'''Creating the training data'''
def create_train_data():
     # Creating an empty list where we should store the training data
     # after a little preprocessing of the data
     training_data = []
  
     # tqdm is only used for interactive loading
     # loading the training data
     for img in tqdm(os.listdir(TRAIN_DIR)):
  
         # labeling the images
         label = label_img(img)
  
         path = os.path.join(TRAIN_DIR, img)
  
         # loading the image from the path and then converting them into
         # greyscale for easier covnet prob
         img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
  
         # resizing the image for processing them in the covnet
         img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
  
         # final step-forming the training data list with numpy array of the images
         training_data.append([np.array(img), np.array(label)])
  
     # shuffling of the training data to preserve the random state of our data
     shuffle(training_data)
  
     # saving our trained data for further uses if required
     np.save( 'train_data.npy' , training_data)
     return training_data
  
'''Processing the given test data'''
# Almost same as processing the training data but
# we dont have to label it.
def process_test_data():
     testing_data = []
     for img in tqdm(os.listdir(TEST_DIR)):
         path = os.path.join(TEST_DIR, img)
         img_num = img.split( '.' )[ 0 ]
         img = cv2.imread(path, cv2.IMREAD_GRAYSCALE)
         img = cv2.resize(img, (IMG_SIZE, IMG_SIZE))
         testing_data.append([np.array(img), img_num])
          
     shuffle(testing_data)
     np.save( 'test_data.npy' , testing_data)
     return testing_data
  
'''Running the training and the testing in the dataset for our model'''
train_data = create_train_data()
test_data = process_test_data()
  
# train_data = np.load('train_data.npy')
# test_data = np.load('test_data.npy')
'''Creating the neural network using tensorflow'''
# Importing the required libraries
import tflearn
from tflearn.layers.conv import conv_2d, max_pool_2d
from tflearn.layers.core import input_data, dropout, fully_connected
from tflearn.layers.estimator import regression
  
import tensorflow as tf
tf.reset_default_graph()
convnet = input_data(shape = [ None , IMG_SIZE, IMG_SIZE, 1 ], name = 'input' )
  
convnet = conv_2d(convnet, 32 , 5 , activation = 'relu' )
convnet = max_pool_2d(convnet, 5 )
  
convnet = conv_2d(convnet, 64 , 5 , activation = 'relu' )
convnet = max_pool_2d(convnet, 5 )
  
convnet = conv_2d(convnet, 128 , 5 , activation = 'relu' )
convnet = max_pool_2d(convnet, 5 )
  
convnet = conv_2d(convnet, 64 , 5 , activation = 'relu' )
convnet = max_pool_2d(convnet, 5 )
  
convnet = conv_2d(convnet, 32 , 5 , activation = 'relu' )
convnet = max_pool_2d(convnet, 5 )
  
convnet = fully_connected(convnet, 1024 , activation = 'relu' )
convnet = dropout(convnet, 0.8 )
  
convnet = fully_connected(convnet, 2 , activation = 'softmax' )
convnet = regression(convnet, optimizer = 'adam' , learning_rate = LR, loss = 'categorical_crossentropy' , name = 'targets' )
  
model = tflearn.DNN(convnet, tensorboard_dir = 'log' )
  
# Splitting the testing data and training data
train = train_data[: - 500 ]
test = train_data[ - 500 :]
  
'''Setting up the features and lables'''
# X-Features & Y-Labels
  
X = np.array([i[ 0 ] for i in train]).reshape( - 1 , IMG_SIZE, IMG_SIZE, 1 )
Y = [i[ 1 ] for i in train]
test_x = np.array([i[ 0 ] for i in test]).reshape( - 1 , IMG_SIZE, IMG_SIZE, 1 )
test_y = [i[ 1 ] for i in test]
  
'''Fitting the data into our model'''
# epoch = 5 taken
model.fit({ 'input' : X}, { 'targets' : Y}, n_epoch = 5 , validation_set = ({ 'input' : test_x}, { 'targets' : test_y}), snapshot_step = 500 , show_metric = True , run_id = MODEL_NAME)
model.save(MODEL_NAME)
  
'''Testing the data'''
import matplotlib.pyplot as plt
# if you need to create the data:
# test_data = process_test_data()
# if you already have some saved:
test_data = np.load( 'test_data.npy' )
  
fig = plt.figure()
  
for num, data in enumerate (test_data[: 20 ]):
     # cat: [1, 0]
     # dog: [0, 1]
      
     img_num = data[ 1 ]
     img_data = data[ 0 ]
      
     y = fig.add_subplot( 4 , 5 , num + 1 )
     orig = img_data
     data = img_data.reshape(IMG_SIZE, IMG_SIZE, 1 )
  
     # model_out = model.predict([data])[0]
     model_out = model.predict([data])[ 0 ]
      
     if np.argmax(model_out) = = 1 : str_label = 'Dog'
     else : str_label = 'Cat'
          
     y.imshow(orig, cmap = 'gray' )
     plt.title(str_label)
     y.axes.get_xaxis().set_visible( False )
     y.axes.get_yaxis().set_visible( False )
plt.show()

输出图像将不是很清晰, 因为所有图像都减小为50X50, 以便机器通过速度和损耗之间的权衡来快速处理。

并在你的cmd中使用以下命令访问tensorboard(Windows用户)

tensorboard --logdir=foo:C:\Users\knapseck\Desktop\Dev\Cov_Net\log

输出如下：

初学者到机器学习的参考链接：

机器学习极客
Siraj Raval – YouTube
吴安国(Andrew Ng)Machinera课程
机器学习：凯文·墨菲(Kevin Murphy)的一种概率方法
Reddit机器学习社区。

CNN的参考链接：

Jupyter笔记本–转换网
维基百科–卷积神经网络
斯坦福课程–cs231n

相关推荐

评论抢沙发

评论前必须登录！

猜你喜欢

热门标签

回顶部

相关推荐

评论 抢沙发

评论前必须登录！

猜你喜欢

热门标签

回顶部

评论抢沙发