欧美日韩一区二区三区va,国内自拍视频青青在线视频

使用球員圖像姿勢的板球擊球分類

2022-03-15 15:34

磐創(chuàng)AI

關(guān)注

介紹

姿勢檢測是計算機(jī)視覺（CV）技術(shù)的子集，可預(yù)測人或物體的軌跡和位置。這是通過查看給定人或物體的姿勢和方向的組合來完成的。

目標(biāo)

本文的目的是建立一個模型，該模型可以使用球員的姿勢對板球擊球進(jìn)行分類。為此，將圖像輸入到模型中。它將檢測圖像中人的姿勢，然后使用檢測到的姿勢，我們將分類它屬于什么類型。

1．安裝依賴項(xiàng)

2．加載和預(yù)處理數(shù)據(jù)

3．數(shù)據(jù)增強(qiáng)

4．使用detectron2檢測姿勢

5．使用球員的姿勢對板球擊球進(jìn)行分類

6．評估模型性能

安裝 Cricket Shot 分類的依賴項(xiàng)

！pip install pyyaml＝＝5．1

＃ install detectron2：

！pip install detectron2＝＝0．1．3 －f

https：／／dl．fbaipublicfiles．com／detectron2／wheels／cu101／torch1．5／index．html

加載和預(yù)處理板球擊球分類的數(shù)據(jù)

我們將加載保存在驅(qū)動器上的數(shù)據(jù)集。因此，為此，我們將首先安裝驅(qū)動器，然后提取簡短的 zip 文件。

＃ mount drive

from google．colab import drive

drive．mount（＇drive／＇）

zip 文件包含不同類型鏡頭的圖像。接下來，我們獲取文件夾的名稱，它們是類別或不同類型的鏡頭。

＃ extract files

！unzip ＇drive／My Drive／shot．zip＇

接下來，我們使用 OS 庫的 list ERR 函數(shù)來執(zhí)行此操作。在這里，我們正在打印我們擁有的文件夾名稱，我們有四個文件夾，即 pull， cut， drive and sweep．

import os

＃ specify path

path＝＇shot／＇

＃ list down the folders

folders ＝ os．listdir（path）

print（folders）

輸出：［＇pull＇，＇cut＇，＇drive＇，＇sweep＇］

接下來，我們正在讀取所有圖像并將它們存儲在一個名為 images 的列表中。我們還將標(biāo)簽存儲在一個列表中，該列表是每個圖像的類。這個類存儲圖像的文件夾的名稱。我們將遍歷每個文件夾并逐個讀取圖像，并將它們附加到創(chuàng)建的列表中。

＃ for dealing with images

import cv2

＃ create lists

images ＝［］

labels ＝［］

＃ for each folder

for folder in folders：

＃ list down image names

names＝os．listdir（path＋folder）

＃ for each image

for name in names：

＃ read an image

img＝cv2．imread（path＋folder＋＇／＇＋name）

＃ append image to list

images．a(chǎn)ppend（img）

＃ append folder name （type of shot） to list

labels．a(chǎn)ppend（folder）

讓我們使用 len 函數(shù)快速檢查圖像的數(shù)量。我們可以觀察到有 290 張圖像。

＃ number of images

len（images）

輸出：290

現(xiàn)在，我們正在可視化數(shù)據(jù)集中的一些圖像。所以對于每種類型的鏡頭。我們隨機(jī)繪制五張圖像。我們將使用matplotlib來可視化圖像。random 函數(shù)將用于隨機(jī)選擇圖像。

我們將創(chuàng)建一個子圖，其中四行代表四個不同的類，五列代表五個示例。接下來對于每個類，我們將隨機(jī)挑選五張圖像并使用 cv2．imread 函數(shù)讀取圖像。讀取圖像后，你可以將這些圖像轉(zhuǎn)換為 RGB 格式并可視化這些圖像。

＃ visualization library

import matplotlib．pyplot as plt

＃ for randomness

import random

＃ create subplots with 4 rows and 5 columns
fig， ax ＝ plt．subplots（nrows＝4， ncols＝5， figsize＝（15，15））

＃ randomly display 5 images for each shot for each folder

for i in range（len（folders））：

＃ read image names

names＝os．listdir（path＋folders［i］）

＃ randomly select 5 image names

names＝ random．sample（names， 5）

＃ for each image

for j in range（len（names））：

＃ read an image

img ＝ cv2．imread（path＋ folders［i］＋＇／＇＋names［j］）

＃ convert BGR to RGB

img ＝ cv2．cvtColor（img， cv2．COLOR＿BGR2RGB）

＃ display image

ax［i， j］．imshow（img）

＃ set folder name as title

ax［i， j］．set＿title（folders［i］）

＃ Turn off axis

ax［i， j］．a(chǎn)xis（＇off＇）

因此，你可以在這里看到我們從數(shù)據(jù)集中獲取的一些圖像示例�，F(xiàn)在，因?yàn)槲覀冊谟?xùn)練集中的圖像數(shù)量較少。我們將使用數(shù)據(jù)增強(qiáng)技術(shù)來增加我們的訓(xùn)練規(guī)模。

數(shù)據(jù)增強(qiáng)

為了增加我們的訓(xùn)練規(guī)模，我們將水平翻轉(zhuǎn)圖像，這將有助于我們做兩件事，首先，玩家可以同時使用右手和左手，因此通過翻轉(zhuǎn)圖像。這將使我們的模型更加通用。它還將增加用于訓(xùn)練的圖像數(shù)量。

在這里我們創(chuàng)建一個空列表來存儲數(shù)據(jù)集中每個圖像的增強(qiáng)圖像及其對應(yīng)的標(biāo)簽。

我們使用 cv2 的 flip 函數(shù)翻轉(zhuǎn)它，然后將其附加到列表中。

＃ image augmentation

aug＿images＝［］

aug＿labels＝［］

＃ for each image in training data

for idx in range（len（images））：

＃ fetch an image and label

img ＝ images［idx］

label＝ labels［idx］

＃ flip an image

img＿flip ＝ cv2．flip（img， 1）

＃ append augmented image to list

aug＿images．a(chǎn)ppend（img＿flip）

＃ append label to list

aug＿labels．a(chǎn)ppend（label）

接下來，我們將與原始圖像一起可視化一些增強(qiáng)圖像。

我們隨機(jī)挑選了五張圖片。此外，我們正在創(chuàng)建一個子圖來像以前一樣進(jìn)行可視化。我們首先繪制實(shí)際圖像，然后繪制其增強(qiáng)版本。

在這里我們可以看到，使用數(shù)據(jù)增強(qiáng)來翻轉(zhuǎn)圖像，鏡頭的類型不會改變。即使我們水平旋轉(zhuǎn)圖像，pull類型的鏡頭仍屬于pull類。

＃ display actual and augmented image for sample images

＃ create indices

ind ＝ range（len（aug＿images））

＃ randomly sample indices

ind ＝ random．sample（ind， 5）

＃ create subplots with 5 rows and 2 columns

fig， ax ＝ plt．subplots（nrows＝5， ncols＝2， figsize＝（15，15））

＃ for each row

for row in range（5）：

＃ for each column

for col in range（2）：

＃ first column for actual image

if col＝＝0：

＃ display actual image

ax［row， col］．imshow（images［ ind［row］］）

＃ set title

ax［row， col］．set＿title（＇Actual＇）

＃ Turn off axis

ax［row， col］．a(chǎn)xis（＇off＇）

＃ second column for augmented image

else：

＃ display augmented image

ax［row， col］．imshow（aug＿images［ ind［row］］）

＃ set title

ax［row， col］．set＿title（＇Augmented＇）

＃ Turn off axis

ax［row， col］．a(chǎn)xis（＇off＇）

現(xiàn)在我們正在合并實(shí)際圖像和增強(qiáng)圖像并檢查圖像的數(shù)量。

＃ combine actual and augmented images ＆ labels

images ＝ images ＋ aug＿images

labels ＝ labels ＋ aug＿labels

＃ number of images

len（images）

輸出：580

使用detectron2檢測姿勢

現(xiàn)在我們有 580 張圖像，包括用于訓(xùn)練的實(shí)際圖像和增強(qiáng)圖像。現(xiàn)在我們的數(shù)據(jù)集已經(jīng)準(zhǔn)備好了。接下來，我們將使用detectron2 檢測所有這些圖像中玩家的姿勢。

我們將使用detectron2 中的預(yù)訓(xùn)練模型來檢測這些姿勢。我們正在定義模型和一些庫，定義我們將使用的模型架構(gòu)。我們還定義了使用預(yù)訓(xùn)練模型的權(quán)重的路徑。

之后，我們將邊界框的閾值定義為 0．8。最后，我們正在定義我們的預(yù)測器�，F(xiàn)在模型已經(jīng)準(zhǔn)備好了。

＃ import some common detectron2 utilities

＃ to obtain pretrained models

from detectron2 import model＿zoo

＃ set up predictor

from detectron2．engine import DefaultPredictor

＃ set config

from detectron2．config import get＿cfg

＃ define configure instance

cfg ＝ get＿cfg（）

＃ get a model specified by relative path under Detectron2’s official configs／ directory．

cfg．merge＿from＿file（model＿zoo．get＿config＿file
（＂COCO－Keypoints／keypoint＿rcnn＿R＿101＿FPN＿3x．yaml＂））

＃ download pretrained model

cfg．MODEL．WEIGHTS ＝ model＿zoo．get＿checkpoint＿url
（＂COCO－Keypoints／keypoint＿rcnn＿R＿101＿FPN＿3x．yaml＂）

＃ set threshold for this model

cfg．MODEL．ROI＿HEADS．SCORE＿THRESH＿TEST ＝ 0．8

＃ create predictor

predictor ＝ DefaultPredictor（cfg）

讓我們可視化模型中的一些預(yù)測。在這里，我們隨機(jī)挑選五張圖像，然后對每張圖像進(jìn)行預(yù)測，定義可視化器并在圖像上繪制預(yù)測。

＃ for drawing predictions on images

from detectron2．utils．visualizer import Visualizer

＃ to obtain metadata

from detectron2．data import MetadataCatalog

＃ to display an image

from google．colab．patches import cv2＿imshow

＃ randomly select images

for img in random．sample（images，5）：

＃ make predictions

outputs ＝ predictor（img）

＃ use ｀Visualizer｀ to draw the predictions on the image．

v ＝ Visualizer（img［：，：，：：－1］，

MetadataCatalog．get（cfg．DATASETS．TRAIN［0］）， scale＝1）

＃ draw prediction on image

v ＝ v．draw＿instance＿predictions（outputs［＂instances＂］．to（＂cpu＂））

＃ display image

cv2＿imshow（v．get＿image（）［：，：，：：－1］）

這里是模型的預(yù)測。你可以看到我們有邊界框以及為每個玩家預(yù)測的關(guān)鍵點(diǎn)。你可以看到該模型甚至還預(yù)測了背景中的一些圖像。這些是模型的一些預(yù)測。

接下來，我們將定義一個函數(shù)，用于提取和檢測圖像的姿勢。因此，此函數(shù)將以圖像作為輸入，使用預(yù)訓(xùn)練模型對圖像進(jìn)行這些預(yù)測，然后將提取的關(guān)鍵點(diǎn)轉(zhuǎn)換為單個圖像的 numpy 數(shù)組。

也可以有多個對象。我們將選擇得分最高的對象并只保留那些關(guān)鍵點(diǎn)，最后我們將關(guān)鍵點(diǎn)轉(zhuǎn)換為一維數(shù)組。

因?yàn)槲覀兿Ｍ诖酥辖⒁粋€神經(jīng)網(wǎng)絡(luò)模型，并且神經(jīng)網(wǎng)絡(luò)采用一維輸入。

所以在這里我們將其轉(zhuǎn)換為單一維度，現(xiàn)在我們將使用定義的函數(shù)，提取所有圖像的關(guān)鍵點(diǎn)，并將它們存儲在列表關(guān)鍵點(diǎn)中。

現(xiàn)在我們有了所有圖像的關(guān)鍵點(diǎn)。接下來，我們將構(gòu)建一個神經(jīng)網(wǎng)絡(luò)，將這些關(guān)鍵點(diǎn)分類為對應(yīng)的鏡頭類型。

＃ define function that extracts the keypoints for an image

def extract＿keypoints（img）：

＃ make predictions

outputs ＝ predictor（img）

＃ fetch keypoints

keypoints ＝ outputs［＇instances＇］．pred＿keypoints

＃ convert to numpy array

kp ＝ keypoints．cpu（）．numpy（）

＃ if keypoints detected

if（len（keypoints）＞0）：

＃ fetch keypoints of a person with maximum confidence score

kp ＝ kp［0］

kp ＝ np．delete（kp，2，1）

＃ convert 2D array to 1D array

kp ＝ kp．flatten（）

＃ return keypoints

return kp

＃ progress bar

from tqdm import tqdm

import numpy as np

＃ create list

keypoints ＝［］

＃ for every image

for i in tqdm（range（len（images）））：

＃ extract keypoints

kp ＝ extract＿keypoints（images［i］）

＃ append keypoints

keypoints．a(chǎn)ppend（kp）

5．使用球員姿勢對板球擊球進(jìn)行分類

首先，我們將對關(guān)鍵點(diǎn)的值進(jìn)行歸一化，這最終將加快訓(xùn)練過程。

＃ for normalization

from sklearn．preprocessing import StandardScaler

＃ define normalizer

scaler＝ StandardScaler（）

＃ normalize keypoints

keypoints ＝ scaler．fit＿transform（keypoints）

＃ convert to an array

keypoints ＝ np．a(chǎn)rray（keypoints）

在這里我們對關(guān)鍵點(diǎn)的值進(jìn)行了標(biāo)準(zhǔn)化。我們正在使用標(biāo)簽編碼將當(dāng)前為文本形式的目標(biāo)轉(zhuǎn)換為數(shù)字。

＃ converting the target categories into numbers

from sklearn．preprocessing import LabelEncoder

le ＝ LabelEncoder（）

y＝le．fit＿transform（labels）

之后，我們使用訓(xùn)練測試拆分功能將數(shù)據(jù)集拆分為訓(xùn)練集和驗(yàn)證集。所以我們將測試大小保持為 0．2，這意味著 80％的數(shù)據(jù)將用于訓(xùn)練，20％將在驗(yàn)證集中。

＃ for creating training and validation sets

from sklearn．model＿selection import train＿test＿split

＃ split keypoints and labels in 80：20

x＿tr， x＿val， y＿tr， y＿val ＝ train＿test＿split（keypoints， y， test＿size＝0．2， stratify＝labels，
random＿state＝120）

現(xiàn)在為了使用關(guān)鍵點(diǎn)和目標(biāo)，我們必須將它們轉(zhuǎn)換為張量。因此，在這里我們將關(guān)鍵點(diǎn)和目標(biāo)轉(zhuǎn)換為 python 張量，用于訓(xùn)練和驗(yàn)證集。

＃ converting the keypoints and target value to tensor
import torch

x＿tr ＝ torch．Tensor（x＿tr）

x＿val ＝ torch．Tensor（x＿val）

y＿tr ＝ torch．Tensor（y＿tr）

y＿tr ＝ y＿tr．type（torch．long）

y＿val ＝ torch．Tensor（y＿val）

y＿val ＝ y＿val．type（torch．long）

這是訓(xùn)練的形狀，驗(yàn)證集有 464 張用于訓(xùn)練的圖像和 116 張用于驗(yàn)證的圖像。

＃ shape of training and validation set

（x＿tr．shape， y＿tr．shape），（x＿val．shape， y＿val．shape）

現(xiàn)在我們將為我們的模型定義架構(gòu)。在這里我們從 PyTorch 中導(dǎo)入一些對我們有幫助的函數(shù)。在這里，我們定義了一個簡單的神經(jīng)網(wǎng)絡(luò)架構(gòu)，其中只有一個具有 64 個神經(jīng)元的隱藏層。

輸出層有四個神經(jīng)元，因?yàn)槲覀冇兴膫€不同的類，輸出層的激活函數(shù)將返回概率。因此，我們有一個 softmax 激活函數(shù)。

＃ importing libraries for defining the architecture of model

from torch．a(chǎn)utograd import Variable

from torch．optim import Adam

from torch．nn import Linear， ReLU， Sequential，

Softmax， CrossEntropyLoss

＃ defining the model architecture

model ＝ Sequential（Linear（34， 64），
ReLU（），
Linear（64， 4），
Softmax（）
）

接下來，我們將優(yōu)化器定義為 adam，將損失定義為交叉熵。這是一個多類分類問題，然后我們將模型轉(zhuǎn)移到 GPU。

＃ define optimizer and loss function

optimizer ＝ Adam（model．parameters（）， lr＝0．01）

criterion ＝ CrossEntropyLoss（）

＃ checking if GPU is available

if torch．cuda．is＿available（）：

model ＝ model．cuda（）

criterion ＝ criterion．cuda（）

接下來，我們將定義一個用于訓(xùn)練模型的函數(shù)。這個函數(shù)會將 epoch 的數(shù)量作為輸入。首先，我們將損失初始化為零，然后使用 Pytorch 變量加載訓(xùn)練和驗(yàn)證集。

將我們的模型和驗(yàn)證轉(zhuǎn)移到 GPU 之后，清除模型參數(shù)的梯度。接下來，我們從模型中獲取訓(xùn)練集和驗(yàn)證集的預(yù)測，并將它們分類為單獨(dú)的變量。

我們已經(jīng)計算了訓(xùn)練和驗(yàn)證損失，最后，反向傳播梯度并更新參數(shù)。

此外，我們還在每 10 個 epoch 后打印驗(yàn)證損失。

def train（epoch）：

model．train（）

tr＿loss ＝ 0

＃ getting the training set

x＿train， y＿train ＝ Variable（x＿tr）， Variable（y＿tr）

＃ getting the validation set

x＿valid， y＿valid ＝ Variable（x＿val）， Variable（y＿val）

＃ converting the data into GPU format

if torch．cuda．is＿available（）：

x＿train ＝ x＿train．cuda（）

y＿train ＝ y＿train．cuda（）

x＿valid ＝ x＿valid．cuda（）

y＿valid ＝ y＿valid．cuda（）

＃ clearing the Gradients of the model parameters

optimizer．zero＿grad（）

＃ prediction for training and validation set

output＿train ＝ model（x＿train）

output＿val ＝ model（x＿valid）

＃ computing the training and validation loss

loss＿train ＝ criterion（output＿train， y＿train）

loss＿val ＝ criterion（output＿val， y＿valid）

＃ computing the updated weights of all the model parameters

loss＿train．backward（）

optimizer．step（）

if epoch％10 ＝＝ 0：

＃ printing the validation loss

print（＇Epoch ：＇，epoch＋1，＇t＇，＇loss ：＇， loss＿val．item（））

現(xiàn)在我們已經(jīng)定義了我們的函數(shù)。我們將使用此訓(xùn)練功能并開始對我們的模型進(jìn)行訓(xùn)練。此外，我們正在訓(xùn)練 400 個 epoch。你可以看到該模型每 10 個 epoch 打印一次損失。

最后，我們以 1．38 的損失開始，現(xiàn)在我們最終損失了 0．97。所以我們可以看到，隨著模型訓(xùn)練的進(jìn)行，模型的性能正在提高。

＃ defining the number of epochs

n＿epochs ＝ 100

＃ training the model

for epoch in range（n＿epochs）：

train（epoch）

評估模型性能

讓我們評估模型性能，以便檢查模型的準(zhǔn)確性。

從sklearn導(dǎo)入函數(shù)。我們得到了包括關(guān)鍵點(diǎn)和目標(biāo)變量的驗(yàn)證集。一旦獲得變量，首先將這些值傳輸?shù)?GPU，我們將使用經(jīng)過訓(xùn)練的模型從模型中對驗(yàn)證圖像進(jìn)行預(yù)測。

現(xiàn)在我們正在使用 arg max 函數(shù)將預(yù)測概率轉(zhuǎn)換為相應(yīng)的類。

＃ to check the model performance

from sklearn．metrics import accuracy＿score

＃ get validation accuracy

x， y ＝ Variable（x＿val）， Variable（y＿val）

if torch．cuda．is＿available（）：

x＿val ＝ x．cuda（）

y＿val ＝ y．cuda（）

pred ＝ model（x＿val）

final＿pred ＝ np．a(chǎn)rgmax（pred．cpu（）．data．numpy（）， axis＝1）

accuracy＿score（y＿val．cpu（）， final＿pred）

最后，我們計算了準(zhǔn)確度得分，因此該模型的準(zhǔn)確度為 0．79，約為 80％。

結(jié)論

為了提高準(zhǔn)確性，你可以使用不同的超參數(shù)，例如增加模型中的隱藏層數(shù)、更改優(yōu)化器、更改激活函數(shù)、增加 epoch 數(shù)等等。這就是我們?nèi)绾谓⒁粋€模型來使用球員的姿勢對鏡頭進(jìn)行分類的教程。

原文標(biāo)題 : 使用球員圖像姿勢的板球擊球分類

本地收藏打印推薦給朋友

聲明： 本文由入駐維科號的作者撰寫，觀點(diǎn)僅代表作者本人，不代表OFweek立場。如有侵權(quán)或其他問題，請聯(lián)系舉報。

發(fā)表評論

共0條評論，0人參與

登錄登錄即可訪問所有OFweek服務(wù)

用戶名/郵箱/手機(jī)：
密碼：
忘記密碼？
用其他賬號登錄： QQ | 微信 | 新浪微博

請輸入評論內(nèi)容...

請輸入評論/評論長度6~500個字

暫無評論

圖片新聞