国产性生活爱视频,欧美一卡二卡乱码无人区

用 Python 從零開(kāi)始構(gòu)建 Inception Network

2021-07-25 11:30

介紹

隨著越來(lái)越多的高效體系結(jié)構(gòu)出現(xiàn)在世界各地的研究論文中，深度學(xué)習(xí)體系結(jié)構(gòu)正在迅速發(fā)展。這些研究論文不僅包含了大量的信息，而且為新的深度學(xué)習(xí)體系結(jié)構(gòu)的誕生提供了一條新的途徑，它們通常難以解析。為了理解這些論文，人們可能需要多次閱讀那篇論文，甚至可能需要閱讀其他相關(guān)論文。Inception 就是其中之一。

Inception 網(wǎng)絡(luò)是 CNN 圖像分類(lèi)器發(fā)展過(guò)程中的一個(gè)重要里程碑。在此架構(gòu)之前，大多數(shù)流行的 CNN 或分類(lèi)器只是使用越來(lái)越深的堆疊卷積層以獲得更好的性能。

另一方面，Inception 網(wǎng)絡(luò)經(jīng)過(guò)精心設(shè)計(jì)，非常深入和復(fù)雜。它使用了許多不同的技術(shù)來(lái)推動(dòng)其性能；無(wú)論是速度還是準(zhǔn)確性。

什么是Inception？

Inception Network（ResNet）是Christian Szegedy、Wei Liu、Yangqing Jia介紹的著名深度學(xué)習(xí)模型之一。Pierre Sermanet、Scott Reed、Dragomir Anguelov、Dumitru Erhan、Vincent Vanhoucke 和 Andrew Rabinovich在 2014 年的論文“Going deeper with convolutions” ［1］中。

后來(lái)演化出了不同版本的 Inception 網(wǎng)絡(luò)。這是 Sergey Ioffe、Christian Szegedy、Jonathon Shlens、Vincent Vanhouck 和 Zbigniew Wojna在 2015 年題為“Rethinking the Inception Architecture for Computer Vision” ［2］的論文中提出的。Inception模型被歸類(lèi)為最受歡迎和最常用的深度學(xué)習(xí)模型之一。

設(shè)計(jì)原則

–少數(shù)通用設(shè)計(jì)原則和優(yōu)化技術(shù)的提議被證明對(duì)有效地?cái)U(kuò)展卷積網(wǎng)絡(luò)很有用。

–在網(wǎng)絡(luò)體系結(jié)構(gòu)的早期，避免代表性瓶頸。

–如果網(wǎng)絡(luò)具有更多不同的過(guò)濾器，這些過(guò)濾器將具有更多不同的特征圖，則網(wǎng)絡(luò)將學(xué)習(xí)得更快。

–降維的空間聚合可以在低維嵌入上完成，而不會(huì)損失太多表示能力。

–通過(guò)寬度和深度之間的平衡，可以實(shí)現(xiàn)網(wǎng)絡(luò)的最佳性能。

初始模塊

初始模塊（naive）

資料來(lái)源：＇Going Deeper with Convolution ＇論文

最優(yōu)局部稀疏結(jié)構(gòu)的近似

處理各種尺度的視覺(jué)／空間信息，然后聚合

從計(jì)算上看，這有點(diǎn)樂(lè)觀

5×5 卷積非�；ㄙM(fèi)開(kāi)銷(xiāo)

Inception 模塊（降維）

來(lái)源：＇Going Deeper with Convolution＇論文

降維是必要和有動(dòng)力的（網(wǎng)絡(luò)中的網(wǎng)絡(luò)）

通過(guò) 1×1 卷積實(shí)現(xiàn)

深入思考學(xué)習(xí)池化，而不是高度／寬度的最大／平均池化。

初始架構(gòu)

使用降維的inception模塊，構(gòu)建了深度神經(jīng)網(wǎng)絡(luò)架構(gòu)（Inception v1）。架構(gòu)如下圖所示：

Inception 網(wǎng)絡(luò)線(xiàn)性堆疊了 9 個(gè)這樣的 Inception 模塊。它有 22 層深（如果包括池化層，則為 27 層）。在最后一個(gè) inception 模塊的最后，它使用了全局平均池化。

對(duì)于降維和修正線(xiàn)性激活，使用了 128 個(gè)濾波器的 1×1 卷積。

具有 1024 個(gè)單元的全連接層的修正線(xiàn)性激活。

使用 dropout 層丟棄輸出的比例為 70％。

使用線(xiàn)性層作為分類(lèi)器的 softmax 損失。

Inception 網(wǎng)絡(luò)的流行版本如下：

Inception v1

Inception v2

Inception v3

Inception v4

Inception－ResNet

讓我們從頭開(kāi)始構(gòu)建 Inception v1（GoogLeNet）：

Inception 架構(gòu)多次使用 CNN 塊和 1×1、3×3、5×5 等不同的過(guò)濾器，所以讓我們?yōu)?CNN 塊創(chuàng)建一個(gè)類(lèi)，它采用輸入通道和輸出通道以及 batchnorm2d 和 ReLu 激活．

class conv＿block（nn．Module）：
def ＿＿init＿＿（self， in＿channels， out＿channels，＊＊kwargs）：
super（conv＿block， self）．＿＿init＿＿（）
self．relu ＝ nn．ReLU（）
self．conv ＝ nn．Conv2d（in＿channels， out＿channels，＊＊kwargs）
self．batchnorm ＝ nn．BatchNorm2d（out＿channels）
def forward（self， x）：
return self．relu（self．batchnorm（self．conv（x）））

然后為inception module創(chuàng)建一個(gè)降維的類(lèi)，參考上圖，從1×1 filter輸出，reduction 3×3，然后從3×3 filter輸出，reduction 5×5，然后從5×5輸出和 1×1 池中輸出。

class Inception＿block（nn．Module）：
def ＿＿init＿＿（
self， in＿channels， out＿1x1， red＿3x3， out＿3x3， red＿5x5， out＿5x5， out＿1x1pool
）：
super（Inception＿block， self）．＿＿init＿＿（）
self．branch1 ＝ conv＿block（in＿channels， out＿1x1， kernel＿size＝（1， 1））
self．branch2 ＝ nn．Sequential（
conv＿block（in＿channels， red＿3x3， kernel＿size＝（1， 1）），
conv＿block（red＿3x3， out＿3x3， kernel＿size＝（3， 3）， padding＝（1， 1）），
）
self．branch3 ＝ nn．Sequential（
conv＿block（in＿channels， red＿5x5， kernel＿size＝（1， 1）），
conv＿block（red＿5x5， out＿5x5， kernel＿size＝（5， 5）， padding＝（2， 2）），
）
self．branch4 ＝ nn．Sequential（
nn．MaxPool2d（kernel＿size＝（3， 3）， stride＝（1， 1）， padding＝（1， 1）），
conv＿block（in＿channels， out＿1x1pool， kernel＿size＝（1， 1）），
）
def forward（self， x）：
return torch．cat（
［self．branch1（x）， self．branch2（x）， self．branch3（x）， self．branch4（x）］， 1
）

讓我們保留下圖作為參考并開(kāi)始構(gòu)建網(wǎng)絡(luò)。

來(lái)源：＇Going Deeper with Convolution＇論文

創(chuàng)建一個(gè)類(lèi)作為 GoogLeNet

class GoogLeNet（nn．Module）：
def ＿＿init＿＿（self， aux＿logits＝True， num＿classes＝1000）：
super（GoogLeNet， self）．＿＿init＿＿（）
assert aux＿logits ＝＝ True or aux＿logits ＝＝ False
self．a(chǎn)ux＿logits ＝ aux＿logits
＃ Write in＿channels， etc， all explicit in self．conv1， rest will write to
＃ make everything as compact as possible， kernel＿size＝3 instead of （3，3）
self．conv1 ＝ conv＿block（
in＿channels＝3，
out＿channels＝64，
kernel＿size＝（7， 7），
stride＝（2， 2），
padding＝（3， 3），
）
self．maxpool1 ＝ nn．MaxPool2d（kernel＿size＝3， stride＝2， padding＝1）
self．conv2 ＝ conv＿block（64， 192， kernel＿size＝3， stride＝1， padding＝1）
self．maxpool2 ＝ nn．MaxPool2d（kernel＿size＝3， stride＝2， padding＝1）
＃ In this order： in＿channels， out＿1x1， red＿3x3， out＿3x3， red＿5x5， out＿5x5， out＿1x1pool
self．inception3a ＝ Inception＿block（192， 64， 96， 128， 16， 32， 32）
self．inception3b ＝ Inception＿block（256， 128， 128， 192， 32， 96， 64）
self．maxpool3 ＝ nn．MaxPool2d（kernel＿size＝（3， 3）， stride＝2， padding＝1）
self．inception4a ＝ Inception＿block（480， 192， 96， 208， 16， 48， 64）
self．inception4b ＝ Inception＿block（512， 160， 112， 224， 24， 64， 64）
self．inception4c ＝ Inception＿block（512， 128， 128， 256， 24， 64， 64）
self．inception4d ＝ Inception＿block（512， 112， 144， 288， 32， 64， 64）
self．inception4e ＝ Inception＿block（528， 256， 160， 320， 32， 128， 128）
self．maxpool4 ＝ nn．MaxPool2d（kernel＿size＝3， stride＝2， padding＝1）
self．inception5a ＝ Inception＿block（832， 256， 160， 320， 32， 128， 128）
self．inception5b ＝ Inception＿block（832， 384， 192， 384， 48， 128， 128）
self．a(chǎn)vgpool ＝ nn．AvgPool2d（kernel＿size＝7， stride＝1）
self．dropout ＝ nn．Dropout（p＝0．4）
self．fc1 ＝ nn．Linear（1024， num＿classes）
if self．a(chǎn)ux＿logits：
self．a(chǎn)ux1 ＝ InceptionAux（512， num＿classes）
self．a(chǎn)ux2 ＝ InceptionAux（528， num＿classes）
else：
self．a(chǎn)ux1 ＝ self．a(chǎn)ux2 ＝ None
def forward（self， x）：
x ＝ self．conv1（x）
x ＝ self．maxpool1（x）
x ＝ self．conv2（x）
＃ x ＝ self．conv3（x）
x ＝ self．maxpool2（x）
x ＝ self．inception3a（x）
x ＝ self．inception3b（x）
x ＝ self．maxpool3（x）
x ＝ self．inception4a（x）
＃ Auxiliary Softmax classifier 1
if self．a(chǎn)ux＿logits and self．training：
aux1 ＝ self．a(chǎn)ux1（x）
x ＝ self．inception4b（x）
x ＝ self．inception4c（x）
x ＝ self．inception4d（x）
＃ Auxiliary Softmax classifier 2
if self．a(chǎn)ux＿logits and self．training：
aux2 ＝ self．a(chǎn)ux2（x）
x ＝ self．inception4e（x）
x ＝ self．maxpool4（x）
x ＝ self．inception5a（x）
x ＝ self．inception5b（x）
x ＝ self．a(chǎn)vgpool（x）
x ＝ x．reshape（x．shape［0］，－1）
x ＝ self．dropout（x）
x ＝ self．fc1（x）
if self．a(chǎn)ux＿logits and self．training：
return aux1， aux2， x
else：
return x

然后為輸出層定義一個(gè)類(lèi)，如論文中提到的 dropout＝0．7 和一個(gè)帶有 softmax 的線(xiàn)性層來(lái)輸出 n＿classes。

class InceptionAux（nn．Module）：
def ＿＿init＿＿（self， in＿channels， num＿classes）：
super（InceptionAux， self）．＿＿init＿＿（）
self．relu ＝ nn．ReLU（）
self．dropout ＝ nn．Dropout（p＝0．7）
self．pool ＝ nn．AvgPool2d（kernel＿size＝5， stride＝3）
self．conv ＝ conv＿block（in＿channels， 128， kernel＿size＝1）
self．fc1 ＝ nn．Linear（2048， 1024）
self．fc2 ＝ nn．Linear（1024， num＿classes）

def forward（self， x）：
x ＝ self．pool（x）
x ＝ self．conv（x）
x ＝ x．reshape（x．shape［0］，－1）
x ＝ self．relu（self．fc1（x））
x ＝ self．dropout（x）
x ＝ self．fc2（x）
return x

然后程序應(yīng)該如下所示對(duì)齊。

– Class GoogLeNet

– Class Inception＿block

– Class InceptionAux

– Class conv＿block

然后最后讓我們寫(xiě)一小段測(cè)試代碼來(lái)檢查我們的模型是否工作正常。

if ＿＿name＿＿＝＝＂＿＿main＿＿＂：
＃ N ＝ 3 （Mini batch size）
x ＝ torch．randn（3， 3， 224， 224）
model ＝ GoogLeNet（aux＿logits＝True， num＿classes＝1000）
print（model（x）［2］．shape）

輸出應(yīng)如下所示