人妻献身系列第54部分,国产69精久久久九九九,成人综合国产乱在线

如何避免這8個常見的深度學習/計算機視覺錯誤？

2019-10-28 09:47

磐創(chuàng)AI

關注

人是不完美的，我們經常在程序中犯錯誤。有時這些錯誤很容易發(fā)現(xiàn)：你的代碼根本不能工作，你的應用程序崩潰等等。但是有些bug是隱藏的，這使得它們更加危險。

在解決深度學習問題時，由于一些不確定性，很容易出現(xiàn)這種類型的bug：很容易看到web應用端點路由請求是否正確，而不容易檢查你的梯度下降步驟是否正確。然而，在DL從業(yè)者生涯中有很多錯誤是可以避免的。

我想分享一些我的經驗，關于我在過去兩年的計算機視覺工作中看到或制造的錯誤。我在會議上談到過這個話題，很多人在會后告訴我：“是的，伙計，我也有很多這樣的錯誤。”我希望我的文章可以幫助你至少避免其中的一些問題。

1．翻轉圖像和關鍵點

假設一個關鍵點檢測問題的工作。它們的數據看起來像圖像和一系列關鍵點元組，例如［（0，1），（2，2）］，其中每個關鍵點是一對x和y坐標。

讓我們對這個數據實現(xiàn)一個基本的數據增強：

def flip＿img＿and＿keypoints（img： np．ndarray， kpts： Sequence［Sequence［int］］）：
img ＝ np．fliplr（img）
h， w，＊＿＝ img．shape
kpts ＝［（y， w － x） for y， x in kpts］
return img， kpts

看起來好像是正確的，嗯，讓我們把結果可視化一下：

mage ＝ np．ones（（10， 10）， dtype＝np．float32）
kpts ＝［（0， 1），（2， 2）］
image＿flipped， kpts＿flipped ＝ flip＿img＿and＿keypoints（image， kpts）
img1 ＝ image．copy（）
for y， x in kpts：
img1［y， x］＝ 0
img2 ＝ image＿flipped．copy（）
for y， x in kpts＿flipped：
img2［y， x］＝ 0
＿＝ plt．imshow（np．hstack（（img1， img2）））

不對稱看起來很奇怪！如果我們檢查極值的情況呢？

image ＝ np．ones（（10， 10）， dtype＝np．float32）
kpts ＝［（0， 0），（1， 1）］
image＿flipped， kpts＿flipped ＝ flip＿img＿and＿keypoints（image， kpts）
img1 ＝ image．copy（）
for y， x in kpts：
img1［y， x］＝ 0
img2 ＝ image＿flipped．copy（）
for y， x in kpts＿flipped：
img2［y， x］＝ 0

out：

IndexError
Traceback （most recent call last）
＜ipython－input－5－997162463eae＞ in ＜module＞
8 img2 ＝ image＿flipped．copy（）
9 for y， x in kpts＿flipped：
－－－＞ 10 img2［y， x］＝ 0
IndexError： index 10 is out of bounds for axis 1 with size 10

程序報錯了！這是一個典型的差一誤差。正確的代碼是這樣的：

def flip＿img＿and＿keypoints（img： np．ndarray， kpts： Sequence［Sequence［int］］）：
img ＝ np．fliplr（img）
h， w，＊＿＝ img．shape
kpts ＝［（y， w － x － 1） for y， x in kpts］
return img， kpts

我們可以通過可視化來檢測這個問題，而在x ＝ 0點的單元測試也會有幫助。

2．還是關鍵點問題

即使在上述錯誤被修復之后，仍然存在問題�，F(xiàn)在更多的是語義上的問題，而不僅僅是代碼上的問題。

假設需要增強具有兩只手掌的圖像�？雌饋砗孟駴]問題－左右翻轉后手還是手。

但是等等！我們對我們擁有的關鍵點語義一無所知。如果這個關鍵點的意思是這樣的：

kpts ＝［
（20， 20），＃左小指
（20， 200），＃右小指
．．．
］

這意味著增強實際上改變了語義：左變成右，右變成左，但我們不交換數組中的關鍵點索引。它會給訓練帶來大量的噪音和更糟糕的度量。

我們應該吸取教訓：

在應用增強或其他特性之前，要了解和考慮數據結構和語義；

保持你的實驗原子性：添加一個小的變化（例如一個新的變換），如果分數已經提高，檢查它如何進行和合并。

3．編碼自定義損失函數

熟悉語義分割問題的人可能知道IoU度量。不幸的是，我們不能直接用SGD來優(yōu)化它，所以常用的方法是用可微損失函數來近似它。讓我們編碼實現(xiàn)一個！

def iou＿continuous＿loss（y＿pred， y＿true）：
eps ＝ 1e－6
def ＿sum（x）：
return x．sum（－1）．sum（－1）
numerator ＝（＿sum（y＿true ＊ y＿pred）＋ eps）
denominator ＝（＿sum（y＿true ＊＊ 2）＋＿sum（y＿pred ＊＊ 2）
－＿sum（y＿true ＊ y＿pred）＋ eps）
return （numerator ／ denominator）．mean（）

看起來不錯，讓我們測試一下：

In ［3］： ones ＝ np．ones（（1， 3， 10， 10））
．．．： x1 ＝ iou＿continuous＿loss（ones ＊ 0．01， ones）
．．．： x2 ＝ iou＿continuous＿loss（ones ＊ 0．99， ones）
In ［4］： x1， x2
Out［4］：（0．010099999897990103， 0．9998990001020204）

在x1中，我們計算了與正確數據完全不同的數據的損失，而x2則是非常接近正確數據的數據損失結果。我們期望x1很大因為預測很糟糕，x2應該接近0。但是結果與我期望的有差別，哪里出現(xiàn)錯誤了呢？

上面的函數是度量的一個很好的近似。度量不是一種損失：它通常（包括這種情況）越高越好。當我們使用SGD最小化損失時，我們應該做一些改變：

def iou＿continuous（y＿pred， y＿true）：
eps ＝ 1e－6
def ＿sum（x）：
return x．sum（－1）．sum（－1）
numerator ＝（＿sum（y＿true ＊ y＿pred）＋ eps）
denominator ＝（＿sum（y＿true ＊＊ 2）＋＿sum（y＿pred ＊＊ 2）
－＿sum（y＿true ＊ y＿pred）＋ eps）
return （numerator ／ denominator）．mean（）
def iou＿continuous＿loss（y＿pred， y＿true）：
return 1 － iou＿continuous（y＿pred， y＿true）

這些問題可以從兩個方面來確定：

編寫一個單元測試來檢查損失的方向

運行健全性檢查

4．當我們遇到Pytorch的時候

假設有一個預先訓練好的模型。編寫基于ceevee API的Predictor 類。

from ceevee．base import AbstractPredictor
class MySuperPredictor（AbstractPredictor）：
def ＿＿init＿＿（self，
weights＿path： str，
）：
super（）．＿＿init＿＿（）
self．model ＝ self．＿load＿model（weights＿path＝weights＿path）
def process（self， x，＊kw）：
with torch．no＿grad（）：
res ＝ self．model（x）
return res
＠staticmethod
def ＿load＿model（weights＿path）：
model ＝ ModelClass（）
weights ＝ torch．load（weights＿path， map＿location＝＇cpu＇）
model．load＿state＿dict（weights）
return model

這個代碼正確嗎？也許！對于某些模型來說確實是正確的。例如，當模型沒有dropout或norm 層，如torch．nn．BatchNorm2d。

但是對于大多數計算機視覺應用來說，代碼忽略了一些重要的東西：轉換到評估模式。

如果試圖將動態(tài)PyTorch圖轉換為靜態(tài)PyTorch圖，這個問題很容易意識到。torch．jit模塊用于這種轉換。

In ［3］： model ＝ nn．Sequential（
．．．： nn．Linear（10， 10），
．．．： nn．Dropout（．5）
．．．：）
．．．：
．．．： traced＿model ＝ torch．jit．trace（model， torch．rand（10））
／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／jit／＿＿init＿＿．py：914： TracerWarning： Trace had nondeterministic nodes． Did you forget call ．eval（） on your model？ Nodes：
％12 ： Float（10）＝ aten：：dropout（％input，％10，％11）， scope： Sequential／Dropout［1］＃／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／nn／functional．py：806：0
This may cause errors in trace checking． To disable trace checking， pass check＿trace＝False to torch．jit．trace（）
check＿tolerance，＿force＿outplace， True，＿module＿class）
／Users／Arseny／．pyenv／versions／3．6．6／lib／python3．6／site－packages／torch／jit／＿＿init＿＿．py：914： TracerWarning： Output nr 1． of the traced function does not match the corresponding output of the Python function． Detailed error：
Not within tolerance rtol＝1e－05 atol＝1e－05 at input［5］（0．0 vs． 0．5454154014587402） and 5 other locations （60．00％）
check＿tolerance，＿force＿outplace， True，＿module＿class）

一個簡單的解決辦法：

In ［4］： model ＝ nn．Sequential（
．．．： nn．Linear（10， 10），
．．．： nn．Dropout（．5）
．．．：）
．．．：
．．．： traced＿model ＝ torch．jit．trace（model．eval（）， torch．rand（10））
＃沒有警告！

torch．jit．trace運行模型幾次并比較結果。
然而torch．jit．trace并不是萬能的，你應該了解并記住。

5．復制粘貼問題

很多東西都是成對存在的：訓練和驗證、寬度和高度、緯度和經度……如果你仔細閱讀，你會很容易發(fā)現(xiàn)一個bug是由某一個成員中復制粘貼到另外一個成員中引起的：

def make＿dataloaders（train＿cfg， val＿cfg， batch＿size）：
train ＝ Dataset．from＿config（train＿cfg）
val ＝ Dataset．from＿config（val＿cfg）
shared＿params ＝｛＇batch＿size＇： batch＿size，＇shuffle＇： True，＇num＿workers＇： cpu＿count（）｝
train ＝ DataLoader（train，＊＊shared＿params）
val ＝ DataLoader（train，＊＊shared＿params）
return train， val

不僅僅是我犯了愚蠢的錯誤，例如。流行的albumentations庫中也有類似的問題。

＃ https：／／github．com／albu／albumentations／blob／0．3．0／albumentations／augmentations／transforms．py
def apply＿to＿keypoint（self， keypoint， crop＿height＝0， crop＿width＝0， h＿start＝0， w＿start＝0， rows＝0， cols＝0，＊＊params）：
keypoint ＝ F．keypoint＿random＿crop（keypoint， crop＿height， crop＿width， h＿start， w＿start， rows， cols）
scale＿x ＝ self．width ／ crop＿height
scale＿y ＝ self．height ／ crop＿height
keypoint ＝ F．keypoint＿scale（keypoint， scale＿x， scale＿y）
return keypoint

不過別擔心，現(xiàn)在已經修復好了。

如何避免？盡量以不需要復制和粘貼的方式編寫代碼。

下面這種編程方式不是一個好的方式：

datasets ＝［］
data＿a ＝ get＿dataset（MyDataset（config［＇dataset＿a＇］）， config［＇shared＿param＇］， param＿a）
datasets．append（data＿a）
data＿b ＝ get＿dataset（MyDataset（config［＇dataset＿b＇］）， config［＇shared＿param＇］， param＿b）
datasets．append（data＿b）

而下面的方式看起來好多了：

datasets ＝［］
for name， param in zip（（＇dataset＿a＇，＇dataset＿b＇），
（param＿a， param＿b），
）：
datasets．append（get＿dataset（MyDataset（config［name］）， config［＇shared＿param＇］， param））

6．正確的數據類型讓我們編寫一個新的增強：def add＿noise（img： np．ndarray）－＞ np．ndarray：
mask ＝ np．random．rand（＊img．shape）＋．5
img ＝ img．astype（＇float32＇）＊ mask
return img．astype（＇uint8＇）

圖像已被更改。這是我們所期望的嗎？嗯，可能修改得有點過了。

這里有一個危險的操作：將float32轉換為uint8。它可能會導致溢出：

def add＿noise（img： np．ndarray）－＞ np．ndarray：
mask ＝ np．random．rand（＊img．shape）＋．5
img ＝ img．astype（＇float32＇）＊ mask
return np．clip（img， 0， 255）．astype（＇uint8＇）
img ＝ add＿noise（cv2．imread（＇two＿hands．jpg＇）［：，：，：：－1］）
＿＝ plt．imshow（img）

看起來好多了，是吧？

順便說一句，還有一種方法可以避免這個問題：不要重造輪子，不要從頭開始編寫增強代碼，而是使用現(xiàn)有的增強，比如：albumentations．augmentations．transforms．GaussNoise。

我曾經犯過另一個同樣的錯誤。

raw＿mask ＝ cv2．imread（＇mask＿small．png＇）
mask ＝ raw＿mask．astype（＇float32＇）／ 255
mask ＝ cv2．resize（mask，（64， 64）， interpolation＝cv2．INTER＿LINEAR）
mask ＝ cv2．resize（mask，（128， 128）， interpolation＝cv2．INTER＿CUBIC）
mask ＝（mask ＊ 255）．astype（＇uint8＇）
＿＝ plt．imshow（np．hstack（（raw＿mask， mask）））

這里出了什么問題？首先，用三次樣條插值調整mask的大小是一個壞主意。與轉換float32到uint8的問題是一樣的：三次樣條插值的輸出值會大于輸入值，會導致溢出。

我在做可視化的時候發(fā)現(xiàn)了這個問題。在你的訓練循環(huán)中到處使用斷言也是一個好主意。

7．拼寫錯誤發(fā)生

假設需要對全卷積網絡（如語義分割問題）和一個巨大的圖像進行推理。該圖像是如此巨大，沒有機會把它放在你的GPU上－例如，它可以是一個醫(yī)療或衛(wèi)星圖像。

在這種情況下，可以將圖像分割成網格，獨立地對每一塊進行推理，最后合并。此外，一些預測交叉可能有助于平滑邊緣的偽影

讓我們編碼實現(xiàn)吧！

from tqdm import tqdm
class GridPredictor：
＂＂＂
你有GPU內存限制時，此類可用于預測大圖像的分割掩碼
＂＂＂
def ＿＿init＿＿（self， predictor： AbstractPredictor， size： int， stride： Optional［int］＝ None）：
self．predictor ＝ predictor
self．size ＝ size
self．stride ＝ stride if stride is not None else size ／／ 2
def ＿＿call＿＿（self， x： np．ndarray）：
h， w，＿＝ x．shape
mask ＝ np．zeros（（h， w， 1）， dtype＝＇float32＇）
weights ＝ mask．copy（）
for i in tqdm（range（0， h － 1， self．stride））：
for j in range（0， w － 1， self．stride）：
a， b， c， d ＝ i， min（h， i ＋ self．size）， j， min（w， j ＋ self．size）
patch ＝ x［a：b， c：d，：］
mask［a：b， c：d，：］＋＝ np．expand＿dims（self．predictor（patch），－1）
weights［a：b， c：d，：］＝ 1
return mask ／ weights

有一個符號輸入錯誤，可以很容易地找到它，檢查代碼是否正確：

class Model（nn．Module）：
def forward（self， x）：
return x．mean（axis＝－1）
model ＝ Model（）
grid＿predictor ＝ GridPredictor（model， size＝128， stride＝64）
simple＿pred ＝ np．expand＿dims（model（img），－1）
grid＿pred ＝ grid＿predictor（img）
np．testing．assert＿allclose（simple＿pred， grid＿pred， atol＝．001）

AssertionError Traceback （most recent call last）
＜ipython－input－24－a72034c717e9＞ in ＜module＞
9 grid＿pred ＝ grid＿predictor（img）
10
－－－＞ 11 np．testing．assert＿allclose（simple＿pred， grid＿pred， atol＝．001）
～／．pyenv／versions／3．6．6／lib／python3．6／site－packages／numpy／testing／＿private／utils．py in assert＿allclose（actual， desired， rtol， atol， equal＿nan， err＿msg， verbose）
1513 header ＝＇Not equal to tolerance rtol＝％g， atol＝％g＇％（rtol， atol）
1514 assert＿array＿compare（compare， actual， desired， err＿msg＝str（err＿msg），
－＞ 1515 verbose＝verbose， header＝header， equal＿nan＝equal＿nan）
1516
1517
～／．pyenv／versions／3．6．6／lib／python3．6／site－packages／numpy／testing／＿private／utils．py in assert＿array＿compare（comparison， x， y， err＿msg， verbose， header， precision， equal＿nan， equal＿inf）
839 verbose＝verbose， header＝header，
840 names＝（＇x＇，＇y＇）， precision＝precision）
－－＞ 841 raise AssertionError（msg）
842 except ValueError：
843 import traceback
AssertionError：
Not equal to tolerance rtol＝1e－07， atol＝0．001
Mismatch： 99．6％
Max absolute difference： 765．
Max relative difference： 0．75000001
x： array（［［［215．333333］，
［192．666667］，
［250．］，．．．
y： array（［［［ 215．33333］，
［ 192．66667］，
［ 250．］，．．．

call方法的正確版本如下：

def ＿＿call＿＿（self， x： np．ndarray）：
h， w，＿＝ x．shape
mask ＝ np．zeros（（h， w， 1）， dtype＝＇float32＇）
weights ＝ mask．copy（）
for i in tqdm（range（0， h － 1， self．stride））：
for j in range（0， w － 1， self．stride）：
a， b， c， d ＝ i， min（h， i ＋ self．size）， j， min（w， j ＋ self．size）
patch ＝ x［a：b， c：d，：］
mask［a：b， c：d，：］＋＝ np．expand＿dims（self．predictor（patch），－1）
weights［a：b， c：d，：］＋＝ 1
return mask ／ weights

如果你仍然不知道問題是什么，注意行weights［a：b， c：d，：］＋＝ 1。

8．Imagenet歸一化

當一個人需要做遷移學習時，用訓練Imagenet時的方法將圖像歸一化通常是一個好主意。

讓我們使用熟悉的albumentations來實現(xiàn)：

from albumentations import Normalize
norm ＝ Normalize（）
img ＝ cv2．imread（＇img＿small．jpg＇）
mask ＝ cv2．imread（＇mask＿small．png＇， cv2．IMREAD＿GRAYSCALE）
mask ＝ np．expand＿dims（mask，－1）＃ shape （64， 64）－＞ shape （64， 64， 1）
normed ＝ norm（image＝img， mask＝mask）
img， mask ＝［normed［x］ for x in ［＇image＇，＇mask＇］］
def img＿to＿batch（x）：
x ＝ np．transpose（x，（2， 0， 1））．astype（＇float32＇）
return torch．from＿numpy（np．expand＿dims（x， 0））
img， mask ＝ map（img＿to＿batch，（img， mask））
criterion ＝ F．binary＿cross＿entropy

現(xiàn)在是時候訓練一個網絡并對單個圖像進行擬合——正如我所提到的，這是一種很好的調試技術：

model＿a ＝ UNet（3， 1）
optimizer ＝ torch．optim．Adam（model＿a．parameters（）， lr＝1e－3）
losses ＝［］
for t in tqdm（range（20））：
loss ＝ criterion（model＿a（img）， mask）
losses．append（loss．item（））
optimizer．zero＿grad（）
loss．backward（）
optimizer．step（）
＿＝ plt．plot（losses）

曲率看起來很好，但是－300不是我們期望的交叉熵的損失值。是什么問題？

歸一化處理圖像效果很好，但掩碼需要縮放到［0，1］之間。

model＿b ＝ UNet（3， 1）
optimizer ＝ torch．optim．Adam（model＿b．parameters（）， lr＝1e－3）
losses ＝［］
for t in tqdm（range（20））：
loss ＝ criterion（model＿b（img）， mask ／ 255．）
losses．append（loss．item（））
optimizer．zero＿grad（）
loss．backward（）
optimizer．step（）
＿＝ plt．plot（losses）

在訓練循環(huán)時一個簡單運行斷言（例如assert mask．max（）＜＝ 1）可以很快地檢測到問題。同樣，也可以是單元測試。