current position:Home>(3) write their own - YOLOv3 - loss function

(3) write their own - YOLOv3 - loss function

2022-08-06 18:09:17zidea

携手创作,共同成长!这是我参与「掘金日新计划 · 8 月更文挑战」的第10天,点击查看活动详情

First, we need to find out some problem,Because for the target detection output is far more complicated than the classification,In return for bounding box,We all know is a regression of the coordinates of the center,As well as the edit box wide high.好,Then model's output center coordinates and to predict the edit box wide high? Or high relative to the width of the output value ratio,显然都不是,

006.png

This picture we need to figure out clear,明明白白,Or later in the code or the source will have a problem,The first model of the output is what,Model the actual output value is t x , t y , t w , t h t_x,t_y,t_w,t_h These are also called regression parameters,True on how to use these values to the image coordinates and the center of the bounding box of the width of high.

First of all we need to understand the grid concept,First the real image coordinates need to be mapped to a grid,

class YoloLoss(nn.Module):
    def __init__(self):
        super().__init__()
        
复制代码
self.mse = nn.MSELoss()
#
self.bce = nn.BCEWithLogitsLoss()
#
self.entropy = nn.CrossEntropyLoss()
self.sigmoid = nn.Sigmoid()
复制代码
self.lambda_class = 1
self.lambda_noobj = 10
self.lambda_obj = 1
self.lambda_box = 10
复制代码

置信度损失函数

obj = target[...,0] == 1
noobj = target[...,0] == 0
no_object_loss = self.bce((preds[...,0:1][noobj]),(target[...,0:1][noobj]))
复制代码

BCEWithLogitsLoss Loss function will Sigmoid 函数和 BCELoss Function as a whole to use,The advantage is better than Sigmoid 和 BCELoss 分别使用,More stable in the numerical,Because they are merged into a logical unit,就可以利用了 log 和 exp 技巧来实现数值稳定.

l ( x , y ) = L = { l 1 , , l N } T l n = w n [ y n log σ ( x n ) + ( 1 y n ) log ( 1 σ ( x n ) ) ] l(x,y) = L = \{l_1,\cdots,l_N \}^T\\ l_n = - w_n \left[ y_n \log\sigma(x_n) + (1-y_n)\log(1 - \sigma(x_n)) \right]

Through a piece of code below to explainobj = target[...,0] == 1preds[...,0:1][noobj] Forecast to screen without goals box

a = torch.tensor([[[0,1,1],[1,1,2]],[[0,2,1],[0,2,2]]])
a
复制代码

这里创建一个 tensor a 维度为 2 × 2 × 3 2\times 2 \times 3

tensor([[[0, 1, 1], [1, 1, 2]], [[0, 2, 1], [0, 2, 2]]])
复制代码
obj = a[...,0] == 1
noobj = a[...,0] == 0
obj
复制代码

这里 obj 为一个 tensor([[False, True], [False, False]]) ,并且 obj 维度为 torch.Size([2, 2]) Can be used as a filter to satisfy the conditions,Is the corresponding position as True 数据保留

a[...,0:1][obj]
复制代码
box_preds = torch.cat([self.sigmoid(preds[...,1:3]),torch.exp(preds[...,3:5]) * anchors],dim=-1)
复制代码

Targeted loss value

#Object Loss
anchors = anchors.reshape(1,3,1,1,2) # 通过 broadcasting 让 anchors 和
# 
box_preds = torch.cat([self.sigmoid(preds[...,1:3]),torch.exp(preds[...,3:5]) * anchors],dim=-1)
ious = intersection_over_union(box_preds[obj],target[...,1:5][obj]).detach()

object_loss = self.bce((preds[...,0:1][obj]), (ious * target[...,0:1]))
复制代码

为了计算 IoU Need to predict bounding box center of output and wide high regression parameters for processing,And then target center of bounding box and wide high do IoU For the target loss function will IoU 考虑进去.

计算定位损失


preds[...,1:3] = self.sigmoid(preds[...,1:3])
target[...,3:5] = torch.log(1e-16 + target[...,3:5]/anchors)

box_loss = self.mse(preds[...,1:5][obj],target[...,1:5][obj])
复制代码

类别损失

#Class Loss
class_loss = self.entropy(
    (preds[...,5:][obj]),(target[...,5][obj].long())
)
复制代码
return (
    self.lambda_box * box_loss
    + self.lambda_obj * object_loss
    + self.lambda_noobj * no_object_loss
    + self.lambda_class * class_loss
)
复制代码

copyright notice
author[zidea],Please bring the original link to reprint, thank you.
https://en.cdmana.com/2022/218/202208061751323305.html

Random recommended