# (3) write their own - YOLOv3 - loss function

2022-08-06 18:09:17zidea

First, we need to find out some problem,Because for the target detection output is far more complicated than the classification,In return for bounding box,We all know is a regression of the coordinates of the center,As well as the edit box wide high.好,Then model's output center coordinates and to predict the edit box wide high? Or high relative to the width of the output value ratio,显然都不是,

This picture we need to figure out clear,明明白白,Or later in the code or the source will have a problem,The first model of the output is what,Model the actual output value is $t_x,t_y,t_w,t_h$ These are also called regression parameters,True on how to use these values to the image coordinates and the center of the bounding box of the width of high.

First of all we need to understand the grid concept,First the real image coordinates need to be mapped to a grid,

class YoloLoss(nn.Module):
def __init__(self):
super().__init__()

self.mse = nn.MSELoss()
#
self.bce = nn.BCEWithLogitsLoss()
#
self.entropy = nn.CrossEntropyLoss()
self.sigmoid = nn.Sigmoid()

self.lambda_class = 1
self.lambda_noobj = 10
self.lambda_obj = 1
self.lambda_box = 10

### 置信度损失函数

obj = target[...,0] == 1
noobj = target[...,0] == 0
no_object_loss = self.bce((preds[...,0:1][noobj]),(target[...,0:1][noobj]))

BCEWithLogitsLoss Loss function will Sigmoid 函数和 BCELoss Function as a whole to use,The advantage is better than Sigmoid 和 BCELoss 分别使用,More stable in the numerical,Because they are merged into a logical unit,就可以利用了 log 和 exp 技巧来实现数值稳定.

$l(x,y) = L = \{l_1,\cdots,l_N \}^T\\ l_n = - w_n \left[ y_n \log\sigma(x_n) + (1-y_n)\log(1 - \sigma(x_n)) \right]$

Through a piece of code below to explainobj = target[...,0] == 1preds[...,0:1][noobj] Forecast to screen without goals box

a = torch.tensor([[[0,1,1],[1,1,2]],[[0,2,1],[0,2,2]]])
a

tensor([[[0, 1, 1], [1, 1, 2]], [[0, 2, 1], [0, 2, 2]]])

obj = a[...,0] == 1
noobj = a[...,0] == 0
obj

a[...,0:1][obj]

box_preds = torch.cat([self.sigmoid(preds[...,1:3]),torch.exp(preds[...,3:5]) * anchors],dim=-1)

### Targeted loss value

#Object Loss
anchors = anchors.reshape(1,3,1,1,2) # 通过 broadcasting 让 anchors 和
#
box_preds = torch.cat([self.sigmoid(preds[...,1:3]),torch.exp(preds[...,3:5]) * anchors],dim=-1)
ious = intersection_over_union(box_preds[obj],target[...,1:5][obj]).detach()

object_loss = self.bce((preds[...,0:1][obj]), (ious * target[...,0:1]))

### 计算定位损失


preds[...,1:3] = self.sigmoid(preds[...,1:3])
target[...,3:5] = torch.log(1e-16 + target[...,3:5]/anchors)

box_loss = self.mse(preds[...,1:5][obj],target[...,1:5][obj])

### 类别损失

#Class Loss
class_loss = self.entropy(
(preds[...,5:][obj]),(target[...,5][obj].long())
)

return (
self.lambda_box * box_loss
+ self.lambda_obj * object_loss
+ self.lambda_noobj * no_object_loss
+ self.lambda_class * class_loss
)