# ORB-SLAM3 Algorithm Learning - Frame Construction - Binocular Feature Matching Based on SAD Sliding Window

2022-11-24 22:48:44

# 0总述

The specific geometric principle diagram is as follows：

# 1双目匹配

## 1.1为左目每个特征点建立带状区域搜索表,限定搜索区域.（已提前极线校正）

First initialize a banded search table for binocular matching`vRowIndices`,This is a two-dimensional vector,The outer size is the number of lines of the imagenRows,Record the ordinate of the feature point in the right image,The inner size is the width of the band,Record the index of the feature point.

``````vector<vector<size_t> > vRowIndices(nRows,vector<size_t>());
``````

When the right-eye image is matched with the left-eye feature points,不仅仅是在一条横线上搜索,而是在一条横向搜索带上搜索.简而言之,原本每个特征点的纵坐标为1个像素大小,这里把特征点体积放大,Assume that the ordinate occupies several lines

``````    const int nRows = mpORBextractorLeft->mvImagePyramid[0].rows;// 左目图像,The first level of the pyramid is the number of rows of the original image,即高度
// Assign keypoints to row table
// 步骤1：建立特征点搜索范围对应表,一个特征点在一个带状区域内搜索匹配特征点
// 匹配搜索的时候,不仅仅是在一条横线上搜索,而是在一条横向搜索带上搜索,简而言之,原本每个特征点的纵坐标为1,这里把特征点体积放大,纵坐标占好几行
// 例如左目图像某个特征点的纵坐标为20,那么在右侧图像上搜索时是在纵坐标为18到22这条带上搜索,搜索带宽度为正负2,搜索带的宽度和特征点所在金字塔层数有关
// 简单来说,如果纵坐标是20,特征点在图像第20行,那么认为18 19 20 21 22行都有这个特征点
// vRowIndices[18]、vRowIndices[19]、vRowIndices[20]、vRowIndices[21]、vRowIndices[22]都有这个特征点编号
vector<vector<size_t> > vRowIndices(nRows,vector<size_t>());
for(int i=0; i<nRows; i++)
vRowIndices[i].reserve(200);
const int Nr = mvKeysRight.size();// The number of right-eye feature points
// Associate the index of the feature point of the right eye with the strip area
for(int iR=0; iR<Nr; iR++)
{

// !!在这个函数中没有对双目进行校正,双目校正是在外层程序中实现的
const cv::KeyPoint &kp = mvKeysRight[iR];
const float &kpY = kp.pt.y;
// 计算匹配搜索的纵向宽度,尺度越大（层数越高,距离越近）,搜索范围越大
// 如果特征点在金字塔第一层,则搜索范围为:正负2
// 尺度越大其位置不确定性越高,所以其搜索半径越大
const float r = 2.0f*mvScaleFactors[mvKeysRight[iR].octave];
const int maxr = ceil(kpY+r);// ceil向上取整函数
const int minr = floor(kpY-r);// floor向下取整函数
// Associate an index with a band
for(int yi=minr;yi<=maxr;yi++)
vRowIndices[yi].push_back(iR);
}
``````

## 1.2对左目相机每个特征点,通过描述子在右目带状搜索区域找到匹配点

``````const vector<size_t> &vCandidates = vRowIndices[vL];
``````

``````const int dist = ORBmatcher::DescriptorDistance(dL,dR);
``````

``````        const cv::KeyPoint &kpL = mvKeys[iL];// Take out a left purpose feature point
const int &levelL = kpL.octave;// 特征点的尺度
const float &vL = kpL.pt.y;// The ordinate of the feature point
const float &uL = kpL.pt.x;// The abscissa of the feature point
// Possible candidate matching points in the right eye image
// According to the ordinate of the feature point,Quickly find the indexes of all the right-eye feature points in the right-eye corresponding to the strip area
const vector<size_t> &vCandidates = vRowIndices[vL];
// If it is not found, it is considered that the feature point has no right-target matching point
if(vCandidates.empty())
continue;

// According to the maximum and minimum parallax allowed by the camera,确定一个xThe range in the axis direction
const float minU = uL-maxD;// 最小匹配范围
const float maxU = uL-minD;// 最大匹配范围
if(maxU<0)// minD=0,maxU<0说明uL<0,is an invalid point
continue;

int bestDist = ORBmatcher::TH_HIGH;// Initialize the best matching distance,会不断更新
size_t bestIdxR = 0;// corresponding to the best matching pointid
// Take out the descriptor corresponding to the feature point of the left eye,每个特征点描述子占一行,建立一个指针指向iL特征点对应的描述子
const cv::Mat &dL = mDescriptors.row(iL);

// Compare descriptor to right keypoints
// 步骤2.1：遍历右目所有可能的匹配点,找出最佳匹配点（描述子距离最小）
for(size_t iC=0; iC<vCandidates.size(); iC++)
{

const size_t iR = vCandidates[iC];// Right-eye candidate feature point index
const cv::KeyPoint &kpR = mvKeysRight[iR];
// 仅对近邻尺度的特征点进行匹配
if(kpR.octave<levelL-1 || kpR.octave>levelL+1)
continue;

const float &uR = kpR.pt.x;// Right-eye candidate matching pointsx坐标
// Make sure that the coordinates of the right-eye candidate matching points are also within a reasonable range of parallax
if(uR>=minU && uR<=maxU)
{

const cv::Mat &dR = mDescriptorsRight.row(iR);// Take out the descriptor corresponding to the right-eye candidate matching point
const int dist = ORBmatcher::DescriptorDistance(dL,dR);// Calculate the descriptor distance of the left and right eye matching points

// Update the minimum matching distance and its corresponding right-eye feature point index
if(dist<bestDist)
{

bestDist = dist;
bestIdxR = iR;
}
}
``````

This results in a parabola,Because if there really is an optimal correction amount,The closer to the position the pixel gray value difference will be smaller,The matching deviation is smaller;The farther away from this location the greater the difference,The greater the matching deviation
If the position where the minimum value occurs is out of the two boundaries,It shows that there is no inflection point,i.e. no minimum was found,Give up computing the depth of the pair of matching points

``````    // kpL.pt.x对应金字塔最底层坐标,Point the best matching features to each otherxyCoordinates are scaled to scale-corresponding layers using a scale transformation (scaleduL, scaledvL) (scaleduR0, )
const float uR0 = mvKeysRight[bestIdxR].pt.x;// The feature points of the right eye image are at the bottom of the pyramidx坐标
const float scaleFactor = mvInvScaleFactors[kpL.octave];// The scale of the left eye feature point
const float scaleduL = round(kpL.pt.x*scaleFactor);// 左目x坐标
const float scaledvL = round(kpL.pt.y*scaleFactor);// 右目y坐标
const float scaleduR0 = round(uR0*scaleFactor); // 右目x坐标

// sliding window search
const int w = 5;// 滑动窗口的大小11*11 注意该窗口取自resize后的图像
// Take an image block from the image of the pyramid layer where the feature point of the left eye is located,The image block is centered on the feature point,取11*11个像素区域
cv::Mat IL = mpORBextractorLeft->mvImagePyramid[kpL.octave].rowRange(scaledvL-w,scaledvL+w+1).colRange(scaleduL-w,scaleduL+w+1);

int bestDist = INT_MAX;// Reset best match distance
int bestincR = 0;// Optimal coordinate correction amount
const int L = 5;
vector<float> vDists;
vDists.resize(2*L+1);
// 滑动窗口的滑动范围为（-L, L）,提前判断滑动窗口滑动过程中是否会越界
// iniu和enduFor the left and right start and end points of the windowx坐标
const float iniu = scaleduR0+L-w;
const float endu = scaleduR0+L+w+1;
if(iniu<0 || endu >= mpORBextractorRight->mvImagePyramid[kpL.octave].cols)// 越界判断
continue;

for(int incR=-L; incR<=+L; incR++)
{

// 横向滑动窗口
// 这里L和w的值相等,So the range of traversal is scaleduR0为中心,左边界为scaleduR0-L-w,右边界为scaleduR0+L+w+1
cv::Mat IR = mpORBextractorRight->mvImagePyramid[kpL.octave].rowRange(scaledvL-w,scaledvL+w+1).colRange(scaleduR0+incR-w,scaleduR0+incR+w+1);

float dist = cv::norm(IL,IR,cv::NORM_L1);// 一范数,计算差的绝对值
if(dist<bestDist)
{

}
// 正常情况下,The data inside should change in the form of a parabola
// Because if there really is an optimal correction amount,The closer to the position the pixel gray value difference will be smaller,The matching deviation is smaller;The farther away from this location the greater the difference,The greater the matching deviation
vDists[L+incR] = dist;
}

if(bestincR==-L || bestincR==L)
continue;
``````

## 1.4 做抛物线拟合找谷底得到亚像素匹配deltaR

We have proved above,There is such a parabola,It's rock bottom(The matching error is minimal)处对应的xThe coordinates are the exact matching coordinates we are looking for.
Although at the optimal amount of correction`L+bestincR`does not necessarily have a minimum value,But the minimum value must be near the optimum correction value,因此可以通过`(L+bestincR,dist1)``(L+bestincR-1,dist2)``(L+bestincR+1,dist3)`三个点拟合出抛物线,做抛物线拟合找抛物线谷底得到亚像素修正量`deltaR`,`deltaR`是在`L+bestincR`Based on a more subtle amount of change,Hence the final matching point coordinates`bestuR`为：

``````bestuR = scaleduR0+bestincR+deltaR
``````

It is accurate to calculate the feature points of the right eyex坐标后,The parallax can be calculated,Then, the depth distance information corresponding to the pair of matching points can be further calculated.Depth information calculation is relatively simple,就是用配置文件的`mbf`divide by parallax`disparity`

Up to this point,We have obtained the coordinates and corresponding depth information of all the feature points of the left eye in the right eye image.There is one more step of filtering in the code,对于通过SADThe depth value corresponding to the feature point with a large matching deviation calculated by the sliding window is set to -1.

``````    // 步骤2.3：做抛物线拟合找谷底得到亚像素匹配deltaR
// (L+bestincR,dist) (L+bestincR-1,dist) (L+bestincR+1,dist)三个点拟合出抛物线
const float dist1 = vDists[L+bestincR-1];
const float dist2 = vDists[L+bestincR];
const float dist3 = vDists[L+bestincR+1];

const float deltaR = (dist1-dist3)/(2.0f*(dist1+dist3-2.0f*dist2));
// 抛物线拟合得到的修正量不能超过一个像素,否则放弃求该特征点的深度
if(deltaR<-1 || deltaR>1)
continue;

// Re-scaled coordinate
// 通过描述子匹配得到匹配点位置为scaleduR0
// 通过抛物线拟合找到亚像素修正量deltaR
float bestuR = mvScaleFactors[kpL.octave]*((float)scaleduR0+(float)bestincR+deltaR);
// 这里是disparity,根据它算出depth
float disparity = (uL-bestuR);

if(disparity>=minD && disparity<maxD)// 最后判断视差是否在范围内
{

if(disparity<=0)
{

disparity=0.01;
bestuR = uL-0.01;
}
// depth 是在这里计算的
// depth=baseline*fx/disparity
mvDepth[iL]=mbf/disparity;// 深度
mvuRight[iL] = bestuR;// 匹配对在右图的横坐标
}
``````
``````    // 步骤3：剔除SAD匹配偏差较大的匹配特征点
sort(vDistIdx.begin(),vDistIdx.end());
const float median = vDistIdx[vDistIdx.size()/2].first;
const float thDist = 1.5f*1.4f*median;
//lusx count
int count_depth = vDistIdx.size();
for(int i=vDistIdx.size()-1;i>=0;i--)
{

if(vDistIdx[i].first<thDist)
break;
else
{

count_depth--;
mvuRight[vDistIdx[i].second]=-1;
mvDepth[vDistIdx[i].second]=-1;
}
}
``````