current position:Home>ORB-SLAM3 Algorithm Learning—Frame Construction

ORB-SLAM3 Algorithm Learning—Frame Construction

2022-11-24 22:51:46I am a happy little party dish

0 Summary

This article gives a brief overview of the image frame structure in ORB-SLAM3.

Currently ORB-SLAM3 supports 4 kinds of camera sensors, and at the same time, it is divided into pure vision Frame object structure and VI mode Frame according to whether it is integrated with IMUObject Construction

For each sensor, the Frame construction of visual and vi modes share a function

You can see in Frame.h that there are 3 constructors, corresponding to monocular, binocular and RGB-D cameras, and binocular cameras and dual monocular cameras share a constructor

In the process of Frame construction, the following things are mainly done:

1. Detect ORB features and calculate BREIF descriptors for each feature point

As you can see in the>ParseORBParamFile function, ORB-SLAM3 additionally creates an ORB feature extractor mpIniORBextractor for single-purpose initialization, and normal modeCompared with the feature extractor under , the difference is that the number of extracted feature points is 5 times that of the normal mode, the purpose is to ensure a more stable initialization result.

Detail stamp: ORB-SLAM3 algorithm learning—FrameConstruction - ORB feature extraction and BRIEF descriptor calculation

2. Feature point undistortionUndistortKeyPoints()

For monocular, binocular and RGB-D cameras, when constructing an image frame, after extracting the feature points, it is necessary to perform distortion correction according to the image situation. The principle of de-distortion is relatively simple, which is to give the pixel coordinates corresponding to the feature pointsJust multiply by the distortion coefficient, and ORB-SALM3 directly calls the dedistortion function in opencv.
The corresponding code is in>UndistortKeyPoints()

cv::undistortPoints(mat,mat, static_castPinhole*(mpCamera)->toK(),mDistCoef,cv::Mat((),mK);

The formula is shown below:
Insert image description here

In addition, some cameras can directly send out the image after dedistortion. At this time, the distortion coefficient needs to be set to 0 in the configuration file. When the dedistortion function detects that the distortion coefficient is 0, it will directly return to the next step of the image frame constructionA link, such as zed camera, RealSense camera.

However, some binocular cameras send images with distortion, but when performing binocular camera calibration, a set of internal and external parameters of the binocular camera can be obtained. According to the internal and external parameters, the binocular image can be directly remapped.What is transmitted to the SLAM system is also a set of undistorted images, such as the EuRoC dataset.

For the pinhole camera model, (such as TUM RGBD dataset, EUROC dataset, KITTY dataset, zed camera, Realsense camera, etc.) the distortion coefficient format is generally[k1,k2,p1,p2,k3], where k3 is optional

For the fisheye camera model, (such as TUM's binocular + IMU dataset), the distortion coefficient format is generally [k1,k2,k3,k4]

3. Calculate binocular matching (only for binocular camera mode)

For the binocular camera, the author did not directly use the depth information output by the camera, but based on a band search method, according to the coordinates of the feature points of the left-eye image to find the correspondence on the right-eye imageFor feature points, after SAD search and parabola fitting, the sub-pixel coordinates corresponding to the feature points of the left eye on the right eye are obtained, and the corresponding depth information is finally calculated.

4. Register grid information for each feature point

Divide the image into several grids, and record the information of the feature points in each grid, mainly for feature matching search.

copyright notice
author[I am a happy little party dish],Please bring the original link to reprint, thank you.

Random recommended