Review: Visual Odometry I [tutorial]

Scaramuzza, D., & Fraundorfer, F. (2011). Visual odometry [tutorial]. IEEE robotics & automation magazine, 18(4), 80-92.

0. Introduction

Visual Odometry๋Š” ๋ชจ๋…ธ ๋˜๋Š” ์Šคํ…Œ๋ ˆ์˜ค ์นด๋ฉ”๋ผ ๋งŒ์„ ์ด์šฉํ•˜์—ฌ ์–ด๋–ค ์ฃผ์ฒด(e.g. ์ฐจ๋Ÿ‰, ์‚ฌ๋žŒ, ๋กœ๋ด‡ ๋“ฑ)์˜ ์—๊ณ ๋ชจ์…˜์„ ์ถ”์ •ํ•˜๋Š” ๊ณผ์ •์ด๋‹ค. ํ™œ์šฉ ๋ถ„์•ผ๋กœ๋Š” ๋กœ๋ณดํ‹ฑ์Šค, ์›จ์–ด๋Ÿฌ๋ธ”, ์ฆ๊ฐ• ํ˜„์‹ค, ์ž์œจ ์ฃผํ–‰ ๋“ฑ์ด ์žˆ๋‹ค.

egomotion์— ๋Œ€ํ•˜์—ฌ

Visual Odometry๋ผ๋Š” ๋‹จ์–ด๋Š” 2004๋…„ Nister์˜ ๋Œ€ํ‘œ ๋…ผ๋ฌธ์—์„œ ์ฒ˜์Œ ๋“ฑ์žฅํ–ˆ์œผ๋ฉฐ, Wheel Odometry์™€์˜ ์œ ์‚ฌ์„ฑ ๋•Œ๋ฌธ์— ๊ทธ๋ ‡๊ฒŒ ์ •ํ•ด์กŒ๋‹ค. ์ฐธ๊ณ ๋กœ, Wheel Odometry๋ž€, ์ผ์ • ์‹œ๊ฐ„ ๋™์•ˆ์˜ ๋ฐ”ํ€ด์˜ ํšŒ์ „ ์ˆ˜๋ฅผ ์ด์šฉํ•˜์—ฌ ์ฐจ๋Ÿ‰์˜ ์ ์ง„์ ์ธ ์›€์ง์ž„์„ ์ถ”์ •ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด์™€ ๊ฐ™์ด, Visual Odometry ๋˜ํ•œ ์ฐจ๋Ÿ‰์˜ ์›€์ง์ž„์œผ๋กœ ์ธํ•ด ์žฅ์ฐฉ๋œ ์นด๋ฉ”๋ผ์˜ ์ด๋ฏธ์ง€์— ์ƒ๊ธฐ๋Š” ๋ณ€ํ™”๋ฅผ ํ†ตํ•ด ์ฐจ๋Ÿ‰์˜ ํฌ์ฆˆ๋ฅผ ์ ์ง„์ ์œผ๋กœ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•œ๋‹ค. Visual Odometry๊ฐ€ ํšจ๊ณผ์ ์œผ๋กœ ์ž‘๋™ํ•˜๊ธฐ ์œ„ํ•ด์„ , ์ฒซ์งธ, ์ถฉ๋ถ„ํžˆ ๋ฐ์€ ํ™˜๊ฒฝ์ด์–ด์•ผ ํ•˜๊ณ , ๋‘˜์งธ, ๊ฐ๊ฐ์˜ ์ •์ ์ธ Scene์€ ์›€์ง์ž„์„ ํ™•์‹คํžˆ ์ถ”์ถœ ํ•ด๋‚ผ ์ˆ˜ ์žˆ๋Š” ์ถฉ๋ถ„ํ•œ ์งˆ๊ฐ์„ ์ง€๋‹ˆ๊ณ  ์žˆ์–ด์•ผ ํ•œ๋‹ค. ๋”๋ถˆ์–ด, ์—ฐ์†ํ•œ ํ”„๋ ˆ์ž„๋“ค์€ ์ถฉ๋ถ„ํ•˜๊ฒŒ ๊ฒน์นœ Scene๋“ค์ด์–ด์•ผ ํ•œ๋‹ค.

Wheel Odometry์™€ ๋น„๊ตํ•ด์„œ Visual Odometry๋Š” Wheel slip๊ณผ ์šธํ‰๋ถˆํ‰ํ•œ ์ง€๋ฉด ๋“ฑ ๋‹ค๋ฅธ ๋ถˆ๋ฆฌํ•œ ์กฐ๊ฑด๋“ค์˜ ์˜ํ–ฅ์„ ๋ฐ›์ง€ ์•Š๋Š”๋‹ค. Visual Odometry๋Š” 0.1 ~ 2% ๋ฒ”์œ„์˜ ์ƒ๋Œ€ ์œ„์น˜ ์˜ค์ฐจ๋ฅผ ๋ณด์ด๋ฉฐ Wheel Odometry ๋ณด๋‹ค ์ •ํ™•ํ•œ Trajectory ์ถ”์ •์ด ๊ฐ€๋Šฅํ•œ ๊ฒƒ์ด ์ฆ๋ช…๋˜์—ˆ๋‹ค.

Trajectory์— ๋Œ€ํ•˜

์ด๋Ÿฌํ•œ ์ด์ ์œผ๋กœ Visual Odometry๋Š” Wheel Odometry๋ฅผ ๋น„๋กฏํ•œ ๋‹ค๋ฅธ Navigation ์‹œ์Šคํ…œ (GNSS, IMU, Laser Odometry )์˜ ๋Œ€์•ˆ ํ˜น์€ ๋ณด์ถฉ์•ˆ์œผ๋กœ ๋– ์˜ค๋ฅด๊ณ  ์žˆ๋‹ค. ํŠนํžˆ ๋ฌผ์†๊ณผ ๊ฐ™์€ GPS ์Œ์˜ ์ง€์—ญ์—์„œ Visual Odometry๊ฐ€ ๋” ์œ ์šฉํ•˜๋‹ค.

๋‘ ํŒŒํŠธ์˜ Tutorial๊ณผ Survey๋ฅผ ํ†ตํ•ด 1980๋…„๋ถ€ํ„ฐ 2011๊นŒ์ง€ Visual Odometry(์ดํ•˜ VO)์— ๊ด€ํ•œ ์—ฐ๊ตฌ๋ฅผ ์•Œ์•„ ๋ณผ ๊ฒƒ์ด๋‹ค. ์ฒ˜์Œ 20๋…„ ๋™์•ˆ ๋งŽ์€ ์˜คํ”„๋ผ์ธ ๊ตฌํ˜„์ด ์ด๋ฃจ์–ด์กŒ์ง€๋งŒ, 30๋…„ ์งธ ๋“ค์–ด์„œ ์ฒ˜ ์‹ค์‹œ๊ฐ„ ์‹œ์Šคํ…œ์ด ํ™”์„ฑ ํƒ์‚ฌ ์ฐจ๋Ÿ‰์— ์‚ฌ์šฉ๋˜๋ฉด์„œ ์ฃผ๋ฅ˜๊ฐ€ ๋˜์—ˆ๋‹ค. ์ฒซ ๋ฒˆ์งธ ํŒŒํŠธ์—์„  30๋…„ ๋™์•ˆ์˜ ์ด ๋ถ„์•ผ ๋…ผ๋ฌธ๋“ค์— ๋Œ€ํ•œ ํšŒ๊ณ ์™€ ํ•ต์‹ฌ ๋‚ด์šฉ์„ ๋‹ค๋ฃฌ๋‹ค. ์นด๋ฉ”๋ผ ๋ชจ๋ธ๋ง๊ณผ ์บ˜๋ฆฌ๋ธŒ๋ ˆ์ด์…˜์— ๊ด€ํ•œ ๊ฐ„๋‹จํ•œ ๊ณ ์ฐฐ ํ›„์—, ๋ชจ๋…ธ์™€ ์Šคํ…Œ๋ ˆ์˜ค ์นด๋ฉ”๋ผ๋ฅผ ์ด์šฉํ•œ ์›€์ง์ž„ ์ถ”์ • ๊ณผ์ •์„ ์‚ดํŽด๋ณด๊ณ  ๊ฐ๊ฐ์˜ ์žฅ๋‹จ์ ์„ ์•Œ์•„๋ณธ๋‹ค. ๋‘ ๋ฒˆ์งธ ํŒŒํŠธ์—์„  ํ”ผ์ณ ๋งค์นญ๊ณผ robustness ๊ทธ๋ฆฌ๊ณ  ํ™œ์šฉํ•˜๋Š” ๊ฒƒ์„ ๋‹ค๋ฃฌ๋‹ค. ์ด ๋ถ€๋ถ„์—์„  VO์—์„œ ์ฃผ๋กœ ์“ฐ์ด๋Š” ํŠน์ง•์  ์ถ”์ถœ๊ธฐ์— ๋Œ€ํ•ด ์‚ดํŽด๋ณด๊ณ  ๋‹ค๋ฅธ ์•„์›ƒ๋ผ์ด์–ด ์ œ๊ฑฐ ๋ฐฉ์‹๋„ ์‚ดํŽด๋ณธ๋‹ค. ํŠนํžˆ Random Sample Consensus (i.e. RANSAC)๊ณผ ๊ทธ๊ฒƒ์„ ๋น ๋ฅด๊ฒŒ ์‚ฌ์šฉํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ๊ฐ•์กฐํ•ด์„œ ๋…ผํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  error modeling, location recognition, bundle adjustment ๋“ฑ์— ๋Œ€ํ•ด์„œ๋„ ๋‹ค๋ค„๋ณธ๋‹ค.

์ด ํŠœํ† ๋ฆฌ์–ผ์€ ์ˆ™๋ จ์ž์™€ ๋น„์ˆ™๋ จ์ž ๋ชจ๋‘์—๊ฒŒ ์™„์ „ํ•œ VO์‹œ์Šคํ…œ์„ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฐ€์ด๋“œ ๋ผ์ธ๊ณผ ์ฐธ์กฐ ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์ œ๊ณตํ•œ๋‹ค. ํ•ญ์ƒ ์ž‘๋™ํ•˜๋Š” ์ž‘์—… ํ™˜๊ฒฝ์„ ์œ„ํ•œ ์ด์ƒ์ ์ด๊ณ  ๊ณ ์œ ํ•œ VO ์†”๋ฃจ์…˜์ด ์กด์žฌํ•˜์ง€ ์•Š๊ธฐ ๋•Œ๋ฌธ์— ํŠน์ • ํƒ์ƒ‰ ํ™˜๊ฒฝ๊ณผ ์ฃผ์–ด์ง„ ๊ณ„์‚ฐ ๋ฆฌ์†Œ์Šค์— ๋”ฐ๋ผ ์‹ ์ค‘ํ•˜๊ฒŒ ์ตœ์ ์˜ ์†”๋ฃจ์…˜์„ ์„ ํƒํ•ด์•ผ ํ•œ๋‹ค..

1. History of Visual Odometry

SfM๊ณผ ๊ฐ™์€ ์ปดํ“จํ„ฐ ๋น„์ „ ์ปค๋ฎค๋‹ˆํ‹ฐ์—์„œ ์นด๋ฉ”๋ผ ์ด๋ฏธ์ง€ ์„ธํŠธ์—์„œ ์ƒ๋Œ€์ ์ธ ์นด๋ฉ”๋ผ ํฌ์ฆˆ์™€ 3์ฐจ์› ๊ตฌ์กฐ๋ฅผ ๋ณต์›ํ•˜๋Š” ๋ฌธ์ œ๊ฐ€ ์•Œ๋ ค์ ธ ์žˆ๋‹ค. ๊ทธ ๊ธฐ์›์€ [2]์™€ [3]๊ณผ ๊ฐ™์€ ๊ฒƒ๋“ค๋กœ ๊ฑฐ์Šฌ๋Ÿฌ ์˜ฌ๋ผ๊ฐˆ ์ˆ˜ ์žˆ๋‹ค. VO๋Š” SFM์˜ ํŠน์ •ํ•œ ๊ฒฝ์šฐ์ด๋‹ค. SFM์€ ์—ฐ์†์ ์ธ ์ •๋ ฌ, ๋น„์ •๋ ฌ ์ด๋ฏธ์ง€ ์…‹์œผ๋กœ ๋ถ€ํ„ฐ ๊ตฌ์กฐ์™€ ์นด๋ฉ”๋ผ ํฌ์ฆˆ 3D ์žฌ๊ตฌ์„ฑ ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ณ  ๋น„๊ต์  ์ผ๋ฐ˜์ ์ด๋‹ค. (๋” ํฐ ๋ฐ”์šด๋”๋ฆฌ๋‹ค.) ์ตœ์ข… ๊ตฌ์กฐ์™€ ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋Š” ์ผ๋ฐ˜์ ์œผ๋กœ ์˜คํ”„๋ผ์ธ ์ตœ์ ํ™”(BO)๋กœ ๋‹ค๋“ฌ์–ด์ง€๋ฉฐ, ๊ณ„์‚ฐ ์‹œ๊ฐ„์€ ์ด๋ฏธ์ง€ ์ˆ˜์— ๋”ฐ๋ผ ์ฆ๊ฐ€ํ•œ๋‹ค. ๋ฐ˜๋ฉด์—, VO๋Š” -์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„์ด ๋“ค์–ด์˜ค๋ฉด- ์‹ค์‹œ๊ฐ„์œผ๋กœ ์นด๋ฉ”๋ผ์˜ 3D๋ชจ์…˜์„ ์—ฐ์†์ ์œผ๋กœ ์ถ”์ •ํ•˜๋Š”๋ฐ ์ดˆ์ ์ด ๋งž์ถฐ์ ธ์žˆ๋‹ค.

๋น„์ฃผ์–ผ ์ž…๋ ฅ์œผ๋กœ๋ถ€ํ„ฐ ์ฐจ๋Ÿ‰์˜ ์—๊ณ ๋ชจ์…˜์„ ์ถ”์ •ํ•˜๋Š” ๋ฌธ์ œ๋Š” ์ผ์ฐ์ด 1980๋…„๋Œ€์— ๋‹ค๋ค„์กŒ๋‹ค. [5] ํฅ๋ฏธ๋กญ๊ฒŒ๋„ VO์˜ ์ดˆ๊ธฐ ์—ฐ๊ตฌ ([5] ~ [9]) ๋Œ€๋ถ€๋ถ„์ด NASA ํ™”์„ฑ ํƒ์‚ฌ ํ”„๋กœ๊ทธ๋žจ์— ์ฃผ๋กœ ์šธํ‰๋ถˆํ‰ํ•˜๊ณ  ๊ฑฐ์นœ ์ง€ํ˜•์—์„œ ๋ฐ”ํ€ด์˜ ๋ฏธ๋„๋Ÿฌ์ง์ด ์žˆ์„ ๋•Œ ํƒ์‚ฌ์„ ์˜ 6์ž์œ ๋„ ๋™์ž‘์„ ์ธก์ •ํ•  ์ˆ˜ ์žˆ๋Š” ๊ธฐ๋Šฅ์„ ์ œ๊ณตํ•˜๊ธฐ ์œ„ํ•œ ๋…ธ๋ ฅ์˜ ์ผํ™˜์ด์—ˆ๋‹ค.

Moravec์˜ ์ž‘์—…์€ ์ตœ์ดˆ์˜ ๋ชจ์…˜ ์ถ”์ • ํŒŒ์ดํ”„ ๋ผ์ธ(์ฃผ์š” ๊ธฐ๋Šฅ ๋ธ”๋ก์€ ์—ฌ์ „ํžˆ ์‚ฌ์šฉ๋˜๊ณ  ์žˆ์Œ)์„ ์ œ์‹œ ํ•œ ๊ฒƒ ๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ์ตœ์ดˆ์˜ ์ฝ”๋„ˆ ๊ฒ€์ถœ๊ธฐ ์ค‘ ํ•˜๋‚˜๋ฅผ ์„ค๋ช…ํ•˜๋Š”๋ฐ๋„ ์ค‘์š”ํ•˜๋‹ค. (Hanna๊ฐ€ 1974๋…„์— ์ œ์•ˆํ•œ ์ฒซ ๋ฒˆ์งธ ๊ฒƒ ์ดํ›„[10]) Moravec ์ฝ”๋„ˆ ๊ฒ€์ถœ๊ธฐ๋Š” Forstner ์ฝ”๋„ˆ ๊ฒ€์ถœ๊ธฐ[12]์™€ Harris-Stephens ์ฝ”๋„ˆ ๊ฒ€์ถœ๊ธฐ[3], [82]์˜ ์ „์‹ ์ด๋‹ค.

Moravec์€ ๋ณธ์ธ์ด Slider stereo(a single camera sliding on a rail)๋ผ๊ณ  ํ‘œํ˜„ํ•œ ์นด๋ฉ”๋ผ๊ฐ€ ์žฅ์ฐฉ๋œ ํƒ์‚ฌ์„ ์œผ๋กœ ๊ทธ์˜ ์ž‘์—…์„ ํ…Œ์ŠคํŠธํ–ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ์ƒ๊ธฐ ์žฅ์น˜๋Š” stop-and-go ๋ฐฉ์‹์œผ๋กœ ์›€์ง์˜€๊ณ , ๋ชจ๋“  ์œ„์น˜์—์„œ ์ด๋ฏธ์ง€ ๋””์ง€ํ„ธํ™” ๋ฐ ๋ถ„์„์„ ์ง„ํ–‰ํ–ˆ๋‹ค. ๊ฐ๊ฐ์˜ stopํฌ์ธํŠธ์—์„œ ์นด๋ฉ”๋ผ๋Š” ์ˆ˜ํ‰์œผ๋กœ ์Šฌ๋ผ์ด๋“œํ•˜๋ฉด์„œ ์ผ์ •ํ•œ ๊ฑฐ๋ฆฌ์—์„œ ์ด 9๋ฒˆ์œผ๋กœ ๋‚˜๋ˆ„์–ด ์ด๋ฏธ์ง€๋ฅผ ์–ป๋Š”๋‹ค. Moravec์ด ๋งŒ๋“  ํŠน์ง•์  ๊ฒ€์ถœ๊ธฐ๋ฅผ ์ด์šฉํ•ด ํ•œ ์ด๋ฏธ์ง€์—์„œ ์ฝ”๋„ˆ๋ฅผ ์ฐพ์•„๋‚ด๊ณ  Normalized cross correlation์„ ์ด์šฉํ•˜์—ฌ ๋‚˜๋จธ์ง€ 8๊ฐœ ์ด๋ฏธ์ง€์˜ epipolar line์— ์ผ์น˜์‹œ์ผฐ๋‹ค.

epiporal?

Normalized cross correlation

๋‹ค์Œ ๋กœ๋ด‡ ์œ„์น˜์—์„œ ์˜ˆ์ƒ๋˜๋Š” ์ผ์น˜๋Š” ํฐ ๊ทœ๋ชจ์˜ ๋ณ€ํ™”๋ฅผ ์„ค๋ช…ํ•˜๊ธฐ ์œ„ํ•ด Coarse to fine strategy๋ฅผ ์‚ฌ์šฉํ•œ ์ƒ๊ด€ ๊ด€๊ณ„์— ์˜ํ•ด ๊ตฌํ•ด์ง„๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ ํ›„์— 8๊ฐœ์˜ ์Šคํ…Œ๋ ˆ์˜ค ์Œ์„ ์ด์šฉํ•ด ๊นŠ์ด ๋ถˆ์ผ์น˜๋ฅผ ๊ณ ๋ คํ•˜์—ฌ ์ด์ƒ๊ฐ’(Outliers)๋ฅผ ์ œ๊ฑฐํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ๋‘ ๊ฐœ์˜ ์—ฐ์†๋œ ๋กœ๋ด‡ ์œ„์น˜์—์„œ ๋ณด์ด๋Š” ์‚ผ๊ฐ ์ธก๋Ÿ‰๋œ 3D ํฌ์ธํŠธ๋ฅผ ์ •๋ ฌํ•˜๊ธฐ ์œ„ํ•ด, ๋ชจ์…˜์€ ๊ฐ•์ฒด ๋ณ€ํ™˜์œผ๋กœ ๊ณ„์‚ฐ๋˜์—ˆ๋‹ค.

rigid body transformation

Triangulation

์‹์€ ๊ฐ€์ค‘ ์ตœ์†Œ ์ œ๊ณฑ์„ ํ†ตํ•ด ํ•ด๊ฒฐ๋˜์—ˆ๋‹ค. (๊ฐ€์ค‘์น˜๊ฐ€ 3์ฐจ์›์˜ ์ ์œผ๋กœ๋ถ€ํ„ฐ์˜ ๊ฑฐ๋ฆฌ์— ๋ฐ˜๋น„๋ก€ํ•˜๋Š”)

๋ฐฉ์ •์‹ ์‹œ์Šคํ…œ์€ ๊ฐ€์ค‘์น˜๊ฐ€ 3 ์ฐจ์› ์ ์œผ๋กœ๋ถ€ํ„ฐ์˜ ๊ฑฐ๋ฆฌ์— ๋ฐ˜๋น„๋ก€ํ•˜๋Š” ๊ฐ€์ค‘ ์ตœ์†Œ ์ œ๊ณฑ์„ ํ†ตํ•ด ํ•ด๊ฒฐ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

weighted least square

๋น„๋ก Moravec์ด ๋‹จ์•ˆ์˜ ์Šฌ๋ผ์ด๋”ฉ ์นด๋ฉ”๋ผ๋ฅผ ์‚ฌ์šฉํ–ˆ์ง€๋งŒ, ๊ทธ์˜ ์ž‘์—…๋“ค์€ ์Šคํ…Œ๋ ˆ์˜ค VO ์•Œ๊ณ ๋ฆฌ์ฆ˜์— ์†ํ•œ๋‹ค. ์ด ์šฉ์–ด๋Š” ํ”ผ์ฒ˜์˜ ์ƒ๋Œ€์ ์ธ 3D ์œ„์น˜๊ฐ€ ํ•ญ์ƒ ์‚ผ๊ฐ ์ธก๋Ÿ‰์— ์˜ํ•ด ์ธก์ • ๋˜๊ณ  ์ƒ๋Œ€ ๋™์ ์„ ์œ ๋„ํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ๋‹ค๋Š” ๋ง์ด๋‹ค. Trinocular ๋ฐฉ๋ฒ•๋„ ๊ฐ™์€ ์•Œ๊ณ ๋ฆฌ์ฆ˜ ์ข…๋ฅ˜์— ์†ํ•œ๋‹ค. ์Šคํ…Œ๋ ˆ์˜ค ๋ฐฉ์‹์˜ ๋Œ€์•ˆ์€ ๋‹จ์•ˆ ์นด๋ฉ”๋ผ๋ฅผ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด ๊ฒฝ์šฐ, ๋ฐฉ์œ„ ์ •๋ณด๋งŒ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. ๋‹จ์ ์€ ๋ชจ์…˜์ด scale factor๋งŒํผ๋งŒ ๋ณต๊ตฌ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์ ˆ๋Œ€์ ์ธ ํฌ๊ธฐ๋Š” ๋ชจ์…˜ ์ œ์•ฝ, IMU, ๊ธฐ์•• ๋ฐ ๋ฒ”์œ„ ์„ผ์„œ์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ์„ผ์„œ์™€์˜ ํ†ตํ•ฉ ๋“ฑ ์ง์ ‘ ์ธก์ • ๋ฐฉ์‹์— ์˜ํ•ด ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. (e.g. ์žฅ๋ฉด์˜ ์š”์†Œ ์ธก์ •)

๋‹จ์•ˆ ๋ฐฉ๋ฒ•์—์„œ ์ฃผ๋ชฉํ•  ์ ์€ ๊ฑฐ๋ฆฌ๊ฐ€ ์Šคํ…Œ๋ ˆ์˜ค ๊ธฐ์ค€์„ (i.e. ๋‘ ์นด๋ฉ”๋ผ ์‚ฌ์ด์˜ ๊ฑฐ๋ฆฌ)๋ณด๋‹ค ํ›จ์”ฌ ํด ๋•Œ ์Šคํ…Œ๋ ˆ์˜ค VO๊ฐ€ ๋‹จ์•ˆ ์ผ€์ด์Šค๋กœ ํ‡ดํ™” ๋  ์ˆ˜ ์žˆ๋‹ค๋Š” ์ ์ด๋‹ค.

Stereo baseline

์ด ๊ฒฝ์šฐ ์Šคํ…Œ๋ ˆ์˜ค ๋ฐฉ์‹์€ ํšจ๊ณผ๊ฐ€ ๋–จ์–ด์ง€๋ฏ€๋กœ, ๋‹จ์•ˆ ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉ ํ•ด์•ผ ํ•œ๋‹ค. ์ˆ˜๋…„์— ๊ฑธ์ณ ๋‹จ์•ˆ VO์™€ ์Šคํ…Œ๋ ˆ์˜ค VO๋Š” ๋…๋ฆฝ์ ์œผ๋กœ ๋ฐœ์ „ํ•ด์™”๋‹ค. ์ด ์„น์…˜์˜ ๋‚˜๋จธ์ง€ ๋ถ„์•ผ์—์„œ๋Š” ์ด ๋ถ€๋ถ„์— ๊ด€ํ•ด ๋‹ค๋ฃจ๊ฒ ๋‹ค.

1-1. Stereo VO

VO์— ๋Œ€ํ•œ ๋Œ€๋ถ€๋ถ„์˜ ์—ฐ๊ตฌ๋Š” ์Šคํ…Œ๋ ˆ์˜ค ์นด๋ฉ”๋ผ๋ฅผ ์ด์šฉํ•ด ์ด๋ฃจ์–ด์กŒ๋‹ค. Moravec์˜ ์ž‘์—…์„ ๊ธฐ๋ฐ˜์œผ๋กœ, Matthies์™€ Shafer [6], [7] ์€ ์Šคํ…Œ๋ ˆ์˜ค ์‹œ์Šคํ…œ๊ณผ Moravec์˜ ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ์ฝ”๋„ˆ detecting๊ณผ tracking์„ ์ˆ˜ํ–‰ํ–ˆ๋‹ค.

Moravec์˜ ์ž‘์—…์„ ๊ธฐ๋ฐ˜์œผ๋กœ Matthies์™€ Shafer [6], [7]์€ ์–‘์•ˆ ์‹œ์Šคํ…œ๊ณผ Moravec์˜ ์ ˆ์ฐจ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ์„œ๋ฆฌ๋ฅผ ๊ฐ์ง€ํ•˜๊ณ  ์ถ”์ ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด๋“ค์€ Moravec์ด ๋ถˆํ™•์‹ค์„ฑ์˜ ์Šค์นผ๋ผ ํ‘œํ˜„์„ ์‚ฌ์šฉํ•œ ๊ฒƒ๊ณผ ๋‹ฌ๋ฆฌ, Triangulated ํ”ผ์ณ์™€ error covariance matrix๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฅผ ๋ชจ์…˜ ์ถ”์ • ๋‹จ๊ณ„์— ํ†ตํ•ฉํ–ˆ๋‹ค. ํƒ์‚ฌ์„  ๊ถค๋„ ํšŒ๋ณต์—์„œ Moravec์— ๋น„ํ•ด 5.5m ๊ฒฝ๋กœ์—์„œ 2%์˜ ์ƒ๋Œ€ ์˜ค์ฐจ๋กœ ๋” ์šฐ์ˆ˜ํ•œ ๊ฒฐ๊ณผ๋ฅผ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. Olson [9], [13] ๋“ฑ์€ ์ ˆ๋Œ€ ๋ฐฉํ–ฅ ์„ผ์„œ (e.g. compass, omnidirectional camera)๋ฅผ ๋„์ž…ํ•˜๊ณ  Forstner ์ฝ”๋„ˆ Detector์„ ์‚ฌ์šฉํ•ด Moravec์˜ operator๋ณด๋‹ค ํ›จ์”ฌ ๋น ๋ฅด๊ฒŒ ๊ณ„์‚ฐํ•˜๋Š” ๊ฑธ ๋ณด์˜€๋‹ค. ๊ทธ๋“ค์€ egomotion ์ถ”์ • ๊ฐ’๋งŒ ์‚ฌ์šฉํ•˜๋ฉด ์ด๋™ ๊ฑฐ๋ฆฌ์˜ superlinear growth์™€ ๋ˆ„์  ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•˜์—ฌ ๋ฐฉํ–ฅ ์˜ค๋ฅ˜๊ฐ€ ์ฆ๊ฐ€ํ•จ์„ ๋ณด์˜€๋‹ค. ๋ฐ˜๋Œ€๋กœ, ์ ˆ๋Œ€ ๋ฐฉํ–ฅ ์„ผ์„œ๊ฐ€ ํ†ตํ•ฉ๋˜๋ฉด ์˜ค์ฐจ ์ฆ๊ฐ€๋Š” ์ด๋™ ๊ฑฐ๋ฆฌ์˜ ์„ ํ˜• ํ•จ์ˆ˜ ํ˜•ํƒœ๋กœ ์ค„์–ด๋“ค ์ˆ˜ ์žˆ๋‹ค. ์ด๋กœ ์ธํ•ด 20m ๊ฒฝ๋กœ์—์„œ 1.2%์˜ ์ƒ๋Œ€ ์œ„์น˜ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ–ˆ๋‹ค.

Triangulated feature

error covariance matrix

Forstner corner detector

Lacroix ๋“ฑ ์€ ์•ž์„œ ์„ค๋ช…ํ•œ ๊ฒƒ๊ณผ ์œ ์‚ฌํ•œ ํ™”์„ฑ ํƒ์‚ฌ์„ ์šฉ ์Šคํ…Œ๋ ˆ์˜ค Visual Odometry ์ ‘๊ทผ ๋ฐฉ์‹์„ ๊ตฌํ˜„ํ–ˆ๋‹ค. ์ฐจ์ด์ ์€ key point๋ฅผ ์žก๋Š” ๋ฐฉ๋ฒ•์— ์žˆ๋Š”๋ฐ Forstner detector๋ฅผ ์‚ฌ์šฉํ•˜์ง€ ์•Š๊ณ , dense stereo๋ฅผ ์‚ฌ์šฉํ•œ ๋‹ค์Œ, peaks ์ฃผ๋ณ€์˜ correlation function์„ ๋ถ„์„ํ•˜์—ฌ key point ํ›„๋ณด๋ฅผ ์„ ํƒํ•˜์˜€๋‹ค. ์ด ์ ‘๊ทผ๋ฒ•์€ ๋‚˜์ค‘์— [14], [15] ์™€ ๋‹ค๋ฅธ ์ž‘์—…๋“ค์—์„œ๋„ ์“ฐ์˜€๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ correlation curve์˜ ํ˜•ํƒœ์™€ ํ‘œ์ค€ํŽธ์ฐจ ์‚ฌ์ด์— ๊ฐ•ํ•œ ์ƒ๊ด€ ๊ด€๊ณ„๊ฐ€ ์žˆ๋‹ค๋Š” ๊ฒƒ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๋‹ค.

correlation function

correlation curve

Cheng ๋“ฑ ์€ Olson ๋ฐฉ์‹์—์„œ ๋‘ ๊ฐ€์ง€๋ฅผ ๊ฐœ์„ ํ•˜์˜€๋‹ค. ์šฐ์„  Harris Corner Detector๋ฅผ ์‚ฌ์šฉํ•œ ํ›„, (Lacroix ๋“ฑ์ด ์ œ์•ˆํ•œ) Feature ์ฃผ๋ณ€์˜ correlation function์˜ curvature์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€ ํฌ์ธํŠธ์˜ error covariance matrix๋ฅผ ์ •์˜ํ–ˆ๋‹ค. ๋‘ ๋ฒˆ์งธ๋Š”, Nister๊ฐ€ ๊ทธ๋žฌ๋“ฏ outlier rejection์„ ์œ„ํ•ด์„œ ์ตœ์†Œ ์ž์Šน ๋ชจ์…˜ ์ถ”์ • ๋‹จ๊ณ„ (least-squares motion estimation step)์—์„œ RANSAC์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

๋‹ค๋ชฉ์  ํƒ์‚ฌ ์ฐจ๋Ÿ‰์„ ์œ„ํ•œ motion estimation์™€ outlier removal์— ๋Œ€ํ•œ ๋‹ค๋ฅธ ์ ‘๊ทผ์œผ๋กœ๋Š” Milella์™€ Siegwart์˜ ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค. ์ด๋“ค์€ Shi-Tomasi Detector๋ฅผ ์‚ฌ์šฉํ–ˆ๊ณ  (Lacroix์™€ ๋น„์Šทํ•˜๊ฒŒ) stereo disparity map์— ๊ฐ•ํ•œ ์‹ ๋ขฐ๋ฅผ ๋ฐ”ํƒ•์œผ๋กœ ์ž‘์—…ํ–ˆ์Šต๋‹ˆ๋‹ค. ์ด์ „์˜ ๋ฐฉ๋ฒ•์ฒ˜๋Ÿผ ์ตœ์†Œ ์ž์Šน์„ ์‚ฌ์šฉํ•˜์—ฌ Motion estimation๋ฌธ์ œ๋ฅผ ํ’€๊ณ  ICP(Iterative Near Point) ์•Œ๊ณ ๋ฆฌ์ฆ˜์„ ์‚ฌ์šฉํ•˜์—ฌ Pose refinement๋ฅผ ํ•ด๊ฒฐํ–ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  robustness๋ฅผ ์œ„ํ•ด์„œ outlier removal ๋‹จ๊ณ„๋ฅผ ICP์— ํ†ตํ•ฉ์‹œ์ผฐ๋‹ค.

์ง€๊ธˆ๊นŒ์ง€ ์–ธ๊ธ‰ํ•œ ์ž‘์—…๋“ค์€ ๋ชจ๋‘ ์Šคํ…Œ๋ ˆ์˜ค ์Œ์— ๋Œ€ํ•ด์„œ triangulation์„ ํ†ตํ•ด 3D ํฌ์ธํŠธ๋ฅผ ์žก๊ณ  relative motion์€ 3D to 3D point registration ๋ฌธ์ œ๋กœ ํ’€์–ด๋‚ด๋Š” ๋ฐฉ์‹์ด๋ผ๋Š” ๊ณตํ†ต์ ์ด ์žˆ๋‹ค. Nister์— ์˜ํ•ด 2004๋…„์— ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ ‘๊ทผ๋ฒ•์ด ์ œ์•ˆ๋˜์—ˆ๋Š”๋ฐ, ์ด๋“ค์€ VO๋ผ๋Š” ์šฉ์–ด๋ฅผ ์ฒ˜์Œ ๋งŒ๋“ค์—ˆ๊ณ  ๊ฐ•๋ ฅํ•œ outlier rejection ์ฒด๊ณ„๋ฅผ ์‹ค์‹œ๊ฐ„์—์„œ long-run ํ•˜๊ฒŒ ๊ตฌํ˜„ํ•ด๋ƒˆ๋‹ค.

Nister et al.์— ์˜ํ•ด 2004 ๋…„์— ์™„์ „ํžˆ ๋‹ค๋ฅธ ์ ‘๊ทผ๋ฒ•์ด ์ œ์•ˆ๋˜์—ˆ์Šต๋‹ˆ๋‹ค. ๊ทธ๋“ค์˜ ๋…ผ๋ฌธ์€ VO๋ผ๋Š” ์šฉ์–ด๋ฅผ ๋งŒ๋“ค์—ˆ์„๋ฟ๋งŒ ์•„๋‹ˆ๋ผ ๊ฐ•๋ ฅํ•œ ์ด์ƒ์น˜ ์ œ๊ฑฐ ์ฒด๊ณ„๋ฅผ ๊ฐ–์ถ˜ ์ตœ์ดˆ์˜ ์‹ค์‹œ๊ฐ„ ์žฅ๊ธฐ ๊ตฌํ˜„์„ ์ œ๊ณตํ•˜๋Š” ๊ฒƒ์œผ๋กœ ์•Œ๋ ค์ ธ ์žˆ์Šต๋‹ˆ๋‹ค. ๋˜ํ•œ ์—ฌ๋Ÿฌ ๋ฉด์—์„œ ์ด์ „ ์ด๋ก ๋“ค์„ ๊ฐœ์„ ํ•ด๋ƒˆ๋Š”๋ฐ, ์ฒซ์งธ๋กœ, ์ด์ „ ์ด๋ก ๋“ค๊ณผ ๋‹ค๋ฅด๊ฒŒ ํ”„๋ ˆ์ž„๊ฐ„ Feature๋“ค์„ ์ถ”์ ํ•˜์ง€ ์•Š์•˜๋‹ค. ๋Œ€์‹  Harris corner detector๋ฅผ ์ด์šฉํ•ด ๋…๋ฆฝ์ ์œผ๋กœ ์ฐพ์€ Feature๋“ค์„ ๋งค์น˜ํ•˜์˜€๋‹ค.

๊ธฐ์กด ๋ฐฉ์‹๊ณผ ์ฐจ์ด๋Š” ๋ญ๊ณ  ๋ญ๊ฐ€ ์ง€๊ธˆ์€ ๋Œ€์„ธ์ธ์ง€?

์ด๋Ÿฌํ•œ ๋ฐฉ์‹์€ cross-correlation-based tracking์—์„œ drift๋ฅผ ํ”ผํ•˜๋Š” ์žฅ์ ์ด ์žˆ๋‹ค. ๋‘˜์งธ๋กœ, relative motion์„ 3D to 3D ๋ฐฉ์‹์œผ๋กœ ๊ณ„์‚ฐํ•˜์ง€ ์•Š๊ณ  3D to 2D ์นด๋ฉ”๋ผ ํฌ์ฆˆ ์ถ”์ • ๋ฌธ์ œ๋กœ ํ’€์—ˆ๋‹ค. (์ด ๋ถ€๋ถ„์€ Motion Estimation์—์„œ ์ž์„ธํžˆ ๊ธฐ์ˆ ํ•œ๋‹ค.) ๋งˆ์ง€๋ง‰์œผ๋กœ, RANSAC outlier rejection์„ motion estimation ๋‹จ๊ณ„์— ํ†ตํ•ฉํ–ˆ๋‹ค.

Comport ๋“ฑ์€ ๋‹ค๋ฅธ ๋ชจ์…˜ ์ถ”์ • ๋ฐฉ์‹์„ ๋„์ž…ํ–ˆ๋‹ค. 3D to 3D point registration ์ด๋‚˜ 3D to 2D camera pose estimation ๊ธฐ์ˆ ์„ ์“ฐ์ง€ ์•Š๊ณ , 3D ํฌ์ธํŠธ๋ฅผ triangulationํ•  ํ•„์š” ์—†์ด 2D to 2D ์ด๋ฏธ์ง€์—์„œ ๋ชจ์…˜์„ ๊ณ„์‚ฐ ํ•  ์ˆ˜ ์žˆ๋„๋กํ•˜๋Š” quadrifocal tensor๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ์ด ๋ฐฉ๋ฒ•์€ ์–ด๋–ค ์Šคํ…Œ๋ ˆ์˜ค ์Œ์—์„œ๋„ 3D Point๋ฅผ triangulation ํ•˜์ง€ ์•Š๊ณ  2D to 2D ์ด๋ฏธ์ง€ ๋งค์น˜๋ฅผ ๊ฐ€๋Šฅํ•˜๊ฒŒ ํ•œ๋‹ค.

quadrifocal tensor

3D to 3D, 3D to 2D, 2D to 2D

1-2. Monocular VO

monocular VO๊ฐ€ ์Šคํ…Œ๋ ˆ์˜ค ๋ฐฉ์‹๊ณผ ๋‹ค๋ฅธ ์ ์€ 3D structure๋Š” ์ „๋ถ€ 2D ๋ฐฉ์œ„ ๋ฐ์ดํ„ฐ๋กœ๋ถ€ํ„ฐ ๊ณ„์‚ฐ๋˜์–ด์•ผ ํ•œ๋‹ค. ์ ˆ๋Œ€์  ํฌ๊ธฐ๋ฅผ ์ ˆ๋Œ€ ์•Œ ์ˆ˜ ์—†๊ธฐ ๋•Œ๋ฌธ์— ์ฒซ ๋‘ ํ”„๋ ˆ์ž„์˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋Š” ๋ณดํ†ต 1๋กœ ์„ค์ •๋œ๋‹ค. ์ƒˆ๋กœ์šด ์ด๋ฏธ์ง€๋ฅผ ๋ฐ›์œผ๋ฉด ์ฒซ ๋‘ ํ”„๋ ˆ์ž„์— ๋Œ€ํ•œ ์ƒ๋Œ€์ ์ธ ์Šค์ผ€์ผ๊ณผ ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋Š” 3D structure์— ๋Œ€ํ•œ ์ •๋ณด๋‚˜ trifocal tensor๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊ฒฐ์ •ํ•œ๋‹ค.

trifocal tensor, ORB SLAM์—์„œ ๋ชจ๋…ธ ์Šค์ผ€์ผ์„ ๊ฒฐ์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•

perspective camera์™€ omnidirectional ์นด๋ฉ”๋ผ ๋ชจ๋‘๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ง€๋‚œ 10๋…„๊ฐ„ ๋‹จ์ผ ์นด๋ฉ”๋ผ๋กœ ์žฅ๊ฑฐ๋ฆฌ(๊ธธ๊ฒŒ๋Š” ์ˆ˜ ํ‚ฌ๋กœ๋ฏธํ„ฐ)์—์„œ ๊ดœ์ฐฎ์€ ์„ฑ๋Šฅ์„ ๋‚ด์—ˆ๋‹ค.

์›๊ทผ ๋ฐ ์ „ ๋ฐฉํ–ฅ ์นด๋ฉ”๋ผ๋ฅผ ๋ชจ๋‘ ์‚ฌ์šฉํ•˜์—ฌ ์ง€๋‚œ 10 ๋…„ ๋™์•ˆ ์žฅ๊ฑฐ๋ฆฌ (์ตœ๋Œ€ ์ˆ˜ ํ‚ฌ๋กœ๋ฏธํ„ฐ)์—์„œ ๋‹จ์ผ ์นด๋ฉ”๋ผ๋กœ ์„ฑ๊ณต์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ์Šต๋‹ˆ๋‹ค. ๊ด€๋ จ ์ž‘์—…์€ feature-based ๋ฐฉ๋ฒ•, appearance-based ๋ฐฉ๋ฒ•, hybrid ๋ฐฉ๋ฒ•์œผ๋กœ ๋‚˜๋‰  ์ˆ˜ ์žˆ๋‹ค. Feature-based ๋ฐฉ๋ฒ•์€ ๊ฐ์ (silent)๊ณผ ๋ฐ˜๋ณต๋˜๋Š” ํ”ผ์ณ๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ๋‹ค. appearance-based ๋ฐฉ๋ฒ•์€ ์ด๋ฏธ์ง€ ๋˜๋Š” ํ•˜์œ„ ์˜์—ญ์— ์žˆ๋Š” ๋ชจ๋“  ํ”ฝ์…€์˜ intensity ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ๊ทธ๋ฆฌ๊ณ  hybrid ๋ฐฉ๋ฒ•์€ ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์„ ํ•ฉ์นœ ๊ฒƒ์ด๋‹ค.

Feature-based ๋ฐฉ๋ฒ•์€ [1], [24], [25], [27], [30]~[32] ์˜ ๋ฐฉ๋ฒ•์ด๋‹ค. ๋‹จ์ผ ์นด๋ฉ”๋ผ์— ๋Œ€ํ•œ ์ตœ์ดˆ์˜ ์‹ค์‹œ๊ฐ„ VO๋Š” Nister์˜ ๊ฒƒ์ด์—ˆ๊ณ , ๊ทธ๋“ค์€ outlier rejection์œผ๋กœ RABSAC์„ ์‚ฌ์šฉํ•˜๊ณ  ์ƒˆ๋กœ์šด ์นด๋ฉ”๋ผ์˜ ํฌ์ฆˆ๋ฅผ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ 3D to 2D ์นด๋ฉ”๋ผ ํฌ์ฆˆ ์ถ”์ •์„ ์‚ฌ์šฉํ–ˆ๋‹ค. ๋…ผ๋ฌธ์˜ ์ฃผ ๋‚ด์šฉ์€ RANSAC์—์„œ ์šด๋™ ๊ฐ€์„ค์„ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด five-point minimal solver๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. ์ดํ›„ five-point RANSAC์€ VO์—์„œ ๋Œ€์„ธ๊ฐ€ ๋˜์—ˆ์Œ. Corke [24]๋Š” catadioptric ์นด๋ฉ”๋ผ๋ฅผ ์ด์šฉํ•ด ์–ป์€ omnidirectional ์ด๋ฏธ์ง€์™€ optical flow๋ฅผ ์ด์šฉํ•œ mono VO์— ๋Œ€ํ•œ ์ ‘๊ทผ ๋ฐฉ์‹์„ ์ œ์•ˆํ–ˆ๋‹ค. Eng Lhuillier [25] ๋ฐ Mouragnon [30]์€ ๋ชจ์…˜๊ณผ 3D ๋งต์„ ๋ชจ๋‘ ๋ณต๊ตฌํ•˜๊ธฐ ์œ„ํ•ด Local windowed-bundle adjustment์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ ๋ฐฉ์‹์„ ์ œ์•ˆํ–ˆ๋‹ค. (์ด๋Š” Bundle adjustment๊ฐ€ ๋งˆ์ง€๋ง‰ m ํ”„๋ ˆ์ž„์˜ ์œˆ๋„์šฐ์—์„œ ์ˆ˜ํ–‰๋จ์„ ์˜๋ฏธํ•จ.) ๋‹ค์‹œ, ๊ทธ๋“ค์€ outlier ์ œ๊ฑฐ ๋ฐฉ๋ฒ•์œผ๋กœ five-point RANSAC์„ ์‚ฌ์šฉํ–ˆ๋‹ค.

Five-point RANSAC

windowed-bundle adjustment

Tardif [27]๋Š” Bundle adjustment ์—†์ด ์žฅ๊ฑฐ๋ฆฌ(2.5km)์—์„œ VO๋ฅผ ํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์•ˆํ–ˆ๋‹ค. ์ด์ „ ์ž‘์—…๋“ค๊ณผ๋Š” ๋‹ฌ๋ฆฌ Rotation๊ณผ translation ์ถ”์ •์„ ๋ถ„๋ฆฌํ–ˆ๋‹ค. Rotation์€ ๋ฌดํ•œ๋Œ€์˜ ์ ๊ณผ ๋ณต๊ตฌ๋œ 3D ์ง€๋„์˜ translation์„ ์ด์šฉํ•˜์—ฌ ์ถ”์ •ํ–ˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  five-point RANSAC์„ ์‚ฌ์šฉํ•˜์˜€๋‹ค.

Appearance-based ์™€ hybrid ๋ฐฉ๋ฒ•์œผ๋กœ๋Š” [26], [28], [29] ๊ฐ€ ์žˆ๋‹ค. Goecke [26]์€ Fourier-Mellin transform์„ ์ด์šฉํ•ด ์ฐจ๋Ÿ‰์—์„œ ์ฐ์€ ์ง€๋ฉด์˜ ์›๊ทผ ์ด๋ฏธ์ง€๋ฅผ registeringํ–ˆ๊ณ , Milford์™€ Wyeth[28]์€ ์ฐจ๋Ÿ‰์— ์žฅ์ฐฉ๋œ single perspective ์นด๋ฉ”๋ผ์—์„œ ๋Œ€๋žต์ ์ธ Rotation ๋ฐ translational velocity ์ •๋ณด๋ฅผ ์ถ”์ถœํ•˜๋Š” ๋ฐฉ๋ฒ•์„ ์ œ์‹œ ํ–ˆ์œผ๋ฉฐ ์ด๊ฑธ RatSLAM์— ์‚ฌ์šฉํ–ˆ๋‹ค. ๊ทธ๋“ค ๋ฐฉ์‹์€ ์žฅ๋ฉด ์ค‘์•™์— Templete tracking ๋ฐฉ๋ฒ•์„ ์ผ๋‹ค. ์ด ๋ฐฉ๋ฒ•์˜ ์ฃผ์š” ๋‹จ์ ์€ occlussion์— robustํ•˜์ง€ ์•Š๋‹ค๋Š” ์ ์ด๋‹ค. ์ด๋Ÿฌํ•œ ์ด์œ ๋กœ Scaramuzza์™€ Siegwart[29]๋Š” translation๊ณผ absolute scale์„ ๊ณ„์‚ฐํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋˜๋Š” ์ž๋™์ฐจ์˜ ํšŒ์ „๊ณผ ์ง€๋ฉด์œผ๋กœ ๋ถ€ํ„ฐ์˜ ํŠน์ง•์„ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•ด์„œ ์ด๋ฏธ์ง€ ๋ชจ์–‘์„ ์‚ฌ์šฉํ–ˆ๋‹ค. Feature ๊ธฐ๋ฐ˜์˜ ๋ฐฉ์‹์€ appearance ๋ฐฉ์‹์˜ ๋‹จ์ ์„ ๋ณด์™„ํ•˜๋Š”๋ฐ ์‚ฌ์šฉํ–ˆ๋‹ค.

์„ธ ์ ‘๊ทผ๋ฒ• ์ค‘ ์–ด๋–ค ์ ‘๊ทผ๋ฒ•์ด ๋Œ€์„ธ ์ธ์ง€

์•ž์„œ ์–ธ๊ธ‰ํ•œ ๋ชจ๋“  ๋ฐฉ์‹์€ 6 DoF์—์„œ์˜ unconstrained motion์„ ์œ„ํ•ด ์„ค๊ณ„๋˜์—ˆ๋‹ค. ํ•˜์ง€๋งŒ, ๋ช‡๊ฐ€์ง€ VO ์ž‘์—…์€ motion constraint๊ฐ€ ์žˆ๋Š” ์ฐจ๋Ÿ‰์„ ์œ„ํ•ด ์„ค๊ณ„๋˜์—ˆ๋‹ค. ์žฅ์ ์€ ์—ฐ์‚ฐ ์‹œ๊ฐ„์ด ๋‹จ์ถ•๋˜๊ณ  ๋ชจ์…˜ ์ •ํ™•๋„๊ฐ€ ํ–ฅ์ƒ๋œ๋‹ค๋Š” ๊ฒƒ์ด๋‹ค. ์˜ˆ๋ฅผ ๋“ค์–ด Liang and Pears [35], Ke and Kanade [36], Wang et al. [37] ๋ฐ Guerrero et al. [38]๋Š” dominant ๊ทธ๋ผ์šด๋“œ ๋ฉด์—์„œ์˜ egomotion์„ ์ถ”์ •ํ•˜๊ธฐ ์œ„ํ•ด homographies๋ฅผ ์‚ฌ์šฉํ–ˆ๋‹ค. Scaramuzza et al. [31], [39] ๋Š” egomotion ์ถ”์ • ์†๋„๋ฅผ 400Hz๋กœ ๋†’์ด๊ธฐ ์œ„ํ•ด vehicle nonholonomic constraint๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœํ•œ one-point RANSAC outlier rejection์„ ๋„์ž…ํ–ˆ๋‹ค. ํ›„์† ์ž‘์—…์—์„œ ๊ทธ๋“ค์€ nonholonomic constraint๊ฐ€ ์ฐจ๋Ÿ‰์ด ํšŒ์ „ ํ•  ๋•Œ๋งˆ๋‹ค mono camera์—์„œ absolute scale์„ ๋ณต๊ตฌ ํ•  ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค[40]. ๊ทธ ์ž‘์—…์— ์ด์–ด feature tracking ํ–ฅ์ƒ์„ ์œ„ํ•ด Pretto et al. ์— ์˜ํ•ด ์ฐจ๋Ÿ‰ nonholonomic constraint๋„ ์‚ฌ์šฉ๋˜์—ˆ๋‹ค. Fraundorfer et al. [41]๋Š” windowed bundle adjustment๋ฅผ ์œ„ํ•ด nonholonomic constraint๋ฅผ ์‚ฌ์šฉํ•˜์˜€๋‹ค. (๋‹ค์Œ ์„น์…˜ ์ฐธ์กฐ)

nonholonomic constraint

unconstrained motion

6 DoF ๋น„์ฃผ์–ผ ์˜ค๋„๋ฉ”ํŠธ๋ฆฌ์—์„œ DoF ์˜๋ฏธ

1-3. Reducing the Drift

VO๋Š” ์นด๋ฉ”๋ผ ๊ฒฝ๋กœ๋ฅผ ์ ์ง„์ ์œผ๋กœ ๊ณ„์‚ฐํ•˜๋Š” ๋ฐฉ์‹์ด๋ฏ€๋กœ (ํฌ์ฆˆ์™€ ๊ทธ ๋‹ค์Œ ํฌ์ฆˆ) ๊ฐ๊ฐ์˜ ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„ ๊ฐ„ ๋ชจ์…˜์œผ๋กœ ์ธํ•ด ๋ฐœ์ƒํ•˜๋Š” ์˜ค๋ฅ˜๋Š” ์‹œ๊ฐ„์ด ์ง€๋‚จ์— ๋”ฐ๋ผ ๋ˆ„์ ๋œ๋‹ค. ์ด๊ฒƒ์€ ์‹ค์ œ ๊ฒฝ๋กœ์—์„œ ์ถ”์ •๋œ ๊ถค์ ์œผ๋กœ drift๋ฅผ ์ƒ์„ฑํ•œ๋‹ค. ์ผ๋ถ€ ์• ํ”Œ๋ฆฌ์ผ€์ด์…˜์˜ ๊ฒฝ์šฐ drift๋ฅผ ๊ฐ€๋Šฅํ•œ ์ž‘๊ฒŒ ์œ ์ง€ํ•˜๋Š” ๊ฒƒ์ด ์ค‘์š”ํ•˜๋‹ค. ์ด๋Š” ๋งˆ์ง€๋ง‰์—์„œ ๋ถ€ํ„ฐ m๊ฐœ์˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ์— ๋Œ€ํ•œ local optimization์„ ํ†ตํ•ด ์ˆ˜ํ–‰ ํ•  ์ˆ˜ ์žˆ๋‹ค.

local optimization

sliding window bundle adjustment ๋˜๋Š” windowed bundle adjustment๋ผ๊ณ  ํ•˜๋Š” ์ด ๋ฐฉ์‹์€ [41]-[44] ์™€ ๊ฐ™์€ ์—ฌ๋Ÿฌ ์ž‘์—…์—์„œ ์“ฐ์˜€๋‹ค. ํŠนํžˆ Konolige et al. [43]์˜ 10km VO ์‹คํ—˜์—์„œ window bundle adjustment์ด ์ตœ์ข… ์œ„์น˜ ์˜ค๋ฅ˜๋ฅผ 2~5๋ฐฐ ๊ฐ์†Œ์‹œํ‚ฌ ์ˆ˜ ์žˆ์Œ์„ ๋ณด์—ฌ์ฃผ์—ˆ๋‹ค. ๋ช…๋ฐฑํžˆ, VO drift๋Š” GPS๋‚˜ ๋ ˆ์ด์ €์™€ ๊ฐ™์€ ๋‹ค๋ฅธ ์„ผ์„œ์™€ ๊ฒฐํ•ฉํ•˜๊ฑฐ๋‚˜ IMU๋งŒ ์‚ฌ์šฉํ•˜์—ฌ๋„ ์ค„์ผ ์ˆ˜ ์žˆ๋‹ค. [43], [45], [46].

1-4. V-SLAM

์ด ํŠœํ† ๋ฆฌ์–ผ์€ VO ์ค‘์‹ฌ์˜ ๋‚ด์šฉ์ด์ง€๋งŒ VSLAM์— ์˜ํ•ด ์ˆ˜ํ–‰๋˜๋Š” parallel line์— ๋Œ€ํ•ด ์–ธ๊ธ‰ํ•ด์•ผํ•œ๋‹ค. SLAM ๋ฌธ์ œ์— ๋Œ€ํ•œ ์‹ฌ๋„ ์žˆ๋Š” ์—ฐ๊ตฌ๋ฅผ ์œ„ํ•ด ๋…์ž๋Š” Durrant-Whyte์™€ Bailey [47], [48]์˜ ๊ธ€์„ ์ฐธ์กฐํ•œ๋‹ค. ๋‘ ๊ฐ€์ง€ ๋ฐฉ๋ฒ•๋ก ์ด VSLAM ์—์„œ ๋Œ€์„ธ๊ฐ€ ๋˜์—ˆ๋‹ค. 1)ํ•„ํ„ฐ๋ง ๋ฐฉ๋ฒ•์€ ๋ชจ๋“  ์ด๋ฏธ์ง€์˜ ์ •๋ณด๋ฅผ ํ™•๋ฅ  ๋ถ„ํฌ(probability distribution)๊ณผ ์œตํ•ฉํ•˜๊ณ  [49] 2) ํ‚ค ํ”„๋ ˆ์ž„ ๋ฐฉ๋ฒ•์ด๋ผ๊ณ ๋„ ํ•˜๋Š” ๋น„ ํ•„ํ„ฐ๋ง ๋ฐฉ๋ฒ•์€ ์„ ํƒํ•œ ํ‚คํ”„๋ ˆ์ž„์— ๋Œ€ํ•œ global bundle adjustment ๋ฅผ ์œ ์ง€ํ•œ๋‹ค. ๋‘ ์ ‘๊ทผ๋ฒ•์˜ ์ฃผ์š” ์žฅ์ ์€ [51]์— ํ‰๊ฐ€๋˜๊ณ  ์š”์•ฝ ๋˜์–ด์žˆ๋‹ค.

[51] ์š”์•ฝ ์ฝ์–ด๋ณด๊ธฐ

์ง€๋‚œ ๋ช‡ ๋…„ ๋™์•ˆ mono์™€ stereo camera ๋ชจ๋‘ ์‚ฌ์šฉํ•˜์—ฌ ์„ฑ๊ณต์ ์ธ ๊ฒฐ๊ณผ๋ฅผ ์–ป์—ˆ๋‹ค. [49], [52]-[62]. ์ด๋Ÿฌํ•œ ์—ฐ๊ตฌ์˜ ๋Œ€๋ถ€๋ถ„์€ ์ž‘์€ ์‹ค๋‚ด ์ž‘์—… ๊ณต๊ฐ„์œผ๋กœ ์ œํ•œ๋˜์–ด ์žˆ์œผ๋ฉฐ ์ตœ๊ทผํ•ด ํฐ ๊ณต๊ฐ„์„ ์œ„ํ•ด ์„ค๊ณ„๋œ ์—ฐ๊ตฌ๋Š” ๊ทน์†Œ์ˆ˜์— ๋ถˆ๊ณผํ•˜๋‹ค. [54], [60], [62].

54 60 62 ์ฝ์–ด๋ณด๊ธฐ

์‹ค์‹œ๊ฐ„ VSLAM์˜ ์ดˆ๊ธฐ์ž‘๋“ค ์ค‘ ์ผ๋ถ€์ธ Chiuso et al. [52], Deans [53], Davison [49]๋Š” full-covariance Kalman ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•œ๋‹ค. Davison[49] ์—ฐ๊ตฌ์˜ ์ด์ ์€ ์ž„์˜์˜ ์‹œ๊ฐ„ ํ›„์— ๋ฐ˜๋ณต ๊ฐ€๋Šฅํ•œ localization์„ ์„ค๋ช…ํ•œ ๊ฒƒ์ด๋‹ค. ๋‚˜์ค‘์— Handa et al. [59]๊ฐ€ probabilistic framework์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•œ active matching ๊ธฐ์ˆ ์„ ์‚ฌ์šฉํ•˜์—ฌ ๊ทธ ์ž‘์—…์„ ๊ฐœ์„ ํ–ˆ๋‹ค.

Civera et al. [60] ๋Š” RANSAC model-hypothesis ๋‹จ๊ณ„์—์„œ ํ•„ํ„ฐ์—์„œ ์œ ์šฉํ•œ ์‚ฌ์ „ ํ™•๋ฅ  ์ •๋ณด๋ฅผ ์‚ฌ์šฉํ•˜๋Š” Kalman filter ๋‚ด์—์„œ one-point RANSAC์˜ ์กฐํ•ฉ์„ ์ œ์•ˆํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ๊ตฌํ˜„ํ•˜์˜€๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ Strasdat et al. [61]์€ SLAM์˜ ํŠน์ˆ˜ ํŠน์„ฑ์„ ๊ณ ๋ คํ•˜๋ฉด์„œ ํ‚ค ํ”„๋ ˆ์ž„ optimization ์ ‘๊ทผ๋ฒ• [50]์„ ํ™œ์šฉํ•˜๋Š” ์‹ค์™ธ VSLAM์„ ์œ„ํ•œ ์ƒˆ๋กœ์šด ํ”„๋ ˆ์ž„ ์›Œํฌ๋ฅผ ์ œ์‹œํ•˜์˜€๋‹ค.

RANSAC model-hypothesis ๋‹จ๊ณ„

ํŠน์ˆ˜ ํŠน์„ฑ (special character) ๋ž€?

1-5. VO versus VSLAM

์ด ์„น์…˜์—์„œ๋Š” VO์™€ VSLAM์˜ ๊ด€๊ณ„๋ฅผ ๋ถ„์„ํ•œ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ VSLAM์˜ ๋ชฉํ‘œ๋Š” ๋กœ๋ด‡ ๊ฒฝ๋กœ์˜ ์ „์ฒด์ ์ด๊ณ  ์ผ๊ด€๋œ ์ถ”์ •์น˜๋ฅผ ์–ป๋Š” ๊ฒƒ์ด๋‹ค. ์ด๊ฒƒ์€ ๋กœ๋ด‡์ด ์ด์ „์— ๋ฐฉ๋ฌธํ•œ ์ง€์—ญ์œผ๋กœ ๋Œ์•„์˜ฌ ๋•Œ๋ฅผ ์ธ์‹ํ•˜๋Š”๋ฐ ํ•„์š”ํ•˜๊ธฐ ๋•Œ๋ฌธ์— ์ „์— ํ™˜๊ฒฝ์˜ ์ง€๋„๋ฅผ ์ถ”์ ํ•˜๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. (์ด๋ฅผ Loop closing ์ด๋ผ๊ณ  ํ•œ๋‹ค. Loop closing์ด ๊ฐ์ง€๋˜๋ฉด ์ด ์ •๋ณด๋“ค์€ ์ง€๋„์™€ ์นด๋ฉ”๋ผ ๊ฒฝ๋กœ ๋ชจ๋‘์—์„œ drift๋ฅผ ์ค„์ด๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค. 1) Loop closing ์ด ๋ฐœ์ƒํ•˜๋Š” ์‹œ๊ธฐ๋ฅผ ์ดํ•ดํ•˜๊ณ  2) ์ด ์ƒˆ๋กœ์šด constraint์„ ํ˜„์žฌ ๋งต์— ํšจ์œจ์ ์œผ๋กœ ํ†ตํ•ฉํ•˜๋Š” ๊ฒƒ์ด SLAM์˜ ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋ฌธ์ œ์ด๋‹ค.) ๋ฐ˜๋Œ€๋กœ, VO๋Š” Path๋ฅผ ํฌ์ฆˆ๋งˆ๋‹ค ์ ์ง„์ ์œผ๋กœ ๋ณต๊ตฌ ํ•˜๋Š” ๊ฒƒ๊ณผ ์ž ์žฌ์ ์œผ๋กœ ๋งˆ์ง€๋ง‰์—์„œ ๋ถ€ํ„ฐ n๊ฐœ์˜ ํฌ์ฆˆ์— ๋Œ€ํ•ด์„œ ์ตœ์ ํ™”๋ฅผ ํ•˜๋Š” ๊ฒƒ์„ ๋ชฉํ‘œ๋กœ ํ•œ๋‹ค. (์ด๊ฒƒ์€ Windowed bundle adjustment๋ผ๊ณ  ๋ถˆ๋ฆฌ๊ธฐ๋„ ํ•œ๋‹ค.) sliding window optimization์€ SLAM์—์„œ ๋กœ์ปฌ ๋งต์„ ๊ตฌ์ถ•ํ•˜๋Š” ๊ฒƒ๊ณผ ๋™์ผํ•˜๋‹ค๊ณ  ๊ฐ„์ฃผํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์ฒ ํ•™(;๋ณธ์งˆ)์€ ๋‹ค๋ฅด๋‹ค. VO์—์„œ๋Š” trajectory์˜ local consistency์—๋งŒ ๊ด€์‹ฌ์ด ์žˆ๊ณ , ๋กœ์ปฌ ๋งต์€ local trajectory์˜ ์ •ํ™•ํ•œ ์ถ”์ •(e.g. bundle adjustment) ์—๋งŒ ์‚ฌ์šฉ๋œ๋‹ค. ๋ฐ˜๋ฉด์— VSLAM์€ global map consistency์— ๋ณด๋‹ค ๊ด€์‹ฌ์ด ์žˆ๋‹ค.

loop closing

global map consistency

VO๋Š” ์™„์ „ํ•œ SLAM ์•Œ๊ณ ๋ฆฌ์ฆ˜์˜ building block์œผ๋กœ ์‚ฌ์šฉ๋˜์–ด ์นด๋ฉ”๋ผ์˜ incremental motion์„ ๋ณต๊ตฌ ํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋Ÿฌ๋‚˜ ์™„์ „ํ•œ SLAM ๋ฐฉ๋ฒ•์„ ๋งŒ๋“œ๋ ค๋ฉด Loop closing์„ ๊ฐ์ง€ํ•˜๋Š” ๋ฐฉ๋ฒ•๊ณผ ๊ฐ€๋Šฅํ•œ global optimization ๋‹จ๊ณ„๋ฅผ ์ถ”๊ฐ€ํ•˜์—ฌ metricallyํ•˜๊ฒŒ ์ผ๊ด€๋œ ๋งต์„ ๊ฐ€์ ธ์™€์•ผ ํ•œ๋‹ค. (์ด ๋‹จ๊ณ„๊ฐ€ ์—†์–ด๋„ ๋งต์€ ์—ฌ์ „ํžˆ topologicallyํ•˜๊ฒŒ ์ผ์ •ํ•˜๋‹ค.)

incremental motion

๋งŒ์•ฝ ์‚ฌ์šฉ์ž๊ฐ€ ์ „์ฒด ๋งต์ด ์•„๋‹Œ ์นด๋ฉ”๋ผ ๊ฒฝ๋กœ์—๋งŒ ๊ด€์‹ฌ์ด ์žˆ๋Š” ๊ฒฝ์šฐ ์ด ํŠœํ† ์ด์–ผ์—์„œ ์„ค๋ช…ํ•˜๋Š” VO ๊ธฐ์ˆ  ๋Œ€์‹  ์™„์ „ํ•œ VSLAM ๋ฐฉ๋ฒ•์„ ์‚ฌ์šฉํ•  ์ˆ˜ ์žˆ๋‹ค. VSLAM ๋ฐฉ์‹์€ ๊ฒฝ๋กœ์— ๋” ๋งŽ์€ ์ œ์•ฝ์„ ์ ์šฉํ•˜์ง€๋งŒ ๋ฐ˜๋“œ์‹œ ๋” ๊ฐ•๋ ฅํ•œ ๊ฒƒ์€ ์•„๋‹ˆ๊ธฐ ๋•Œ๋ฌธ์— ์ž ์žฌ์ ์œผ๋กœ ํ›จ์”ฌ ๋” ์ •ํ™•ํ•˜๋‹ค. (e.g. loop closing์˜ outlier๋“ค์€ map consistency์— ์‹ฌ๊ฐํ•œ ์˜ํ–ฅ์„ ์ค„ ์ˆ˜ ์žˆ์Œ.) ๋˜ํ•œ ๋” ๋ณต์žกํ•˜๊ณ  ๊ณ„์‚ฐ ๋น„์šฉ์ด ๋งŽ์ด ๋“ ๋‹ค.

๊ฒฐ๊ตญ VO์™€ VSLAM ์ค‘ ์„ ํƒํ•˜๋Š” ๊ฒƒ์€ ์„ฑ๋Šฅ๊ณผ ์ผ๊ด€์„ฑ ์‚ฌ์ด์˜ ๊ท ํ˜•๊ณผ ๊ตฌํ˜„์˜ ๋‹จ์ˆœ์„ฑ์— ๋‹ฌ๋ ค ์žˆ๋‹ค. ์นด๋ฉ”๋ผ ๊ฒฝ๋กœ์˜ ์ „์ฒด์ ์ธ ์ผ๊ด€์„ฑ์ด ์ค‘์š”ํ•˜๊ธด ํ•˜์ง€๋งŒ VO๋Š” ์นด๋ฉ”๋ผ์˜ ์ด์ „ ๊ธฐ๋ก์„ ์ถ”์ ํ•  ํ•„์š” ์—†์ด ์‹ค์‹œ๊ฐ„ ์„ฑ๋Šฅ๊ณผ ์ผ๊ด€์„ฑ์„ ์ ˆ์ถฉํ•œ๋‹ค.

2. Formulation of the VO problem

agent๊ฐ€ ์ „์ฒด ํ™˜๊ฒฝ์„ ์ด๋™ํ•˜๋ฉฐ ๊ณ ์ •๋œ ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ์œผ๋กœ ์ด์‚ฐ ์‹œ๊ฐ„ k์— ์ด๋ฏธ์ง€๋ฅผ ์ดฌ์˜ํ•œ๋‹ค. mono ์‹œ์Šคํ…œ์˜ ๊ฒฝ์šฐ k ์‹œ๊ฐ„์— ์ดฌ์˜๋œ ์ด๋ฏธ์ง€ ์„ธํŠธ๋Š” I0:n = {I0, ... , In} ๊ผด๋กœ ๋‚˜ํƒ€๋‚ผ ์ˆ˜ ์žˆ๋‹ค. Stereo ์‹œ์Šคํ…œ์˜ ๊ฒฝ์šฐ๋Š” ๋งค ์ˆœ๊ฐ„ ์™ผ์ชฝ ์˜ค๋ฅธ์ชฝ ์ด๋ฏธ์ง€๊ฐ€ ์žˆ๋Š”๋ฐ y Il, 0:n ยผ fIl, 0, ... , Il, ng and Ir, 0:n ยผ fIr, 0, ... , Ir, ng . ์ด๋Š” Figure 1 ์˜ ๊ทธ๋ฆผ์ด ์ด๋ฅผ ๋ณด์—ฌ์ฃผ๊ณ  ์žˆ๋‹ค.

๋‹จ์ˆœํ™”๋ฅผ ์œ„ํ•ด, camera coordinate frame์„ agent's coordinate frame์œผ๋กœ ๊ฐ€์ •ํ•œ๋‹ค. Stereo ์‹œ์Šคํ…œ์˜ ๊ฒฝ์šฐ ์ผ๋ฐ˜์„ฑ์„ ์žƒ์ง€ ์•Š๊ณ  ์™ผ์ชฝ ์นด๋ฉ”๋ผ์˜ ์ขŒํ‘œ๊ณ„๋ฅผ ์›์ ์œผ๋กœ ์‚ฌ์šฉ ํ•  ์ˆ˜ ์žˆ๋‹ค.

์ธ์ ‘ํ•œ ์‹œ์  k1 ๋ฐ k์—์„œ ๋‘ ๊ฐœ์˜ ์นด๋ฉ”๋ผ ์œ„์น˜๋Š” ๋‹ค์Œ ํ˜•์‹์˜ rigid body transformation T_k,k-1 E R ^4x4 of the following form:

Rigid body transform

์—ฌ๊ธฐ์„œ Rk,k-1 E SO(3) ๋Š” rotation ํ–‰๋ ฌ์ด๊ณ , tk,k-1 E R^3x1 ๋Š” translation ๋ฒกํ„ฐ์ด๋‹ค. T1:n = {T1,0,,,, , Tn,n-1} ์„ธํŠธ์—๋Š” ๋ชจ๋“  ํ›„์† ๋™์ž‘์ด ํฌํ•จ๋œ๋‹ค. ๋‹จ์ˆœํ•˜๊ฒŒ ํ‘œ๊ธฐํ•˜๊ธฐ ์œ„ํ•ด ์ง€๊ธˆ๋ถ€ํ„ฐ tk,k-1 ๋Œ€์‹ ์— Tk๋กœ ํ‘œ๊ธฐํ•œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ ์นด๋ฉ”๋ผ ํฌ์ฆˆ ์„ธํŠธ C0:n = {C0, ... , Cn}๋Š” k=0์ธ ์ดˆ๊ธฐ ์ขŒํ‘œ ํ”„๋ ˆ์ž„์— ๋Œ€ํ•œ ์นด๋ฉ”๋ผ์˜ ๋ณ€ํ™˜์ด ํฌํ•จ๋œ๋‹ค. ํ˜„์žฌ ํฌ์ฆˆ Cn์€ ๋ชจ๋“  ๋ณ€ํ™˜ Tk(k=1,,,n)์„ ์—ฐ๊ฒฐํ•ด์„œ ๊ตฌํ•  ์ˆ˜ ์žˆ๋‹ค. ๊ทธ๋ฆฌ๊ณ  ๊ทธ๋Ÿฌ๋ฏ€๋กœ Cn = Cn-1Tn, ์ฆ‰, C0์€ ์‚ฌ์šฉ์ž๊ฐ€ ์ž„์˜๋กœ ์„ค์ •ํ•œ k=0 ์ˆœ๊ฐ„์˜ ์นด๋ฉ”๋ผ ํฌ์ฆˆ๋ผ๊ณ  ํ•  ์ˆ˜ ์žˆ๋‹ค.

VO์˜ ์ฃผ์š” ์ž‘์—…์€ ์ด๋ฏธ์ง€ Ik ์™€ Ik-1์—์„œ relative transformation Tk๋ฅผ ๊ณ„์‚ฐํ•œ ๋‹ค์Œ transformation๋“ค์„ ์—ฐ๊ฒฐํ•˜์—ฌ ์นด๋ฉ”๋ผ์˜ ์ „์ฒด ๊ถค์  C0:n์„ ๋ณต๊ตฌํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ์ด๊ฒƒ์€ VO๊ฐ€ ํฌ์ฆˆ๋งˆ๋‹ค ๊ฒฝ๋กœ๋ฅผ ์ ์ง„์ ์œผ๋กœ ๋ณต๊ตฌํ•œ๋‹ค๋Š” ๊ฒƒ์„ ์˜๋ฏธํ•œ๋‹ค. ์ด ๋‹จ๊ณ„ ํ›„์— ๋งˆ์ง€๋ง‰ m๊ฐœ์˜ ํฌ์ฆˆ์— ๋Œ€ํ•œ ๋ฐ˜๋ณต์ ์ธ ๋ฏธ์„ธ ์กฐ์ •์„ ์ˆ˜ํ–‰ํ•˜์—ฌ local trajectory๋ฅผ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ ์ถ”์ •ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ๋ฐ˜๋ณต์ ์ธ ๊ฐœ์„ ์€ ๋งˆ์ง€๋ง‰ m๊ฐœ์˜ ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ์žฌ๊ตฌ์„ฑ ๋œ 3D ํฌ์ธํŠธ (์ฆ‰, 3D ๋งต)์˜ squared reprojection error์˜ ํ•ฉ์„ ์ตœ์†Œํ™”ํ•˜๋Š” ๋ฐฉ์‹์œผ๋กœ ์ž‘๋™ํ•œ๋‹ค. (mํ”„๋ ˆ์ž„์˜ window์—์„œ ์ˆ˜ํ–‰๋˜๊ธฐ ๋•Œ๋ฌธ์— windowed-bundle adjustment๋ผ๊ณ  ํ•จ, Bundle adjustment๋Š” Part 2์—์„œ ์ž์„ธํžˆ ๋‹ค๋ฃฌ๋‹ค.) 3D ํฌ์ธํŠธ๋Š” ์ด๋ฏธ์ง€ ํฌ์ธํŠธ๋ฅผ Triangulationํ•˜์—ฌ ์–ป๋Š”๋‹ค. (Triangulation๊ณผ Keyframe Selection ๋ถ€๋ถ„ ์ฐธ๊ณ )

"Monocular VO" ์—์„œ ์–ธ๊ธ‰ํ–ˆ๋“ฏ์ด ์ƒ๋Œ€ ๋ชจ์…˜ Tk๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๋‘ ๊ฐ€์ง€ ์ฃผ์š” ๋ฐฉ์‹์ด ์žˆ๋‹ค. ๋‘ ์ž…๋ ฅ ์ด๋ฏธ์ง€์— ์žˆ๋Š” ๋ชจ๋“  ํ”ฝ์…€์˜ intensity ์ •๋ณด๋ฅผ ์ด์šฉํ•˜๋Š” appearance-based ๋ฐฉ๋ฒ•๊ณผ salient์™€ ๋ฐ˜๋ณต๋˜๋Š” feature๋ฅผ ๋ฝ‘์•„์„œ ์‚ฌ์šฉํ•˜๋Š”(trackํ•˜๋Š”) feature-based ๋ฐฉ๋ฒ•์ด ์žˆ๋‹ค.

salient

appearance-based ๋ฐฉ๋ฒ•์€ feature-based ๋ฐฉ๋ฒ•์— ๋น„ํ•ด ์ •ํ™•๋„๊ฐ€ ๋–จ์–ด์ง€๊ณ  ์—ฐ์‚ฐ๋Ÿ‰์ด ๋” ๋งŽ๋‹ค. (์•ž์„  "History of VO"์—์„œ ์‚ดํŽด๋ดค๋“ฏ์ด Stereo์˜ ๊ฒฝ์šฐ๋ณด๋‹ค ๊ตฌํ˜„์ด ์‰ฌ์šด ๊ด€๊ณ„๋กœ ๋Œ€๋ถ€๋ถ„์˜ appearance-based ๋ฐฉ์‹์€ monocular VO์— ์“ฐ์˜€๋‹ค. ) Feature-based ๋ฐฉ์‹์„ ์‚ฌ์šฉํ•˜๋ ค๋ฉด ํ”„๋ ˆ์ž„ ์ „์ฒด์—์„œ feature๋“ค์„ robustํ•˜๊ฒŒ ๋งค์น˜ํ•˜๋Š” ๊ธฐ๋Šฅ์ด ํ•„์š”ํ•˜์ง€๋งŒ appearance-based ๋ฐฉ์‹๋ณด๋‹ค ๋น ๋ฅด๊ณ  ์ •ํ™•ํ•˜๋‹ค. ๊ทธ๋Ÿฌํ•œ ์ด์œ ๋กœ, ๋Œ€๋ถ€๋ถ„์˜ VO ๊ตฌํ˜„์€ Feature-based ๋ฐฉ์‹์ด๋‹ค.

VO์˜ ํŒŒ์ดํ”„๋ผ์ธ์€ Figure 2์— ์š”์•ฝ๋˜์–ด ์žˆ๋‹ค. ๋ชจ๋“  ์ƒˆ ์ด๋ฏธ์ง€ Ik(์Šคํ…Œ๋ ˆ์˜ค ์นด๋ฉ”๋ผ์˜ ๊ฒฝ์šฐ ์ด๋ฏธ์ง€ ์Œ) ์— ๋Œ€ํ•ด ์ฒ˜์Œ ๋‘ ๋‹จ๊ณ„๋Š” 2D Feature๋ฅผ ๊ฐ์ง€ํ•˜๊ณ  ์ด์ „ ํ”„๋ ˆ์ž„์˜ Feature์™€ ์ผ์น˜์‹œํ‚ค๋Š” ๊ฒƒ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. ์„œ๋กœ ๋‹ค๋ฅธ ํ”„๋ ˆ์ž„์—์„œ ๋™์ผํ•œ 3D ๊ธฐ๋Šฅ์„ reprojectionํ•˜๋Š” 2D ๊ธฐ๋Šฅ์„ image correspondences ๋ผ๊ณ  ํ•œ๋‹ค. ( Part2 ์—์„œ ๋‹ค์‹œ ์„ค๋ช…ํ•˜๊ฒ ์ง€๋งŒ, ์šฐ๋ฆฌ๋Š” feature matching๊ณผ feature tracking์„ ๊ตฌ๋ณ„ํ•œ๋‹ค. ์ฒซ ๋ฒˆ์งธ๋กœ ๋ชจ๋“  ์ด๋ฏธ์ง€์—์„œ ๋…๋ฆฝ์ ์œผ๋กœ feature๋ฅผ ๊ฐ์ง€ํ•œ ๋‹ค์Œ ๋ช‡ ๊ฐ€์ง€ similarity metrics๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๋งค์นญํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๊ตฌ์„ฑ๋˜๊ณ , ๋‘ ๋ฒˆ์งธ๋กœ ํ•˜๋‚˜์˜ ์ด๋ฏธ์ง€์—์„œ feature๋ฅผ ์ฐพ๊ณ , ๋‹ค์Œ ์ด๋ฏธ์ง€์—์„œ correlation๊ณผ ๊ฐ™์€ local search technique์„ ์‚ฌ์šฉํ•˜์—ฌ tracking ํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๊ตฌ์„ฑ๋œ๋‹ค. )

Figure 2.

์„ธ ๋ฒˆ์งธ ๋‹จ๊ณ„๋Š” k 1 ์ˆœ๊ฐ„๊ณผ k ์ˆœ๊ฐ„ ์‚ฌ์ด์˜ ์ƒ๋Œ€ ์šด๋™ Tk๋ฅผ ๊ณ„์‚ฐํ•˜๋Š” ๊ฒƒ์œผ๋กœ ๊ตฌ์„ฑ๋‹ค. ๋Œ€์‘์ด 3 ์ฐจ์› ๋˜๋Š” 2 ์ฐจ์›์œผ๋กœ ์ง€์ •๋˜์—ˆ๋Š”์ง€ ์—ฌ๋ถ€์— ๋”ฐ๋ผ ์ด ๋ฌธ์ œ๋ฅผ ํ•ด๊ฒฐํ•˜๊ธฐ ์œ„ํ•œ ์„ธ ๊ฐ€์ง€ ๋‹ค๋ฅธ ์ ‘๊ทผ ๋ฐฉ์‹์ด ์žˆ๋‹ค. (โ€œMotion Estimationโ€์„น์…˜ ์ฐธ์กฐ). ์นด๋ฉ”๋ผ ํฌ์ฆˆ Ck๋Š” Tk๋ฅผ ์ด์ „ ํฌ์ฆˆ์™€ ์—ฐ๊ฒฐํ•˜์—ฌ ๊ณ„์‚ฐ๋œ๋‹ค. ๋งˆ์ง€๋ง‰์œผ๋กœ, Local trajectory์— ๋Œ€ํ•œ ๋ณด๋‹ค ์ •ํ™•ํ•œ ์ถ”์ •์„ ์–ป๊ธฐ ์œ„ํ•ด ๋งˆ์ง€๋ง‰ m ํ”„๋ ˆ์ž„์— ๋Œ€ํ•ด ๋ฐ˜๋ณต์ ์ธ iterative refinement (bundle adjustment)์„ ์‹คํ–‰ํ•  ์ˆ˜ ์žˆ๋‹ค.

Motion estimation์€ ์ด ํŠœํ† ๋ฆฌ์–ผ์— ์„ค๋ช…๋˜์–ด์žˆ๋‹ค. ("motion estimation" ์„น์…˜ ์ฐธ๊ณ ) Feature detection๊ณผ matching, bundle adjustment๋Š” Part2์—์„œ ์„ค๋ช…ํ•œ๋‹ค. ๋˜ํ•œ, ์ •ํ™•ํ•œ ๋ชจ์…˜ ๊ณ„์‚ฐ์„ ์œ„ํ•ด feature correspondences ๋Š” outlier (wrong data associations๋ผ๊ณ ๋„ ํ•˜๋Š”) ๋ฅผ ํฌํ•จํ•˜๊ณ  ์žˆ์–ด์„œ๋Š” ์•ˆ๋œ๋‹ค. outlier๊ฐ€ ์žˆ๋Š” ์ƒํƒœ์—์„œ ์ •ํ™•ํ•œ ๋ชจ์…˜ ์ถ”์ •์„ ๋ณด์žฅํ•˜๋Š” ๊ฒƒ์€ robust estimation์˜ ์ž‘์—…์ด๋ฉฐ ์ด๋Š” Part2์—์„œ ์„ค๋ช…ํ•˜๋„๋ก ํ•œ๋‹ค. ๋Œ€๋ถ€๋ถ„์˜ VO ๊ตฌํ˜„์—์„œ๋Š” ์นด๋ฉ”๋ฝ ๋ณด์ •๋˜์—ˆ๋‹ค๊ณ  ๊ฐ€์ •ํ•œ๋‹ค. ์ด๋ฅผ ์œ„ํ•ด ๋‹ค์Œ ์„น์…˜์—์„œ๋Š” perspective camera์™€ omnidirectional camera์˜ ํ‘œ์ค€ ๋ชจ๋ธ๊ณผ ๋ณด์ • ์ ˆ์ฐจ๋ฅผ ์•Œ์•„๋ณธ๋‹ค.

2-1. Perspective Camera model

perspective ์นด๋ฉ”๋ผ์— ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” ํ‘œ์ค€ ๋ชจ๋ธ์€ pinhole projection ์‹œ์Šคํ…œ์„ ๊ฐ€์ •ํ•œ๋‹ค. ์ด๋ฏธ์ง€๋Š” ์ดˆ์  ๋ฉด๊ณผ ๋ Œ์ฆˆ ์ค‘์•™(ํˆฌ์˜ ์ค‘์‹ฌ)์„ ํ†ตํ•ด ๋ฌผ์ฒด์—์„œ ๋‚˜์˜ค๋Š” ๊ด‘์„ ์˜ ๊ต์ฐจ๋กœ ํ˜•์„ฑ๋œ๋‹ค. Figure 3. (a) ์ฐธ๊ณ .

Figure 3. (a)

X= [x,y,z]^t๋ฅผ ์นด๋ฉ”๋ผ reference frame์˜ scene point๋ผ๊ณ  ํ•˜๊ณ , p=[u,v]^t๋Š” ํ”ฝ์…€ ๋‹จ์œ„๋กœ ์ธก์ •๋œ ์ด๋ฏธ์ง€ ๋ฉด์œผ๋กœ์˜ ํˆฌ์˜์ด๋ผ๊ณ  ๊ฐ€์ •ํ•œ๋‹ค. 3D์—์„œ 2D๋กœ์˜ ๋งคํ•‘์€ perspective projection equation์œผ๋กœ ์ œ๊ณต๋œ๋‹ค.

์—ฌ๊ธฐ์„œ ๋žŒ๋‹ค๋Š” depth factor, a_u ๋ฐ a_v๋Š” ์ดˆ์  ๊ฑฐ๋ฆฌ, u_0, v_0๋Š” ํˆฌ์˜ ์ค‘์‹ฌ์˜ ์ด๋ฏธ์ง€ ์ขŒํ‘œ์ด๋‹ค. ์ด๋Ÿฌํ•œ parameter๋ฅผ intrincsic parameter๋ผ๊ณ  ํ•œ๋‹ค. ์นด๋ฉ”๋ผ์˜ ์‹œ์•ผ๊ฐ€ 45๋„๋ณด๋‹ค ํฌ๋ฉด ๋ฐฉ์‚ฌํ˜• ์™œ๊ณก์˜ ํšจ๊ณผ๊ฐ€ ํ‘œ์‹œ ๋  ์ˆ˜ ์žˆ์œผ๋ฉฐ, 2์ฐจ(๋˜๋Š” ๊ทธ ์ด์ƒ) ๋‹คํ•ญ์‹์„ ์‚ฌ์šฉํ•˜์—ฌ ๋ชจ๋ธ๋ง ํ•  ์ˆ˜ ์žˆ๋‹ค. ์™„์ „ ๋ชจ๋ธ์˜ ํŒŒ์ƒํ˜•์€ [22]๋‚˜ [63]์™€ ๊ฐ™์€ ์ปดํ“จํ„ฐ ๋น„์ „ ๊ต๊ณผ์„œ์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. p ~= [u, v,1]^T = k^-1[u,v,1]^T๋ฅผ ์ •๊ทœํ™”๋œ ์ด๋ฏธ์ง€ ์ขŒํ‘œ(Nomalized image coordinates)๋ผ๊ณ  ํ•œ๋‹ค. ์ด ์ขŒํ‘œ๋Š” ๋‹ค์Œ ์„น์…˜ ์ „์ฒด์—์„œ ์‚ฌ์šฉ๋œ๋‹ค.

omnidirectional camera, spherical camera ์ƒ๋žต

2-2. Camera Calibration

Calibration์˜ ๋ชฉํ‘œ๋Š” ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ์˜ intrinsic ๋ฐ extrinsic parameter๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ์ธก์ •ํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋‹ค์•ˆ ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ์—์„œ extrinsic parameter๋Š” ์นด๋ฉ”๋ผ ๊ฐ„ ์ƒํ˜ธ ์œ„์น˜์™€ ๋ฐฉํ–ฅ์„ ๊ฐ€์ง€๊ณ  ์žˆ๋‹ค. ๊ฐ€์žฅ ๋„๋ฆฌ ์‚ฌ์šฉ๋˜๋Š” ๋ฐฉ๋ฒ•์€ planar checkerboard-like pattern์„ ์‚ฌ์šฉํ•˜๋Š” ๊ฒƒ์ด๋‹ค. ๋ณด๋“œ์˜ ์‚ฌ๊ฐํ˜•์˜ ์œ„์น˜๋Š” ์•Œ๊ณ  ์žˆ์–ด์•ผ ํ•˜๋ฉฐ, calibration parameter๋ฅผ ์ •ํ™•ํ•˜๊ฒŒ ๊ณ„์‚ฐํ•˜๊ธฐ ์œ„ํ•ด ์นด๋ฉ”๋ผ์˜ ํ™”๊ฐ์— ์ตœ๋Œ€ํ•œ ์ฑ„์›Œ์ ธ ์žˆ๊ณ  ๋‹ค์–‘ํ•œ ์œ„์น˜์™€ ๋ฐฉํ–ฅ์—์„œ ์ฐํžŒ ์—ฌ๋Ÿฌ์žฅ์˜ ๋ณด๋“œ ์‚ฌ์ง„์„ ์ดฌ์˜ํ•ด์•ผ ํ•œ๋‹ค. ๊ทธ๋Ÿฐ least-square minimization method๋ฅผ ์ด์šฉํ•˜์—ฌ intrinsic ํŒŒ๋ผ๋ฏธํ„ฐ์™€ extrinsic ํŒŒ๋ผ๋ฏธํ„ฐ๋ฅผ ๊ตฌํ•ด๋‚ธ๋‹ค. ์ž…๋ ฅ ๋ฐ์ดํ„ฐ๋Š” ๋ณด๋“œ ์‚ฌ๊ฐํ˜• ๋ชจ์„œ๋ฆฌ์˜ 2D ์œ„์น˜์™€ ์ฝ”๋„ˆ์˜ ํ•ด๋‹น ํ”ฝ์…€ ์ขŒํ‘œ์ด๋‹ค.

๋งŽ์€ ์นด๋ฉ”๋ผ Calibration ํˆด๋ฐ•์Šค๊ฐ€ MATLAB ๋ฐ C ์šฉ์œผ๋กœ ๊ณ ์•ˆ๋˜์—ˆ๋‹ค ์ตœ์‹  ๋ชฉ๋ก์€ [68]์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. ์ด ์ค‘ ๊ฐ€์žฅ ๋งŽ์ด ์‚ฌ์šฉ๋˜๋Š” MATLAB์€ perspective camera์™€ omnidirectional ์นด๋ฉ”๋ผ์— ๋Œ€ํ•ด์„œ [69]์™€ [70]~[72]์— ๋‚˜์™€ ์žˆ๋‹ค. perspective ์นด๋ฉ”๋ผ์— ๋Œ€ํ•œ ์นด๋ฉ”๋ผ Calibration์˜ C๊ตฌํ˜„์€ OpenCV [73]์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค.

2-3. Motion Estimation

Motion Estimation์€ ๋ชจ๋“  ์ด๋ฏธ์ง€์— ๋Œ€ํ•ด ์ˆ˜ํ–‰๋˜๋Š” VO ์‹œ์Šคํ…œ์˜ ํ•ต์‹ฌ ๋‹จ๊ณ„์ด๋‹ค. ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ๋Š” Motion Estimation ๋‹จ๊ณ„์—์„œ ํ˜„์žฌ ์ด๋ฏธ์ง€์™€ ์ด์ „ ์ด๋ฏธ์ง€ ์‚ฌ์ด์˜ ์นด๋ฉ”๋ผ ๋ชจ์…˜์ด ๊ณ„์‚ฐ๋œ๋‹ค. ์ด๋Ÿฌํ•œ ๋ชจ๋“  single ์›€์ง์ž„์„ ์—ฐ๊ฒฐํ•˜์—ฌ ์นด๋ฉ”๋ผ์™€ agent์˜ full trajectory(์นด๋ฉ”๋ผ๊ฐ€ ๋‹จ๋‹จํžˆ ์žฅ์ฐฉ๋˜์–ด ์žˆ๋‹ค๊ณ  ๊ฐ€์ •)์„ ๋ณต๊ตฌ ํ•  ์ˆ˜ ์žˆ๋‹ค. ์ด ์„น์…˜์—์„œ๋Š” ๋‘ ์ด๋ฏธ์ง€ Ik-1๊ณผ Ik ๊ฐ„์˜ ๋ณ€ํ™˜ T_k๊ฐ€ TkT_k ๊ฐ€ ๊ฐ๊ฐ ์ˆœ๊ฐ„ k-1๊ณผ k์—์„œ ๋‘ ์„ธํŠธ์˜ ํ•ด๋‹น corresponding feature fkโˆ’1f_{k-1} , fkf_k ์—์„œ ๊ณ„์‚ฐ ๋  ์ˆ˜ ์žˆ๋Š” ๋ฐฉ๋ฒ•์„ ์„ค๋ช…ํ•œ๋‹ค. feature correspondence๊ฐ€ 2D๋กœ ๋ช…์‹œ๋˜์—ˆ๋Š”์ง€, 3D๋กœ ๋ช…์‹œ๋˜์—ˆ๋Š”์ง€์— ๋”ฐ๋ผ ์„ธ๊ฐ€์ง€ ๋ฐฉ๋ฒ•์œผ๋กœ ๋‚˜๋‰œ๋‹ค.

  • 2D to 2D: ์ด ๊ฒฝ์šฐ fkโˆ’1f_{k-1}๊ณผ fkf_k๋Š” ๋ชจ๋‘ 2D ์ด๋ฏธ์ง€ ์ขŒํ‘œ๋กœ ์ง€์ •.

  • 3D to 3D: ์ด ๊ฒฝ์šฐ fkโˆ’1f_{k-1}๊ณผ fkf_k๋Š” ๋ชจ๋‘ 3D ์ด๋ฏธ์ง€ ์ขŒํ‘œ๋กœ ์ง€์ •. ์ด ๋ฐฉ์‹์€ Stereo ์นด๋ฉ”๋ผ ์‹œ์Šคํ…œ ๊ฐ™์€ ๊ฑธ ์ด์šฉํ•˜ ๋งค ์ˆœ๊ฐ„ ๋งˆ๋‹ค 3D point๋ฅผ triangulation ํ•ด์•ผํ•œ๋‹ค.

  • 3D to 2D: ์ด ๊ฒฝ์šฐ fkโˆ’1f_{k-1}๋Š” 3D๋กœ ์ง€์ •๋˜๊ณ ๊ณผ fkf_k๋Š” ์ด๋ฏธ์ง€ IkI_k์— ๋Œ€ํ•œ 2D reprojection์ด๋‹ค. monocular์˜ ๊ฒฝ์šฐ 3D ๊ตฌ์กฐ๋ฌผ์€ ์ธ์ ‘ํ•œ ๋‘ ๊ฐœ์˜ ์นด๋ฉ”๋ผ ์‹œ์ ์— ์˜ํ•ด triangulation๋  ๊ฒƒ์ด๋‹ค. ๊ทธ๋Ÿฐ ๋‹ค์Œ์— ์„ธ ๋ฒˆ์งธ ๋ทฐ์˜ 2D ์ด๋ฏธ์ง€ feature์™€ ์ผ์น˜ํ•ด์•ผํ•œ๋‹ค. monocular ๋ฐฉ์‹์—์„œ๋Š” ์ ์–ด๋„ ์„ธ ๊ฐœ์˜ view๊ฐ€ ์ผ์น˜ํ•ด์•ผ ํ•œ๋‹ค.

Feature๋Š” ์  ๋˜๋Š” ์„ ์ด ๋  ์ˆ˜ ์žˆ๋‹ค. ์ผ๋ฐ˜์ ์œผ๋กœ ๊ตฌ์กฐํ™”๋˜์ง€ ์•Š์€(unstructured) ์žฅ๋ฉด์—์„œ๋Š” ์„ ์ด ์—†๊ธฐ ๋•Œ๋ฌธ์— VO์—์„œ ํฌ์ธํŠธ feature๋ฅผ ์‚ฌ์šฉํ•œ๋‹ค. ํฌ์ธํŠธfeature์™€ ๋ผ์ธ feature์— ๋Œ€ํ•œ ์„ธ๊ฐ€์ง€ ์ ‘๊ทผ ๋ฐฉ์‹์˜ ์‹ฌ์ธต ๋ถ„์„์€ [74]์—์„œ ์ฐพ์„ ์ˆ˜ ์žˆ๋‹ค. ๋ณธ ํŠœํ† ๋ฆฌ์–ผ์—์„œ๋Š” point feature๋งŒ ๋‹ค๋ฃฌ๋‹ค.

2-4. 2D to 2D: Motion from Image Feature Correspondences

Estimating the Essential Matrix

๋ณด์ •๋œ ์นด๋ฉ”๋ผ์˜ ๋‘ ์ด๋ฏธ์ง€ IkI_k ์™€ Ikโˆ’1I_{k-1} ์‚ฌ์ด์˜ ๊ธฐํ•˜ํ•™์  ๊ด€๊ณ„๋Š” ์†Œ์œ„ ์—์„ผ์…œ ํ–‰๋ ฌ E๋กœ ์„ค๋ช…๋œ๋‹ค. E์—๋Š” ๋‹ค์Œ๊ณผ ๊ฐ™์€ ํ˜•ํƒœ๋กœ ๋ณ€ํ™˜์— ๋Œ€ํ•ด ์•Œ๋ ค์ง€์ง€ ์•Š๋Š” ๋ฐฐ์œจ๊นŒ์ง€ ์นด๋ฉ”๋ผ ๋ชจ์…˜ parameter๋กœ ํฌํ•จ๋œ๋‹ค.

tk = [tx, ty, tz]^T ์™€

๋ฌผ๊ฒฐ ๊ธฐํ˜ธ๋Š” ๊ณฑ์…ˆ ์Šค์นผ๋ผ๊นŒ์ง€ ๋“ฑํ˜ธ๊ฐ€ ์œ ํšจํ•จ์„ ํ‘œ์‹œํ•˜๋Š”๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.

์—์„ผ์…œ ํ–‰๋ ฌ์€ 2D to 2D feature correspondence๋กœ ๋ถ€ํ„ฐ ๊ณ„์‚ฐํ•  ์ˆ˜ ์žˆ์œผ๋ฉฐ rotation๊ณผ translation์€ E์—์„œ ์ง์ ‘ ์ถ”์ถœํ•  ์ˆ˜ ์žˆ๋‹ค. 2D to 2D ๊ธฐ๋ฐ˜ Motion estimation์˜ ์ฃผ์š” ์†์„ฑ์€ epipolar constraint ์ด๋‹ค. ์ด๋Š” p์˜ ํ•ด๋‹น ํŠน์ง•์  p'~๊ฐ€ ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์— ์žˆ๋Š” ์„ ์„ ๊ฒฐ์ •ํ•œ๋‹ค. ์ด constraint๋Š”

๋กœ ๊ณต์‹ํ™” ๋  ์ˆ˜ ์žˆ๋‹ค. ์—ฌ๊ธฐ์„œ p'~๋Š” Ik์™€ ๊ฐ™์€ ํ•œ ์ด๋ฏธ์ง€์—์„œ์˜ feature ์œ„์น˜์ด๊ณ  p~๋Š” ๋‹ค๋ฅธ ์ด๋ฏธ์ง€์˜ ํ•ด๋‹น Feature ์œ„์น˜ ์ด๋‹ค. ~p ๋ฐ ~p'๋Š” nomalize๋œ ์ด๋ฏธ์ง€์˜ ์ขŒํ‘œ์ด๋‹ค.

p~ ๋ฐ p0~์€ ์ •๊ทœํ™” ๋œ ์ด๋ฏธ์ง€ ์ขŒํ‘œ์ž…๋‹ˆ๋‹ค. ๋‹จ์ˆœํ™”๋ฅผ ์œ„ํ•ด ๋‹ค์Œ ์„น์…˜ ์ „์ฒด์—์„œ

ํ˜•์‹์˜ ์ •๊ทœํ™” ๋œ ์ขŒํ‘œ๊ฐ€ ์‚ฌ์šฉ๋œ๋‹ค. (์›๊ทผ ์นด๋ฉ”๋ผ ๋ชจ๋ธ ์„น์…˜ ์ฐธ์กฐ).

Last updated

Was this helpful?