๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

AI/Computer Vision

[Computer Vision] Depth

728x90
๋ฐ˜์‘ํ˜•
๐Ÿ‘€ ๋ณธ ์˜ˆ์ œ๋Š” Window10์˜ VSCode, Python3.11.0๋กœ ์ž‘์„ฑ๋˜์—ˆ์Šต๋‹ˆ๋‹ค.

 

๊นŠ์ด ์ •๋ณด๋Š” ์ด๋ฏธ์ง€์—์„œ ๊ฐ ํ”ฝ์…€์ด ์นด๋ฉ”๋ผ์™€ ์–ผ๋งˆ๋‚˜ ๋–จ์–ด์ ธ ์žˆ๋Š”์ง€๋ฅผ ๋‚˜ํƒ€๋‚ธ๋‹ค.

 

์ด ์ •๋ณด๋Š” 3D ์žฌ๊ตฌ์„ฑ, ๋ฌผ์ฒด ์ธ์‹, ์ž์œจ์ฃผํ–‰์ฐจ, ๋กœ๋ด‡ ๋น„์ „ ๋“ฑ์—์„œ ํ•„์ˆ˜์ ์ด๋‹ค.

 

๊นŠ์ด ์ถ”์ • ๋ฐฉ๋ฒ•์€ ๋‹ค์Œ๊ณผ ๊ฐ™๋‹ค.

  • ์Šคํ…Œ๋ ˆ์˜ค ๋น„์ „(Stereo Vision) : ๋‘ ๊ฐœ์˜ ์นด๋ฉ”๋ผ๋ฅผ ์„œ๋กœ ๋‹ค๋ฅธ ๊ฐ๋„์—์„œ ์ด๋ฏธ์ง€๋ฅผ ์ดฌ์˜ํ•˜๊ณ , ๋‘ ์ด๋ฏธ์ง€ ๊ฐ„์˜ ์ฐจ์ด๋ฅผ ๋ถ„์„ํ•˜์—ฌ ๊นŠ์ด๋ฅผ ๊ณ„์‚ฐ.
  • ๊นŠ์ด ์„ผ์„œ(Depth Sensors) : LiDAR, ToF(Time-of-Flight) ์นด๋ฉ”๋ผ์™€ ๊ฐ™์€ ์„ผ์„œ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๋ฌผ์ฒด๊นŒ์ง€์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ง์ ‘ ์ธก์ •ํ•œ๋‹ค.
  • ๋ชจ๋…ธํ˜๋Ÿฌ ๊นŠ์ด ์ถ”์ •(Monocular Depth Estimation) : ๋‹จ์ผ ์นด๋ฉ”๋ผ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ๊นŠ์ด๋ฅผ ์ถ”์ •ํ•˜๋Š” ๋ฐฉ๋ฒ•, ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ์‚ฌ์šฉํ•˜์—ฌ ์ด๋ฏธ์ง€์˜ ํŒจํ„ด๊ณผ ๊ตฌ์กฐ๋ฅผ ํ•™์Šตํ•˜๊ณ , ์ด๋ฅผ ๊ธฐ๋ฐ˜์œผ๋กœ ๊นŠ์ด๋ฅผ ์˜ˆ์ธกํ•œ๋‹ค.

 

Computer Vision์—์„œ์˜ ์‘์šฉ

  • 3D ์žฌ๊ตฌ์„ฑ : ๊นŠ์ด ์ •๋ณด๋ฅผ ํ™œ์šฉํ•˜์—ฌ 2D ์ด๋ฏธ์ง€์—์„œ 3D ๋ชจ๋ธ์„ ์ƒ์„ฑํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ๋œ๋‹ค.
  • ๋ฌผ์ฒด ์ธ์‹ ๋ฐ ์ถ”์  : ๊นŠ์ด ์ •๋ณด๋ฅผ ํ†ตํ•ด ๋ฌผ์ฒด์˜ ์œ„์น˜์™€ ํ˜•ํƒœ๋ฅผ ๋ณด๋‹ค ์ •ํ™•ํ•˜๊ฒŒ ์ธ์‹ํ•˜๊ณ  ์ถ”์ ํ•  ์ˆ˜ ์žˆ๋‹ค.
  • ์ž์œจ์ฃผํ–‰์ฐจ : ์ฐจ๋Ÿ‰ ์ฃผ๋ณ€์˜ ํ™˜๊ฒฝ์„ ์ดํ•ดํ•˜๊ณ  ์žฅ์• ๋ฌผ ํšŒํ”ผ ๋ฐ ๋‚ด๋น„๊ฒŒ์ด์…˜์„ ์œ„ํ•œ ์ค‘์š”ํ•œ ์š”์†Œ์ด๋‹ค.

 

Stereo Vision

๋‘ ๊ฐœ์˜ ์นด๋ฉ”๋ผ๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 3D ๊ณต๊ฐ„์—์„œ ๋ฌผ์ฒด์˜ ๊นŠ์ด ์ •๋ณด๋ฅผ ์ถ”์ •ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค.

 

์ด ๊ธฐ์ˆ ์€ ์ธ๊ฐ„์˜ ๋‘ ๋ˆˆ์ด ๊นŠ์ด๋ฅผ ์ธ์‹ํ•˜๋Š” ๋ฐฉ์‹๊ณผ ์œ ์‚ฌํ•˜๊ฒŒ ์ž‘๋™ํ•œ๋‹ค.

 

import numpy as np
import cv2
from matplotlib import pyplot as plt

img_left = cv2.imread("./img_left.jpg",cv2.IMREAD_GRAYSCALE)

img_right = cv2.imread("./img_right.jpg",cv2.IMREAD_GRAYSCALE)


# ์Šคํ…Œ๋ ˆ์˜ค ๋งค์นญ ๊ฐ์ฒด ์ƒ์„ฑ
# numDisparities : ๊นŠ์ด ๋งต์„ ๊ณ„์‚ฐํ•  ๋•Œ ์‚ฌ์šฉํ•  ์ตœ๋Œ€ ์‹œ์ฐจ ๊ฐ’์„ ์ง€์ •, ๊ฐ’์€ 16์˜ ๋ฐฐ์ˆ˜ 
# blockSize : ๋งค์นญ์„ ์˜คใ…Ÿํ•ด ์‚ฌ์šฉํ•  ๋ธ”๋ก์˜ ํฌ๊ธฐ, ์ด ๊ฐ’์€ ํ™€์ˆ˜์—ฌ์•ผ ํ•˜๋ฉฐ, ์ผ๋ฐ˜์ ์œผ๋กœ 5,7,9,11 ๋“ฑ์ž„
stereo = cv2.StereoBM.create(numDisparities=16,blockSize=9)

# ๊นŠ์ด ๋งต ๊ณ„์‚ฐ
disparity_map = stereo.compute(img_left, img_right)

# ์„œ๋ธŒํ”Œ๋กฏ ์ƒ์„ฑ
fig, axs = plt.subplots(3, 1, figsize=(10, 15))

# ์›๋ณธ ์ด๋ฏธ์ง€ (์™ผ์ชฝ)
axs[0].imshow(img_left, cmap='gray')
axs[0].set_title("Left Image")
axs[0].axis("off")

# ์›๋ณธ ์ด๋ฏธ์ง€ (์˜ค๋ฅธ์ชฝ)
axs[1].imshow(img_right, cmap='gray')
axs[1].set_title("Right Image")
axs[1].axis("off")

# ๊นŠ์ด ๋งต
axs[2].imshow(disparity_map, cmap='gray')
axs[2].set_title("Depth Map")
axs[2].axis("off")

# ๊ฒฐ๊ณผ ์ถœ๋ ฅ
plt.tight_layout()
plt.show()

 

 

Depth Sensors

๋ฌผ์ฒด์™€ ์„ผ์„œ ๊ฐ„์˜ ๊ฑฐ๋ฆฌ๋ฅผ ์ธก์ •ํ•˜์—ฌ 3D ๊ณต๊ฐ„์—์„œ์˜ ๊นŠ์ด ์ •๋ณด๋ฅผ ์ œ๊ณตํ•˜๋Š” ์žฅ์น˜์ด๋‹ค.

์ด๋Ÿฌํ•œ ์„ผ์„œ๋Š” ๋‹ค์–‘ํ•œ ๊ธฐ์ˆ ์„ ๊ธฐ๋ฐ˜์œผ๋กœ ํ•˜๋ฉฐ, ์—ฌ๋Ÿฌ ๋ถ„์•ผ์—์„œ ํ™œ์šฉ๋œ๋‹ค.

์ข…๋ฅ˜๋กœ๋Š” LiDAR, ToF, ์Šคํ…Œ๋ ˆ์˜ค ์นด๋ฉ”๋ผ, ๊ตฌ์กฐ๊ด‘ ์„ผ์„œ ๋“ฑ

 

 

Monocular Depth Estimation

๋‹จ์ผ ์นด๋ฉ”๋ผ ์ด๋ฏธ์ง€๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ 3D ๊ณต๊ฐ„์—์„œ ๊นŠ์ด ์ •๋ณด๋ฅผ ์ถ”์ •ํ•˜๋Š” ๊ธฐ์ˆ ์ด๋‹ค.

 

์ด๋Š” ์Šคํ…Œ๋ ˆ์˜ค ๋น„์ „๊ณผ ๋‹ฌ๋ฆฌ ๋ณ„๋„์˜ ๊นŠ์ด ์„ผ์„œ๋‚˜ ๋‘ ๊ฐœ์˜ ์ด๋ฏธ์ง€๋ฅผ ํ•„์š”๋กœ ํ•˜์ง€ ์•Š๊ณ , ์ฃผ๋กœ ๋”ฅ๋Ÿฌ๋‹ ๋ชจ๋ธ์„ ํ†ตํ•ด ์ˆ˜ํ–‰๋œ๋‹ค.

๋Œ€ํ‘œ์ ์œผ๋กœ Depth-Anything์ด ์žˆ๋‹ค.

 

GitHub - DepthAnything/Depth-Anything-V2: Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation

Depth Anything V2. A More Capable Foundation Model for Monocular Depth Estimation - DepthAnything/Depth-Anything-V2

github.com

Demo

 

Depth Anything V2 - a Hugging Face Space by depth-anything

Running on Zero

huggingface.co

 

728x90
๋ฐ˜์‘ํ˜•