๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ

AI/Coursera

[Coursera] 8. Softmax

728x90
๋ฐ˜์‘ํ˜•
๐Ÿฅ‘ Coursera์˜ "Improving Deep Neural Networks: Hyperparameter Tuning, Regularization and
Optimization" ๊ฐ•์ขŒ์˜ ๋‚ด์šฉ์„ ๋ฐฐ์šฐ๋ฉด์„œ ๊ฐœ์ธ์ ์œผ๋กœ ์ •๋ฆฌํ•œ ๋‚ด์šฉ์ž…๋‹ˆ๋‹ค.

 

Softmax๋Š” ์ฃผ์–ด์ง„ ๋ฒกํ„ฐ๋ฅผ ๋‹ค๋ฅธ ๋ฒกํ„ฐ๋กœ ๋ณ€ํ™˜ํ•˜๋Š” ํ•จ์ˆ˜๋กœ, ์ผ๋ฐ˜์ ์œผ๋กœ ๋‹ค์ค‘ ํด๋ž˜์Šค ๋ถ„๋ฅ˜ ๋ฌธ์ œ์—์„œ ์ถœ๋ ฅ ํ™•๋ฅ  ํ˜•ํƒœ๋กœ ํ‘œํ˜„ํ•˜๊ธฐ ์œ„ํ•ด ์‚ฌ์šฉ๋œ๋‹ค.

 

์ž…๋ ฅ ๋ฒกํ„ฐ์˜ ๊ฐ ์š”์†Œ๋ฅผ 0 ~ 1 ์‚ฌ์ด์˜ ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ํ•˜๋ฉฐ, ๋ชจ๋“  ์š”์†Œ์˜ ํ•ฉ์ด 1์ด ๋˜๋„๋ก ์ •๊ทœํ™” ํ•œ๋‹ค.

 

๊ณ„์‚ฐ ๋ฐฉ๋ฒ•

  • ์ง€์ˆ˜ ํ•จ์ˆ˜๋ฅผ ์‚ฌ์šฉํ•˜์—ฌ ์ž…๋ ฅ ๊ฐ’์„ ํ™•๋ฅ  ๊ฐ’์œผ๋กœ ๋ณ€ํ™˜ ํ•œ๋‹ค.
  • ์ž…๋ ฅ ๋ฒกํ„ฐ x์— ๋Œ€ํ•ด ์ถœ๋ ฅ y๋Š” ๋‹ค์Œ ๊ณผ๊ฐ™์ด ๊ณ„์‚ฐ

(yi๋Š” ์ถœ๋ ฅ ๋ฒกํ„ฐ์˜ i๋ฒˆ์งธ ์š”์†Œ, xi๋Š” ์ž…๋ ฅ ๋ฒกํ„ฐ์˜ i๋ฒˆ์งธ ์š”์†Œ, N์€ ๋ฒกํ„ฐ์˜ ์ฐจ์› ์ˆ˜)

 

์˜ˆ์‹œ

Input vector -> x = [2,4,1,3]

1. ๊ฐ ์š”์†Œ๋ฅผ ์ง€์ˆ˜ ํ•จ์ˆ˜๋กœ ๋ณ€ํ™˜

  • e2 ≈ 7.3891
  • e4 ≈ 54.5982
  • e1 ≈ 2.7183
  • e3 ≈ 20.0855

 

2. ๋ณ€ํ™˜ ๋œ ๊ฐ’๋“ค์˜ ์ดํ•ฉ ๊ณ„์‚ฐ

  • total ≈ 8.3891 + 54.5982 + 2.7183 + 20.0855 ≈ 84.7911

 

3. ๊ฐ ๊ฐ’์„ ํ•ฉ๊ณ„๋กœ ๋‚˜๋ˆ„์–ด Softmax ๊ฐ’์„ ๊ณ„์‚ฐ

  • y1 = 7.3891 / 84.7911 ≈ 0.087
  • y2 = 54.5982 / 84.7911 ≈ 0.644
  • y3 = 2.7183 / 84.7911 ≈ 0.032
  • y4 = 20.0855 / 84.7911 ≈ 0.237
  • y1 + y2 + y3 + y4 = 1

 

Logstic Regression์—์„œ ์‚ฌ์šฉ์‹œ ์ด์ง„ ๋ถ„๋ฅ˜์ด๋ฏ€๋กœ ์ถœ๋ ฅ์ด 2๊ฐœ๋กœ ๊ณ ์ •๋˜์–ด ์‚ฌ์šฉ๋œ๋‹ค.

 

Deep Learning์—์„œ๋Š” ์ถœ๋ ฅ ๋‹จ์œ„๊ฐ€ ํด๋ž˜์Šค์˜ ๊ฐœ์ˆ˜์™€ ๋™์ผํ•˜๋‹ค.(Logstic Regression์„ ์ผ๋ฐ˜ํ™” ํ•œ ๊ฒƒ์ด๋ผ ์ƒ๊ฐ ํ•  ์ˆ˜ ์žˆ๋‹ค.)

 

Softmax ํ•จ์ˆ˜๋Š” Cross-entropy-loss ํ•จ์ˆ˜(์‹ค์ œ ๊ฐ’๊ณผ ์˜ˆ์ธก๋œ ํ™•๋ฅ ์˜ ์ฐจ์ด๋ฅผ ๊ณ„์‚ฐํ•˜์—ฌ ๋ชจ๋ธ์„ ํ•™์Šตํ•˜๋Š” ๋ฐ ์‚ฌ์šฉ)์™€ ์—ฐ๊ฒฐ๋˜์–ด ์‚ฌ์šฉ๋œ๋‹ค.

 

 

Softmax.ipynb

Colaboratory notebook

colab.research.google.com

 

728x90
๋ฐ˜์‘ํ˜•

'AI > Coursera' ์นดํ…Œ๊ณ ๋ฆฌ์˜ ๋‹ค๋ฅธ ๊ธ€

[Coursera] 7. Hyperparameter tuning, Batch Normalization  (0) 2024.01.01
[Coursera] 6. Opitmization  (0) 2023.12.31
[Coursera] 5. Optimization Problem  (0) 2023.12.28
[Coursera] 4. Practical Aspects of Deep Learning  (0) 2023.12.27
[Coursera] 3. Gradient Descent  (0) 2023.12.27