스마트 스피커의 이해 (1/2)
- 음성인식 기반 인공지능의 시작
2018년 8월
김유신
건국대학교
ICT 산업의 발전
-35yr
1983
[ 컴퓨터와 키트 ]
-28yr
1990
[ PC통신 ]
-23yr
1995
[ 인터넷 ]
-21yr
1997
[ 인터넷 벤처 ]
-17yr
2001
[ 모바일 인터넷 ]
-11yr
2007
[ 스마트폰 ]
-6yr
2012
[ IoT ]
-3yr
2015
[ 인공지능 ]
음성인식 기반 인공지능 서비스
https://www.youtube.com/watch?v=yzK9nGwvTco (2:40)
Amazon Echo 와 Alexa Service
Introduction to Amazon Echo
https://www.youtube.com/watch?v=FQn6aFQwBQU (2:53)
Amazon Echo’s Services
Service
Connectivity
Amazon
Alexa
Channel
Connected
Home
Commerce
Music
Flash
Briefing
Calendar/
To-do list
Amazon Echo’s Services
Service
Connectivity
Book
Amazon
Alexa
Channel
Local Search
85 new features
3rd party Alexa Skills
Connected
Car
Sports
Connected
Home
Commerce
Music
Flash
Briefing
Calendar/
To-do list
Amazon Echo’s Services
Service
Connectivity
Book
Amazon
Alexa
Channel
Local Search
85 new features
3rd party Alexa Skills
Connected
Car
Sports
Connected
Home
Commerce
Music
Flash
Briefing
Calendar/
To-do list
“over 1000 skills”
Movies/
TV shows
Alexa
Devices
“Where’s my stuff?” Music
Curation
Amazon Echo’s Services
Service
Connectivity
Book
Amazon
Alexa
Channel
Local Search
85 new features
3rd party Alexa Skills
Connected
Car
Sports
Connected
Home
Commerce
Music
Flash
Briefing
Calendar/
To-do list
“over 1000 skills”
Movies/
TV shows
Alexa
Devices
“Where’s my stuff?” Music
Curation
“more than
40,000 skills”
Voice Technology Innovation
[Source] https://developer.amazon.com/
Ecosystem of Skills
[Source] ‘Internet Trends 2017, 2018 by Mary Meeker’ and ‘Business Insider’
Ecosystem of Skills
[Source] ‘Internet Trends 2017, 2018 by Mary Meeker’ and ‘Business Insider’
Ecosystem of Skills
[Source] ‘Internet Trends 2017, 2018 by Mary Meeker’ and ‘Business Insider’
Ecosystem of Skills
[Source] ‘Internet Trends 2017, 2018 by Mary Meeker’ and ‘Business Insider’
Engine for Voice Assistant Devices
[Source] https://developer.amazon.com/ and http://www.slideshare.net/firstmarkcap
Amazon Alexa Platform
[Source] http://www.slideshare.net/firstmarkcap
Voice UI 와 A.I.
UI/UX as Clue
Input
Output
0.3 Billion
1.3 Billion
?
Change of UI
논리적
객관성/정확성
정형적
다양성
제한된 Device 적용
Machine
감성적
맥락연관성
예측불가
Curated
다양한 Device 적용
Human
New Values
자연어
처리
Data
Analytics
Knowledge
Base
Machine
Learning
AI Cloud
맥락 인식 학습
Cognition
즐거움/재미 공감 시간절약 우정/관계
“사용자의 Life Style 변화”
Technologies of Voice UI
[Source] https://lekta.ai/blog/talking-to-machines-more-naturally-than-ever-before-voice-interface-for-lekta-nlp/
Evolution of Speech Technologies
[Source] Nuance
Speech recognition, natural language understanding and text-to-speech
have all undergone statistical, cloud-based and neural network-driven
phases of development.
Outline of Voice Assistants
[Source] http://voicelabs.co/2017/01/15/the-2017-voice-report/
Apple Siri
https://www.youtube.com/watch?v=XSp0jbaSBZs (1:52)
Google Duplex
https://www.youtube.com/watch?v=D5VN56jQMWM (4:11)
Landscape of Intelligent Assistance
[Source] https://opusresearch.net/pdfs/IA_Bot_Landscape_Dec2017.pdf
Next Paradigm Shift
The KEY to Internet’s Future ?
[Source] Internet Trends 2016 by Mary Meeker
“Why Amazon Echo, NOT the iPhone, may be the KEY to Internet’s Future”
(http://www.newsweek.com/why-amazon-echo-not-iphone-may-be-key-internets-future-465487)
Before the Intelligent Assistance
[Source] http://www.slideshare.net/neotevan/io-t-talkdemoday141222
Before the Intelligent Assistance
[Source] http://www.slideshare.net/neotevan/io-t-talkdemoday141222
Before the Intelligent Assistance
[Source] http://www.slideshare.net/neotevan/io-t-talkdemoday141222
Before the Intelligent Assistance
[Source] http://www.slideshare.net/neotevan/io-t-talkdemoday141222
Intelligent Assistant = Pseudo-Human
Physical World
Digital World
Sensor
Cloud
BigData
Analysis
ActuatorAR
Robot
Machine
Learning
NUI
ü NUI + Network + AI
ü Pseudo-Human, Human as a Service
Pebble Core with Alexa
https://www.youtube.com/watch?v=qYTfAXBPmro (0:39)
Apple AirPods with Siri
https://www.youtube.com/watch?v=SRkrZxNs7HE (3:25)
Enabler for Smart Home
https://www.youtube.com/watch?v=MJ3VdSXYEfk (1:57)
Issues
Future of Intelligent Voice Assistant
https://www.youtube.com/watch?v=w046TcaQu40 (4:22)
Speaker Recognition
[Source] http://www.kpvoice.com/page/sub2_1_8
ü Voice Print
Domain Knowledge
[Source] 정보통신기술진흥센터, ‘언어 처리를 위한 인공지능 기술 동향’
ü Context-awareness
Albert Mehrabian's Communications Model
• 7% of message pertaining
to feelings and attitudes is
in the words that are
spoken.
• 38% of message pertaining
to feelings and attitudes is
paralinguistic (the way that
the words are said).
• 55% of message pertaining
to feelings and attitudes is
in facial expression.
[Source] http://jadenelsonseducationalstudiesblog.blogspot.kr/
Tone Analyzer & Emotion Expression
MS Cortana’s 18 emotions
[Source] http://nordicapis.com/20-emotion-recognition-apis-that-will-leave-you-impressed-and-concerned/ and Microsoft Blog
IBM Watson’s Tone Analyzer
Privacy
ü Hack with inaudible command
: ‘Dolphin attack’
: ‘Hidden in music’
[Source] https://thehackernews.com/2017/09/ai-digital-voice-assistants.html
AI_SmartSpeaker1_KKU_180813

AI_SmartSpeaker1_KKU_180813