(→UI_Speech_Recognition_Function) |
(→UI_Speech_Recognition_Function) |
||
| Line 117: | Line 117: | ||
==== UI_Touch_Sreen_Hand_Writing_Detection ==== | ==== UI_Touch_Sreen_Hand_Writing_Detection ==== | ||
==== UI_Speech_Recognition_Function ==== | ==== UI_Speech_Recognition_Function ==== | ||
| - | 1. Recognition Category | + | 1.Recognition Category |
| - | + | #discrete command - command, single digit ( e.g. "dial",'1','2',..etc ) | |
| - | + | #continuous command - continuous digit ( e.g. "dial 555-1212", " 555-1212" ) | |
| - | + | #natural language understaning - flexible recognition ( e.g. "I want to make a call to john","please route to the san franciso regency hyatt hotel",..etc) | |
| - | + | #Dynamic vocabulary (e.g. phonebook, MP3 Title) | |
| - | + | ##number of phonebook and Music Title is not determined, User flexblly add and remove the list whatever they want, this list is dynamically loaded for a recognition, especially G2P (Grapheme-To-Phoneme) techinique is required to generate the phonetic transcription for the new words | |
| - | + | #VDE ( Voice Detination Entry ) | |
| - | + | ##Multi Step (e.g. "MI" and then "troy" and then "1307" | |
| - | + | ##One shot (e.g. "1307 troy, MI") | |
| - | + | ||
| - | + | ||
| - | + | ||
| - | 2. Recognition Response time | + | 2.Recognition Response time |
| - | + | #discrete command - 300ms | |
| - | + | #continuous command - 1200ms | |
| - | + | #phone book & Music Title - 1200ms | |
| - | + | #natural language understanding, VDE - 1500ms | |
| - | 3. Recognition performance measurement | + | 3.Recognition performance measurement |
| - | + | #overall accuracy measurement | |
| - | + | ##Average Sentence Accuracy = (Total Number of Correct Sentences)/(Total Number of Sentence Attempted) | |
| - | + | #individual accuracy measurement | |
| - | + | ##Average Word Accuracy = ((Total number of attempts) - (insertions) - (deletions) - (substitutions))/Total number of attempts | |
| - | 4. | + | 4.Recognition performance requirement |
| - | + | #discrete command | |
| - | + | ##IDLE SNR>20dB (98%>) , Middle noisy SNR>10dB ( 95%> ), Too much noisy SNR> 6dB ( 92%) , SNR <6dB (rejection ) | |
| - | + | #continuous command | |
| - | + | ##IDLE SNR>20dB (98%>) , Middle noisy SNR>10dB ( 95%> ), Too much noisy SNR> 6dB ( 92%) , SNR <6dB (rejection ) | |
| + | |||
| + | 5.Speech Recognition functionality | ||
| + | #confidence score | ||
| + | ##the recognized result can be accepted/rejected according to confidence score, which represent how confidentially result can be accepted in terms of log likelihood, for example, if the developer set the confidence threshold to 40, basically assumed that confidence score range is between 0(low confidence score)to 100(high confidence score), result can be accepted only if confidence score is greater than confidence threshold 40 | ||
| + | #grammar weight | ||
| + | ## | ||
=== Multiple_HMI_Languages === | === Multiple_HMI_Languages === | ||
This topic provides an outline of requirements for a head unit. The requirements are derived from a real life development project for in-vehicle infotainment platform. This is still work in progress--the content will be added gradually.
Some of the requirements below will be fulfilled outside of the MeeGo IVI based software. For example, the implementation of CAN network interface and early audio functions most probably falls into this category. The decisions about implementation of specific requirements in MeeGo IVI software will be made assuming a specific system architecture.
- objective testing : VDA 1.6 - subjective testing : In-Vehicle live testing
1.Recognition Category
2.Recognition Response time
3.Recognition performance measurement
4.Recognition performance requirement
5.Speech Recognition functionality