(→Audio_Volume_Control) |
(→UI_Speech_Recognition_Function) |
||
| (18 intermediate revisions not shown) | |||
| Line 57: | Line 57: | ||
# Volume level set by the user is preserved through shutdown cycle. | # Volume level set by the user is preserved through shutdown cycle. | ||
## Upon start up, the previous volume level is only restored up to a defined 'start up maximum' (as contrasted to the 'absolute maximum' that can be set after the head unit is up and running in its normal mode). | ## Upon start up, the previous volume level is only restored up to a defined 'start up maximum' (as contrasted to the 'absolute maximum' that can be set after the head unit is up and running in its normal mode). | ||
| - | # Under limit volume need to be set as some low level NOT minimum just to make sure that at | + | # Under limit volume need to be set as some low level NOT minimum just to make sure that at least user can recognize low level volume even though user set the volume to minimum which is basically "MUTE" before the power cycle |
# Upper limit volume need to be set as some high level NOT maximum just to make sure that user can not be annoyed due to the previous maximum volume setting. | # Upper limit volume need to be set as some high level NOT maximum just to make sure that user can not be annoyed due to the previous maximum volume setting. | ||
| + | # In Emergency Case, volume should not be adjusted by the user, volume need to be fit as some very loud level as much as no sound is masked with ambient noisy background, | ||
=== Handfree Functionality === | === Handfree Functionality === | ||
| Line 116: | Line 117: | ||
==== UI_Touch_Sreen_Hand_Writing_Detection ==== | ==== UI_Touch_Sreen_Hand_Writing_Detection ==== | ||
==== UI_Speech_Recognition_Function ==== | ==== UI_Speech_Recognition_Function ==== | ||
| - | 1. Recognition Category | + | 1.Recognition Category |
| - | + | #discrete command - command, single digit ( e.g. "dial",'1','2',..etc ) | |
| - | + | #continuous command - continuous digit ( e.g. "dial 555-1212", " 555-1212" ) | |
| - | + | #natural language understaning - flexible recognition ( e.g. "I want to make a call to john","please route to the san franciso regency hyatt hotel",..etc) | |
| - | + | #Dynamic vocabulary (e.g. phonebook, MP3 Title) | |
| + | ##number of phonebook and Music Title is not determined, User flexblly add and remove the list whatever they want, this list is dynamically loaded for a recognition, especially G2P (Grapheme-To-Phoneme) techinique is required to generate the phonetic transcription for the new words | ||
| + | #VDE ( Voice Destination Entry ) | ||
| + | ##Multi Step (e.g. "MI" and then "troy" and then "1307" | ||
| + | ##One shot (e.g. "1307 troy, MI") | ||
| - | 2. Recognition Response time | + | 2.Recognition Response time |
| - | + | #discrete command - 300ms | |
| - | + | #continuous command - 1200ms | |
| - | + | #phone book & Music Title - 1200ms | |
| + | #natural language understanding, VDE - 1500ms | ||
| - | 3. Recognition performance measurement | + | 3.Recognition performance measurement |
| - | + | #overall accuracy measurement | |
| - | + | ##Average Sentence Accuracy = (Total Number of Correct Sentences)/(Total Number of Sentence Attempted) | |
| - | + | #individual accuracy measurement | |
| - | + | ##Average Word Accuracy = ((Total number of attempts) - (insertions) - (deletions) - (substitutions))/Total number of attempts | |
| - | 4. | + | 4.Recognition performance requirement |
| - | + | #discrete command | |
| - | + | ##IDLE SNR>20dB (98%>) , Middle noisy SNR>10dB ( 95%> ), Too much noisy SNR> 6dB ( 92%) , SNR <6dB (rejection ) | |
| - | + | #continuous command | |
| - | + | ##IDLE SNR>20dB (98%>) , Middle noisy SNR>10dB ( 95%> ), Too much noisy SNR> 6dB ( 92%) , SNR <6dB (rejection ) | |
| + | |||
| + | 5.Speech Recognition functionality | ||
| + | #confidence score | ||
| + | ##the recognized result can be accepted/rejected according to confidence score, which represent how confidentially result can be accepted in terms of log likelihood, for example, if the developer set the confidence threshold to 40, basically assumed that confidence score range is between 0(low confidence score)to 100(high confidence score), result can be accepted only if confidence score is greater than confidence threshold 40 | ||
| + | #grammar weight | ||
| + | ##In case of poor accuracy command compare to other candidate in grammar, the weight can be adjusted to eqaulize the result, for example ( 1.1 dial | 0.9 store | 1.0 one | 1.0 two | 1.0 three | 0.9 four | .... | 1.1 oh ) , 1.1 means that more weight , and 1.0 is equal unity gain, 0.9 means that less weight compare to the unity gain | ||
| + | #SNR rejection | ||
| + | ##if ambient noisy is too much, this can be measured by SNR (Signal to Noise Ratio), the it's better to reject recognized result in case of SNR is lower than some specific threahold which is potentially too much corrupted by noise condition | ||
| + | ##even though SNR is lower than some level which is not reliable to get the correct result. if confidence score is extremely high, the result can be accepted depend on the OEM specification | ||
| + | #Talk too soon | ||
| + | ##if user start the utterance before the start beep is playback to user, there is tendancy to chop at the beginning of the utterance, so there is high possibility of result can be misrecognized, but this also can be accepted somewhat confidence score is too high to accepted the result | ||
| + | #AGC(automatic Gain control) | ||
| + | ##speaking style is too difierent user by user, some user's voice is very low and smooth, and others are very strong and loud, even same user can speak different sound occasionally, so AGC is obviouly necessary to bring the volume up in case of soft voice, otherwise bringing the volume down against to loud voice. how this functionality also can be accepted/rejected according to their requirement. Usually AGC does not have impact to the accuracy. but recommed to use this functinality for higher usability. | ||
=== Multiple_HMI_Languages === | === Multiple_HMI_Languages === | ||
| Line 154: | Line 173: | ||
== Navigation == | == Navigation == | ||
=== Navigation_Engine === | === Navigation_Engine === | ||
| + | # map source | ||
| + | # routing | ||
| + | # searching (local / web service) | ||
| + | # geocoding / reverse geocoding | ||
| + | |||
=== Map_Function === | === Map_Function === | ||
# database | # database | ||
| - | # 2D/3D representation | + | # 2D/3D representation (pitch) |
| + | # zooming (IN / OUT / auto-zoom) | ||
| + | # panning (NORTH / SOUTH / EAST / WEST / and combination of these 4 eg. NORTH-WEST) | ||
| + | # orientation / northing (ON - map oriented north / OFF - map oriented in travelling direction) | ||
| + | # follow gps signal ( ON - map cursor follows gps signal / OFF - map cursor does not follow gps signal / timeout - number of updates to wait before cursor follows gps signal on map; useful for panning) | ||
| + | # set / clear destination | ||
| + | # center the map | ||
| + | # set/change map layout (day / night / detailed - POIs / plain simple) | ||
| + | # bookmarks (see also Destination_Import) | ||
| + | # OSD (on-screen display) information | ||
| + | # set units: metric / imperial / ... | ||
=== Real_Time_Traffic_Information === | === Real_Time_Traffic_Information === | ||
| Line 163: | Line 197: | ||
=== Points_of_Interestes_POIs === | === Points_of_Interestes_POIs === | ||
=== Destination_Import === | === Destination_Import === | ||
| + | # from file (eg. USB) | ||
| + | # from web server / web service | ||
| + | # free input | ||
| + | # coordinate units conversion | ||
== Network == | == Network == | ||
This topic provides an outline of requirements for a head unit. The requirements are derived from a real life development project for in-vehicle infotainment platform. This is still work in progress--the content will be added gradually.
Some of the requirements below will be fulfilled outside of the MeeGo IVI based software. For example, the implementation of CAN network interface and early audio functions most probably falls into this category. The decisions about implementation of specific requirements in MeeGo IVI software will be made assuming a specific system architecture.
- objective testing : VDA 1.6 - subjective testing : In-Vehicle live testing
1.Recognition Category
2.Recognition Response time
3.Recognition performance measurement
4.Recognition performance requirement
5.Speech Recognition functionality