The National Highway Traffic Safety Administration (NHTSA) estimates that in the USA alone approximately 100,000 crashes each year are caused primarily by driver drowsiness or fatigue. The major cause for inattentiveness has been found to be a deficit in what we call in this paper an extended view of ergonomics, i.e. the "extended ergonomics status" of the driving process. This deficit is multidimensional as it includes aspects such as drowsiness (sleepiness), fatigue (lack of energy) and emotions/stress (for example sadness, anger, joy, pleasure, despair and irritation). The emotions are generally measured by analyzing either head movement patterns or eyelid movements or face expressions or all the lasts together. Concerning emotion recognition visual sensing of face expressions is helpful but generally not always sufficient. Therefore, one needs additional information that can be collected in a non-intrusive manner in order to increase the robustness.
We find acoustic information to be appropriate, provided the human generates some vocal signals by speaking, shouting, crying, etc. In this paper, appropriate visual and acoustic features of the driver are identified based on the experimental analysis. For visual and acoustic features, Linear Discriminant Analysis (LDA) technique is used for dimensionality reduction and Hausdorff distance is used for emotion classification. The performance is evaluated by using the Vera am Mittag (VAM) emotional recognition database. We propose a decision level fusion technique, to fuse the combination of visual sensing of face expressions and pattern recognition from voice.