Text this: A HYBRID APPROACH IN ACHIEVING ACCURACY AND ROBUSTNESS FOR BIMODAL EMOTION DETECTION THROUGH SPEECH AND FACIAL EXPRESSIONS