Text this: A parallel-model speech emotion recognition network based on feature clustering