Support Speech to Text to input prompt and Text to Speech for response