Along with being able to show/hear/see the various media that can be sent over text message in the contact card conversations window (text, voice, gif, image, video, etc.), include a "media type" in the API data and allow the passing of different data types back through SMS.
With AI being where it is, we could have AI transcribe audio, understand a video, respond with a gif, etc., but only if we can detect the type and process the data properly.
Right now, with an AI bot, if someone sends a voice sms, message.body can't be populated and so nothing makes it to the workflows.
Allow multiple SMS media types in both the conversation window and through the APi.