First of all, excellent update with the retention of WhatsApp media files and the “Add to documents” option. This solves a very big problem.
After testing the new behavior of the Conversations API, here's what I'm seeing:
Incoming WhatsApp images now correctly fill in attachments [] with a downloadable URL
The files are stored and accessible (the 1-year retention works great)
Incoming WhatsApp voice notes (audio/ogg — opus) still arrive as:
ContentType: text/plain
attachments [] = []
No URL or reference to the audio file
There is currently a lack of parity:
Images and documents → working
WhatsApp voice notes → not yet mapped to attachments []
Since ingestion and storage of the media are already implemented, it seems that this is the last step in unlocking:
Speech-to-text transcription (workflows with AI)
Audio-based support and sales flows
Full automation of WhatsApp inbound, at the same level as other channels
Request
Please display WhatsApp voice notes in the same way as images:
Fill attachments [] with a URL (or attachmentID)
Keep the original audio format (ogg/opus is correct)
This would immediately unlock many AI-powered use cases and would remove the last block for advanced automations on WhatsApp.
Thanks for the great work — this is already very close 🙌
·