Last updated: February 2026
ConvertAudioToText provides a RESTful API for programmatic access to our transcription and media processing services. This disclosure outlines how the API works, what third-party services are involved, and how your data is handled during API interactions.
Our API relies on the following third-party services to deliver functionality:
When you submit a file or URL through the API, your media is processed through an automated pipeline. Audio is extracted from video files, then transcribed by our speech-to-text engine. The full response, including word-level timestamps, speaker labels, and confidence scores, is stored in our database and made available through the API.
API usage is subject to the following rate limits:
API access requires authentication via JWT tokens or API keys. JWT tokens are obtained through the login endpoint and expire after a configurable period. API keys are long-lived tokens available to users on paid plans, with granular scope controls and per-key usage tracking.
Transcription results are retained as long as your account is active. You can delete individual transcriptions or your entire account at any time. For unauthenticated tool usage, all data is automatically deleted within hours of processing.
For questions about API usage or this disclosure, contact us at support@convertaudiototext.com