XA to VOX Converter

Transform Maxis XA game audio into Dialogic VOX telephony audio

Drop files here. 1 GB maximum file size or Sign Up
to
Facebook Amazon Microsoft Tesla Nestle Walmart L'Oreal

Telephony Output

Move Maxis game audio into the telephony world — convert XA to VOX for voice systems and telecom applications.

Browser-Based Tool

No game modding tools or audio extractors needed. Convert XA files directly in your web browser on any device.

Secure Processing

Uploaded XA files are deleted immediately after conversion. Output files are purged within 24 hours.

How to convert XA to VOX

1

Select files from Computer, Google Drive, Dropbox, URL or by dragging it on the page.

2

Choose vox or any other format you need as a result (more than 200 formats supported)

3

Let the file convert and you can download your vox file right afterwards

About formats

XA is a proprietary audio format developed by Maxis, the Electronic Arts studio behind SimCity and The Sims, first appearing with SimCity 3000 around 1997. The format is a variant of EA ADPCM (Adaptive Differential Pulse-Code Modulation) tailored for game audio — delivering acceptable sound quality at minimal file sizes so that music and effects can coexist with large game assets. XA encoding stores the difference between consecutive audio samples rather than absolute values, then quantizes those differences into a constrained bit range. This approach yields significant compression while keeping decoding computationally cheap, an important consideration for games that dedicate most CPU resources to rendering and simulation. The format continued in use across SimCity 4, The Sims, and other Maxis titles through the early 2000s. Extracting and converting XA audio is possible through tools like FFmpeg and dedicated game-asset extractors built by the modding community. One practical advantage for developers was that XA files could be streamed from disc during gameplay without stalling the main loop, enabling continuous background music in an era when memory was scarce. For game preservationists, XA remains a commonly encountered format when unpacking classic Maxis title assets.
Initial release: 1997
VOX is a headerless audio format built around Dialogic ADPCM encoding, widely adopted in telephony, interactive voice response (IVR) systems, and voice mail platforms since the 1980s. Each audio sample is compressed into 4 bits using an algorithm developed by Oki Electric and implemented in hardware on Dialogic Corporation's telephony interface cards. VOX files typically use a sampling rate of 6000 or 8000 Hz, producing extremely compact recordings optimized for speech intelligibility rather than musical fidelity. Because the format carries no header, playback software must know the sample rate and encoding parameters in advance — a trade-off that reduces overhead but demands careful file management. The primary advantage of VOX is storage efficiency: a one-minute voice recording at 8 kHz occupies roughly 240 KB, making it practical for systems storing thousands of prompts. Dialogic ADPCM conforms to the ITU-T G.726 standard, ensuring interoperability across telephony equipment from different vendors. Even as modern call centers migrate to IP-based systems with codecs like Opus, vast libraries of VOX recordings persist in legacy IVR deployments and compliance archives worldwide.
Initial release: 1983

Frequently Asked Questions

Why convert XA to VOX?

VOX is the Dialogic telephony standard. Converting XA to VOX places game audio into IVR and call center systems.

What can open VOX files?

Dialogic telephony hardware, Asterisk, and SoX play VOX files.

What is the Maxis XA format?

XA is a proprietary audio format used in Maxis games like SimCity 2000, SimCity 3000, and early The Sims titles for music and sound effects.

Can I extract all audio from a Maxis game?

Upload XA files extracted from your Maxis game directory and convert them to any modern format for listening or preservation.

Is the conversion quality-preserving?

The converter decodes the XA audio data and re-encodes it in the target format. For lossless targets, no additional quality loss occurs.