IRCAM to SPH Converter

Repackage IRCAM research audio as SPH online

Drop files here. 1 GB maximum file size or Sign Up
to
Facebook Amazon Microsoft Tesla Nestle Walmart L'Oreal

IRCAM to SPH

Move audio from the academic IRCAM format into SPH — making research recordings accessible for speech recognition datasets.

Accessible Everywhere

Convert IRCAM files without installing Csound or academic audio tools. Process your research audio from any modern browser.

Quick Results

IRCAM files convert to SPH rapidly on our cloud servers. Upload your research audio and receive the output promptly.

How to convert IRCAM to SPH

1

Select files from Computer, Google Drive, Dropbox, URL or by dragging it on the page.

2

Choose sph or any other format you need as a result (more than 200 formats supported)

3

Let the file convert and you can download your sph file right afterwards

About formats

IRCAM sound files originate from the Institut de Recherche et Coordination Acoustique/Musique — one of the world's foremost computer music laboratories, founded by composer Pierre Boulez in Paris. The format was created in the early 1980s to serve the research needs of IRCAM and has since been adopted by academic and artistic communities working at the intersection of science and sound. An IRCAM file begins with a 1024-byte header containing a magic number, sample rate, channel count, and an encoding type field that supports linear PCM (16/32-bit integer and 32-bit float), mu-law, and A-law variants. The header block also accommodates free-form annotation text, allowing researchers to embed experiment metadata directly in the audio file. Because the payload is uncompressed by default, recordings maintain full fidelity through successive analysis and resynthesis cycles — essential in psychoacoustic experimentation. Software such as Csound, libsndfile, and SoX reads and writes the format natively. Key advantages include a well-defined header that eliminates parsing ambiguity, support for floating-point samples essential in scientific DSP work, and deep roots in the computer music community ensuring continued tooling.
Developer: IRCAM
Initial release: 1983
SPH is the file extension for audio stored in the NIST SPHERE (SPeech HEader REsources) format, a standard created by the U.S. National Institute of Standards and Technology around 1990. Built for speech research, SPH files carry a 1024-byte ASCII header packed with metadata — database identifiers, channel counts, sample rates, byte ordering, and compression type — making every recording self-describing. The underlying audio is typically 16-bit linear PCM sampled at 16 kHz, though other configurations are permitted. Researchers at NIST, DARPA, and universities worldwide rely on SPH for distributing speech corpora such as TIMIT, Switchboard, and the LDC collections that underpin modern automatic speech recognition systems. A key advantage is that the human-readable header lets scripts parse recording metadata without binary decoding. The format's strict standardization also eliminates ambiguity when sharing datasets across institutions and platforms. Because SPH files store uncompressed PCM, they preserve full audio fidelity — critical when training acoustic models where even small artifacts can skew results.
Initial release: 1990

Frequently Asked Questions

Why convert IRCAM to SPH?

SPH provides speech research corpus container. Converting IRCAM research audio to SPH makes it accessible for speech recognition datasets.

What opens SPH files?

NIST tools, Kaldi, HTK can open and play SPH files without additional plugins or configuration.

What is IRCAM format?

IRCAM is a specialized academic audio format from the Institut de Recherche et Coordination Acoustique/Musique in Paris, used in computational musicology and acoustic research.

Is quality preserved in the conversion?

The conversion faithfully transfers audio from IRCAM to SPH. Output quality depends on the target format encoding settings you choose.

Can I convert multiple IRCAM files?

Upload several IRCAM files and batch-convert them all to SPH at once — efficient for processing research audio collections.