AV1 to SPH Converter

Extract NIST Sphere audio from AV1 video online

Drop files here. 1 GB maximum file size or Sign Up
to
Facebook Amazon Microsoft Tesla Nestle Walmart L'Oreal

Speech Research Standard

SPH is the format for major speech corpora — converting from AV1 prepares audio for linguistic research and analysis.

Corpus Compatible

SPH files integrate with standard speech research tools like Kaldi, HTK, and NIST scoring utilities.

Private Files

AV1 uploads are erased right after conversion, and SPH outputs are deleted within 24 hours.

How to convert AV1 to SPH

1

Select files from Computer, Google Drive, Dropbox, URL or by dragging it on the page.

2

Choose sph or any other format you need as a result (more than 200 formats supported)

3

Let the file convert and you can download your sph file right afterwards

About formats

AV1 (AOMedia Video 1) is an open, royalty-free video coding format developed by the Alliance for Open Media, a consortium whose founding members include Google, Mozilla, Microsoft, Amazon, Netflix, and Intel, among others. The specification was finalized in June 2018 with the goal of providing a next-generation video codec that surpasses the compression efficiency of H.264 and HEVC while remaining free from licensing fees. AV1 achieves roughly 30-50% better compression than HEVC at equivalent visual quality, making it particularly attractive for streaming platforms seeking to reduce bandwidth costs without sacrificing viewer experience. The codec supports a broad range of features including film grain synthesis, flexible tiling for parallel processing, content-adaptive resolution switching, and a rich set of intra and inter prediction modes. Hardware decoding support has expanded rapidly across mobile processors, GPUs, and smart TVs, addressing early concerns about computational demands during encoding. AV1 has seen wide adoption from major streaming services for delivering 4K and HDR content, and it serves as the video component of the WebM container for web-based playback. The royalty-free status makes AV1 especially important for open web standards and accessible media distribution.
Initial release: June 25, 2018
SPH is the file extension for audio stored in the NIST SPHERE (SPeech HEader REsources) format, a standard created by the U.S. National Institute of Standards and Technology around 1990. Built for speech research, SPH files carry a 1024-byte ASCII header packed with metadata — database identifiers, channel counts, sample rates, byte ordering, and compression type — making every recording self-describing. The underlying audio is typically 16-bit linear PCM sampled at 16 kHz, though other configurations are permitted. Researchers at NIST, DARPA, and universities worldwide rely on SPH for distributing speech corpora such as TIMIT, Switchboard, and the LDC collections that underpin modern automatic speech recognition systems. A key advantage is that the human-readable header lets scripts parse recording metadata without binary decoding. The format's strict standardization also eliminates ambiguity when sharing datasets across institutions and platforms. Because SPH files store uncompressed PCM, they preserve full audio fidelity — critical when training acoustic models where even small artifacts can skew results.
Initial release: 1990

Frequently Asked Questions

Why convert AV1 to SPH?

SPH (NIST Sphere) is the standard format for speech research corpora — used by linguistic datasets like TIMIT and Switchboard.

What opens SPH files?

NIST Sphere tools, Kaldi, HTK, and SoX handle SPH files. It is standard in academic speech and language research.

Is SPH the same as NIST?

SPH uses the NIST Sphere header format — the terms are often used interchangeably in speech research contexts.

What sample rate is typical?

Most speech corpora use 8 kHz or 16 kHz mono — standard rates for telephony and speech recognition data.

Is the conversion secure?

AV1 uploads are deleted immediately. SPH outputs are removed within 24 hours.