MKV to SPH Converter

Extract SPHERE audio from MKV for speech datasets

Drop files here. 1 GB maximum file size or Sign Up
to
Facebook Amazon Microsoft Tesla Nestle Walmart L'Oreal

Research Standard

SPH is the gold standard for speech research corpora. Extract MKV audio in the format that speech recognition frameworks expect.

Corpus Building

Convert multiple MKV files to SPH at once. Efficient for assembling large speech datasets from video recordings.

Confidential Data

All MKV uploads are deleted after processing. SPH results are purged within 24 hours — sensitive speech data stays private.

How to convert MKV to SPH

1

Select files from Computer, Google Drive, Dropbox, URL or by dragging it on the page.

2

Choose sph or any other format you need as a result (more than 200 formats supported)

3

Let the file convert and you can download your sph file right afterwards

About formats

MKV (Matroska Video) is an open-standard multimedia container format developed by the Matroska project, which announced the format in December 2002. Named after the Russian matryoshka nesting dolls, the format is built on the Extensible Binary Meta Language (EBML), a simplified binary variant of XML that provides a flexible and forward-compatible structure. MKV can hold virtually unlimited numbers of video, audio, and subtitle tracks within a single file, supporting codecs from H.264 and HEVC to VP9 and AV1 for video, and AAC, FLAC, Opus, and DTS for audio. A standout feature is comprehensive subtitle support, handling formats from simple SRT text to complex ASS styled subtitles and bitmap-based PGS tracks from Blu-ray discs. MKV also supports chapter markers, attachments (such as fonts needed for styled subtitles), and tagging metadata, making it one of the most feature-rich containers available. The open specification ensures that any developer can implement MKV reading and writing without licensing fees, which has driven widespread adoption across media players, streaming tools, and encoding software. The ability to encapsulate virtually any codec combination in a single, well-organized file has made MKV the preferred container for high-quality video distribution, archival, and personal media libraries.
Developer: Matroska
Initial release: December 6, 2002
SPH is the file extension for audio stored in the NIST SPHERE (SPeech HEader REsources) format, a standard created by the U.S. National Institute of Standards and Technology around 1990. Built for speech research, SPH files carry a 1024-byte ASCII header packed with metadata — database identifiers, channel counts, sample rates, byte ordering, and compression type — making every recording self-describing. The underlying audio is typically 16-bit linear PCM sampled at 16 kHz, though other configurations are permitted. Researchers at NIST, DARPA, and universities worldwide rely on SPH for distributing speech corpora such as TIMIT, Switchboard, and the LDC collections that underpin modern automatic speech recognition systems. A key advantage is that the human-readable header lets scripts parse recording metadata without binary decoding. The format's strict standardization also eliminates ambiguity when sharing datasets across institutions and platforms. Because SPH files store uncompressed PCM, they preserve full audio fidelity — critical when training acoustic models where even small artifacts can skew results.
Initial release: 1990

Frequently Asked Questions

Why convert MKV to SPH?

SPH (SPHERE) is the NIST standard for speech research corpora. Required by the Linguistic Data Consortium and major speech databases.

What reads SPH files?

NIST SPeech HEader Resources tools, HTK, Kaldi, SoX, and academic speech processing frameworks handle SPH files as standard input.

Is SPH used in AI training?

Yes — SPHERE is widely used for speech recognition training data. Many foundational ASR datasets are distributed in SPH format.

Does SPH contain metadata?

Yes — SPH files include a text header with sample rate, channel count, encoding type, and other metadata useful for automated processing.

Can I batch-process MKV files?

Yes — upload multiple MKV recordings and extract SPH audio from all of them. Ideal for building speech datasets from video sources.