Speech Recognition + Translation

Complete the entire workflow from speech to bilingual subtitles in one click.

What Is Speech Recognition + Translation

Complete the entire workflow from speech to bilingual subtitles in one click: first, speech recognition generates source-language subtitles, then they are automatically translated into the target language. Ideal for scenarios where you need bilingual subtitles quickly.

Output: Source-language SRT + Target-language SRT.

How to Use

  1. Import videos Drag video files into the media library, or click the import button to add videos.
  2. Switch to the "Speech Recognition + Translation" tab Click the "Speech Recognition + Translation" tab in the left settings panel.
  3. Choose source and target languages Set the source language (the language spoken in the video) and the target language (the language you want to translate into).
  4. Choose a translation engine Select a suitable translation engine based on your needs (refer to the engine comparison table on the Translation page).
  5. Click "Run Speech Recognition + Translation" Click the Run button at the bottom to start processing.
  6. Confirm the translation direction popup A confirmation dialog will appear asking you to verify the source and target languages.
  7. View bilingual results After processing is complete, click "Open Subtitle Editor" to view the generated bilingual subtitles.

Two Processing Modes

You can choose a processing mode in "More Settings":

Standard Mode (Default)

Processes the entire video at once for maximum efficiency. Best for situations where you are not in a rush to see results.

Quick Preview Mode

Processes in 2-minute segments. You can view results as soon as the first 2 minutes are done, while the rest continues processing in the background. Best when you want to see the output as soon as possible.

Advanced Settings

Setting Description
Recognition Model Same as the Speech Recognition page; choose a recognition model
Custom Translation Instructions Same as the Translation page; add terminology and style requirements
Max Characters per Line (Source) Controls the maximum length of a single transcribed subtitle line
Max Characters per Line (Target) Controls the maximum length of a single translated subtitle line to keep bilingual lines aligned
AI Punctuation Correction PRO Corrects punctuation from speech recognition
Variety Show Mode Optimized for variety shows / reality TV

Difference from Running Separately

"Speech Recognition + Translation" is equivalent to running "Speech Recognition" first and then "Translation", but more convenient -- it completes the entire workflow with a single click.

The advantage of running them separately is that you can review and edit the source-language subtitles before translating, which is better suited for scenarios where source subtitle quality is critical.

FAQ

Which is better, Standard Mode or Quick Preview Mode?

Standard Mode is more efficient and recommended for most scenarios. Quick Preview Mode is ideal for long videos or when you want to see the output as soon as possible.

Can I get only the source-language subtitles without translation?

Yes, simply use the "Speech Recognition" tab, which generates source-language subtitles only without translating.