Japanese transcription
Lightweight Vosk Japanese model on CPU — tuned for short clips, voice notes, and stream segments up to 90 seconds.
Transcribe Japanese audio and optionally translate to English. Built for stream overlays, mobile apps, and backend automation.
curl -X POST https://api.jptranscribe.com/v1/transcribe \
-H "Authorization: Bearer YOUR_API_KEY" \
-F "file=@recording.wav" \
-F "language=ja" \
-F "translate=true"
Features
Accurate Japanese ASR with optional English output. No ML ops — send audio, get text.
Lightweight Vosk Japanese model on CPU — tuned for short clips, voice notes, and stream segments up to 90 seconds.
Append translate=true to get English captions alongside or instead of Japanese text.
Upload a file or send a URL. JSON in, JSON out. Bearer token auth. Standard HTTP status codes.
Download our OpenAPI 3 schema and generate clients in Python, TypeScript, Go, or any language.
HTTPS only. API keys per project. Audio deleted after processing — not used for training.
Try endpoints at Swagger UI or upload audio on Try it.
Use cases
Pipe audio chunks from OBS or your app into JP Transcribe. Display English overlays for international viewers.
Transcribe Japanese voice memos in productivity apps. Export searchable text or translated summaries.
Batch-process uploaded media via REST. Integrate with your CMS or automation pipeline.
Convert Japanese field recordings to text for analysis.
How it works
One click on get-key. No email, no credit card. Unlimited minutes.
POST a WAV, MP3, M4A, or FLAC file up to 10 MB (90 sec max). Or pass a public HTTPS URL.
Response includes Japanese text and optional English translation.
Generate a free key in seconds. Full docs and OpenAPI spec are live.
Get free API key View documentationFAQ
WAV, MP3, M4A, FLAC, and OGG. Files up to 10 MB, clips up to 90 seconds. Runs on lightweight CPU — no GPU required.
Vosk small Japanese model on CPU. Best for clear speech in short clips. Noisy or long audio may reduce accuracy.
REST batch API is available at launch. WebSocket streaming for sub-second live captions is on the roadmap — contact us if you need beta access.
Audio is processed in memory and deleted after the request completes. We do not use customer audio to train models.
Completely free — unlimited minutes, fair-use rate limits only. See pricing.