zf-kbot/manga-translator
An advanced, automated pipeline for translating manga and comics. This project integrates state-of-the-art Computer Vision and NLP models to perform detection, recognition, translation, and typesetting in a single seamless process.
Run zf-kbot/manga-translator with an API
Use one of our client libraries to get started quickly. Clicking on a library will take you to the Playground tab where you can tweak different inputs, see the results, and copy the corresponding code to use in your own project.
Input schema
The fields you can use to run this model with an API. If you don't give a value for a field its default value will be used.
| Field | Type | Default value | Description |
|---|---|---|---|
| source_lan |
None
|
ja
|
Source language (Auto-selects the best OCR engine: MangaOCR for JA, Paddle for ZH/KO, EasyOCR for others)
|
| target_lan |
None
|
zh-CN
|
Target language for translation
|
| translator |
None
|
google
|
Translation Engine: 'google' (Free/Fast) or 'llm' (High Quality, requires API Key)
|
| image |
string
|
Input image file
|
|
| openai_model |
string
|
gpt-4o
|
LLM Model Name. (e.g., 'gpt-4o', 'deepseek-chat', 'moonshot-v1-8k')
|
| openai_api_key |
string
|
LLM API Key. Required if 'translator' is set to 'llm'. Supports OpenAI, DeepSeek, Moonshot, etc.
|
|
| openai_base_url |
string
|
https://api.openai.com/v1
|
LLM Base URL. Optional. Change this if using non-OpenAI models (e.g., 'https://api.deepseek.com')
|
{
"type": "object",
"title": "Input",
"required": [
"image"
],
"properties": {
"image": {
"type": "string",
"title": "Image",
"format": "uri",
"description": "Input image file"
},
"source_lan": {
"enum": [
"ja",
"zh",
"ko",
"en",
"fr",
"de"
],
"type": "string",
"title": "source_lan",
"description": "Source language (Auto-selects the best OCR engine: MangaOCR for JA, Paddle for ZH/KO, EasyOCR for others)",
"default": "ja",
"x-order": 1
},
"target_lan": {
"enum": [
"zh-CN",
"zh-TW",
"en",
"ja",
"ko",
"fr",
"de",
"es",
"it"
],
"type": "string",
"title": "target_lan",
"description": "Target language for translation",
"default": "zh-CN",
"x-order": 2
},
"translator": {
"enum": [
"google",
"llm"
],
"type": "string",
"title": "translator",
"description": "Translation Engine: 'google' (Free/Fast) or 'llm' (High Quality, requires API Key)",
"default": "google",
"x-order": 3
},
"openai_model": {
"type": "string",
"title": "Openai Model",
"default": "gpt-4o",
"description": "LLM Model Name. (e.g., 'gpt-4o', 'deepseek-chat', 'moonshot-v1-8k')"
},
"openai_api_key": {
"type": "string",
"title": "Openai Api Key",
"format": "password",
"nullable": true,
"writeOnly": true,
"description": "LLM API Key. Required if 'translator' is set to 'llm'. Supports OpenAI, DeepSeek, Moonshot, etc.",
"x-cog-secret": true
},
"openai_base_url": {
"type": "string",
"title": "Openai Base Url",
"default": "https://api.openai.com/v1",
"description": "LLM Base URL. Optional. Change this if using non-OpenAI models (e.g., 'https://api.deepseek.com')"
}
}
}
Output schema
The shape of the response you’ll get when you run this model with an API.
{
"type": "string",
"title": "Output",
"format": "uri"
}