Merge branch 'dev' of https://github.com/dmMaze/BallonsTranslator into dev (aed93e95) · Commits · git-mirror / BallonsTranslator

README.md

+47 −45

Original line number	Diff line number	Diff line
		# BallonTranslator
		简体中文 \| [English](README_EN.md) \| [Русский](doc/README_RU.md) \| [日本語](doc/README_JA.md) \| [Indonesia](doc/README_ID.md)

		深度学习辅助漫画翻译工具, 支持一键机翻和简单的图像/文本编辑
		深度学习辅助漫画翻译工具，支持一键机翻和简单的图像/文本编辑

		<img src="doc/src/ui0.jpg" div align=center>

		@@ -11,10 +11,10 @@

		# Features
		* 一键机翻
		- 译文回填参考对原文排版的估计, 包括颜色, 轮廓, 角度, 朝向, 对齐方式等
		- 最后效果取决于文本检测, 识别, 抹字, 机翻四个模块的整体表现
		- 译文回填参考对原文排版的估计，包括颜色，轮廓，角度，朝向，对齐方式等
		- 最后效果取决于文本检测，识别，抹字，机翻四个模块的整体表现
		- 支持日漫和美漫
		- 英译中, 日译英排版已优化, 文本布局以提取到的背景泡为参考, 中文基于pkuseg进行断句, 日译中竖排待改善
		- 英译中，日译英排版已优化，文本布局以提取到的背景泡为参考，中文基于 pkuseg 进行断句，日译中竖排待改善

		* 图像编辑
		支持掩膜编辑和修复画笔
		@@ -29,9 +29,9 @@
		# 使用说明

		## Windows
		如果用windows而且不想自己手动配置环境, 而且能正常访问互联网:
		从[MEGA](https://mega.nz/folder/gmhmACoD#dkVlZ2nphOkU5-2ACb5dKw) 或 [Google Drive](https://drive.google.com/drive/folders/1uElIYRLNakJj-YS0Kd3r3HE-wzeEvrWd?usp=sharing) 下载BallonsTranslator_dev_src_with_gitpython.7z, 解压并运行launch_win.bat启动程序。如果无法自动下载库和模型，手动下载data和ballontrans_pylibs_win.7z并解压到程序目录下.
		运行scripts/local_gitpull.bat获取更新.
		如果用 Windows 而且不想自己手动配置环境，而且能正常访问互联网:
		从 [MEGA](https://mega.nz/folder/gmhmACoD#dkVlZ2nphOkU5-2ACb5dKw) 或 [Google Drive](https://drive.google.com/drive/folders/1uElIYRLNakJj-YS0Kd3r3HE-wzeEvrWd?usp=sharing) 下载 BallonsTranslator_dev_src_with_gitpython.7z，解压并运行 launch_win.bat 启动程序。如果无法自动下载库和模型，手动下载 data 和 ballontrans_pylibs_win.7z 并解压到程序目录下。
		运行 scripts/local_gitpull.bat 获取更新。

		## 运行源码

		@@ -45,9 +45,9 @@ $ git clone https://github.com/dmMaze/BallonsTranslator.git ; cd BallonsTranslat
		$ python3 launch.py
		```

		第一次运行会自动安装torch等依赖项并下载所需模型和文件, 如果模型下载失败, 需要手动从[MEGA](https://mega.nz/folder/gmhmACoD#dkVlZ2nphOkU5-2ACb5dKw) 或 [Google Drive](https://drive.google.com/drive/folders/1uElIYRLNakJj-YS0Kd3r3HE-wzeEvrWd?usp=sharing)下载data文件夹(或者报错里提到缺失的文件), 并保存到源码目录下的对应位置.
		第一次运行会自动安装 torch 等依赖项并下载所需模型和文件，如果模型下载失败，需要手动从 [MEGA](https://mega.nz/folder/gmhmACoD#dkVlZ2nphOkU5-2ACb5dKw) 或 [Google Drive](https://drive.google.com/drive/folders/1uElIYRLNakJj-YS0Kd3r3HE-wzeEvrWd?usp=sharing) 下载 data 文件夹(或者报错里提到缺失的文件)，并保存到源码目录下的对应位置。

		如果要使用Sugoi翻译器(仅日译英), 下载[离线模型](https://drive.google.com/drive/folders/1KnDlfUM9zbnYFTo6iCbnBaBKabXfnVJm), 将 "sugoi_translator" 移入BallonsTranslator/ballontranslator/data/models.
		如果要使用Sugoi翻译器(仅日译英)，下载[离线模型](https://drive.google.com/drive/folders/1KnDlfUM9zbnYFTo6iCbnBaBKabXfnVJm)，将 ```sugoi_translator``` 移入 BallonsTranslator/ballontranslator/data/models。

		## 构建 macOS 应用（适用 apple silicon 芯片）
		<i>如果构建不成功也可以直接跑源码</i>
		@@ -78,13 +78,13 @@ cd ..
		curl -L https://raw.githubusercontent.com/dmMaze/BallonsTranslator/dev/scripts/macos-build-script.sh \| bash
		```

		> 📌打包好的应用在`./data/BallonsTranslator/dist/BallonsTranslator.app`，将应用拖到macOS的应用程序文件夹即完成安装，开箱即用，不需要另外配置Python环境.
		> 📌打包好的应用在`./data/BallonsTranslator/dist/BallonsTranslator.app`，将应用拖到 macOS 的应用程序文件夹即完成安装，开箱即用，不需要另外配置 Python 环境。

		## 一键翻译
		建议在命令行终端下运行程序, 首次运行请先配置好源语言/目标语言, 打开一个带图片的文件夹, 点击Run等待翻译完成
		建议在命令行终端下运行程序，首次运行请先配置好源语言/目标语言，打开一个带图片的文件夹，点击 Run 等待翻译完成
		<img src="doc/src/run.gif">

		一键机翻嵌字格式如大小、颜色等默认是由程序决定的, 可以在设置面板->嵌字菜单中改用全局设置. 全局字体格式就是未编辑任何文本块时右侧字体面板显示的格式:
		一键机翻嵌字格式如大小、颜色等默认是由程序决定的，可以在设置面板->嵌字菜单中改用全局设置。全局字体格式就是未编辑任何文本块时右侧字体面板显示的格式:
		<img src="doc/src/global_font_format.png">

		## 画板
		@@ -101,9 +101,9 @@ curl -L https://raw.githubusercontent.com/dmMaze/BallonsTranslator/dev/scripts/m
		矩形工具
		</p>

		按下鼠标左键拖动矩形框抹除框内文字, 按下右键拉框清除框内修复结果.
		抹除结果取决于算法(gif中的"方法1"和"方法2")对文字区域估算的准确程度, 一般拉的框最好稍大于需要抹除的文本块. 两种方法都比较玄学, 能够应付绝大多数简单文字简单背景, 部分复杂背景简单文字/简单背景复杂文字, 少数复杂背景复杂文字, 可以多拉几次试试.
		勾选"自动"拉完框立即修复, 否则需要按下"修复"或者空格键才进行修复, 或"Ctrl+D"删除矩形选框.
		按下鼠标左键拖动矩形框抹除框内文字，按下右键拉框清除框内修复结果。
		抹除结果取决于算法(gif 中的"方法1"和"方法2")对文字区域估算的准确程度，一般拉的框最好稍大于需要抹除的文本块。两种方法都比较玄学，能够应付绝大多数简单文字简单背景，部分复杂背景简单文字/简单背景复杂文字，少数复杂背景复杂文字，可以多拉几次试试。
		勾选"自动"拉完框立即修复，否则需要按下"修复"或者空格键才进行修复，或 ```Ctrl+D``` 删除矩形选框。

		## 文本编辑
		<img src="doc/src/textedit.gif">
		@@ -124,52 +124,54 @@ OCR并翻译选中文本框
		</p>

		## 界面说明及快捷键
		* Ctrl+Z, Ctrl+Y可以撤销重做大部分操作，注意翻页后撤消重做栈会清空
		* A/D或pageUp/Down翻页, 如果当前页面未保存会自动保存
		* "T"切换到文本编辑模式下(底部最右"T"图标), W激活文本块创建模式后在画布右键拉文本框
		* "P"切换到画板模式, 右下角滑条改原图透明度
		* 底部左侧"OCR"和"A"按钮控制启用/禁用OCR翻译功能, 禁用后再Run程序就只做文本检测和抹字
		* Ctrl+Z，Ctrl+Y 可以撤销重做大部分操作，注意翻页后撤消重做栈会清空
		* A/D 或 pageUp/Down 翻页，如果当前页面未保存会自动保存
		* T 切换到文本编辑模式下(底部最右"T"图标)，W激活文本块创建模式后在画布右键拉文本框
		* P 切换到画板模式，右下角滑条改原图透明度
		* 标题栏->运行可以启用/禁用任意自动化模块，全部禁用后Run会根据全局字体样式和嵌字设置重新渲染文本
		* 设置面板配置各自动化模块参数
		* Ctrl++/- 或滚轮缩放画布
		* Ctrl+A 可选中界面中所有文本块
		* Ctrl+F查找当前页, Ctrl+G全局查找
		* Ctrl+F 查找当前页，Ctrl+G全局查找
		* 0-9调整嵌字/原图透明度
		* 文本编辑下```Ctrl+B```加粗, ```Ctrl+U```下划线, ```Ctrl+I```斜体
		* 文本编辑下 ```Ctrl+B``` 加粗，```Ctrl+U``` 下划线，```Ctrl+I``` 斜体
		* 字体样式面板-"特效"修改透明度添加阴影

		<img src="doc/src/configpanel.png">

		# 自动化模块
		本项目重度依赖[manga-image-translator](https://github.com/zyddnys/manga-image-translator), 在线服务器和模型训练需要费用, 有条件请考虑支持一下
		本项目重度依赖 [manga-image-translator](https://github.com/zyddnys/manga-image-translator)，在线服务器和模型训练需要费用，有条件请考虑支持一下
		- Ko-fi: <https://ko-fi.com/voilelabs>
		- Patreon: <https://www.patreon.com/voilelabs>
		- 爱发电: <https://afdian.net/@voilelabs>

		Sugoi翻译器作者: [mingshiba](https://www.patreon.com/mingshiba).
		Sugoi 翻译器作者: [mingshiba](https://www.patreon.com/mingshiba)

		### 文本检测
		暂时仅支持日文(方块字都差不多)和英文检测, 训练代码和说明见https://github.com/dmMaze/comic-text-detector
		暂时仅支持日文(方块字都差不多)和英文检测，训练代码和说明见https://github.com/dmMaze/comic-text-detector

		### OCR
		* 所有mit模型来自manga-image-translator, 支持日英汉识别和颜色提取
		* [manga_ocr](https://github.com/kha-white/manga-ocr)来自[kha-white](https://github.com/kha-white), 支持日语识别, 注意选用该模型程序不会提取颜色
		* 所有 mit 模型来自 manga-image-translator，支持日英汉识别和颜色提取
		* [manga_ocr](https://github.com/kha-white/manga-ocr) 来自 [kha-white](https://github.com/kha-white)，支持日语识别，注意选用该模型程序不会提取颜色

		### 图像修复
		* AOT 修复模型来自 manga-image-translator
		* patchmatch是非深度学习算法, 也是PS修复画笔背后的算法, 实现来自[PyPatchMatch](https://github.com/vacancy/PyPatchMatch), 本程序用的是我的[修改版](https://github.com/dmMaze/PyPatchMatchInpaint)
		* patchmatch 是非深度学习算法，也是PS修复画笔背后的算法，实现来自 [PyPatchMatch](https://github.com/vacancy/PyPatchMatch)，本程序用的是我的[修改版](https://github.com/dmMaze/PyPatchMatchInpaint)
		* lama* 是微调过的[lama](https://github.com/advimman/lama)


		### 翻译器

		* <s>谷歌翻译能挂代理建议把url从cn改成com</s> 谷歌翻译器已经关闭中国服务, 大陆再用需要设置全局代理, 并在设置面板把url换成*.com
		* 彩云, 需要申请[token](https://dashboard.caiyunapp.com/)
		* 谷歌翻译器已经关闭中国服务，大陆再用需要设置全局代理，并在设置面板把 url 换成*.com
		* 彩云，需要申请 [token](https://dashboard.caiyunapp.com/)
		* papago
		* DeepL 和 Sugoi(及它的CT2 Translation转换)翻译器, 感谢[Snowad14](https://github.com/Snowad14)
		* DeepL 和 Sugoi (及它的 CT2 Translation 转换)翻译器，感谢 [Snowad14](https://github.com/Snowad14)
		* 支持 [Sakura-13B-Galgame](https://github.com/SakuraLLM/Sakura-13B-Galgame)

		如需添加新的翻译器请参考[加别的翻译器](doc/加别的翻译器.md), 本程序添加新翻译器只需要继承基类实现两个接口即可不需要理会代码其他部分, 欢迎大佬提pr
		如需添加新的翻译器请参考[加别的翻译器](doc/加别的翻译器.md)，本程序添加新翻译器只需要继承基类实现两个接口即可不需要理会代码其他部分，欢迎大佬提 pr

		## 杂
		* 电脑带N卡或 Apple silicon 默认启用 GPU 加速
		* 感谢 [bropines](https://github.com/bropines) 提供俄语翻译
		* 第三方输入法可能会造成右侧编辑框显示bug, 见[#76](https://github.com/dmMaze/BallonsTranslator/issues/76), 暂时不打算修
		* 第三方输入法可能会造成右侧编辑框显示 bug，见[#76](https://github.com/dmMaze/BallonsTranslator/issues/76)，暂时不打算修
		* 选中文本迷你菜单支持聚合词典专业划词翻译[沙拉查词](https://saladict.crimx.com): [安装说明](doc/saladict_chs.md)

README_EN.md

+3 −1

Original line number	Diff line number	Diff line
		@@ -176,12 +176,13 @@ OCR & Translate Selected Area
		* ```W``` to activate text block creating mode, then drag the mouse on the canvas with the right button clicked to add a new text block. (see the text editing gif)
		* ```P``` to image-editting mode.
		* In the image editing mode, use the slider on the right bottom to control the original image transparency.
		* The "OCR" and "A" button in the bottom toolbar controls whether to enable OCR and translation, if you disable them, the program will only do the text detection and removal.
		* Disable or enable any automatic modules via titlebar->run, run with all modules disabled will re-letter and re-render all text according to corresponding settings.
		* Set parameters of automatic modules in the config panel.
		* ```Ctrl++```/```Ctrl+-``` (Also ```Ctrl+Shift+=```) to resize image.
		* ```Ctrl+G```/```Ctrl+F``` to search globally/in current page.
		* ```0-9``` to adjust opacity of lettering layer
		* For text editing: bold - ```Ctrl+B```, underline - ```Ctrl+U```, Italics - ```Ctrl+I```
		* Set text shadow and transparency in the text style panel -> Effect.

		<img src="doc/src/configpanel.png">

		@@ -213,6 +214,7 @@ Available translators: Google, DeepL, ChatGPT, Sugoi, Caiyun, Baidu. Papago, and
		* [Caiyun](https://dashboard.caiyunapp.com/), [ChatGPT](https://platform.openai.com/playground), [Yandex](https://yandex.com/dev/translate/), [Baidu](http://developers.baidu.com/), and [DeepL](https://www.deepl.com/docs-api/api-access) translators needs to require a token or api key.
		* DeepL & Sugoi translator (and it's CT2 Translation conversion) thanks to [Snowad14](https://github.com/Snowad14).
		* Sugoi translates Japanese to English completely offline.
		* [Sakura-13B-Galgame](https://github.com/SakuraLLM/Sakura-13B-Galgame)

		To add a new translator, please reference [how_to_add_new_translator](doc/how_to_add_new_translator.md), it is simple as subclass a BaseClass and implementing two interfaces, then you can use it in the application, you are welcome to contribute to the project.

launch.py

+1 −1

Original line number	Diff line number	Diff line
		@@ -211,7 +211,7 @@ def main():
		if fnt_idx >= 0:
		shared.CUSTOM_FONTS.append(QFontDatabase.applicationFontFamilies(fnt_idx)[0])

		shared.FONT_FAMILIES = set(f.lower() for f in QFontDatabase.families())
		shared.FONT_FAMILIES = set(f for f in QFontDatabase.families())
		yahei = QFont('Microsoft YaHei UI')
		if yahei.exactMatch() and not sys.platform == 'darwin':
		QGuiApplication.setFont(yahei)

modules/ocr/mit48px.py

+2 −2

Original line number	Diff line number	Diff line
		@@ -328,10 +328,10 @@ def transformer_encoder_forward(
		is_causal: bool = False) -> torch.Tensor:
		x = src
		if self.norm_first:
		x = x + self._sa_block(self.norm1(x), src_mask, src_key_padding_mask, is_causal=is_causal)
		x = x + self._sa_block(self.norm1(x), src_mask, src_key_padding_mask)
		x = x + self._ff_block(self.norm2(x))
		else:
		x = self.norm1(x + self._sa_block(x, src_mask, src_key_padding_mask, is_causal=is_causal))
		x = self.norm1(x + self._sa_block(x, src_mask, src_key_padding_mask))
		x = self.norm2(x + self._ff_block(x))

		return x

modules/textdetector/init.py

+3 −3

Original line number	Diff line number	Diff line
		@@ -54,8 +54,8 @@ class ComicTextDetector(TextDetectorBase):
		params = {
		'detect_size': {
		'type': 'selector',
		'options': [1024, 1152, 1280],
		'select': 1280
		'options': [896, 1024, 1152, 1280],
		'select': 1024
		},
		'det_rearrange_max_batches': {
		'type': 'selector',
		@@ -68,7 +68,7 @@ class ComicTextDetector(TextDetectorBase):
		_load_model_keys = {'model'}

		device = DEFAULT_DEVICE
		detect_size = 1280
		detect_size = 1024
		def __init__(self, **params) -> None:
		super().__init__(**params)