Merge pull request #447 from PiDanShouRouZhouXD/dev (a5c69501) · Commits · git-mirror / BallonsTranslator

README.md

+8 −1

Original line number	Diff line number	Diff line
		@@ -167,11 +167,18 @@ python launch.py --headless --exec_dirs "[DIR_1],[DIR_2]..."
		Sugoi 翻译器作者: [mingshiba](https://www.patreon.com/mingshiba)

		### 文本检测
		暂时仅支持日文(方块字都差不多)和英文检测，训练代码和说明见https://github.com/dmMaze/comic-text-detector
		* 暂时仅支持日文(方块字都差不多)和英文检测，训练代码和说明见https://github.com/dmMaze/comic-text-detector
		* 支持使用 [星河云（团子漫画OCR）](https://cloud.stariver.org.cn/)的字体检测，需要获取并填写token
		* 参数设置、token获取方式详见 [团子OCR说明](doc/团子OCR说明.md)


		### OCR
		* 所有 mit 模型来自 manga-image-translator，支持日英汉识别和颜色提取
		* [manga_ocr](https://github.com/kha-white/manga-ocr) 来自 [kha-white](https://github.com/kha-white)，支持日语识别，注意选用该模型程序不会提取颜色
		* 支持使用 [星河云（团子漫画OCR）](https://cloud.stariver.org.cn/)的OCR，需要获取并填写token
		* 参数设置、token获取方式详见 [团子OCR说明](doc/团子OCR说明.md)
		* 推荐文本检测设置为团子Detector时，将OCR设为none_ocr，直接读取文本，节省时间和请求次数。


		### 图像修复
		* AOT 修复模型来自 manga-image-translator

README_EN.md

+6 −1

Original line number	Diff line number	Diff line
		@@ -203,11 +203,16 @@ This project is heavily dependent upon [manga-image-translator](https://github.c
		[Sugoi translator](https://sugoitranslator.com/) is created by [mingshiba](https://www.patreon.com/mingshiba).

		## Text detection
		Support English and Japanese text detection, training code and more details can be found at [comic-text-detector](https://github.com/dmMaze/comic-text-detector)
		* Support English and Japanese text detection, training code and more details can be found at [comic-text-detector](https://github.com/dmMaze/comic-text-detector)
		* Support using text detection from [Stariver Cloud (Tuanzi Comics OCR)](https://cloud.stariver.org.cn/), requires obtaining and filling in the token
		* For parameter settings and how to obtain the token, refer to [Tuanzi OCR Instructions (Chinese only)](doc/团子OCR说明.md)

		## OCR
		* All mit* models are from manga-image-translator, support English, Japanese and Korean recognition and text color extraction.
		* [manga_ocr](https://github.com/kha-white/manga-ocr) is from [kha-white](https://github.com/kha-white), text recognition for Japanese, with the main focus being Japanese manga.
		* Support using OCR from [Stariver Cloud (Tuanzi Comics OCR)](https://cloud.stariver.org.cn/), requires obtaining and filling in the token
		* For parameter settings and how to obtain the token, refer to [Tuanzi OCR Instructions (Chinese only)](doc/团子OCR说明.md)
		* When setting the text detection to Tuanzi Detector, it is recommended to set OCR to none_ocr, directly read the text, saving time and number of requests.

		## Inpainting
		* AOT is from [manga-image-translator](https://github.com/zyddnys/manga-image-translator).

doc/团子OCR说明.md

0 → 100644

+39 −0

Original line number	Diff line number	Diff line
		## 官方提供的请求参数参考：
		<p align = "center">
		<img src="https://github.com/PiDanShouRouZhouXD/BallonsTranslator/assets/38401147/3c3985e9-f36e-41fb-af94-d6a8088e5ccd" width="85%" height="85%">

		</p>

		## Token 获取方法

		### 方法1：从cookies中获取token

		在浏览器中登录并访问[星河云OCR](https://cloud.stariver.org.cn/)，在浏览器的开发者工具中查看`cookie`，其中包含`token`字段，复制其值。
		<p align = "center">
		<img src="https://github.com/PiDanShouRouZhouXD/BallonsTranslator/assets/38401147/ae2cbcec-b426-4396-a484-62aa09f22cf6" width="50%" height="50%">

		</p>

		### 方法2：通过API获取token

		通过API获取token的方法如下：

		```
		POST https://capiv1.ap-sh.starivercs.cn/OCR/Admin/Login


		Request Body:
		{
		"User": "your_username",
		"Password": "your_password"
		}

		Response Body:
		{
		…
		"Token": "your_token"
		…
		}
		```

		其中，`User`和`Password`为登录团子OCR的用户名和密码，`Token`为登录成功后返回的token。

modules/ocr/init.py

+111 −1

Original line number	Diff line number	Diff line
		@@ -37,7 +37,9 @@ class OCRBase(BaseModule):
		blk_list = [blk_list]

		for blk in blk_list:
		if self.name != 'none_ocr':
		blk.text = []

		self._ocr_blk_list(img, blk_list)
		for callback_name, callback in self._postprocess_hooks.items():
		callback(textblocks=blk_list, img=img, ocr_module=self)
		@@ -242,7 +244,115 @@ class OCRMIT48px(OCRBase):
		if self.device != device:
		self.model.to(device)

		from .stariver_ocr import StariverOCR
		@register_OCR('stariver_ocr')
		class OCRStariver(OCRBase):
		params = {
		'token': 'Replace with your token',
		"refine":{
		'type': 'selector',
		'options': [True, False],
		'select': True
		},
		"filtrate":{
		'type': 'selector',
		'options': [True, False],
		'select': True
		},
		"disable_skip_area":{
		'type': 'selector',
		'options': [True, False],
		'select': True
		},
		"detect_scale": "3",
		"merge_threshold": "2",
		"force_expand":{
		'type': 'selector',
		'options': [True, False],
		'select': False
		},
		'description': '星河云(团子翻译器) OCR API'
		}

		@property
		def token(self):
		return self.params['token']

		@property
		def expand_ratio(self):
		return float(self.params['expand_ratio'])

		@property
		def refine(self):
		if self.params['refine']['select'] == 'True':
		return True
		elif self.params['refine']['select'] == 'False':
		return False
		@property
		def filtrate(self):
		if self.params['filtrate']['select'] == 'True':
		return True
		elif self.params['filtrate']['select'] == 'False':
		return False
		@property
		def disable_skip_area(self):
		if self.params['disable_skip_area']['select'] == 'True':
		return True
		elif self.params['disable_skip_area']['select'] == 'False':
		return False
		@property
		def detect_scale(self):
		return int(self.params['detect_scale'])

		@property
		def merge_threshold(self):
		return float(self.params['merge_threshold'])

		@property
		def force_expand(self):
		if self.params['force_expand']['select'] == 'True':
		return True
		elif self.params['force_expand']['select'] == 'False':
		return False

		def __init__(self, **params) -> None:
		super().__init__(**params)
		self.client = StariverOCR(self.token, refine=self.refine, filtrate=self.filtrate, disable_skip_area=self.disable_skip_area, detect_scale=self.detect_scale, merge_threshold=self.merge_threshold, force_expand=self.force_expand)

		def _ocr_blk_list(self, img: np.ndarray, blk_list: List[TextBlock]):
		im_h, im_w = img.shape[:2]
		for blk in blk_list:
		x1, y1, x2, y2 = blk.xyxy
		if y2 < im_h and x2 < im_w and \
		x1 > 0 and y1 > 0 and x1 < x2 and y1 < y2:
		blk.text = self.client.ocr(img[y1:y2, x1:x2])
		else:
		logging.warning('invalid textbbox to target img')
		blk.text = ['']

		def ocr_img(self, img: np.ndarray) -> str:
		self.logger.debug(f'ocr_img: {img.shape}')
		return self.client.ocr(img)

		def updateParam(self, param_key: str, param_content):
		super().updateParam(param_key, param_content)
		self.client.token = self.params['token']

		@register_OCR('none_ocr')
		class OCRNone(OCRBase):
		def __init__(self, **params) -> None:
		super().__init__(**params)

		params = {
		'NOTICE': 'Not a OCR, just return original text.',
		'description': 'Not a OCR, just return original text.'
		}

		def _ocr_blk_list(self, img: np.ndarray, blk_list: List[TextBlock]):
		pass

		def ocr_img(self, img: np.ndarray) -> str:
		return ''

		import platform
		if platform.mac_ver()[0] >= '10.15':

modules/ocr/stariver_ocr.py

0 → 100644

+56 −0

Original line number	Diff line number	Diff line
		import cv2
		import requests
		import json
		import base64
		import numpy as np


		class StariverOCR:

		def __init__(self, token, detect_scale=3, merge_threshold=0.5, refine=True, filtrate=True, disable_skip_area=True, force_expand=False):
		self.token = token
		self.url = 'https://dl.ap-sh.starivercs.cn/v2/manga_trans/advanced/manga_ocr'
		self.debug = False
		self.params = {
		"token": self.token,
		"mask": False,
		"refine": refine,
		"filtrate": filtrate,
		"disable_skip_area": disable_skip_area,
		"detect_scale": detect_scale,
		"merge_threshold": merge_threshold,
		"low_accuracy_mode": True,
		"force_expand": force_expand
		}

		def ocr(self, img: np.ndarray) -> str:
		if not self.params['token'] or self.params['token'] == 'Replace with your token':
		raise ValueError('token 没有设置。')

		img_base64 = base64.b64encode(
		cv2.imencode('.jpg', img)[1]).decode('utf-8')
		self.params["image"] = img_base64

		response = requests.post(self.url, data=json.dumps(self.params))

		if response.status_code != 200:
		print(f'请求失败，状态码：{response.status_code}')
		if response.json().get('Code', -1) != 0:
		print(f'错误信息：{response.json().get("Message", "")}')
		with open('stariver_ocr_error.txt', 'w', encoding='utf-8') as f:
		f.write(response.text)
		raise ValueError('请求失败。')

		response_data = response.json()['Data']

		if self.debug:
		id = response.json().get('RequestID', '')
		file_name = f"stariver_ocr_response_{id}.json"
		print(f"请求成功，响应数据已保存至{file_name}")
		with open(file_name, 'w', encoding='utf-8') as f:
		json.dump(response_data, f, ensure_ascii=False, indent=4)

		texts_list = ["".join(block.get('texts', '')).strip()
		for block in response_data.get('text_block', [])]
		texts_str = "".join(texts_list).replace('<skip>', '')
		return texts_str