Unverified Commit a5c69501 authored by dmMaze's avatar dmMaze Committed by GitHub
Browse files

Merge pull request #447 from PiDanShouRouZhouXD/dev

Add Stariver (星河云,团子漫画OCR) as Text Detector and OCR module.
parents bb8b8ea6 0585234e
Loading
Loading
Loading
Loading
+8 −1
Original line number Diff line number Diff line
@@ -167,11 +167,18 @@ python launch.py --headless --exec_dirs "[DIR_1],[DIR_2]..."
Sugoi 翻译器作者: [mingshiba](https://www.patreon.com/mingshiba)
  
### 文本检测
暂时仅支持日文(方块字都差不多)和英文检测,训练代码和说明见https://github.com/dmMaze/comic-text-detector
 * 暂时仅支持日文(方块字都差不多)和英文检测,训练代码和说明见https://github.com/dmMaze/comic-text-detector
 * 支持使用 [星河云(团子漫画OCR)](https://cloud.stariver.org.cn/)的字体检测,需要获取并填写token
   * 参数设置、token获取方式详见 [团子OCR说明](doc/团子OCR说明.md)


### OCR
 * 所有 mit 模型来自 manga-image-translator,支持日英汉识别和颜色提取
 * [manga_ocr](https://github.com/kha-white/manga-ocr) 来自 [kha-white](https://github.com/kha-white),支持日语识别,注意选用该模型程序不会提取颜色
 * 支持使用 [星河云(团子漫画OCR)](https://cloud.stariver.org.cn/)的OCR,需要获取并填写token
   * 参数设置、token获取方式详见 [团子OCR说明](doc/团子OCR说明.md)
   * 推荐文本检测设置为团子Detector时,将OCR设为none_ocr,直接读取文本,节省时间和请求次数。


### 图像修复
  * AOT 修复模型来自 manga-image-translator
+6 −1
Original line number Diff line number Diff line
@@ -203,11 +203,16 @@ This project is heavily dependent upon [manga-image-translator](https://github.c
[Sugoi translator](https://sugoitranslator.com/) is created by [mingshiba](https://www.patreon.com/mingshiba).
  
## Text detection
Support English and Japanese text detection, training code and more details can be found at [comic-text-detector](https://github.com/dmMaze/comic-text-detector)
 * Support English and Japanese text detection, training code and more details can be found at [comic-text-detector](https://github.com/dmMaze/comic-text-detector)
* Support using text detection from [Stariver Cloud (Tuanzi Comics OCR)](https://cloud.stariver.org.cn/), requires obtaining and filling in the token
   * For parameter settings and how to obtain the token, refer to [Tuanzi OCR Instructions (Chinese only)](doc/团子OCR说明.md)

## OCR
 * All mit* models are from manga-image-translator, support English, Japanese and Korean recognition and text color extraction.
 * [manga_ocr](https://github.com/kha-white/manga-ocr) is from [kha-white](https://github.com/kha-white), text recognition for Japanese, with the main focus being Japanese manga.
* Support using OCR from [Stariver Cloud (Tuanzi Comics OCR)](https://cloud.stariver.org.cn/), requires obtaining and filling in the token
   * For parameter settings and how to obtain the token, refer to [Tuanzi OCR Instructions (Chinese only)](doc/团子OCR说明.md)
   * When setting the text detection to Tuanzi Detector, it is recommended to set OCR to none_ocr, directly read the text, saving time and number of requests.

## Inpainting
  * AOT is from [manga-image-translator](https://github.com/zyddnys/manga-image-translator).

doc/团子OCR说明.md

0 → 100644
+39 −0
Original line number Diff line number Diff line
## 官方提供的请求参数参考:
<p align = "center">
<img src="https://github.com/PiDanShouRouZhouXD/BallonsTranslator/assets/38401147/3c3985e9-f36e-41fb-af94-d6a8088e5ccd" width="85%" height="85%">

</p>

## Token 获取方法

### 方法1:从cookies中获取token

在浏览器中登录并访问[星河云OCR](https://cloud.stariver.org.cn/),在浏览器的开发者工具中查看`cookie`,其中包含`token`字段,复制其值。
<p align = "center">
<img src="https://github.com/PiDanShouRouZhouXD/BallonsTranslator/assets/38401147/ae2cbcec-b426-4396-a484-62aa09f22cf6" width="50%" height="50%">

</p>

### 方法2:通过API获取token

通过API获取token的方法如下:

```
POST https://capiv1.ap-sh.starivercs.cn/OCR/Admin/Login


Request Body:
{
    "User": "your_username",
    "Password": "your_password"
}

Response Body:
{

    "Token": "your_token"

}
```

其中,`User``Password`为登录团子OCR的用户名和密码,`Token`为登录成功后返回的token。
+111 −1
Original line number Diff line number Diff line
@@ -37,7 +37,9 @@ class OCRBase(BaseModule):
            blk_list = [blk_list]

        for blk in blk_list:
            if self.name != 'none_ocr':
                blk.text = []
                
        self._ocr_blk_list(img, blk_list)
        for callback_name, callback in self._postprocess_hooks.items():
            callback(textblocks=blk_list, img=img, ocr_module=self)
@@ -242,7 +244,115 @@ class OCRMIT48px(OCRBase):
        if self.device != device:
            self.model.to(device)

from .stariver_ocr import StariverOCR
@register_OCR('stariver_ocr')
class OCRStariver(OCRBase):
    params = {
        'token': 'Replace with your token',
        "refine":{
            'type': 'selector',
            'options': [True, False],
            'select': True
        },
        "filtrate":{
            'type': 'selector',
            'options': [True, False],
            'select': True
        },
        "disable_skip_area":{
            'type': 'selector',
            'options': [True, False],
            'select': True
        },
        "detect_scale": "3",
        "merge_threshold": "2",
        "force_expand":{
            'type': 'selector',
            'options': [True, False],
            'select': False
        },
        'description': '星河云(团子翻译器) OCR API'
    }

    @property
    def token(self):
        return self.params['token']
    
    @property
    def expand_ratio(self):
        return float(self.params['expand_ratio'])
    
    @property
    def refine(self):
        if self.params['refine']['select'] == 'True':
            return True
        elif self.params['refine']['select'] == 'False':
            return False    
    @property
    def filtrate(self):
        if self.params['filtrate']['select'] == 'True':
            return True
        elif self.params['filtrate']['select'] == 'False':
            return False
    @property
    def disable_skip_area(self):
        if self.params['disable_skip_area']['select'] == 'True':
            return True
        elif self.params['disable_skip_area']['select'] == 'False':
            return False
    @property
    def detect_scale(self):
        return int(self.params['detect_scale'])
    
    @property
    def merge_threshold(self):
        return float(self.params['merge_threshold'])
    
    @property
    def force_expand(self):
        if self.params['force_expand']['select'] == 'True':
            return True
        elif self.params['force_expand']['select'] == 'False':
            return False

    def __init__(self, **params) -> None:
        super().__init__(**params)
        self.client = StariverOCR(self.token, refine=self.refine, filtrate=self.filtrate, disable_skip_area=self.disable_skip_area, detect_scale=self.detect_scale, merge_threshold=self.merge_threshold, force_expand=self.force_expand)

    def _ocr_blk_list(self, img: np.ndarray, blk_list: List[TextBlock]):
        im_h, im_w = img.shape[:2]
        for blk in blk_list:
            x1, y1, x2, y2 = blk.xyxy
            if y2 < im_h and x2 < im_w and \
                    x1 > 0 and y1 > 0 and x1 < x2 and y1 < y2:
                blk.text = self.client.ocr(img[y1:y2, x1:x2])
            else:
                logging.warning('invalid textbbox to target img')
                blk.text = ['']

    def ocr_img(self, img: np.ndarray) -> str:
        self.logger.debug(f'ocr_img: {img.shape}')
        return self.client.ocr(img)

    def updateParam(self, param_key: str, param_content):
        super().updateParam(param_key, param_content)
        self.client.token = self.params['token']

@register_OCR('none_ocr')
class OCRNone(OCRBase):
    def __init__(self, **params) -> None:
        super().__init__(**params)

    params = {
        'NOTICE': 'Not a OCR, just return original text.',
        'description': 'Not a OCR, just return original text.'
    }

    def _ocr_blk_list(self, img: np.ndarray, blk_list: List[TextBlock]):
        pass

    def ocr_img(self, img: np.ndarray) -> str:
        return ''
    
import platform
if platform.mac_ver()[0] >= '10.15':
+56 −0
Original line number Diff line number Diff line
import cv2
import requests
import json
import base64
import numpy as np


class StariverOCR:

    def __init__(self, token, detect_scale=3, merge_threshold=0.5, refine=True, filtrate=True, disable_skip_area=True, force_expand=False):
        self.token = token
        self.url = 'https://dl.ap-sh.starivercs.cn/v2/manga_trans/advanced/manga_ocr'
        self.debug = False
        self.params = {
            "token": self.token,
            "mask": False,
            "refine": refine,
            "filtrate": filtrate,
            "disable_skip_area": disable_skip_area,
            "detect_scale": detect_scale,
            "merge_threshold": merge_threshold,
            "low_accuracy_mode": True,
            "force_expand": force_expand
        }

    def ocr(self, img: np.ndarray) -> str:
        if not self.params['token'] or self.params['token'] == 'Replace with your token':
            raise ValueError('token 没有设置。')

        img_base64 = base64.b64encode(
            cv2.imencode('.jpg', img)[1]).decode('utf-8')
        self.params["image"] = img_base64

        response = requests.post(self.url, data=json.dumps(self.params))

        if response.status_code != 200:
            print(f'请求失败,状态码:{response.status_code}')
            if response.json().get('Code', -1) != 0:
                print(f'错误信息:{response.json().get("Message", "")}')
                with open('stariver_ocr_error.txt', 'w', encoding='utf-8') as f:
                    f.write(response.text)
            raise ValueError('请求失败。')

        response_data = response.json()['Data']

        if self.debug:
            id = response.json().get('RequestID', '')
            file_name = f"stariver_ocr_response_{id}.json"
            print(f"请求成功,响应数据已保存至{file_name}")
            with open(file_name, 'w', encoding='utf-8') as f:
                json.dump(response_data, f, ensure_ascii=False, indent=4)

        texts_list = ["".join(block.get('texts', '')).strip()
                      for block in response_data.get('text_block', [])]
        texts_str = "".join(texts_list).replace('<skip>', '')
        return texts_str
Loading