Commit 03d0abbc authored by dmMaze's avatar dmMaze
Browse files

add eng instruction on new translator

parent e17e99c6
Loading
Loading
Loading
Loading
+21 −4
Original line number Diff line number Diff line
@@ -458,7 +458,7 @@ class YandexTranslator(TranslatorBase):
                tr_list.append('')
        return tr_list

# # "dummy translator" is the name showed in the app
# "dummy translator" is the name showed in the app
# @register_translator('dummy translator')
# class DummyTranslator(TranslatorBase):

@@ -482,9 +482,9 @@ class YandexTranslator(TranslatorBase):
#         do the setup here.  
#         keys of lang_map are those languages options showed in the app, 
#         assign corresponding language keys accepted by API to supported languages.  
#         This translator only supports Chinese, Japanese, and English.
#         Only the languages supported by the translator are assigned here, this translator only supports Japanese, and English.
#         For a full list of languages see LANGMAP_GLOBAL in translator.__init__
#         '''
#         self.lang_map['简体中文'] = 'zh'
#         self.lang_map['日本語'] = 'ja'
#         self.lang_map['English'] = 'en'  
        
@@ -495,7 +495,9 @@ class YandexTranslator(TranslatorBase):
#         '''
#         source = self.lang_map[self.lang_source]
#         target = self.lang_map[self.lang_target]
#         return text 
        
#         translation = text
#         return translation

#     def updateParam(self, param_key: str, param_content):
#         '''
@@ -504,6 +506,21 @@ class YandexTranslator(TranslatorBase):
#         '''
#         super().updateParam(param_key, param_content)
#         if param_key == 'device':
#             # get current state from setup_params
#             # self.model.to(self.setup_params['device']['select'])
#             pass

#     @property
#     def supported_tgt_list(self) -> List[str]:
#         '''
#         required only if the translator's language supporting is asymmetric, 
#         for example, this translator only supports English -> Japanese, no Japanese -> English.
#         '''
#         return ['English']

#     @property
#     def supported_src_list(self) -> List[str]:
#         '''
#         required only if the translator's language supporting is asymmetric.
#         '''
#         return ['日本語']
 No newline at end of file
+132 −0
Original line number Diff line number Diff line
If know how to to call the target translator api or translation model in python, implement a class in ballontranslator/dl/translators.__init__.py as follows to use it in the app.      

The following example DummyTranslator is commented out of ballontranslator/dl/translator/__init__.py and can be uncommented to test in the program.


``` python

# "dummy translator" is the name showed in the app
@register_translator('dummy translator')
class DummyTranslator(TranslatorBase):

    concate_text = True

    # parameters showed in the config panel. 
    # keys are parameter names, if value type is str, it will be a text editor(required key)
    # if value type is dict, you need to spicify the 'type' of the parameter, 
    # following 'device' is a selector, options a cpu and cuda, default is cpu
    setup_params: Dict = {
        'api_key': '', 
        'device': {
            'type': 'selector',
            'options': ['cpu', 'cuda'],
            'select': 'cpu'
        }
    }

    def _setup_translator(self):
        '''
        do the setup here.  
        keys of lang_map are those languages options showed in the app, 
        assign corresponding language keys accepted by API to supported languages.  
        Only the languages supported by the translator are assigned here, this translator only supports Japanese, and English.
        For a full list of languages see LANGMAP_GLOBAL in translator.__init__
        '''
        self.lang_map['日本語'] = 'ja'
        self.lang_map['English'] = 'en'  
        
    def _translate(self, text: Union[str, List]) -> Union[str, List]:
        '''
        do the translation here.  
        This translator do nothing but return the original text.
        '''
        source = self.lang_map[self.lang_source]
        target = self.lang_map[self.lang_target]
        
        translation = text
        return translation

    def updateParam(self, param_key: str, param_content):
        '''
        required only if some state need to be updated immediately after user change the translator params,
        for example, if this translator is a pytorch model, you can convert it to cpu/gpu here.
        '''
        super().updateParam(param_key, param_content)
        if param_key == 'device':
            # get current state from setup_params
            # self.model.to(self.setup_params['device']['select'])
            pass

    @property
    def supported_tgt_list(self) -> List[str]:
        '''
        required only if the translator's language supporting is asymmetric, 
        for example, this translator only supports English -> Japanese, no Japanese -> English.
        '''
        return ['English']

    @property
    def supported_src_list(self) -> List[str]:
        '''
        required only if the translator's language supporting is asymmetric.
        '''
        return ['日本語']
```

First the translator must be decorated with register_translator and inherit from the base class TranslatorBase, the 'dummy translator' passed to the decorator is the name of the translator that will be displayed in the interface, be careful not to rename it with an existing translator.  
This ```concate_text``` will be explained later, **set it to False if this translator is a offline model or target api accept str list**.  
``` python
@register_translator('dummy translator')
class DummyTranslator(TranslatorBase):  
    concate_text = True
```

If the new translator requires user-configurable parameters, construct a dictionary named setup_params as below, otherwise leave it alone or assign None to it.  

The keys in setup_params is the corresponding parameter names displayed in the interface, if the corresponding value type is str, it will show in app as a text editor, in following example, the api_key be a text editor with an empty default value.  
The value of the parameter can also be a dictionary, in which case it must be described by 'type', in following example, the 'device' parameter will be shown as a selector in app, valid options are 'cpu' and 'cuda.  
``` python
    setup_params: Dict = {
        'api_key': '', 
        'device': {
            'type': 'selector',
            'options': ['cpu', 'cuda'],
            'select': 'cpu'
        }
    }
```  

Implement ```_setup_translator```: initialized the translator here. 

``` python
def _setup_translator(self):
    '''
    do the setup here.  
    keys of lang_map are those languages options showed in the app, 
    assign corresponding language keys accepted by API to supported languages.  
    Only the languages supported by the translator are assigned here, this translator only supports Japanese, and English.
    For a full list of languages see LANGMAP_GLOBAL in translator.__init__
    '''
    self.lang_map['日本語'] = 'ja'
    self.lang_map['English'] = 'en'  
```

Implement ```_translate```, the following lang_source and lang_target are the languages selected in the interface at this point, you can use the previous lang_map to get the corresponding api language keywords and make a request or process text & feed into model here.  
If prementioned ```concate_text``` is set to False, input could be str list(all text recognized in a page) or str, else the input could be concated text of a str list (['text1', 'text2'] -> 'text1 \n###\n text2'), set it to True only if this translator is a online api and don't accept str list to make fewer requests.

``` python
def _translate(self, text: Union[str, List]) -> Union[str, List]:
    '''
    do the translation here.  
    This translator do nothing but return the original text.
    '''
    source = self.lang_map[self.lang_source]
    target = self.lang_map[self.lang_target]
    
    translation = text
    return translation
```

Re-implement ```updateParam```, ```supported_tgt_list```, ```supported_src_list``` if necessary, please refer to their comments for further details.

Once the translator is implemented, it is recommended to test it following the example in tests/test_translators.py.
 No newline at end of file
+30 −9
Original line number Diff line number Diff line
@@ -2,8 +2,10 @@
下面作为实例的DummyTranslator在dl/translator/__init__.py里被注释掉了, 可以反注释在程序里看结果.  

``` python
# "dummy translator" is the name showed in the app
@register_translator('dummy translator')
class DummyTranslator(TranslatorBase):

    concate_text = True

    # parameters showed in the config panel. 
@@ -11,7 +13,7 @@ class DummyTranslator(TranslatorBase):
    # if value type is dict, you need to spicify the 'type' of the parameter, 
    # following 'device' is a selector, options a cpu and cuda, default is cpu
    setup_params: Dict = {
        'required_key': '', 
        'api_key': '', 
        'device': {
            'type': 'selector',
            'options': ['cpu', 'cuda'],
@@ -24,9 +26,9 @@ class DummyTranslator(TranslatorBase):
        do the setup here.  
        keys of lang_map are those languages options showed in the app, 
        assign corresponding language keys accepted by API to supported languages.  
        This translator only supports Chinese, Japanese, and English.
        Only the languages supported by the translator are assigned here, this translator only supports Japanese, and English.
        For a full list of languages see LANGMAP_GLOBAL in translator.__init__
        '''
        self.lang_map['简体中文'] = 'zh'
        self.lang_map['日本語'] = 'ja'
        self.lang_map['English'] = 'en'  
        
@@ -37,7 +39,9 @@ class DummyTranslator(TranslatorBase):
        '''
        source = self.lang_map[self.lang_source]
        target = self.lang_map[self.lang_target]
        return 'translate ' + text + f'from {source} to target'
        
        translation = text
        return translation

    def updateParam(self, param_key: str, param_content):
        '''
@@ -46,12 +50,28 @@ class DummyTranslator(TranslatorBase):
        '''
        super().updateParam(param_key, param_content)
        if param_key == 'device':
            # get current state from setup_params
            # self.model.to(self.setup_params['device']['select'])
            pass

    @property
    def supported_tgt_list(self) -> List[str]:
        '''
        required only if the translator's language supporting is asymmetric, 
        for example, this translator only supports English -> Japanese, no Japanese -> English.
        '''
        return ['English']

    @property
    def supported_src_list(self) -> List[str]:
        '''
        required only if the translator's language supporting is asymmetric.
        '''
        return ['日本語']
```

首先这个翻译器必须用register_translator装饰并继承基类TranslatorBase, 装饰器内的参数'dummy translator'是最终在界面里显示的翻译器名字, 注意不要和已有翻译器重名.  
这个concate_text留到后面再提.  
这个concate_text留到后面再提, **如果是离线模型或在线api接受字符串表就设成False**.  
``` python
@register_translator('dummy translator')
class DummyTranslator(TranslatorBase):  
@@ -91,7 +111,6 @@ setup_params里的键值是界面里显示的对应参数名, 值可以是str,
翻译器还需要实现_translate, 下面的lang_source和lang_target是此时界面里选择的语言, 可以通过之前的lang_map获取对应的api关键字, 以拼接api参数并发送请求.  
注意如果前面的concate_text设置为False, 这里传入的text会是字符串表, 对应当前翻译页面的每个文本块原文内容, 翻译的输出也应当是一一对应的译文表. 设置为True时传入的text是所有文本块内容拼接成的纯字符串, 输出应当是这个字符串的翻译文本.  
每个文本块都发请求太慢了所以拼接后整页一起翻译, concate_text设置后拼/拆是自动的这里不用管, 默认会将'\n###\n'作为分隔符拼接成一整个文本块, 再将译文用'###'分割回文本表. 这种方法对我测试过的多数翻译器管用, 但是有些翻译器会把这些#处理掉, 这时可以禁用concate_text逐个文本块翻译或者实现自己的拼接方法.  
一些api如彩云支持直接post文本表所以可设置为False.  
``` python
    def _translate(self, text: Union[str, List]) -> Union[str, List]:
        api_key = self.setup_params['api_key']  # 如此获取用户修改过的api_key
@@ -100,6 +119,8 @@ setup_params里的键值是界面里显示的对应参数名, 值可以是str,
        return text
```
这个dummy translator什么都不做只返回原文.  
翻译器实现后建议仿照tests/test_translators.py下的例子写个自己翻译器的测试查看输出是否正确. 测试通过就能在程序里使用了.   

最后上面的updateParam会在用户更改某个参数时自动调用, 默认只会改setup_params里的值, 比如上面的api_key. 一般可忽略, 但是如果需要改变翻译器状态, 比如这是个本地翻译模型能在cuda和cpu切换可以在这里做.  
 No newline at end of file

如果有必要重新实现```updateParam```, ```supported_tgt_list```, ```supported_src_list```, 详见这些函数注释.  

翻译器实现后建议仿照tests/test_translators.py下的例子写个自己翻译器的测试查看输出是否正确. 测试通过就能在程序里正常使用了.   
 No newline at end of file