Convert English phrases into phonetic Japanese kana approximations; also known as Englishru. Does not translate English into Japanese, but translates English words into their approximate pronounciations in Japanese.
Based on the English to Katakana transcription code written in Python by Yoko Harada (@yokolet) Please see that repo for details on the phonetic conversion.
English to phoneme conversion based on CMUDict. Kanji to Katakana convertion based on KANJIDIC2. Thanks to JMDict and kana. Please refer to those licenses for non-free implementations.
It is a port in Golang with some additional functions:
- Filtering functions to split, parse, and rejoin sentences which contain punctuation or improper contractions.
- Also accepts Japanese input; converts any Kanji characters into their most common Hiragana pronounciation, converts Hiragana into Katakana, leaves Katakana as is.
- Also accepts Romaji input.
- strict input cleaning mode for use with TTS input that does not understand punctuation and other chars. See header below.
Below is an example go file to test this module. It reads input from stdin
, converts the English sentences into their Japanese transliteration and prints them to stdout
.
package main
import (
"github.com/Luigi-Pizzolito/English2KanaTransliteration"
"bufio"
"fmt"
"os"
)
func main() {
// Create an instance of AllToKana
allToKana := kanatrans.NewAllToKana()
// Listen to stdin indefinitely
reader := bufio.NewReader(os.Stdin)
for {
line, err := reader.ReadString('\n')
if err != nil {
break // Exit loop on error
}
// Call convertString function with the accumulated line
result := allToKana.Convert(line)
// Output the result
fmt.Print(result+"\n")
}
}
Sample Output:
❯ go run .
Hello there.
ヘロー ゼアー。
With this program, you can make Japanese text to speech speak in English!
ウィズ ジス プローラ、 ユー キャン メイク ジャーンイーズ テックスト ツー スピーチ スピーク イン イングシュ!
watashi wa miku desu~
ワタシ ワ ミク デス〜
Hello! こんにちは~ ヘロー, miki松原。
ヘロー! コンニチハ〜 ヘロー、 ミキショウゲン。
// Create an instance of AllToKana
allToKana := kanatrans.NewAllToKana()
// Usage
kana := allToKana.Convert("Hello! watashiwa 初音ミク.")
// -> ヘロー! ワタシワ ショオンミク。
// Create an instance of EngToKana
engToKana := kanatrans.NewEngToKana()
// Usage
kana := engToKana.TranscriptSentence("Hello World!")
// -> ヘローワールド
// Create an instance of KanjiToKana
kanjiToKana := kanatrans.NewKanjiToKana()
// Usage
kana := kanjiToKana.Convert("初音")
// -> ショオン
This needs some work, it just takes the most common pronouciation of each Kanji instead of the correct one for the context. Pull requests are welcome!
// Create an instance of HiraganaToKana
hiraganaToKana := kanatrans.NewHiraganaToKana()
// Usage
kana := hiraganaToKana.Convert("こんにちは")
// -> コンニチハ
// Create an instance of RomajiToKana
romajiToKana := kanatrans.NewRomajiToKana()
// Usage
kana := romajiToKana.Convert("kita kita desu")
// -> キタ キタ デス
// Usage
japanesePunctuation := kanatrans.ConvertToJapanesePunctuation("Hello, World!")
// -> Hello、 World!
This module is intended to allow TTS which only support Japanese to speak english (such as AquesTalk, Softalk, etc). These TTS usually have some limitations in what punctuation may be present in the input; with only commas and stops being interpreted as a pause and all other punctuation causing an error.
To use this module for such TTS input, you may enable strict input cleaning mode (only Japanese comma and stop on output) by passing a bool in the initialiser for EngToKana
, RomajiToKana
and AllToKana
classes:
// Create an instance of AllToKana with strict punctuation output
allToKana := kanatrans.NewAllToKana(true)
// Create an instance of EngToKana with strict punctuation output
engToKana := kanatrans.NewEngToKana(true)
// Create an instance of RomajiToKana with strict punctuation output
romajiToKana := kanatrans.NewRomajiToKana(true)
You may also use the function kanatrans.ConvertToJapanesePunctuationRestricted
instead of kanatrans.ConvertToJapanesePunctuation
.
Internally, the AllToKana
proccess function uses a KanjiSplitter
class to call func(string) string
functions which handle Kanji, Kana, English and Punctuation respectively:
// Create an instance of KanjiSplitter with proccesing callbacks
kanjiSplitter := kanatrans.NewKanjiSplitter(
kanjiToKana.Convert, // Kanji callback
hiraganaToKana.Convert, // Gana & Kana callback
engToKana.TranscriptSentence, // English callback
ConvertToJapanesePunctuation, // Punctuation callback
)
If required, you may use a KanjiSplitter
with custom callback functions to provide different processing.