About Korean pronunciation conversion script

Charlotte_White · 2022 年 6 月 27 日午後 3:53

==== in Korean ====
한국어 조교러를 위해 간단하게 스크립트를 짜봤습니다.
사용자 사전을 사용한다는 이야기도 듣긴 했지만, 일부 맞춤법 후처리의 용이성을 생각해서 스크립트로 작성해봤습니다. (겹받침에서 뒷받침이 이동하는 것, 두음법칙, dy와 같은 특수 기호로 전환 등)

일단 알고리즘을 좀 더 이쁘게 다듬는 중이라, 배포는 좀 걸릴 것 같으나, 마무리 작업을 하기 전에 이러한 사용 방식이 사용자 경험 측면에서 좋은지 고민이 되서 의견을 듣고 싶습니다.

아마 영 도움이 되지 않는다면, 데카르트 곱 연산을 이용해서 사용자 사전 생성 및 업데이트를 자동화하는 것도 가능할 것 같습니다.

==== in English ====
I wrote a simple script for Korean users.
I think I heard that some people use a user dictionary, but I thought it was too repetitive, so I wrote it in a script.

Also, when working with code, it has the advantage of being able to flexibly post-processing work, and I think it also has the advantage of being able to flexibly convert to English pronunciation or Chinese pronunciation.

So, I’d like to hear from you if you find this usage useful or inconvenient in terms of user experience.

Of course, since it is currently divided into three arrays, it seems possible to create a user dictionary using the Cartesian multiplication operation.

DEMO

lindaxiong · 2022 年 7 月 25 日午前 7:47

It sounds like Korean, but the pronunciation is strange.
If you use only Japanese pronunciation, many words are too awkward to pronounce.
And I tried using Korean script. It is very convenient, but it cannot be modified, and there are many differences from the actual Korean pronunciation.
Korean has many pronunciation changes depending on the front and back letters.
so I suggest a tool for creating and updating user dictionaries.

Charlotte_White · 2022 年 7 月 25 日午前 8:16

Thanks for your feedback.
I already know such phonological characteristics well, and I planned to implement it as a script to chaining the front and back letters from the beginning.
However, due to the technical limitations of the internal implementation discovered during the development process, currently, only simple conversion was implemented. (mentioned on Github)

In my mind, I wanted to use a morpheme analyser that has already been developed and researched at various universities such as “Seoul National University” or “KAIST”, which have been developed since 1990. But, no matter how much I tried the system programming skills including Lua-C ffi module, the stability was poor, so I gave up and quickly finalized it.

Regarding the method you revisited, I think that it is possible to solve it sufficiently by contributing to the repository only when necessary.

Charlotte_White · 2022 年 7 月 25 日午前 8:41

in terms of things that have bad pronunciation, I plan to fork KoG2P’s algorithm after preparing for the hackathon right now.
I think it will be okay if we get rid of the problematic consonant assimilation.
When it is updated, give me feedback, please.

lindaxiong · 2022 年 7 月 25 日午前 10:32

Sure~^^ Thank you for your Development.

Aaron_Batchoy · 2022 年 8 月 13 日午後 1:43

Can you do Filipino/Tagalog and Malay Pronunciation Dictionary for SynthV please