Hi, there! Welcome to Zhengjia Xu's page!

Personal Toy Porjects

1. Didide: A toy project for Chinese character correcting

Image source: ABC MANDARIN MISSION

Didide is a project created by me to correct the misuse of the homophones 的, 地, and 得, which are often confused by beginners of Chinese and those who are careless. Source code can be found at GitHub.

python playground.py "我觉的我烦的有点难过,因为我得培根忘记吃了" "didide_model.pt"
# the output'd be: 我觉得我烦得有点难过,因为我的培根忘记吃了
python playground.py "我觉的我烦的有点难过,因为我得培根忘记吃了,而且这种东西得营养一般般,但是好吃的哟!我天天早上开心的享受它的味道,开心的受不鸟哩!我咔咔的吃,吃的要满嘴流油 ,哈哈哈,痛快放肆的吃" "didide_model.pt"
# the output'd be: 我觉得我烦得有点难过,因为我的培根忘记吃了,而且这种东西的营养一般般,但是好吃的哟!我天天早上开心地享受它的味道,开心得受不鸟哩!我咔咔地吃,吃得要满嘴流油,哈哈哈,痛快放肆地吃
python playground.py "我要飛的更高,測試一下繁體預測的對不對,分類的還不錯"# the output'd be: 我要飛得更高,測試一下繁體預測的對不對,分類得還不錯# also works well with traditional Chinese. Yeah, it's because they have same input ids actually.

The motivation for me to do the project was to try to make a real-world application with my software knowledge. Since the Chinese characters 的, 地, 得 share the same pinyin “de” in Mandarin, so it’s easy to type wrong. The misuse of the three characters is very common on the Internet, sometimes even generating ambiguity. Though seems most people think that it’s not a big deal, I still prefer using the correct grammar. So, I wrote this toy project. The project is not perfect, but it’s still a good start for me to learn how to make a real-world application with programming.

2. Shutheblanksup: a VSCode extension

When I first started programming, I didn’t like the spaces that appeared after the IDE reformatted the code, so I wrote this extension, which can delete those spaces. But now I’m used to these spaces, in fact, I even write the spaces to format codes manually.