基于对抗生成网络的多风格化的汉字

Jie-fu CHEN; Hua CHEN; Xing XU; Yan-li JI; Li-jiang CHEN

doi:10.3969/j.issn.1001-0548.2019.05.003

基于对抗生成网络的多风格化的汉字

doi: 10.3969/j.issn.1001-0548.2019.05.003

1.
电子科技大学计算机科学与技术学院成都 610054
2.
江西师范大学计算机信息工程学院南昌 330027
3.
北京阿凡题科技有限公司北京海淀区 100083

基金项目:

the National Natural Science Foundation of China under Grant 61602089

the National Natural Science Foundation of China under Grant 61673088

详细信息

作者简介:
陈杰夫(1993-), 男, 主要从事多媒体内容分析、计算机视觉和社交媒体分析等方面的研究.E-mail:790416231@qq.com

中图分类号: TN97

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

1.
College of Computer Science and Technology, University of Electronic Science and Technology of China Chengdu 610054
2.
School of Computer Information Engineering, Jiangxi Normal University Nanchang 330027
3.
Beijing Afanti Inc. Haidian Beijing 100083

Funds:

the National Natural Science Foundation of China under Grant 61602089

the National Natural Science Foundation of China under Grant 61673088

More Information

Author Bio:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

摘要: 随着生成对抗网络（GAN）的发展，中文字体转换领域的研究越来越多，研究者能够生成高质量的汉字图像。这些字体转换模型可以使用GAN将源字体转换为目标字体。然而，目前的方法有以下局限：1）生成的图像模糊；2）模型一次只能学习和生成一种目标字体。针对这些问题，该文开发了一种全新的模式来执行中文字体转换。首先，将字体信息附加到图像上，告诉生成器需要转换的字体；然后，通过卷积网络提取和学习特征映射，并使用转置卷积网络生成照片真实图像。使用真实图像作为监控信息，以确保生成的字符和字体与它们自身一致。这个模型只需要训练一次，就能够将一种字体转换为多种字体并生成新的字体。对7个中文字体数据集的大量实验表明，该方法在中文字体转换中优于其他几种方法。
- Chinese font styles transformation /
- generative adversarial networks /
- multiple domains /
- new font creation
Abstract: With the development of Generative Adversarial Networks (GAN), more and more researches have been conducted in the field of Chinese fonts transformation and researchers are able to generate high-quality images of Chinese characters. These font transformation models can transform a source font to a target font using GAN. However, current methods have limitations that 1) generated images are oftentimes blurry and 2) models can only learn and produce one target font at a time. To address these problems, we have developed a brand-new model to perform Chinese font transformation. First, font information is attached to images to tell the generator the fonts that we want to transform. Then, the generator extracts and learns feature mappings through convolutional networks and generates photo-realistic images using transposed convolutional networks. The ground truth images are then used as supervisory information to ensure that characters and fonts generated are consistent with themselves. This model only needs to be trained once, but it is able to transform one font to multiple fonts and produce new fonts. Extensive experiments on seven Chinese font datasets show the superiority of the proposed method over several other methods in Chinese font transformation.
- Chinese font styles transformation /
- generative adversarial networks /
- multiple domains /
- new font creation
Figure 1. Our model compare with other models

下载: 全尺寸图片幻灯片

Figure 2. Real images and our generating images

下载: 全尺寸图片幻灯片

Figure 3. Our models results compare with other models results

下载: 全尺寸图片幻灯片

Figure 4. Comparison of the results of three models

下载: 全尺寸图片幻灯片

[1]	SUN Dan-yang, REN Tong-zheng, LI Chong-xuan, et al. Learning to write stylized Chinese characters by reading a handful of examples[C]//Proceedings of the Twenty-Seventh International Joint Conference on Artificial Intelligence. Stockholm, Sweden: [s.n]. 2018: 920-927.
[2]	ISOLA P, ZHU J Y, ZHOU T, et al. Image-to-image translation with conditional adversarial networks[C]//IEEE Conference on Computer Vision & Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 5967-5976
[3]	CHO Yun-jey I, CHOI Min-je, KIM Mun-young, et al. Stargan: Unified generative adversarial networks for multi-domain image-to-image translation[C]//IEEE Conference on Computer Vision and Pattern Recognition.[S.l.]: IEEE, 2018: 8789-8797.
[4]	XU Song-hua, JIN Tao, JIANG Hao, et al. Automatic generation of personal Chinese handwriting by capturing the characteristics of personal handwriting[C]//Proceedings of the Twenty-First Conference on Innovative Applications of Artificial Intelligence. Pasadena, California, USA: [s.n.], 2009.
[5]	XU Song-hua, JIANG Hao, JIN Tao, et al. Automatic generation of Chinese calligraphic writings with style imitation[J]. IEEE Intelligent Systems, 2009, 4(2):44-53. http://cn.bing.com/academic/profile?id=b0616b77bc72a90fcb74ca578cca1ac2&encoded=0&v=paper_preview&mkt=zh-cn
[6]	MIRZA M, OSINDERO S. Conditional generative adversarial nets[EB/OL].[2014-11-06]. https://arxiv.org/abs/1411.1784.
[7]	SUN Han-fei, LUO Yi-ming, LU Zhang. Unsupervised Typography Transfer[EB/OL].[2018-02-07]. https://arxiv.org/abs/1802.02595.
[8]	CHANG Jie, GU Yu-jun, ZHANG Ya. Chinese typography transfer[EB/OL].[2017-07-16]. https://arxiv.org/abs/1707.04904.
[9]	GOODFELLOW I J, JEAN P A, MIRZA M, et al. Generative adversarial networks[EB/OL].[2014-06-04]. https://arxiv.org/abs/1406.2661v1.
[10]	ZHU Jun-yan, PARK T, ISOLA P. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//IEEE International Conference on Computer Vision. Venice, Italy: IEEE, 2017: 2242-2251.
[11]	KIM T, CHA M, KIM H, et al. Learning to discover cross-domain relations with generative adversarial networks[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: [s.n.], 2017: 1857-1865.
[12]	TZENG E, HOFFMANM J, SAENKO K, et al. Adversarial discriminative domain adaptation[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Honolulu, HI, USA: IEEE, 2017: 2962-2971.
[13]	HE Kai-ming, ZHANG Xiang-yu, REN Shao-qing, et al. Deep residual learning for image recognition[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition. Las Vegas, NV, USA: IEEE, 2016: 770-778.

[1]	王永, 王淞立, 邓江洲. 基于生成对抗网络的评分可信推荐模型 . 电子科技大学学报, 2024, 53(2): 1-8. doi: 10.12178/1001-0548.2023116
[2]	周丰丰, 孙燕杰, 范雨思. 基于miRNA组学的数据增强算法 . 电子科技大学学报, 2023, 52(2): 182-187. doi: 10.12178/1001-0548.2023002
[3]	GE Shucan, LI Hailong, MENG Lin, RAUF Abdur, ULLAH Safi. Study on the Strong Radar Echoes at Polar Mesosphere Using a New Dataset Analysis Software . 电子科技大学学报, 2021, 50(6): 868-876. doi: 10.12178/1001-0548.2021115
[4]	LI Peng, HU Jiang-ping, ZHANG Yu-ping. Design and Analysis of Distributed Optimization Algorithm for Economic Dispatch Problem of Energy-Water Hybrid Networks . 电子科技大学学报, 2020, 49(5): 652-659, 665. doi: 10.12178/1001-0548.2020238
[5]	FAN Cong-min, ZHANG Ying-jun, YUAN Xiao-jun, LI Si-xian. Machine Learning for Heterogeneous Ultra-Dense Networks with Graphical Representations . 电子科技大学学报, 2020, 49(6): 826-836. doi: 10.12178/1001-0548.2020356
[6]	SU Shuang-ping, YANG Wen, ZHAO Zhi-yun. Security Analysis of Opinion Dynamics in Social Networks . 电子科技大学学报, 2020, 49(6): 924-933. doi: 10.12178/1001-0548.2019234
[7]	Fang-yuan YU. A New Alternating Iteration Recovery Algorithm for Compressed Channel Estimation with Basis Mismatch . 电子科技大学学报, 2018, 47(4): 491-496. doi: 10.3969/j.issn.1001-0548.2018.04.003

点击查看大图

图(4)

计量

文章访问数: 4569
HTML全文浏览量: 1369
PDF下载量: 174
被引次数: 0

全文HTML

1. Introduction

Chinese font transformation and font design have always been problematic. One problem is that, compared to English or Latin letters, the total number of Chinese characters is huge. Chinese government standard GB18030-2000, there are 27 533 unique characters^[1] and the number of daily used characters is at least 3 500. Another problem is that Chinese characters have complex shapes and structures and researchers cannot simply transform them by classifying them.

Some methods, such as zi2zi and Chinese typography transfer, implement Chinese font transformation based on Pix2Pix^[2], which is an image-to-image translation. These models learn feature maps from one target font and then apply the feature maps to the source font to do the transformation. However, one limitation of doing so is that we need to train this model again if we want to transform characters to another target font, which can be very time-consuming. Another problem is that the images generated are oftentimes blurry.

Another image-to-image translation research is StarGAN ^[3]. StarGAN can learn the mappings among multiple domains using only a single generator and a discriminator, training effectively from images of all domains.

Nevertheless, this method cannot work effectively on transforming Chinese fonts because Chinese characters have different radicals, graphic components and strokes.

To address these problems, we have developed a new GAN's method. We use only one generator and one discriminator. When given an image and a label (one-hot vector) of the font of interest to the generator, the generator will generate fonts corresponding to the given label. Then given the fake image and real image to the discriminator, the discriminator will discriminate which one is the real image and give labels to both images on the font style. If given one image and more than one labels, the generator can create a new font. We will explain the experimental content about optimization process and the loss function in detail at proposed framework.

In short, our main contributions are as follows.

1) Propose a novel Chinese font transform method and a new font creation method, the former can transform from one font to multiple fonts while the latter can combine multiple fonts to generate a new font.

2) Improve the GAN's generator, let it learn the specified multiple fonts information and specify the font to be generated.

3) Produce qualitative and quantitative Chinese characters images, compared with other methods.

2. Related Work

Chinese Fonts Transformation. Some Chinese font transformation methods^[4-5] view Chinese characters as the combination of radicals and strokes. zi2zi is the first deep model. It views Chinese characters as images, uses CGAN^[6] to transfer typography style^[7-8], and can successfully transform fonts of Chinese characters. Style-Aware Auto-Encoder (SA-VAE) can capture different graphic components of Chinese characters by disentangling the latent features into content-related and style-related components. Chinese typography transfer is an end-to-end model which does not rely on the graphical components of Chinese characters or their stroke orders, and this model treats each single Chinese character as an inseparable image.

StarGAN. StarGAN is used for handling face image styles conversion. It can take in training data of multiple domains, and learn the mappings among all available domains using only one single generator^[3]. This generator can combine learned feature mappings to generate new images.

Generative Adversarial Networks. Generative adversarial network (GAN)^[9] has shown its superior performance in computer vision and image translation. A typical GAN model consists of two modules: a generator and a discriminator. The generator learns from the real samples to generate fake samples to "fool" the discriminator, and the discriminator tries to distinguish the real samples from the fake ones.

Image-to-Image Translation. In recent years with development of GAN, image-to-image translation has achieved great success in the field of image migration. For instance, Cycle-GAN^[10] and DiscoGAN^[11] preserve key attributes between the input and the translated images by utilizing a cycle consistency loss.

Motivated by zi2zi and StarGAN, we developed a new method to generate different fonts of Chinese characters. Our approach is to use different font datasets and corresponding labels to train the generator. In this way, the generator can successfully generate fonts that we specify.

5. Conclusion

In this paper, we proposed a new model to do one-to-many Chinese fonts transformation and to produce new fonts by combining existing fonts. Comparing with zi2zi and Chinese typography transfer, our model can produce higher-quality images and is reusable for different fonts. This reusable feature saves a lot of time compared to modifying and training the model again. Besides, the reuse of the same model is a major focus of transfer learning for deep learning.

参考文献 (13)

姓名
邮箱
手机号码
标题
留言内容
验证码

留言板

基于对抗生成网络的多风格化的汉字

doi: 10.3969/j.issn.1001-0548.2019.05.003

作者简介:
陈杰夫(1993-), 男, 主要从事多媒体内容分析、计算机视觉和社交媒体分析等方面的研究.E-mail:790416231@qq.com

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

Author Bio:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

计量

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

doi: 10.3969/j.issn.1001-0548.2019.05.003

1. 电子科技大学计算机科学与技术学院成都 610054

2. 江西师范大学计算机信息工程学院南昌 330027

3. 北京阿凡题科技有限公司北京海淀区 100083

作者简介:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

English Abstract

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

1. College of Computer Science and Technology, University of Electronic Science and Technology of China Chengdu 610054

2. School of Computer Information Engineering, Jiangxi Normal University Nanchang 330027

3. Beijing Afanti Inc. Haidian Beijing 100083

Author Bio:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

全文HTML

4.1. Experimental Settings

4.2. Network Architecture

4.3. Results on zi2zi's Dataset

目录

期刊在线

编辑办公

友情链接

留言板

基于对抗生成网络的多风格化的汉字

doi: 10.3969/j.issn.1001-0548.2019.05.003

作者简介: 陈杰夫(1993-), 男, 主要从事多媒体内容分析、计算机视觉和社交媒体分析等方面的研究.E-mail:790416231@qq.com

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

Author Bio: CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

计量

出版历程

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

doi: 10.3969/j.issn.1001-0548.2019.05.003

1. 电子科技大学计算机科学与技术学院 成都 610054 2. 江西师范大学计算机信息工程学院 南昌 330027 3. 北京阿凡题科技有限公司 北京 海淀区 100083

作者简介: CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

English Abstract

Learning to Write Multi-Stylized Chinese Characters by Generative Adversarial Networks

1. College of Computer Science and Technology, University of Electronic Science and Technology of China Chengdu 610054 2. School of Computer Information Engineering, Jiangxi Normal University Nanchang 330027 3. Beijing Afanti Inc. Haidian Beijing 100083

Author Bio: CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

全文HTML

4.1. Experimental Settings

4.2. Network Architecture

4.3. Results on zi2zi's Dataset

目录

期刊在线

编辑办公

友情链接

作者简介:
陈杰夫(1993-), 男, 主要从事多媒体内容分析、计算机视觉和社交媒体分析等方面的研究.E-mail:790416231@qq.com

Author Bio:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

1. 电子科技大学计算机科学与技术学院成都 610054

2. 江西师范大学计算机信息工程学院南昌 330027

3. 北京阿凡题科技有限公司北京海淀区 100083

作者简介:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis

1. College of Computer Science and Technology, University of Electronic Science and Technology of China Chengdu 610054

2. School of Computer Information Engineering, Jiangxi Normal University Nanchang 330027

3. Beijing Afanti Inc. Haidian Beijing 100083

Author Bio:
CHEN Jie-fu was born in 1993, and his research interests include multimedia content analysis, computer vision and social media analysis