摘要

Objective: Sketch-to-photo translation has a wide range of applications in the public safety and digital entertainment area. For example, it can help the police find fugitives and missing children or generate an avatar of social account. The existing algorithm of sketch-to-photo translation can only translate sketches into photos under the same age group. However, it does not solve the problem of cross-age sketch-to-photo translation. Cross-age sketch-to-photo translation characters also have a wide range of applications. For example, when the sketch image of the police at hand is out of date after a long time, the task can generate an aging photo based on outdated sketches to help the police find the suspect. Given that paired cross-age sketches and photo images are difficult to obtain, no data sets are available. To solve this problem, this study combines dual generative adversarial networks (DualGANs) and identity-preserved conditional generative adversarial networks (IPCGANs) to propose double dual generative adversarial networks (D-DualGANs). Method: DualGANs have the advantage of two-way conversion without the need to pair samples. However, it can only achieve a two-way conversion of an attribute and cannot achieve the conversion of two attributes at the same time. IPCGANs can complete the aging or rejuvenation of the face while retaining the personalized features of the person's face, but it cannot complete the two-way change between different age groups. This article considers the span of age as a domain conversion problem and considers the cross-age sketch-to-photo translation task as a problem of style and age conversion. We combined the characteristics of the above network to build D-DualGANs by setting up four generators and four discriminators to combat training. The method not only learns the mapping of the sketch domain to the photo domain and the mapping of the photo domain to the sketch domain but also learns the mapping of the source age group to the target age group and the mapping of the target age group to the original age group. In D-DualGANs, the original sketch image or the original photo image is successively completed by the four generators to achieve the four-domain conversion to obtain cross-age photo images or cross-age sketch images and reconstructed same-age sketch images or reconstructed same-age photo images. The generator is optimized by measuring the distance between the generated cross-age image and the reconstructed image of the same age by full reconstruction loss. We also used the identity retention module to introduce reconstructed identity loss to maintain the personalized features of the face. Eventually, the input sketch images and photo images from the different age groups are converted into photos and sketches of the other age group. This method does not require paired samples, currently overcoming the problem of lack of paired samples of cross-age sketches and photos. Result: The experiments combine the images of the CUFS(CUHK(Chinese University of Hong Kong)-face sketeh database) and CUSFS(CUHK face sketch face recognition technology database) sketch photo datasets and produces corresponding age labels for each image based on the results of the age estimation software. According to the age label, the sketch and photo images in the datasets are divided into three groups of 1130, 3150, and 50+, and each age group is evenly distributed. Six D-DualGAN models were trained to realize the two-two conversion between sketches and photographic images of the three age groups, namely, the 1130 sketch and the 3150 photo, the 1130 sketch and the 50+ photo, the 3150 sketch and the 1130 photo, the 3150 sketch and the 50+ photo, the 50+ sketch and the 3150 photo, the 50+ sketch and the 1130 photo. As there is little research on cross-age sketch-to-photo translation. To illustrate the effectiveness of the method, the generated image obtained by this method is compared with the generated image obtained by DualGANs and then by IPCGANs. Our images are of good quality with less distortion and noise. Using an age estimate CNN to judge the age accuracy of the generated image, the mean absolute error (MAE) of our method is lower than the direct addition of DualGANs and IPCGANs. To evaluate the similarity between the generated image and the original image, we invite volunteers unrelated to this study to determine whether the generated image is the same as the original image. The results show that the resulting aging image is similar, and the resulting younger image is poor. Among them, the 3150 photos generated by 1130 sketches are the same as the original image. Conclusion: D-DualGANs proposed in this study provides knowledge on mapping and inverse mapping between the sketch domain and the photo domain and the mapping and inverse mapping between the different age groups. It also converts both the age and style properties of the input image. Photo images of different ages can be generated from a given sketch image. Through the introduced reconstructed identity loss and complete identity loss, the generated image effectively retains the identity features of the original image. Thus, the problem of image cross-style and cross-age translation is solved effectively. D-DualGANs can be used as a general framework to solve other computer vision tasks that need to complete two attribute conversions at the same time. However, some shortcomings are still observed in this method. For example, conversion between the different age groups requires training different models, such as to achieve 1130 sketches to 3150 photos and 1130 sketches to 50+ photos. To train two D-DualGAN models separately is necessary. This work is cumbersome in practical applications and can be used as an improvement direction in the future so that training a network model can achieve conversion between all age groups.

全文