Machine Learning

Aug 12, 2018

The development of the image field driven by AI and ML is very surprising, Microsoft, Google and Adobe etc. have achieved exciting results in this field, and there are more small technical teams to bring these lab technologies to users.

Content Analysis

Both Google and Microsoft's cloud computing divisions already offer mature technical solutions, while Adobe Sensei is still in beta testing. Their VISION API includes image analysis, content recognition, the ability to analyze and identify content in images and generate tags for it, including real-time content recognition for video (technically the same as per-frame image recognition), a technology that is the most basic function in the field of images driven by machine learning.

Based on this technology, through the collection of content already in the database, it is possible to achieve both character recognition. In the previous level, the machine can only recognize that the picture is a person, man or woman, old man or child, where it can be directly recognized is who this person is, what kind of flower this is, what breed of dog.

And by recognizing the content, it can automatically generate a one-sentence description of the content in the picture.

Microsoft Azure also offers simple image cropping, which is achieved in principle by identifying the main object of the image and trying to place the main object in the most important position in different scale sizes (while automatically leaving the text in place.) The same technology is used for content recognition morphing in Photoshop powered by Adobe Sensei.

In this area of recognition, these technologies have become so mature that today we hardly need to manually add tags to images in any image service anymore, it's all automatic by the machine to analyze them.

Image processing

Removal and Complementation

This team proposes a novel approach to image completion that can make images consistent both locally and globally. With the help of a fully convolutional neural network, we can complete an image of arbitrary resolution by filling in missing regions of any shape. To train the consistency of this image completion network, we use trained global and local contextual distinguishers to distinguish between real and completed images.

© 2022 Chang Lisheng. All Rights reserved