Introduction.- Part I: Image and Video Conversion (VC).- Image and Video Super-resolution.- Depth Estimation and Control.- Motion Vector Estimation.- Quick 3D: 2D to 3D Conversion.- Visual Lossless Color Compression Technology.- Part II: TV and Display Applications (TVD).- On-screen Display Detection (OSD).- Adaptive Content Analysis.- Natural Effects Generation and Reproducing.- Part III: ML and AI (ML-AI).- Location Mining.- Image Categorization in the Cloud.- General Purpose Machine.- Learning (ML) Platform and ML Wizard.- Part IV: Mobile Algorithms (MA).- Color-coded Aperture Camera.- Mobile User Profiling.- 3D Object Reconstruction.- Motion Photo.- Iris Recognition.- Conclusion.
Michael N. Rychagov received MS degree in acoustical imaging and PhD degree from the Moscow State University (MSU) in 1986 and 1989, respectively. In 2000, he received a Dr.Sc. degree (Habilitation) from the same University. From 1991, he is involved in teaching and research at the National Research University of Electronic Technology (MIET) as an associate professor in the Department of Theoretical and Experimental Physics (1998), professor in the Department of Biomedical Systems (2008), professor in the Department of Informatics and SW for Computer Systems (2014). Since 2004, he joined Samsung R&D Institute in Moscow, Russia (SRR) working on imaging algorithms for printing, scanning and copying, TV and display technologies, multimedia and tomographic areas during almost 14 years, including last 8 years as Director of Division at SRR. Currently, he is Senior Manager of SW Development at Align Technology, Inc. (USA) in Moscow branch (Russia). His technical and scientific interests are image and video signal processing, biomedical modelling, engineering applications of machine learning and artificial intelligence. He is a Member of the Society for Imaging Science and Technology and Senior Member of IEEE.
Ekaterina V. Tolstaya received her MS degree in applied mathematics from Moscow State University, in 2000. In 2004, she completed her MS degree in geophysics from University of Utah, USA, where she worked on inverse scattering in electromagnetics. Since 2004, she worked on problems of image processing and reconstruction in Samsung R&D Institute in Moscow, Russia. Based on these investigations she obtained in 2011 her PhD degree with research on image processing algorithms for printing. In 2014, she continued her career with Align Technology, Inc. (USA) in Moscow branch (Russia) on problems involving computer vision, 3D geometry and machine learning. Since 2020, she works at Aramco Innovations LLC in Moscow, Russia, on geophysical modelling and inversion.
Mikhail Y. Sirotenko received his engineer degree in control systems from Taganrog State University of Radio Engineering (2005) and PhD from Don State Technical University in Robotics and AI (2009). In 2009, he co-founded computer vision startup CVisionLab, shortly after he joined Samsung R&D Institute in Moscow, Russia (SRR) where he led a team working on applied machine learning and computer vision research. In 2015, he joined Amazon to work as a research scientist on Amazon Go project. In 2016, he joined computer vision startup Dresr which was acquired by Google at 2018, where he leads a team working on object recognition.
This book presents prospective, industrially proven methods and software solutions for storing, processing, and viewing multimedia content on digital cameras, camcorders, TV, and mobile devices. Most of the algorithms described here are implemented as systems on chip firmware or as software products and have low computational complexity and memory consumption. In the four parts of the book, which contains a total of 16 chapters, the authors address solutions for the conversion of images and videos by super-resolution, depth estimation and control and mono-to-stereo (2D to 3D) conversion; display applications by video editing; the real-time detection of sport episodes; and the generation and reproduction of natural effects. The practical principles of machine learning are illustrated using technologies such as image classification as a service, mobile user profiling, and automatic view planning with dictionary-based compressed sensing in magnetic resonance imaging. The implementation of these technologies in mobile devices is discussed in relation to algorithms using a depth camera based on a colour-coded aperture, the animated graphical abstract of an image, a motion photo, and approaches and methods for iris recognition on mobile platforms. The book reflects the authors’ practical experience in the development of algorithms for industrial R&D and the commercialization of technologies.
Explains digital techniques for digital cameras, camcorders, TV, mobile devices;
Offers essential algorithms for the processing pipeline in multimedia devices and accompanying software tools;
Features advanced topics on data processing, addressing current technology challenges.