"I learned a lot from this book and can recommend it to anyone who believes that his company may benefit from the introduction of storage and/or network deduplication mechanisms." (G. K. Jenkins, Computing Reviews, April, 2017)
Introduction.- De-duplication Background.- Existing De-duplication Techniques.- Hybrid Email De-duplication System.- Structure-Aware File and Email De-duplication for Cloud-based Storage Systems.- Software-defined De-duplication as a Network and Storage Service.- Mobile De-duplication.- Conclusion.
Daehee Kim is an Assistant Professor in the Department of Computing and New Media Technologies at University of Wisconsin-Stevens Point. He received Ph.D from University of Missouri-Kansas City in 2015, and master degree from State University of New York in 2008. His research interests lie in the broad areas of storage and network including data deduplication, wireless sensor networks, networked storage system and network protocols, Big data transfer and analysis, and cloud and Internet application and deployments.
Dr. Baek-Young Choi is an Associate Professor at the University of Missouri - Kansas City. She has been a fellow of the U.S. Air Force Research Laboratory’s Visiting Faculty Research Program (AFRL-VFRP), and Korea Telecom - Advance Institute of Technology (KT-AIT). She co-authored the book, ‘Scalable Network Monitoring in High Speed Networks’, and co-edited the book, ‘High Performance Cloud Auditing and Applications‘. She is an Associate Editor of Springer Journal of Telecommunication Systems, and has served on the editorial board of the Elsevier Journal of Computer Networks. She is a senior member of ACM and IEEE, and a member of IEEE Women in Engineering.
Dr. Sejun Song is an Associate Professor in the Department of Computer Science Electrical Engineering at University of Missouri – Kansas City. He directs the Trustworthy Systems and Software Research Lab. Prior to academia, he worked for Cisco Systems and Honeywell Research Lab. Dr. Daehee Kim is an Assistant Professor at the University of Wisconsin, Stevens Point.
This book introduces fundamentals and trade-offs of data de-duplication techniques. It describes novel emerging de-duplication techniques that remove duplicate data both in storage and network in an efficient and effective manner. It explains places where duplicate data are originated, and provides solutions that remove the duplicate data. It classifies existing de-duplication techniques depending on size of unit data to be compared, the place of de-duplication, and the time of de-duplication. Chapter 3 considers redundancies in email servers and a de-duplication technique to increase reduction performance with low overhead by switching chunk-based de-duplication and file-based de-duplication. Chapter 4 develops a de-duplication technique applied for cloud-storage service where unit data to be compared are not physical-format but logical structured-format, reducing processing time efficiently. Chapter 5 displays a network de-duplication where redundant data packets sent by clients are encoded (shrunk to small-sized payload) and decoded (restored to original size payload) in routers or switches on the way to remote servers through network. Chapter 6 introduces a mobile de-duplication technique with image (JPEG) or video (MPEG) considering performance and overhead of encryption algorithm for security on mobile device.