Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools » książka

zaloguj się | załóż konto

topmenu

Szukaj

Książki na zamówienie

Wyszukiwanie zaawansowane

Pusty koszyk

Bezpłatna dostawa dla zamówień powyżej 40 zł

Kategorie główne

• Nauka

[2953408]

• Literatura piękna

[1818674]

więcej...

Kategorie szczegółowe BISAC

Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools

ISBN-13: 9781484221983 / Angielski / Miękka / 2016 / 421 str.

Deepak Vohra

Practical Hadoop Ecosystem: A Definitive Guide to Hadoop-Related Frameworks and Tools

ISBN-13: 9781484221983 / Angielski / Miękka / 2016 / 421 str.

Deepak Vohra

cena 221,90
(netto: 211,33 VAT: 5%)

Najniższa cena z 30 dni: 212,02

Termin realizacji zamówienia:
ok. 16-18 dni roboczych.

Darmowa dostawa!

Learn how to use the Apache Hadoop projects, including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout, and Apache Solr. From setting up the environment to running sample applications each chapter in this book is a practical tutorial on using an Apache Hadoop ecosystem project.While several books on Apache Hadoop are available, most are based on the main projects, MapReduce and HDFS, and none discusses the other Apache Hadoop ecosystem projects and how they all work together as a cohesive big data development platform.
What You Will Learn:

Set up the environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5
Run a MapReduce job
Store data with Apache Hive, and Apache HBase
Index data in HDFS with Apache Solr
Develop a Kafka messaging system
Stream Logs to HDFS with Apache Flume
Transfer data from MySQL database to Hive, HDFS, and HBase with Sqoop
Create a Hive table over Apache Solr
Develop a Mahout User Recommender System

Who This Book Is For:
Apache Hadoop developers. Pre-requisite knowledge of Linux and some knowledge of Hadoop is required.

Kategorie:

Informatyka, Internet

Kategorie BISAC:

Computers > Database Administration & Management
Computers > System Administration - Storage & Retrieval

Wydawca:

Apress

Język:

Angielski

ISBN-13:

9781484221983

Rok wydania:

2016

Ilość stron:

421

Waga:

0.76 kg

Wymiary:

25.4 x 17.78 x 2.29

Oprawa:

Miękka

Wolumenów:

Dodatkowe informacje:

Wydanie ilustrowane

Introduction

1. HDFS and MapReduce

Hadoop Distributed FileSystem

MapReduce Frameworks

Setting the Environment

Hadoop Cluster Modes

Running a MapReduce Job with MR1 Framework

Running MR1 in Standalone Mode

Running MR1 in Psuedo-Distributed Mode

Running MapReduce with Yarn Framework

Running YARN in Psuedo-Distributed Mode

Running Hadoop Streaming

Section II Storing & Querying

2. Apache Hive

Setting the Environment

Configuring Hadoop

Configuring Hive

Starting HDFS

Starting the Hive Server

Starting the Hive CLI

Creating a Database

Using a Database

Creating a Managed Table

Loading Data into a Table

Creating a table using LIKE

Adding Data with INSERT INTO TABLE

Adding Data with INSERT OVERWRITECreating Table using AS SELECT

Altering a Table

Truncating a Table

Dropping a Table

Creating an External Table

3. Apache HBase

Setting the Environment

Configuring Hadoop

Configuring HBase

Configuring Hive

Starting HBase

Starting HBase Shell

Creating a HBase Table

Adding Data To HBase Table

Listing All Tables

Getting a Row of Data

Scanning a Table

Counting Number of Rows in a Table

Altering a Table

Deleting a Row

Deleting a Column

Disabling and Enabling a Table

Truncating a Table

Dropping a Table

Finding if a Table exists

Creating a Hive External Table

Section III Bulk Transferring & Streaming

4. Apache Sqoop

Installing MySQL Database

Creating MySQL Database Tables

Setting the Environment

Configuring Hadoop

Starting HDFS

Configuring Hive

Configuring HBase

Importing into HDFS

Exporting from HDFS

Importing into Hive

Importing into HBase

5. Apache Flume

Setting the Environment

Configuring Hadoop

Configuring HBase

Starting HDFS

Configuring Flume

Running a Flume Agent

Configuring Flume for HBase Sink

Streaming MySQL Log to HBase Sink

Section IV Serializing

6. Apache Avro

Setting the Environment

Creating an Avro Schema

Creating a Hive Managed Table

Creating a Hive (version prior to 0.14) External Table Stored as Avro<

Creating a Hive (version 0.14 and later) External Table Stored as Avro

Transferring MySQL Table Data as Avro Data File with Sqoop

7. Apache Parquet

Setting the Environment

Creating a Oracle Database Table

Exporting Oracle Database to a CSV File

Importing the CSV File in MongoDB

Exporting MongoDB Document as CSV File

Importing a CSV File to Oracle Database

Section V Messaging & Indexing

8. Apache Kafka

Setting the Environment

Starting the Kafka Server

Creating a Topic

Starting a Kafka Producer

Starting a Kafka Consumer

Producing and Consuming Messages

Streaming Log Data to Apache Kafka with Apache Flume

Setting the Environment

Creating Kafka Topics

Configuring Flume<

Running Flume Agent

Consuming Log Data as Kafka Messages

9. Apache Solr

Setting the Environment

Configuring the Solr Schema

Starting the Solr Server

Indexing a Document in Solr

Deleting a Document from Solr

Indexing a Document in Solr with Java Client

Searching a Document in Solr

Creating a Hive Managed Table

Creating a Hive External Table

Loading Hive External Table Data

Searching Hive Table Data Indexed in Solr

Section VI Machine Learning

10.Apache Mahout

Setting the Environment

Starting HDFS

Setting the Mahout Environment

Running a Mahout Classification Sample

Running a Mahout Clustering Sample

Developing a User Based Recommender System

The Sample Data

Setting the Environment

Creating a Maven Project in Eclipse

Creating a User Based Recommender

Creating a Recommender Evaluator

Running the Recommender

Choosing a Recommender Type

Choosing a User Similarity Measure

Choosing a Neighborhood Type

Choosing a Neighborhood Size for NearestNUserNeighborhood

Choosing a Threshold for ThresholdUserNeighborhood

Running the Evaluator

Choosing the Split between Training Percentage and Test Percentage

Deepak Vohra is a coder, developer, programmer, book author, and technical reviewer.

This book is a practical guide on using the Apache Hadoop projects including MapReduce, HDFS, Apache Hive, Apache HBase, Apache Kafka, Apache Mahout and Apache Solr. From setting up the environment to running sample applications each chapter is a practical tutorial on using a Apache Hadoop ecosystem project. While several books on Apache Hadoop are available, most are based on the main projects MapReduce and HDFS and none discusses the other Apache Hadoop ecosystem projects and how these all work together as a cohesive big data development platform.

What you'll learn

How to set up environment in Linux for Hadoop projects using Cloudera Hadoop Distribution CDH 5.
How to run a MapReduce job
How to store data with Apache Hive, Apache HBase
How to index data in HDFS with Apache Solr
How to develop a Kafka messaging system

How to stream Logs to HDFS with Apache Flume
How to transfer data from MySQL database to Hive, HDFS and HBase with Sqoop
How create a Hive table over Apache Solr

Vohra, Deepak Deepak Vohra is a consultant and a principal membe... więcej >

Krainaksiazek.pl w programie rzetelna firma

Krainaksiaze.pl - płatności przez paypal

Czytaj nas na:

Zobacz:

1997-2026 DolnySlask.com Agencja Internetowa