Creating Hive Data Warehouse with CSV Files

Creating Hive Data Warehouse with CSV Files.

Data Mining

LAB 9

LAB 9_1 : Configure and run each system of your choice below on a single node system in pseudo-distributed mode in Linux Ubuntu OS.

Part 1:

Creating Hive Data Warehouse with CSV Files

1. Download and Set up Hive for NoSql Big Data Processing systems on Hadoop .

2. Create a database tables from the Videogame Fact table (See Attachment).

3 Create Partitions on every Platform and Buckets on Genre for each Platform in the table.

4. Perform CRUDE Operations to Retrieve Info using HiveQL.

Part 2:

Create a MongoDB Collection with JSON Files

1. Download and Set up MongoDB for NoSql Big Data Processing systems on Hadoop below.

2. Create a database for two Collections in MongoDB from the JSON Data provided.

3. Retrieve using MongoDB Queries for CRUDE Operations in MongoDB and stores in

Hive tables.

BASIC STEPS:

1. Configure and run the system of your choice on a single node system in

pseudo-distributed mode on your own system.

2. Create a database/collection with your choice of sample data files.

3. Querying basic operators for CRUDE over your database/collection.

Turn in Output of setting up and running each system on command line screen

shot to do all the procedures done.

THE REPORT:

1.It should contain all the steps and procedure regarding the step of the systems.

2.It should contain all the screen shots of the outcomes.

3.The steps should be explained.

4.The lab should be performed in Linux Ubuntu Only.

Note: Hive has a bug that initialize your Name node whenever you start Hive again. So Save the data every time.

Creating Hive Data Warehouse with CSV Files

Posted in Uncategorized