featured work

Categories:

featured work

Content:

Text-To-Speech Expand to full view

Impala for Hadoop

Get to know Impala

Impala is a memory processing tool created at Cloudera. Which was newly donated to the Apache foundation. Impala uses a SQL syntax based language like Hive does.

In this demo i'm running Cloudera on a VM in Virtualbox. You can get one of the different environments to run Impala from the official site here

Let's get down to some coding.

Impala views

Views are created in Imapa like somewhat like an variable or or a wrapper for a longer statement. This makes a whole more sense in the video posted below:

Next we take a look at Partitioned tables in Impala. An partitioning is just what it sounds like to have different sets. We do this as preprocessing for faster data processing.

Create a partitioned table:

CREATE TABLE partitioned_medi (hospitalname STRING, score FLOAT, city STRING, address STRING)
  PARTITIONED BY (year SMALLINT, month TINYINT);

Lets script the import from a table:

SELECT DISTINCT
concat('insert into partitioned_medi partition (hospitalname=',cast(hospitalname as string),', city=',cast(city as string),', address=',cast(address as string),
') select hospitalname, city, address from hospitalspend where city=',cast(city as string),';') AS command FROM hospitalspend;

 



Print Friendly and PDF

Please authenticate to bookmark

1 Votes (7) Average Rating
784 Views
Date Created 2016-01-17
Author : fb-fac3b0ok 
  
Report

There are no comments yet!


Sorry! You need to register and loggin before you can comment.