Partitioning
of data is often used for distributing load horizontally, moving data
physically closer to its most frequent users, and other purposes. Hive has the
notion of partitioned tables for producing performance benefits, and they can
help organize data in a logical fashion, such as hierarchically.
Query- Create
table Partition2 (id1 int, name string)
Partitioned by
(dept string)
Row format
delimited
Fields
terminated by '\t';
You
can view the partitions of a table by using SHOW PARTITIONS command:
Query-
Show
partitions Partition2;
You can also
view the number of partitions created in HDFS by typing
“localhost.localdomain:50075” in a web browser.
Loading data into partitioned table:-
Loading the data into the partitioned table can be of static
or dynamic depending upon the data you load into the table.
Static Partition:-
Static partition requires individual loads of data to be
loaded into the table for each partitions.
Query -load
data local inpath '/home/training/Desktop/part1.txt' into table Partition1 partition
(dept="cse");
Dynamic Partition:-
Loading data into dynamic partitions allows single insert to
the partitioned table. The following properties should be enabled while performing
the dynamic partitions
Query - set hive.exec.dynamic.partiton=true;
set
hive.exec.dynamic.partiton.mode=nonstrict;
set
hive.exec.max.dynamic.partiton.pernode=1000;
load data local
inpath '/home/training/Desktop/part1.txt' into table Partition1 partition
(dept);
To view the data from a particular
partition:-
Query -Show
partitions clus2 partition (dept="cse");
It is also possible to add a new partition to the existing
tables.
Query - Alter table clus2 add if not exists partition
(year=2015);
To drop a partition from a table:-
Query - Alter table clus2 drop partition (dept=”cse”);
No comments:
Post a Comment