Apache Pig - Handling Compression
Advertisements
We can load and store compressed data in Apache Pig using the functions BinStorage() and TextLoader().
Example
Assume we have a file named employee.txt.zip in the HDFS directory /pigdata/. Then, we can load the compressed file into pig as shown below.
Using PigStorage: grunt> data = LOAD 'hdfs://localhost:9000/pig_data/employee.txt.zip' USING PigStorage(','); Using TextLoader: grunt> data = LOAD 'hdfs://localhost:9000/pig_data/employee.txt.zip' USING TextLoader;
In the same way, we can store the compressed files into pig as shown below.
Using PigStorage: grunt> store data INTO 'hdfs://localhost:9000/pig_Output/data.bz' USING PigStorage(',');
apache_pig_load_store_functions.htm
Advertisements