1. To read compressed input files, and the output is also compressed
In the driver class, add the below:
jobConf.setBoolean("mapred.output.compress",true)
jobConf.setClass("mapred.output.compression.codec","GzipCodec.class","CompressionCodec.class")
Running the program over compressed input:
% hadoop jar MaxTempWithCompression input/tanu/input.txt.gz output
% gunzip -c output/part-r-00000.gz
1949 111
1950 20
2. To compress the mapper output
jonconf.setCompressMapOutput(true);
jobConf.setMapOutputCompressorClass(GzipCodec.class);
In the driver class, add the below:
jobConf.setBoolean("mapred.output.compress",true)
jobConf.setClass("mapred.output.compression.codec","GzipCodec.class","CompressionCodec.class")
Running the program over compressed input:
% hadoop jar MaxTempWithCompression input/tanu/input.txt.gz output
% gunzip -c output/part-r-00000.gz
1949 111
1950 20
2. To compress the mapper output
jonconf.setCompressMapOutput(true);
jobConf.setMapOutputCompressorClass(GzipCodec.class);
No comments:
Post a Comment