Been working with Hadoop (2.4.0) and Hive (0.13.0) with HDInsight (3.1) and it decompresses GZIP files into CSV by default. Nice! So, loading data with a Hive query in Powershell:
$response = Invoke-Hive -Query @" LOAD DATA INPATH 'wasb://$container@$storageAccountName.blob.core.windows.net/file.csv.gz' INTO TABLE logs; "@
No additional work or arguments to pass. I thought I had to do something like specified in this post with the io.compression.codecs=org.apache.hadoop.io.compress.GzipCodec but apparently not.
UPDATE: Just found this link: https://cwiki.apache.org/confluence/display/Hive/CompressedStorage which goes into keeping compressed data in Hive which has a recommendation to create a SequenceFile.