Been working with Hadoop (2.4.0) and Hive (0.13.0) with HDInsight (3.1) and it decompresses GZIP files into CSV by default.  Nice!  So, loading data with a Hive query in Powershell:

$response = Invoke-Hive -Query @"

LOAD DATA INPATH 'wasb://$container@$' 



No additional work or arguments to pass. I thought I had to do something like specified in this post with the but apparently not.


UPDATE: Just found this link: which goes into keeping compressed data in Hive which has a recommendation to create a SequenceFile.