Getting Started with HDInsight - Part 2 - Introduction to Azure HDInsight PowerShell
By: Dattatrey Sindol | Updated: 2014-12-05 | Comments | Related: 1 | 2 | 3 | 4 | 5 | 6 | More > Big Data
I have used PowerShell while working with the Microsoft Azure Cloud and would like to know if PowerShell can be used with HDInsight. I would also like to learn more information on how to use PowerShell for HDInsight. Can you provide some insight into the Cmdlets, installation, security, etc.?
PowerShell is a very popular and powerful feature in the Microsoft stack. It is extensively used for development, deployment and administration across various ecosystems like SQL Server, SharePoint, Azure Cloud and others.
PowerShell cmdlets are available for HDInsight as well and in this tip we will take a look at fundamental aspects of Microsoft Azure PowerShell for HDInsight.
Azure PowerShell cmdlets for HDInsight
PowerShell cmdlets for HDInsight can be broadly classified into following categories:
- Subscription Cmdlets
- Storage Cmdlets
- Cluster Cmdlets
- Job Cmdlets
- Other Cmdlets
While performing many common operations, we need to use a combination of one or more cmdlets irrespective of the categories. The categories in this tip are intended to help understanding the options available with PowerShell.
Let's take a look at various cmdlets in each of the below listed categories. Click on the respective cmdlet URL for more information.
|Get-AzureSubscription||Retrieves the information about an Azure Subscription.|
|Set-AzureSubscription||Configures the common settings for Azure Subscription.|
|Select-AzureSubscription||Selects an Azure Subscription to use.|
|New-AzureStorageAccount||Creates a New Storage Account with the specified input parameters.|
|Get-AzureStorageAccount||Retrieves the list of storage accounts and related details associated with the Azure Subscription.|
|Set-AzureStorageAccount||Sets the properties of a Storage Account in an Azure Subscription.|
|Remove-AzureStorageAccount||Removes/Deletes a Storage Account from an Azure Subscription.|
|New-AzureStorageKey||Regenerates Primary or Secondary Storage Key for an Azure Storage Account based on the specified input parameter values.|
|Get-AzureStorageKey||Retrieves the Storage Account Key for an Azure Storage Account.|
|Add-AzureHDInsightStorage||Adds an Azure Blob Storage Account to an HDInsight cluster configuration.|
|Set-AzureHDInsightDefaultStorage||Sets the default storage account in an HDInsight cluster configuration.|
|New-AzureStorageContainer||Creates a New Azure Storage Container based on the specified input parameter values.|
|Get-AzureStorageContainer||Retrieves the list of Storage Containers in the specified Azure Storage Account.|
|Remove-AzureStorageContainer||Removes/Deletes the specified container from Azure Blob Storage.|
|Get-AzureStorageBlob||Retrieves the list of blobs in an Azure Storage Container.|
|Set-AzureStorageBlobContent||Uploads a local file to Azure Blob Storage.|
|Remove-AzureStorageBlob||Removes/Deletes the specified blob from Azure Blob Storage.|
|Start-AzureStorageBlobCopy||Starts copying the blobs from one location to another location based on the specified input parameter values.|
|Stop-AzureStorageBlobCopy||Stops an on-going Copy Operation in the Blob Storage.|
|Add-AzureHDInsightMetastore||Adds a SQL Database to an HDInsight cluster configuration.|
|New-AzureHDInsightCluster||Creates a New HDInsight cluster in the current/specified Azure Subscription.|
|New-AzureHDInsightClusterConfig||Creates a configuration with various parameters like Data Node Count, Size of Head Node etc. to be used for creating the cluster using New-AzureHDInsightCluster cmdlet.|
|Add-AzureHDInsightConfigValues||Adds a configuration value/customization to the Azure HDInsight cluster configuration.|
|Get-AzureHDInsightCluster||Retrieves either the list of all the clusters associated with the current/specified Azure Subscription or the details of the specified HDInsight cluster.|
|Get-AzureHDInsightProperties||Retrieves the properties specific to a Azure HDInsight service like Number of Clusters, Available Cores, Used Cores etc.|
|Use-AzureHDInsightCluster||Selects the specified HDInsight cluster to use for the subsequent job submission.|
|Remove-AzureHDInsightCluster||Removes/Deletes the specified HDInsight cluster from Azure Subscription.|
|Get-AzureHDInsightJob||Either retrieves the list of jobs from a specified HDInsight cluster or the details of the specified HDInsight job depending upon the specified parameters.|
|Get-AzureHDInsightJobOutput||Retrieves the job output for the specified Job based on the input parameter values.|
|Invoke-AzureHDInsightHiveJob||Submits a Hive job to HDInsight cluster, tracks the progress, and retrieves the output.|
|New-AzureHDInsightHiveJobDefinition||Creates a Job Definition for a Hive Job for HDInsight.|
|New-AzureHDInsightMapReduceJobDefinition||Creates a Job Definition for a Map Reduce Job for HDInsight.|
|New-AzureHDInsightPigJobDefinition||Creates a Job Definition for a Pig Job for HDInsight.|
|New-AzureHDInsightSqoopJobDefinition||Creates a Job Definition for a Sqoop Job for HDInsight.|
|New-AzureHDInsightStreamingMapReduceJobDefinition||Creates a Job Definition for a Streaming Map Reduce Job for HDInsight.|
|Start-AzureHDInsightJob||Starts an Azure HDInsight Job with the specified Job Definition on the specified Cluster.|
|Stop-AzureHDInsightJob||Stops a specified Job on a specified HDInsight Cluster.|
|Wait-AzureHDInsightJob||Awaits the completion of an HDInsight job and reports the progress.|
|Grant-AzureHDInsightHttpServicesAccess||Grants HTTP access to the specified HDInsight cluster.|
|Revoke-AzureHDInsightHttpServicesAccess||Revokes/Disables HTTP access to the specified HDInsight cluster.|
Reference: HDInsight PowerShell Cmdlets Reference Documentation and the msdn documentation pages for respective cmdlets.
Those are the commonly used cmdlets. There are various other PowerShell cmdlets which can be of use depending upon the requirement.
Installing the Azure PowerShell Cmdlets
Azure PowerShell Cmdlets are bundled as a separate module and it is not part of standard PowerShell installation that comes with the operating system. These Cmdlets need to be separately downloaded and installed.
The recommended and easiest way to install Azure PowerShell Cmdlets is through Microsoft Web Platform Installer. Let us install the Azure PowerShell Cmdlets using Microsoft Web Platform Installer by following the below listed steps. If you do not have Microsoft Web Platform Installer, you can get it from here: Microsoft Web Platform Installer.
- Launch the Microsoft Web Platform Installer and you will see various products / add-ons available for installation as shown below.
- Select Microsoft Azure PowerShell and click on Add and then Click on Install. You will be presented with End User License Agreement as shown below. The download size displayed in the below screenshot varies based on whether you are installing the cmdlets for the first time or are installing updates to the previously installed Azure PowerShell Cmdlets.
- Click on "I Accept" and the download as well as installation will start and the progress will be reported as shown below.
- Once the download, installation, and configuration is completed, a success dialog is presented with the status of the installation as shown below. The list of components installed by the installer in the below dialog box depends on whether you are installing the cmdlets for the first time or are installing updates to the previously installed Azure PowerShell Cmdlets.
- Next let us quickly verify whether the Cmdlets are installed and showing up in PowerShell. Go to Start, locate Windows PowerShell ISE, and Launch it. In the PowerShell ISE, go to Help, and click on "Show Command Window" or press "Ctrl+F1" as shown below.
- In the Command Window that appears, type "Azure" and verify that the Azure related Cmdlets are available as shown below.
Configuring PowerShell Environment for the Azure Subscription
While interacting with Azure via PowerShell, we need to configure our PowerShell environment with some commonly used settings / information like the Subscription to be used for PowerShell to authenticate while executing commands against an Azure Subscription.
There are couple of different ways in which we can do this configuration. Following are the ways in which we can configure the PowerShell environment with the required Subscription information:
- Authenticate using Azure Active Directory
- Authenticate using Certificate (Publish Settings)
For this demonstration, we will use the second approach using a Publish Settings file. Let us configure the environment by following the steps below.
Before proceeding, note that we need to have an Active Azure Subscription. If you do not have an Azure Subscription, you can sign up for a free trial here: Microsoft Azure Free Trial.
- Launch Windows Azure PowerShell ISE and execute Get-AzurePublishSettingsFile command as shown below.
- Upon execution of the above command, it will open the Azure Management Portal in a browser. Login to the portal using the same credentials which has an associated subscription. After logging in, you will be prompted to download the Publish Settings file. Save the Publish Settings file.
- The Publish Settings file contains the necessary credentials required to authenticate PowerShell against the Azure Subscription. Now, execute Import-AzurePublishSettingsFile "D:\HDIDemo_AzurePublishSettings.publishsettings" (Use the path where you have saved the Publish Settings file) command to import the necessary settings into the PowerShell environment.
- After importing the Publish Settings file, it is recommended to delete the file as it contains the security credentials. Now, let us check the details of the Subscription to which the PowerShell environment is configured by executing Get-AzureSubscription command. Verify the details and ensure that PowerShell environment is pointing to the correct Azure Subscription.
Now that we have configured the PowerShell environment and verified the Azure Subscription to which it is pointing to, we can start issuing commands to Azure from this PowerShell environment.
- Check out the tips on Big Data
- Check out the tips on Microsoft Azure
- Check out the tips on Windows PowerShell
- Check out my previous tips
- Stay tuned for the next tip in this series!
Last Updated: 2014-12-05
About the author
View all my tips
- Programmatically Drop and Recreate Foreign Keys wi...
- Drop and Re-Create All Foreign Key Constraints in ...
- Getting Started with HDInsight - Part 2 - Introduc...
- Using DELETE CASCADE Option for Foreign Keys...
- Drop and Re-Create All Foreign Key Constraints in ...
- Disable, enable, drop and recreate SQL Server Fore...
- More SQL Server DBA Tips...