Inventory Microsoft Azure with CloudQuery
November 30, 2021
[Originally posted at https://www.returngis.net/2021/11/haz-inventario-de-microsoft-azure-con-cloudquery/]
I have recently discovered a tool that in different scenarios, especially in the inventory, can be useful to us. It's called CloudQuery and it allows you to export the data of the resources of your subscriptions, from the different cloud providers, to be able to execute queries on it by launching SQL statements, since the result is stored in a Postgres. In this article I tell you how to configure it for Microsoft Azure.
Install CloudQuery
The first thing you need to do is install the CloudQuery tool on your machine. In my case I am using Mac, so I have run the following command using Homebrew:
brew install cloudquery/tap/cloudquery
If you use another operating system you can see the different releases here.
Start CloudQuery
Now that you have the CloudQuery tool installed on your machine, create a directory, I have called it cloudquery, and initialize the configuration within it, with the following command:
cloudquery init azure
By doing this, it generates a file called config.hcl which we can customize, indicating which subscriptions we want to bring the data from, what types of resources, and so on.
If we do not modify anything, all the resources of all the subscriptions to which they have access will be brought. Now we have our project ready to retrieve the information of our subscriptions.
Create a Postgres database in Docker
As I already mentioned at the beginning of this article, the information retrieved is exported to a Postgres-type database , so we will need one. The simple way is using a Docker container:
#Create a database in Docker
docker run -d --name postgresdb \
-p 5432:5432 \
-e POSTGRES_PASSWORD=pass \
postgres
Create a Service Principal
Now that we know that we want to retrieve the information from Azure, in order to do so we need to create a main service that has access to the subscriptions we want to export:
SUBSCRIPTION_ID=<YOUR_SUBSCRIPTION_ID>
az account set --subscription $SUBSCRIPTION_ID
#Need to register Microsoft.Security
az provider register --namespace 'Microsoft.Security'
#Create a service principal
az ad sp create-for-rbac --name cloudquery --scopes /subscriptions/$SUBSCRIPTION_ID > auth.json
Once the response is created, and stored in the auth.json file , I use the jq
tool to store the main service information in these environment variables:
#Set variables
export AZURE_TENANT_ID=$(jq -r '.tenant' auth.json)
export AZURE_CLIENT_ID=$(jq -r '.appId' auth.json)
export AZURE_CLIENT_SECRET=$(jq -r '.password' auth.json)
export AZURE_SUBSCRIPTION_ID=$(az account show --query id -o tsv)
Assign the primary service to the Azure AD "Application Administrator" role
To finish the configuration, we need the main service that we just created to be associated with an Azure Active Directory role called Application administrator, which you can find in the Azure AD resource, in the Roles and administrators section:
Once in it, look for the main service that we have called cloudquery, through the Add assignments button, and associate it permanently.
Dump the information in Postgres with CloudQuery
Now the only thing left is to retrieve the information with CloudQuery. To do this, you only have to execute a single command:
#Fetch the information into the database
cloudquery fetch --dsn "postgres://postgres:pass@localhost:5432/postgres?sslmode=disable"
This uses the environment variables that we have previously configured, with the information of our main service, and the postgres that we have generated in Docker. Once the process finishes, you will see that you have a bunch of generated tables (for this example I have used DataGrip as GUI):
You can make queries like these, simply to retrieve resources of a specific type:
SELECT * from azure_compute_virtual_machines;
SELECT * from azure_web_apps;
Or go further and consult about these aspects that may be important to you and you need to validate or generate a report on them. For example: "Tell me which storage accounts have public access enabled." It would be something as simple as this:
SELECT * from azure_storage_accounts where allow_blob_public_access is null
Greetings!