There is something to be said about old infrastructure guys making the transition to cloud computing.  Namely, troubleshooting experience.

Background

I was asked to assist with a project for the Azure QuickStart templates on GitHub.  The goal was to provide a template that shows the power of Cloudera Hadoop and Tableau dashboards.  I was asked to create the underlying infrastructure and another team member would work on the data ingestion and dashboards.  This was a fairly simple integration of two already existing templates.  Cloudera had provided a very nice ARM template that created a Hadoop cluster and configured it.  I added the appropriate subnets and a Tableau server to a front-end subnet.  The idea was to get as close to a real-world scenario as possible without overcomplicating the template – dubbed Push to Pilot.   This template environment uses big VM’s.  In fact, the initial instruction is to increase the minimum core requirements for the region in which you will be deploying the cluster.  The images, typical of a Hadoop environment, are quite large (DS13, DS14).  The Tableau server uses the latest market place image.  Once the ARM template was configured with the additional resources (additional subnet, Tableau server, edited NSG’s), we tested the deployment and everything worked like a charm.  I passed off the ARM template and environment to the BI team that would ultimately configure the Tableau dashboards and connect to the Cloudera cluster for data.

The Issue

We went through multiple iterations of the ARM template.  The reason will result in another blog post “How to contribute to Azure QuickStart Templates”.   I had rolled off onto other projects after submitting a pull request to the Azure GitHub team.  There were applauses and champagne and congratulatory remarks all around (well not exactly champagne but you get the idea).  Periodically, I received requests to make changes to pieces of the ARM template, like adding ports to the NSG for instance.  Initially we built everything from the market place and then tweaked those ARM templates to create the entire solution.  Our test environment worked great.  The Tableau admin was able to access the Tableau console without issues in the manually created solution.  This was not the case with the completed ARM template So, using the Tableau server directly from the market place worked great but my ARM template did not work.  The NSG’s were identical so that wasn’t the issue.  Now I had to find out what the heck was going on.

Troubleshooting

So, after removing the NSG’s to help troubleshoot, checking the Public IP in the ARM template, making sure I had access from within the subnet, etc., etc., I logged onto the Tableau server and began looking at the local firewall.  Both the Market Place server and my ARM templated server had the firewalls set identically.  However, turning the VM firewalls off allowed everything through and the Tableau admin could connect to the console.  Well, simply turning off the VM firewall was not acceptable, nor was it the issue.  Something else was causing the issue but we now knew it was firewall related.  Back to the ARM template and a review of the code! There was a difference in the code.  The Tableau server from the Market Place had a setting on the public IP resource in the ARM template that define the “domainNameLabel”: My ARM template had a DNS setting to configure the Public Domain Name.

What I found out

After removing the DNS Domain Name Label, everything worked just fine.  But why? It seems that the 3 Windows Firewall Profiles need to be considered when you are creating the ARM template for a public IP address. Both Tableau servers, the one from the Market Place and the one that I created in the ARM template, had Private and Public firewalls enabled and the Domain was unchecked.  Makes sense since we were not joining to a domain.  But for some reason, adding a DNS Domain Name Label to the ARM template made the traffic think it was connected to a domain. Learning Points In today’s world, we do not care so much about the Azure VM’s when we are architecting solutions.  We can automate their replacements rather quickly.  But because you are still working with a full operating system, you have to take that into consideration.  So, I now remind myself to consider the Windows Firewall of the VM’s as well as the ARM template configurations for Public IP’s and learn how the two interact.  I am not sure I would have been able to find this issue (I searched like crazy) unless I went back to my old school troubleshooting skills by simplifying and comparing the two environments.

Leave a Reply