Setting Up Federated Identity Management for VMC on AWS – Install and Setup vIDM Connector

As an enterprise using VMware Cloud Services, you can set up federation with your corporate domain. Federating your corporate domain allows you to use your organization’s single sign-on and identity source to sign in to VMware Cloud Services. You can also set up multi-factor authentication as part of federation access policy settings.

Federated identity management allows you to control authentication to your organization and its services by assigning organization and service roles to your enterprise groups.

Set up a federated identity with the VMware Identity Manager service and the VMware Identity Manager connector, which VMWare provide at no additional charge. The following are the required high-level steps.

  1. Download the VMware Identity Manager (vIDM) connector and configure it for user attributes and group sync from your corporate identity store. Note that only the VMware Identity Manager Connector for Windows is supported.
  2. Configure your corporate identity provider instance using the VMware Identity Manager service.
  3. Register your corporate domain.

This series of blogs will demonstrate how to complete customer end setup of the Federated Identity Management for VMC on AWS.

  1. Install and Setup vIDM connector, which is required for all 3 use cases;
  2. Use Case 1: authenticate the users with On-prem Active Directory; (https://davidwzhang.com/2019/07/31/setting-up-federated-identity-management-for-vmc-on-aws-authentication-with-active-directory)
  3. Use Case 2: authenticate the users with third party IDP Okta (https://davidwzhang.com/2019/07/31/setting-up-federated-identity-management-for-vmc-on-aws-authentication-with-okta-idp/)
  4. Use Case 3: authenticate users with Active Directory Federation Services

As the 1st blog of this series, I will show you how to install the vIDM connector (version 19.03) on Windows 2012 R2 server and how we achieve the HA for vIDM connector.

Prerequisite

  • a vIDM SaaS tenant. If you don’t have one, please contact VMware customer success representative.
  • a Window Server (Windows 2008 R2, Windows 2012, Windows 2012 R2 or Windows 2016).
  • Open the firewall rules for communication from Windows Server to domain controllers and vIDM tenant on port 443.
  • vIDM connector for Windows installation package. The latest version of vIDM connector is shown below.

Installation

Log in to the Windows 2012 R2 server and start the installation:

Click Yes in the “User Account Control” window.

Note the installation package will install the latest major JRE version on on the connector windows server if the JRE has not been installed yet.

The installation process is loading the Installation Wizard.

Click Next in the Installation Wizard window.

Accept the License Agreement as below:

Accept the default of installation destination folder and click Next;

Click Next and leave the “Are you migrating your Connector” box unchecked.

Accept the pop-up hostname and default port for this connector.

As the purpose of VMware Cloud federated identity management, please don’t run the Connector service as domain user account. So leave this “Would you like to run the Connector service as a domain user account?” option box unchecked and click Next.

Click Yes in the pop-up window to confirm from the previous step.

Click Install to begin the installation.

Wait for a few minutes, the installation has completed successfully.

Click Finish. A new window will pop up, which suggests the Connector appliance management URL as below .

Click Yes. The browser is opened and will redirect to https://vidmconn01.lab.local:8443. Accept the alert of security certificate and continue to this website.

In the VMware Identity Manager Appliance Setup wizard, click Continue.

Set passwords for appliance application admin account and click Continue.

Now go to the vIDM tenant, in the tab of Identity & Access Management, click Add Connector.

Type in Connector ID Name and Click “Generate Activation Code”.

Copy the generated activation code and go back to the Connector setup wizard.

Copy the activation code into the Activate Connector Window and click Continue.

Wait for a few minutes then the connector will be activated.

Note: sometimes a 404 error will pop up like the below. As my experience, it is a false alert for Windows 2012 R2. Don’t worry about it.

In VMware Identity Manager tenant, the newly installed connector will show up as below:

Setup

Now it is time to set up our connector for user sync.

Step 1: Add Directory

Click Add Directory and select “Add Active Directory over LDAP/IWA”.

Type in “Directory Name”, select “Active Directory over LDAP” and use this directory for user sync and authentication. In the “Directory Search Attribute”, I prefer to use UserPrincipalName than sAMAccountName as the UserPrincipalName option will work for all Federated Identity management use cases, e.g. integration with Active Directory Federation Service and 3rd Party IDP.

Then provide the required Bind User Details and click “Save & Next”

After a few minutes, the domain will pop up. Click Next.

In the Map User Attributes window, accept the setup and click Next

Type in the group DNs and click “Find Groups”.

Click the “0 of 23” under the column “Groups to sync”.

Select 3 user groups which need to be synced and click Save.

Click Next.

Accept the default setting in the “Select the Users you would like to sync” window and click Next.

In the Review window, click “Sync Directory”

Now it is time to verify that the synced users and groups in VIDM tenant. Go to the “User & Groups” tab. You can see we have 10 users and 3 groups that are synced from lab.local directory.

You can find the sync log within the configured directory.

Now the basic set up of vIDM connector has been completed.

Connector HA

A single VMware Identity manager is considered as a single point of failure in an enterprise environment. To achieve the high availability of connectors, just install an extra one or multiple connectors, the installation of an extra connector is exactly same as installing the 1st connector. Here, the second connector is installed on another Windows 2012 R2 server vidmcon02.lab.local. After the installation is completed, the activation procedure of the connector is the same as well.

Now 2 connectors will show up in the vIDM tenant.

Go to the Built-in identity provider and add the second connector.

Type in the Bind User Password and click “Add Connector”

Then the second connector is added successfully.

Now there are 2 connectors associated with the Built-in Identity Provider.

Please note connector HA is only for user authentication in version 19.03. Directory or user sync can only be enabled on one connector at a time. In the event of a connector instance failure, authentication is handled automatically by another connector instance. However, for directory sync, you must modify the directory settings in the VMware Identity Manager service to use another connector instance like the below.

Thank you very much for reading!

Integrate VMware NSX-T with Kubernetes

Kubernetes (K8s) is an open-source system for automating deployment, scaling, and management of containerized applications. K8s use network plugin to provide the required networking functions like routing, switching, firewall and load balancing. VMware NSX-T provides a network plugin called NCP for K8s as well. If you want to know more about VMware NSX-T, please go to docs.vmware.com.

In this blog, I will show you how to integrate VMWare NSX-T with Kubernetes.

Here, we will build a three nodes single master K8s cluster. All 3 nodes are RHEL 7.5 virtual machine.

  • master node:
    • Hostname: master.k8s
    • Mgmt IP: 10.1.73.233
  • worker node1:
    • Hostname: node1.k8s
    • Mgmt IP: 10.1.73.234
  • worker node2:
    • Hostname: node2.k8s
    • Mgmt IP: 10.1.73.235

On each node, there are 2 vNICs attached. The first vNIC is ens192 which is for management and the second vNIC is ens224, which is for K8s transport and connected to an overlay logical switch.

NSX-T version: 2.3.0.0.0.10085405;

NSX-T NCP version: 2.3.1.10693410

Docker version: 18.03.1-ce;

K8s version: 1.11.4

1. Prepare K8s Cluster Setup

1.1 Get Offline Packages and Docker Images

As there is no Internet access in my environment, I have to prepare my K8s cluster offline. To do that, I need to get the following packages:

  • Docker offline installation packages
  • Kubeadm offline installation packages which will be used to set up the K8s cluster;
  • Docker offline images;

1.1.1 Docker Offline Installation Packages

Regarding how to get Docker offline installation packages, please refer to my other blog: Install Docker Offline on Centos7.

1.1.2 Kubeadm Offline Installation Packages

Getting Kubeadm offline installation packages is quite straightforward as well. You can use Yum with downloadonly option.

yum install --downloadonly --downloaddir=/root/ kubelet-1.11.0
yum install --downloadonly --downloaddir=/root/ kubeadm-1.11.0
yum install --downloadonly --downloaddir=/root/ kubectl-1.11.0

1.1.3 Docker Offline Images

Below are the required Docker images for K8s cluster.

  • k8s.gcr.io/kube-proxy-amd64 v1.11.4
  • k8s.gcr.io/kube-apiserver-amd64 v1.11.4
  • k8s.gcr.io/kube-controller-manager-amd64 v1.11.4
  • k8s.gcr.io/kube-scheduler-amd64 v1.11.4
  • k8s.gcr.io/coredns 1.1.3
  • k8s.gcr.io/etcd-amd64 3.2.18
  • k8s.gcr.io/pause-amd64 3.1
  • k8s.gcr.io/pause 3.1

You possibly notice that the above includes two
identical pause images although these two have different repository names. There is a story around this. Initially, I only got the first image
“k8s.gcr.io/pause-amd64” loaded. The setup passed through “kubeadm init” pre-flight but failed at the real cluster setup stage. When I checked the log, I found out that the cluster set up process kept requesting the second image. I guess it is a bug with kubeadm v1.11.0 which I am using.

I put an example here to show how to use “docker pull” CLI to download a docker image in case you don’t know how to do it.

docker pull k8s.gcr.io/kube-proxy-amd64:v1.11.4

Once you have all Docker images, you need to export these Docker images as offline images via “docker save”.

docker save k8s.gcr.io/pause-amd64:3.1 -o /pause-amd64:3.1.docker

Now it is time to upload all your installation packages and offline images to all your K8s 3 nodes including master node.

1.2 Disable SELinux and Firewalld

# disable SELinux
setenforce 0
# Change SELINUX to permissive for /etc/selinux/config
vi /etc/selinux/config
SELINUX=permissive
# Stop and disable firewalld
systemctl disable firewalld && systemctl stop firewalld

1.3 Config DNS Resolution

# Update the /etc/hosts file as below on all three K8s nodes
10.1.73.233   master.k8s
10.1.73.234   node1.k8s
10.1.73.235   node2.k8s

1.4 Install Docker and Kubeadm

To install Docker and Kubeadm, first you put all required packages for Docker or kubeadm into a different directory. For example, all required packages for kubeadm are put into a directory called kubeadm. Then use rpm to install kubeadm as below:

[root@master kubeadm]# rpm -ivh --replacefiles --replacepkgs *.rpm
warning: 53edc739a0e51a4c17794de26b13ee5df939bd3161b37f503fe2af8980b41a89-cri-tools-1.12.0-0.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 3e1ba8d5: NOKEY
warning: socat-1.7.3.2-2.el7.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEY
Preparing...                          ########################## [100%]
Updating / installing...
   1:socat-1.7.3.2-2.el7              ########################## [ 17%]
   2:kubernetes-cni-0.6.0-0           ########################## [ 33%]
   3:kubelet-1.11.0-0                 ########################## [ 50%]
   4:kubectl-1.11.0-0                 ########################## [ 67%]
   5:cri-tools-1.12.0-0               ########################## [ 83%]
   6:kubeadm-1.11.0-0                 #########################3 [100%]

After Docker and Kubeadm are installed, you can go to enable and start docker and kubelet service:

systemctl enable docker && systemctl start docker
systemctl enable kubelet && systemctl start kubelet

In addition, you need to perform some OS level setup so that your K8s environment can run properly.

# ENABLING THE NET.BRIDGE.BRIDGE-NF-CALL-IPTABLES KERNEL OPTION
sysctl -w net.bridge.bridge-nf-call-iptables=1
echo "net.bridge.bridge-nf-call-iptables=1" > /etc/sysctl.d/k8s.conf
# Disable Swap
swapoff -a && sed -i '/ swap / s/^/#/' /etc/fstab

1.5 Load Docker Offline Images

Now let us load all offline docker images into your local Docker repo on all K8s node via CLI “docker load”.

docker load -i kube-apiserver-amd64:v1.11.4.docker
docker load -i coredns:1.1.3.docker
docker load -i etcd-amd64:3.2.18.docker
docker load -i kube-apiserver-amd64:v1.11.4.docker
docker load -i kube-controller-manager-amd64:v1.11.4.docker
docker load -i kube-proxy-amd64:v1.11.4.docker
docker load -i kube-scheduler-amd64:v1.11.4.docker
docker load -i pause-amd64:3.1.docker
docker load -i pause:3.1.docker

1.6 NSX NCP Plugin

Now you can upload your NSX NCP plugin to all 3 nodes and load the NCP images into local Docker repo.

1.6.1 Load NSX Container Image

docker load -i nsx-ncp-rhel-2.3.1.10693410.tar 

Now the docker image list on your K8s nodes will be similar to below:

[root@master ~]# docker image list
REPOSITORY                                   TAG                 IMAGE ID            CREATED             SIZE
registry.local/2.3.1.10693410/nsx-ncp-rhel   latest              97d54d5c80db        5 months ago        701MB
k8s.gcr.io/kube-proxy-amd64                  v1.11.4             5071d096cfcd        5 months ago        98.2MB
k8s.gcr.io/kube-apiserver-amd64              v1.11.4             de6de495c1f4        5 months ago        187MB
k8s.gcr.io/kube-controller-manager-amd64     v1.11.4             dc1d57df5ac0        5 months ago        155MB
k8s.gcr.io/kube-scheduler-amd64              v1.11.4             569cb58b9c03        5 months ago        56.8MB
k8s.gcr.io/coredns                           1.1.3               b3b94275d97c        11 months ago       45.6MB
k8s.gcr.io/etcd-amd64                        3.2.18              b8df3b177be2        12 months ago       219MB
k8s.gcr.io/pause-amd64                       3.1                 da86e6ba6ca1        16 months ago       742kB
k8s.gcr.io/pause                             3.1                 da86e6ba6ca1        16 months ago       742kB

1.6.2 Install NSX CNI

rpm -ivh --replacefiles nsx-cni-2.3.1.10693410-1.x86_64.rpm

Please note replacefiles option is required as a known bug with NSX-T 2.3. If you don’t include the replacefiles option, you will see an error like below:

[root@master rhel_x86_64]# rpm -i nsx-cni-2.3.1.10693410-1.x86_64.rpm
   file /opt/cni/bin/loopback from install of nsx-cni-2.3.1.10693410-1.x86_64 conflicts with file from package kubernetes-cni-0.6.0-0.x86_64

1.6.3 Install and Config OVS

# Go to OpenvSwitch directory
rpm -ivh openvswitch-2.9.1.9968033.rhel75-1.x86_64.rpm
systemctl start openvswitch.service && systemctl enable openvswitch.service
ovs-vsctl add-br br-int
ovs-vsctl add-port br-int ens224 -- set Interface ens224 ofport_request=1
ip link set br-int up
ip link set ens224 up

2. Setup K8s Cluster

Now you are ready to set up your K8s cluster. I will use kubeadm config file to define my K8s cluster when I initiate the K8s cluster setup. Below is the content of my kubeadm config file.

apiVersion: kubeadm.k8s.io/v1alpha1
kind: MasterConfiguration
kubernetesVersion: v1.11.4
api:
  advertiseAddress: 10.1.73.233
  bindPort: 6443

From the above, you can see that Kubernetes version v1.11.4 will be used and the API server IP is 10.1.73.233, which is the master node IP. Run the following CLI from K8s master node to create the K8s cluster.

kubeadm init --config kubeadm.yml

After the K8s cluster is set up, you can join the resting two worker nodes into the cluster via CLI below:

kubeadm join 10.1.73.233:6443 --token up1nz9.iatqv50bkrqf0rcj --discovery-token-ca-cert-hash sha256:3f9e96e70a59f1979429435caa35d12270d60a7ca9f0a8436dff455e4b8ac1da

Note: You can get the required token and discovery-token-ca-cert-hash from the output of “kubeadm init”.

3. NSX-T and K8s Integration

3.1 Prepare NSX Resource

Before the integration, you have to make sure that you have NSX-T resources configured in NSX manager. The required resource includes:

  • Overlay Transport Zone: overlay_tz
  • Tier 0 router: tier0_router
  • K8s Transport Logical Switch
  • IP Blocks for Kubernetes Pods: container_ip_blocks
  • IP Pool for SNAT: external_ip_pools
  • Firewall Marker Sections: top_firewall_section_marker and bottom_firewall_section_marker

Please refer the NSX Container Plug-in for Kubernetes and Cloud Foundry – Installation and Administration Guide to further check how to create the NSX-T resource. The following are the UUID for all created resources:

  • tier0_router = c86a625e-54e0-4510-9185-e9e1b7e26eb9
  • overlay_tz = f6d90300-c56e-4d26-8684-8eff64cdf5a0
  • container_ip_blocks = f9e411f5-654e-4f0d-99e8-2e5a9812f295
  • external_ip_pools = 84ffd635-640f-41c6-be85-71337e112e69
  • top_firewall_section_marker = ab07e559-79aa-4bc9-a6f0-126ea59278c2
  • bottom_firewall_section_marker = 35aaa6c5-0870-4ac4-bf47-114780863956

In addition, make sure that you tagged switching ports which three k8s nodes are attached to in the following ways:

{'ncp/node_name': '<node_name>'}
{'ncp/cluster': '<cluster_name>'}

node_name is the FQDN hostname of the K8s node and the cluster_name is what you call this cluster in NSX not in K8s cluster context. I show you here my K8s nodes’ tags.

k8s master switching port tags
k8s node1 swicthing port tags

k8s node2 swicthing port tags

3.2 Install NSX NCP Plugin

3.2.1 Create Name Space

kubectl create ns nsx-system

3.2.2 Create Service Account for NCP

kubectl apply -f rbac-ncp.yml -n nsx-system

3.2.3 Create NCP ReplicationController

kubectl apply -f ncp-rc.yml -n nsx-system

3.2.4 Create NCP nsx-node-agent and nsx-kube-proxy DaemonSet

kubectl create -f nsx-node-agent-ds.yml -n nsx-system 

You can find the above 3 yaml files in Github
https://github.com/insidepacket/nsxt-k8s-integration-yaml

Now you have completed the NSX-T and K8s integration. If you check the pods running on your K8s cluster, you will see the similar as below:

[root@master ~]# k get pods --all-namespaces 
NAMESPACE     NAME                                   READY     STATUS    RESTARTS   AGE
kube-system   coredns-78fcdf6894-pg4dz               1/1       Running   0          9d
kube-system   coredns-78fcdf6894-q727q               1/1       Running   128        9d
kube-system   etcd-master.k8s                        1/1       Running   3          14d
kube-system   kube-apiserver-master.k8s              1/1       Running   2          14d
kube-system   kube-controller-manager-master.k8s     1/1       Running   3          14d
kube-system   kube-proxy-5p482                       1/1       Running   2          14d
kube-system   kube-proxy-9mnwk                       1/1       Running   0          12d
kube-system   kube-proxy-wj8qw                       1/1       Running   3          14d
kube-system   kube-scheduler-master.k8s              1/1       Running   3          14d
ns-test1000   http-echo-deployment-b5bbfbb86-j4dxq   1/1       Running   0          2d
nsx-system    nsx-ncp-rr989                          1/1       Running   0          11d
nsx-system    nsx-node-agent-kbsld                   2/2       Running   0          9d
nsx-system    nsx-node-agent-pwhlp                   2/2       Running   0          9d
nsx-system    nsx-node-agent-vnd7m                   2/2       Running   0          9d
nszhang       busybox-756b4db447-2b9kx               1/1       Running   0          5d
nszhang       busybox-deployment-5c74f6dd48-n7tp2    1/1       Running   0          9d
nszhang       http-echo-deployment-b5bbfbb86-xnjz6   1/1       Running   0          2d
nszhang       jenkins-deployment-8546d898cd-zdzs2    1/1       Running   0          11d
nszhang       whoami-deployment-85b65d8757-6m7kt     1/1       Running   0          6d
nszhang       whoami-deployment-85b65d8757-b4m99     1/1       Running   0          6d
nszhang       whoami-deployment-85b65d8757-pwwt9     1/1       Running   0          6d

In NSX-T manager GUI, you will see the following resources are created for K8s cluster.

Logical Switches for K8s
Tier1 Router for K8s
NSX LB for K8s

Tips:

I have met a few issues during my journey. The following CLIs are used a lot when I troubleshoot. I shared these CLI here and hope can help you a bit as well.

  • How to check kubelet service’s log
journalctl -xeu kubelet
  • How to check log for a specific pod
kubectl logs nsx-ncp-rr989 -n nsx-system

“nsx-ncp-rr989” is the name of pod and “nsx-system” is the namespace which we created for NCP.

  • How to check log for a specific container when there are more than 1 container in the pod
kubectl logs nsx-node-agent-n7n7g -c nsx-node-agent -n nsx-system

“nsx-node-agent-n7n7g” is the pod name and “nsx-node-agent” is the container name.

  • Show details of a specific pod
kubectl describe pod nsx-ncp-rr989 -n nsx-system

Failed to Start Libvirtd

Environment:

OS: CentOS Linux release 7.5.1804 (Core)

Error Message:

# journalctl -u libvirtd
— Logs begin at Wed 2019-01-30 17:46:41 AEDT, end at Wed 2019-01-30 18:02:09 AEDT. —
Jan 30 17:47:09 ovs-sandbox2 systemd[1]: Starting Virtualization daemon…
Jan 30 17:47:14 ovs-sandbox2 libvirtd[1483]: 2019-01-30 06:47:14.936+0000: 1483: info : libvirt version: 4.5.0, package: 10.el7_6.3 (CentOS BuildSystem http://bugs.centos.org, 2018-11-28-20:51:39, x86-01.bsys.centos.org)
Jan 30 17:47:14 ovs-sandbox2 libvirtd[1483]: 2019-01-30 06:47:14.936+0000: 1483: info : hostname: ovs-sandbox2
Jan 30 17:47:14 ovs-sandbox2 libvirtd[1483]: 2019-01-30 06:47:14.936+0000: 1483: error : virModuleLoadFile:53 : internal error: Failed to load module ‘/usr/lib64/libvirt/storage-backend/libvirt_storage_backend_rbd.so’: /usr/lib64/libvir
Jan 30 17:47:14 ovs-sandbox2 systemd[1]: libvirtd.service: main process exited, code=exited, status=3/NOTIMPLEMENTED
Jan 30 17:47:14 ovs-sandbox2 systemd[1]: Failed to start Virtualization daemon.
Jan 30 17:47:14 ovs-sandbox2 systemd[1]: Unit libvirtd.service entered failed state.
Jan 30 17:47:14 ovs-sandbox2 systemd[1]: libvirtd.service failed.
Jan 30 17:47:15 ovs-sandbox2 systemd[1]: libvirtd.service holdoff time over, scheduling restart.

When:

The issue happened when I incidentally updated the libvirtd from 3.9.0-14.el7_5.8.x86_64 to 4.5.0-10.el7_6.3.x86_64

Fix:

[root@ovs-sandbox2 /]# yum update librados2

[root@ovs-sandbox2 virtualmachines]

# yum history info 14
Loaded plugins: fastestmirror
Transaction ID : 14
Begin time : Wed Jan 30 18:10:53 2019
Begin rpmdb : 815:0a1f6c4d93558a35ec9c3ceb9114712149f71015
End time : 18:10:54 2019 (1 seconds)
End rpmdb : 817:358974b7c1ae161fe8d05d2d23573b31eaac6582
User : root
Return-Code : Success
Command Line : update librados2
Transaction performed with:
Installed rpm-4.11.3-32.el7.x86_64 @anaconda
Installed yum-3.4.3-158.el7.centos.noarch @anaconda
Installed yum-plugin-fastestmirror-1.1.31-45.el7.noarch @anaconda
Packages Altered:
Dep-Install boost-iostreams-1.53.0-27.el7.x86_64 @base
Dep-Install boost-random-1.53.0-27.el7.x86_64 @base
Updated librados2-1:0.94.5-2.el7.x86_64 @base
Update 1:10.2.5-4.el7.x86_64 @base
Updated librbd1-1:0.94.5-2.el7.x86_64 @base
Update 1:10.2.5-4.el7.x86_64 @base
history info

[root@ovs-sandbox2 virtualmachines]

#

Automate NSX-T Build with Terraform

Terraform is a widely adopted Infrastructure as Code tool that allow you to define your infrastructure using a simple, declarative programming language, and to deploy and manage infrastructure across public cloud providers including AWS, Azure, Google Cloud & IBM Cloud and other infrastructure providers like VMware NSX-T, F5 Big-IP etc.

In this blog, I will show you how to leverage Terraform NSX-T provider to define a NSX-T tenant environment in minutes.

To build the new NSX-T environment, I am going to:

  1. Create a new Tier1 router named tier1_router;
  2. Create three logical switches under newly created Tier1 router for web/app/db security zone;
  3. Connect the newly created Tier1 router to the existing Tier0 router;
  4. Create a new network service group including SSH and HTTPs;
  5. Create a new firewall section and add a firewall rule to allow outbound SSH/HTTPs traffic from any workload in web logical switch to any workload in app logical switch;

Firstly, I define a Terraform module as below. Note: Terraform module is normally used to define reusable components. For example, the module which I defined here can be re-used to complete non-prod and prod environment build when you provide different input.

/*
provider "nsxt" {
  allow_unverified_ssl = true
  max_retries = 10
  retry_min_delay = 500
  retry_max_delay = 5000
  retry_on_status_codes = [429]
}
*/

data "nsxt_transport_zone" "overlay_transport_zone" {
  display_name = "tz-overlay"
}

data "nsxt_logical_tier0_router" "tier0_router" {
  display_name = "t0"
}

data "nsxt_edge_cluster" "edge_cluster" {
  display_name = "edge-cluster"
}

resource "nsxt_logical_router_link_port_on_tier0" "tier0_port_to_tier1" {
  description = "TIER0_PORT1 provisioned by Terraform"
  display_name = "tier0_port_to_tier1"
  logical_router_id = "${data.nsxt_logical_tier0_router.tier0_router.id}"
  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_tier1_router" "tier1_router" {
  description = "RTR1 provisioned by Terraform"
  display_name = "${var.nsxt_logical_tier1_router_name}"
  #failover_mode = "PREEMPTIVE"
  edge_cluster_id = "${data.nsxt_edge_cluster.edge_cluster.id}"
  enable_router_advertisement = true
  advertise_connected_routes = false
  advertise_static_routes = true
  advertise_nat_routes = true
  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_router_link_port_on_tier1" "tier1_port_to_tier0" {
  description  = "TIER1_PORT1 provisioned by Terraform"
  display_name = "tier1_port_to_tier0"
  logical_router_id = "${nsxt_logical_tier1_router.tier1_router.id}"
  linked_logical_router_port_id = "${nsxt_logical_router_link_port_on_tier0.tier0_port_to_tier1.id}"
  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_switch" "LS-terraform-web" {
  admin_state = "UP"
  description = "LogicalSwitch provisioned by Terraform"
  display_name = "${var.logicalswitch1_name}"
  transport_zone_id = "${data.nsxt_transport_zone.overlay_transport_zone.id}"
  replication_mode  = "MTEP"
  tag {
    scope = "ibm"
    tag = "blue"
  }
}

resource "nsxt_logical_switch" "LS-terraform-app" {
  admin_state = "UP"
  description = "LogicalSwitch provisioned by Terraform"
  display_name = "${var.logicalswitch2_name}"
  transport_zone_id = "${data.nsxt_transport_zone.overlay_transport_zone.id}"
  replication_mode  = "MTEP"
  tag {
    scope = "ibm"
    tag = "blue"
  }
}


resource "nsxt_logical_switch" "LS-terraform-db" {
  admin_state = "UP"
  description = "LogicalSwitch provisioned by Terraform"
  display_name = "${var.logicalswitch3_name}"
  transport_zone_id = "${data.nsxt_transport_zone.overlay_transport_zone.id}"
  replication_mode  = "MTEP"
  tag {
    scope = "ibm"
    tag = "blue"
  }
}

resource "nsxt_logical_port" "lp-terraform-web" {
  admin_state = "UP"
  description = "lp provisioned by Terraform"
  display_name = "lp-terraform-web"
  logical_switch_id = "${nsxt_logical_switch.LS-terraform-web.id}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_port" "lp-terraform-app" {
  admin_state = "UP"
  description = "lp provisioned by Terraform"
  display_name = "lp-terraform-app"
  logical_switch_id = "${nsxt_logical_switch.LS-terraform-app.id}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_port" "lp-terraform-db" {
  admin_state = "UP"
  description = "lp provisioned by Terraform"
  display_name = "lp-terraform-db"
  logical_switch_id = "${nsxt_logical_switch.LS-terraform-db.id}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_router_downlink_port" "lif-terraform-web" {
  description = "lif provisioned by Terraform"
  display_name = "lif-terraform-web"
  logical_router_id = "${nsxt_logical_tier1_router.tier1_router.id}"
  linked_logical_switch_port_id = "${nsxt_logical_port.lp-terraform-web.id}"
  ip_address = "${var.logicalswitch1_gw}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_router_downlink_port" "lif-terraform-app" {
  description = "lif provisioned by Terraform"
  display_name = "lif-terraform-app"
  logical_router_id = "${nsxt_logical_tier1_router.tier1_router.id}"
  linked_logical_switch_port_id = "${nsxt_logical_port.lp-terraform-app.id}"
  ip_address = "${var.logicalswitch2_gw}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_logical_router_downlink_port" "lif-terraform-db" {
  description = "lif provisioned by Terraform"
  display_name = "lif-terraform-db"
  logical_router_id = "${nsxt_logical_tier1_router.tier1_router.id}"
  linked_logical_switch_port_id = "${nsxt_logical_port.lp-terraform-db.id}"
  ip_address = "${var.logicalswitch3_gw}"

  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_l4_port_set_ns_service" "ns_service_tcp_443_22_l4" {
  description = "Service provisioned by Terraform"
  display_name = "web_to_app"
  protocol = "TCP"
  destination_ports = ["443", "22"]
  tag {
    scope = "ibm"
    tag   = "blue"
  }
}

resource "nsxt_firewall_section" "terraform" {
  description = "FS provisioned by Terraform"
  display_name = "Web-App"
  tag {
    scope = "ibm"
    tag = "blue"
  }
  
  applied_to {
    target_type = "LogicalSwitch"
    target_id = "${nsxt_logical_switch.LS-terraform-web.id}"
  }

  section_type = "LAYER3"
  stateful = true

  rule {
    display_name = "out_rule"
    description  = "Out going rule"
    action = "ALLOW"
    logged = true
    ip_protocol = "IPV4"
    direction = "OUT"

    source {
      target_type = "LogicalSwitch"
      target_id = "${nsxt_logical_switch.LS-terraform-web.id}"
    }

    destination {
      target_type = "LogicalSwitch"
      target_id = "${nsxt_logical_switch.LS-terraform-app.id}"
    }
    service {
      target_type = "NSService"
      target_id = "${nsxt_l4_port_set_ns_service.ns_service_tcp_443_22_l4.id}"
    }
    applied_to {
      target_type = "LogicalSwitch"
      target_id = "${nsxt_logical_switch.LS-terraform-web.id}"
    }
  }
}  

output "edge-cluster-id" {
  value = "${data.nsxt_edge_cluster.edge_cluster.id}"
}

output "edge-cluster-deployment_type" {
  value = "${data.nsxt_edge_cluster.edge_cluster.deployment_type}"
}

output "tier0-router-port-id" {
  value = "${nsxt_logical_router_link_port_on_tier0.tier0_port_to_tier1.id}"
}

Then I use the below to call this newly created module:

provider "nsxt" {
  allow_unverified_ssl = true
  max_retries = 10
  retry_min_delay = 500
  retry_max_delay = 5000
  retry_on_status_codes = [429]
}

module "nsxtbuild" {
  source = "/root/terraform/modules/nsxtbuild"
  nsxt_logical_tier1_router_name = "tier1-npr-vr"
  logicalswitch1_name = "npr-web"
  logicalswitch2_name = "npr-app"
  logicalswitch3_name = "npr-db"
  logicalswitch1_gw = "192.168.80.1/24"
  logicalswitch2_gw = "192.168.81.1/24"
  logicalswitch3_gw = "192.168.82.1/24"
}

After “terraform apply”, you can find the required environment is built successfully in NSX Manager.

Logical Switches
T1 vRouter
Service
DFW Rules

Install Docker Offline on Centos7

Recently, I had to build an environment which have a kind of real web application running to test LBaaS site affinity solution,. After a few minutes,I made a decision to install a Jenkins container on my testing Centos 7 virtual machines. 

Unfortunately, my Centos virtual machines have no Internet access. So I spent a bit of time to work out how to installl docker and run a container offline on Centos 7. Then I have this blog which maybe can help others who have the same challenge.

The docker version which I am going to install is: 
docker-ce-18.03.1.ce-1.el7.centos

On another Linux Centos 7 (minimum install) which have Internet access, I run the CLI below to identify all required packages for Docker offline installation.
repoquery -R docker-ce-18.03.1.ce-1.el7.centos
From the output, I found out that I need the following packages to complete Docker offline installation:

1:libsepol-2.5-8.1.el7
2:libselinux-2.5-12.el7
3:audit-libs-2.8.1-3.el7_5.1
4:libsemanage-2.5-11.el7
5:libselinux-utils-2.5-12.el7
6:policycoreutils-2.5-22.el7
7:selinux-policy-3.13.1-192.el7
8:libcgroup-0.41-15.el7
9:selinux-policy-targeted-3.13.1-19
10:libsemanage-python-2.5-11.el7
11:audit-libs-python-2.8.1-3.el7_5.1
12:setools-libs-3.3.8-2.el7
13:python-IPy-0.75-6.el7
14:pigz-2.3.3-1.el7.centos
15:checkpolicy-2.5-6.el7
16:policycoreutils-python-2.5-22.el7
17:container-selinux-2:2.68-1.el7
18:docker-ce-18.03.1.ce-1.el7.centos
19:audit-2.8.1-3.el7_5.1

Then I went to download docker rpm package and all dependent packages with yumdownloader:
yumdownloader –resolve  docker-ce-18.03.1.ce-1.el7.centos

I archived the above packages (tar cf docker-ce.offline.tar *.rpm) and uploaded to my offline Centos 7 virtual machines. Then use the rpm CLI to install Docker:

[root@lbaas02 ~]# rpm -ivh –replacefiles –replacepkgs *.rpm

warning: audit-2.8.1-3.el7_5.1.x86_64.rpm: Header V3 RSA/SHA256 Signature, key ID f4a80eb5: NOKEYwarning: docker-ce-18.03.1.ce-1.el7.centos.x86_64.rpm: Header V4 RSA/SHA512 Signature, key ID 621e9f35: NOKEYPreparing…                          ################################# [100%]Updating / installing…   1:libsepol-2.5-8.1.el7             ################################# [  5%]   2:libselinux-2.5-12.el7            ################################# [ 11%]   3:audit-libs-2.8.1-3.el7_5.1       ################################# [ 16%]   4:libsemanage-2.5-11.el7           ################################# [ 21%]   5:libselinux-utils-2.5-12.el7      ################################# [ 26%]   6:policycoreutils-2.5-22.el7       ################################# [ 32%]   7:selinux-policy-3.13.1-192.el7    ################################# [ 37%]   8:libcgroup-0.41-15.el7            ################################# [ 42%]   9:selinux-policy-targeted-3.13.1-19################################# [ 47%]  10:libsemanage-python-2.5-11.el7    ################################# [ 53%]  11:audit-libs-python-2.8.1-3.el7_5.1################################# [ 58%]  12:setools-libs-3.3.8-2.el7         ################################# [ 63%]  13:python-IPy-0.75-6.el7            ################################# [ 68%]  14:pigz-2.3.3-1.el7.centos          ################################# [ 74%]  15:checkpolicy-2.5-6.el7            ################################# [ 79%]  16:policycoreutils-python-2.5-22.el7################################# [ 84%]  17:container-selinux-2:2.68-1.el7   ################################# [ 89%]  18:docker-ce-18.03.1.ce-1.el7.centos################################# [ 95%]  19:audit-2.8.1-3.el7_5.1            ################################# [100%]

After the installation completed,  started and enabled docker service:

[root@lbaas02 ~]# systemctl enable docker

Created symlink from /etc/systemd/system/multi-user.target.wants/docker.service to /usr/lib/systemd/system/docker.service.

[root@lbaas02 ~]# systemctl start docker

Now the next question for me is to import the offline Jenkins docker image. Firstly, I pulled the Jenkisn docker image:

docker pull jenkins/jenkins

Then exported the docker image as a file and uploaded to my testing Centos.

docker save -o jenkins.docker jenkins/jenkins

On my testing Centos, I loaded the image to docker process.

[root@lbaas01 ~]# docker load -i jenkins.docker

 f715ed19c28b: Loading layer [==================================================>]  105.5MB/105.5MB 8bb25f9cdc41: Loading layer [==================================================>]  23.99MB/23.99MB 08a01612ffca: Loading layer [==================================================>]  7.994MB/7.994MB 1191b3f5862a: Loading layer [==================================================>]  146.4MB/146.4MB 097524d80f54: Loading layer [==================================================>]  2.332MB/2.332MB 685f72a7cd4f: Loading layer [==================================================>]  3.584kB/3.584kB  9c147c576d67: Loading layer [==================================================>]  1.536kB/1.536kB   e9805f9bdc9e: Loading layer [==================================================>]  356.3MB/356.3MB 8b47d19735d5: Loading layer [==================================================>]  362.5kB/362.5kB e2a15a753d48: Loading layer [==================================================>]  338.9kB/338.9kB 287c6d658570: Loading layer [==================================================>]  3.584kB/3.584kB 5e9d64b80844: Loading layer [==================================================>]  9.728kB/9.728kB   be6e5f898997: Loading layer [==================================================>]  868.9kB/868.9kB  609adfa44126: Loading layer [==================================================>]  4.608kB/4.608kB  a26f92334a9c: Loading layer [==================================================>]  75.92MB/75.92MB de90b90d0715: Loading layer [==================================================>]  4.608kB/4.608kB  13d8fca176c6: Loading layer [==================================================>]  9.216kB/9.216kB   be0781510eef: Loading layer [==================================================>]  4.608kB/4.608kB   d7e644ce9f14: Loading layer [==================================================>]  3.072kB/3.072kB 47dd83bc99e4: Loading layer [==================================================>]  7.168kB/7.168kB  96e3e5ce2959: Loading layer [==================================================>]  12.29kB/12.29kB               Loaded image: jenkins/jenkins:latest

[root@lbaas01 ~]# docker images

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE

jenkins/jenkins     latest              51158f0cf7bc        6 days ago          701MB

Now I am able to start my Jenkins docker on this offline Centos 7.

docker run -d -p 8080:8080 -p 50000:50000 -v jenkins_home:/var/jenkins_home jenkins/jenkins

Wait for 2-3 mins. After Jenkins container is fully running, I can login into my Jenkins.:)

NSX-T Routing Path

In this blog, I will show you the routing path for different NSX-T Edge cluster deployment options.

  • The 1st is the simplest scenario: we have a Edge Cluster and there is not any Tier-1 SR. So we will only have Tier-0 DR and Tier-0 SR running in this NSX Edge Cluster.  In the routing path diagram, I used the orange line to show the northbound path and the dark green line to show the southbound path.

Pattern1

  • In the 2nd scenario, Tier-1 vRouter includes Tier-1 DR and Tier-1 SR. Both Tier-1 SR and Tier-0 SR are running in the same NSX Edge Cluster. This design is to provide NAT, Firewall function at Tier-1 level via Tier1-SR. In the routing path diagram, I used the orange line to show the northbound path and the dark green line to show the southbound path.

Pattern2

 

  • In the 3nd scenario, we have 2 Edge clusters:
    • NSX-T T1 Edge Cluster: dedicated for Tier-1 SR/SRs, which is dedicated for running centralized service (e.g. NAT);
    • NSX-T T0 Edge Cluster: dedicated for Tier-0 SR/SRs, which provides uplink connectivity to the physical infrastructure;

This option gives better scalability and creates isolated service domains for Tier-0 and Tier-1. Similarly, I used the orange line to show the northbound path and the dark green line to show the southbound path in the diagram below:

 

Pattern3

Setup NSX L2VPN on Standalone Edge

With NSX L2VPN, you can extend your VLAN/VXLAN across multiple data centers.  Even in a non-NSX environment, you can achieve this as well by use of standalone edge. In this blog, I will show you how to set up NSX L2VPN between Standalone Edge and NSX edge.

Topology:2018-06-27_162648

As the above, we have 1 NSX edge as L2VPN server and 1 standalone edge which resides in the remote DC which is non-NSX environment. Our target is to stretch two VXLAN backed networks (172.16.136.0/24 and 172.16.137.0/24) to 2 VLAN (VLAN100 and VLAN200) backed networks in remote DC via L2VPN. In addition, we will leverage 4 virtual machines for our L2VPN communication testing.

2 virtual machines in NSX environment:

test1000: 10.172.136.100 gw 172.16.136.1 which is connected to VXLAN10032;

test1002: 10.172.137.100 gw 172.16.137.1 which is connected to VXLAN10033;

2 virtual machines in non-NSX environment:

test1001: 10.172.136.101 gw 172.16.136.1 which is connected to a dVS port-group with access vlan 100;

test1003: 10.172.137.101 gw 172.16.137.1 which is connected to a dVS port-group with access vlan 200;

Step 1: Configure NSX Edge as L2VPN Server

  • Create 2 sub interfaces(sub100: 172.16.136.1/24 and sub200: 172.16.137.1) by two VXLANs under trunk port

L2VPN Server03

Two VXLAN sub-interfaces, please note that 1st sub-interface is mapped to vNic10 and 2nd sub-interface is mapped to vNic11.

L2VPN Server04

Sub-interface sub100: tunnel Id 100/172.16.136.1 (VXLAN 10032)

L2VPN Server05

Sub-interface sub200 tunnel Id 200/172.16.137.1 (VXLAN 10033)

L2VPN Server06

  • L2VPN Server setting as below:
    • Listener IP: 172.16.133.1
    • Listener Port: 443
    • Encryption Algorithm: AES128-GCM-SHA256
    • Site Configuration:
      • name: remote
      • User Id/Password: admin/credential
      • Stretched Interfaces: sub100 and sub200

L2VPN Server01

L2VPN Server02

Step 2: Deploy and Setup L2VPN virtual appliance

Use standard process of deploying a virtual appliance.

  • Start the deploy OVF template wizard

1.2

  • Select the standalone Edge ovf file which is downloaded from vmware.com

1.3

1.4

  • Accept extra configuration options

1.5

  • Select name and folder1.6

1.7

  • Select storage

1.8

  • Setup Networks: here we use one dVS port-group for the standalone trunk interface. We will provide more details around the setting of this port-group later1.9
  • Customize template. We will configure L2VPN client here as well.

The configuration includes multiple parts:

Part1: standalone edge admin credentials:

1.10

Part2: standalone edge network setting:

1.11

Part 3: L2VPN setting, which required to exactly match the L2VPN server configuration which you did in Step1 including cipher suite, L2VPN Server address/service port and L2VPN username/password for authentication

1.12

Part4: L2VPN Sub Interfaces

1.13.1

Part5: other setting, e.g. proxy if your standalone edge need proxy to establish connectivity to L2VPN server.

1.14

  • Accept all setting and submit for the standalone edge deployment.

1.14.1

Once the standalone edge deployment is completed and powered on, you should be able to see the L2VPN tunnel is up either on NSX edge L2VPN server or standalone edge via CLI (show service l2vpn).

On NSX edge L2VPN server:

L2VPN up

On standalone edge:

l2vpn status_client

Step 3: Verification of communication

I simply use PING to verify the communication. My initial test is failed. Yes, you still need to configure port group DPortGroup_ClientTrunk to support L2VPN although L2VPN tunnel is up. You don’t need to do the same for NSX edge as it is completed automatically for you when you configure L2VPN on it.

  • VLAN trunking with VLAN100 and VLAN200

PG_ClientTrunk03

PG_ClientTrunk02

After completing of the above configuration, you will be able to ping all testing virtual machines between each other:

  • test1001 to test1000 (communication within 172.16.136.0/24 via L2VPN)

test01

  • test1003 to test1002 (communication for 172.16.137.0/24 via L2VPN)

test02

  • test1001 to test1003 (communication between 172.16.136.0/24 and 172.16.137.0/24 via L2VPN)

test03

You can check the mac-address and L2VPN mapping relationship via CLI “show service l2vpn bridge”

show_service_l2vpn_bridge

Possibly you noted there is an interface called na1 in the above, which is tunnel interface is created at NSX edge for L2VPN, you can find more details via show interface na1″

interface_na1

On standalone edge L2VPN client end, you will find 2 new vNiCs (vNic_110 and vNic_210) for VLAN 100 and 200 are created as well like vNic10 and vNic11 on the NSX Edge L2VPN server end.

L2VPN client new vNic

In addition, you can find a L2VPN tunnel interface tap0 on standalone edge.

l2vpn client trunk