New Ansible F5 HTTPs Health Monitor Module

Just got time this weekend to test the newly released dev version of Ansible F5 HTTPs health monitor. The result of testing looks good: most of common use cases have been covered properly.

Below is my first playbook for my testing:

# This version is to create a new https health monitor
---
- name: f5 config
  hosts:  lb.davidwzhang.com
  connection: local
  vars:
    ports:
      - 443
  tasks:
    - name: creat http healthmonitor
      bigip_monitor_https:
        state:  "present"
        #state: "absent"
        name: "ansible-httpshealthmonitor"
        password: "password"
        server: "10.1.1.122"
        user: "admin"
        validate_certs: "no"
        send: "Get /cgi-bin/env.sh HTTP/1.1\r\nHost:192.168.72.28\r\nConnection: Close\r\n"
        receive: "web"
        interval: "3"
        timeout: "10"
      delegate_to:  localhost

After run the playbook, I log in my F5 BIGIP VE and see the https health monitor has been created successfully.
f5 https healthmonitor

I tried to create another HTTPs health monitor, which includes basic authentication(admin/password) and customized alias address and alias service port(8443).
Playbook:

# This version is to create a new HTTP health monitor
---
- name: f5 config
  hosts:  lb.davidwzhang.com
  connection: local
  vars:
    ports:
      - 443
  tasks:
    - name: creat http healthmonitor
      bigip_monitor_https:
        state:  "present"
        #state: "absent"
        name: "ansible-httpshealthmonitor02"
        password: "password"
        server: "10.1.1.122"
        user: "admin"
        validate_certs: "no"
        ip: "192.168.72.128"
        port: "8443"
        send: "Get /cgi-bin/env.sh\r\n"
        receive: "200"
        interval: "3"
        timeout: "10"
        target_username: "admin"
        target_password: "password"
      delegate_to:  localhost

In F5, you can see the below:
f5 https healthmonitor02

In addition, you possibly noticed that I comment a line in the above 2 playbooks:

#state: "absent"

You can use it to remove the health monitor.

vRA7.3 and NSX Integration: Network Security Data Collection Failure

We are building vRA 7.3 . We added vCenter and NSX manager as endpoint in vRA. And associate NSX manager with vCenter. All of computing resource data collection works well but not NSX (network and security):

So in vRA reservation, we only can see vSphere cluster, vDS port-group/logical switch but not Transport zone, security group/tags

When check the log, we see the following:

Workflow ‘vSphereVCNSInventory’ failed with the following exception:

One or more errors occurred.

Inner Exception: An error occurred while sending the request.

at System.Threading.Tasks.Task`1.GetResultCore(Boolean waitCompletionNotification)

at DynamicOps.VCNSModel.Interface.NSXClient.GetDatacenters()

at DynamicOps.VCNSModel.Activities.CollectDatacenters.Execute(CodeActivityContext context)

at System.Activities.CodeActivity.InternalExecute(ActivityInstance instance, ActivityExecutor executor, BookmarkManager bookmarkManager)

at System.Activities.Runtime.ActivityExecutor.ExecuteActivityWorkItem.ExecuteBody(ActivityExecutor executor, BookmarkManager bookmarkManager, Location resultLocation)

Inner Exception:

VCNS Workflow failure

I tried to delete NSX end point and recreate from vRA but no luck. I raised the issue in vmware community but can’t get any real valuable feedback.

After a few hours investigation, finally I find a fix:

run the “create a NSX endpoint” workflow in vRO as the below

2017-07-26_184701

Then I re-start network & security data collection in vRA. Everything works and I can see all defined NSX Transport Zone, security groups and DLR in vRA network reservations.

Hope this fix can help others who have the same issue.

Perform Packet Capture on VMware ESXi Host for NSX Troubleshooting

VMware offers a great and powerful tool pktcap-uw to perform packet capture on ESXi host.

Pktcap-uw offers a lot of options for packet capture.

https://kb.vmware.com/selfservice/microsites/search.do?language=en_US&cmd=displayKC&externalId=2051814

Here I show most common used in my daily life here for your reference. I normally perform a packet based on vSwitch port ID or DV filter (NSX DFW)

To do that, I firstly need to find the vSwitch port ID and DV filter ID on ESXi host so that I can refer them in your packet capture. I normally use “summarize-dvfilter” CLI to find the requested information.

[root@esx4005:/tmp] summarize-dvfilter | grep -C 10 1314
slowPathID: none
filter source: Dynamic Filter Creation
vNic slot 1
name: nic-18417802-eth0-dvfilter-generic-vmware-swsec.1
agentName: dvfilter-generic-vmware-swsec
state: IOChain Attached
vmState: Detached
failurePolicy: failClosed
slowPathID: none
filter source: Alternate Opaque Channel
world 18444553 vmm0:auslslnxsd1314-113585a5-f6ed-4eb3-abd2-12083901e942 vcUuid:’11 35 85 a5 f6 ed 4e b3-ab d2 12 08 39 01 e9 42′
port 33554558 (vSwitch PortID) auslslnxsd1314-113585a5-f6ed-4eb3-abd2-12083901e942.eth0
vNic slot 2
name: nic-18444553-eth0-vmware-sfw.2 (DV Filter ID)
agentName: vmware-sfw
state: IOChain Attached
vmState: Detached
failurePolicy: failClosed
slowPathID: none
filter source: Dynamic Filter Creation
vNic slot 1
name: nic-18444553-eth0-dvfilter-generic-vmware-swsec.1

 

After I have the vSwitch port ID and DV filter ID, I can start my packet capture.

  • Packet capture to a VM based on vSwitch PortID

pktcap-uw –switchport 33554558 —dir 0 -o /tmp/from1314.pcap

  • Packet capture from a VM based on vSwitch PortID

pktcap-uw –switchport 33554558 —dir 1 -o /tmp/to1314.pcap

  • Packet capture from a VM based on DV filter

pktcap-uw –capture PreDVFilter –dvfilter nic-18444553-eth0-vmware-sfw.2 -o /tmp/1314v3.pcap

Below is a brief explanation of the parameters which we use in the above.

-o (output): save the capture as a packet capture file;

-dir (direction): 0 for traffic to VM and 1 for traffic from VM;

-PreDVFilter: perform packet capture before DFW rules are applied;

-PostDVFilter: perform packet capture after DFW rules are applied;

In addition, you can add filter as well for your capture:

pktcap-uw –switchport 33554558 –tcpport 9000 –dir 1 -o /tmp/from1314.pcap

I list all available filter options here for your reference:

–srcmac
The Ethernet source MAC address.
–dstmac
The Ethernet destination MAC address.
–mac
The Ethernet MAC address(src or dst).
–ethtype
The Ethernet type. HEX format.
–vlan
The Ethernet VLAN ID.
–srcip
The source IP address.
–dstip
The destination IP address.
–ip
The IP address(src or dst).
–proto 0x
The IP protocol.
–srcport
The TCP source port.
–dstport
The TCP destination port.
–tcpport
The TCP port(src or dst).
–vxlan
The vxlan id of flow.

Update:

Start 2 capture at the same time:

pktcap-uw –switchport 50331665 -o /tmp/50331665.pcap & pktcap-uw –uplink vmnic2 -o /tmp/vmnic2.pcap &

Stop all packet capture:

kill $(lsof |grep pktcap-uw |awk ‘{print $1}’| sort -u)

Of course, you can perform some basic packet capture in NSX manager via Central CLI. If you are interested in, please refer my another blog:

https://davidwzhang.com/2017/01/07/limitation-of-nsx-central-cli-packet-capture/

NSX IPSec Throughput in IBM Softlayer

To understand the real throughput capacity of NSX IPSec in Softlayer, I built a quick IPSec performance testing environment.

Below are the network topology of my testing environment:

NSX_IPSec_Performance_Topology

NSX version: 6.2.4
NSX Edge: X-Large (6 vCPUs and 8G Memory), which is the largest size NSX offers. All of Edges in this testing enviroment reside in the same vSphere cluster which include 3 ESXi hosts. Each ESXi host has 64GB DDR4 Memory and 2 processors (2.4GHz Intel Xeon-Haswell (E5-2620-V3-HexCore))
IPerf Client: Redhat 7.1 (2 vCPUs and 4GB Memory)
IPerf Server: Redhat 7.1 (2 vCPUs and 4GB Memory)
IPerf version: IPerf3

2 IPsec tunnels are built as the above diagram. IPSec setting is:

  • Encryption: AES-GCM
  • Diff-Hellman Group: DH5
  • PFS(Perfect forward secrecy): Enabled
  • AESNI: Enabled
I include 3 test cases in my testing:
Test1_Bandwidth_Utilisation
  • Test Case 2: 2 IPerf Clients (172.16.31.0/24) to 2 IPerf Servers (172.16.38.0/24) via 1 IPsec Tunnel. Result: around 1.6-2.3Gbit/s in total
Test2_Bandwidth_Utilisation
Test3_Bandwidth_Utilisation
Please note:
  1. Firewall function on NSX Edge is disabled in all test cases.
  2. TCP traffic is used in all 3 test cases. 10 parallel streams are used to push the performance test to the max on each IPerf Client.
  3. I didn’t see any CPU or Memory contention in all test cases: the CPU utilisation of NSX Edge was  less than 40% and memory utilisation is nearly zero.

CPU_Mem

Simple Python Script Creating a Dynamic Membership Security Group

In this blog, I developed a very simple Python scripts to create NSX security group whose membership is based on Security Tag. Please note this script is to show you the basic, which has not been ready for a production environment.

Two Python functions are includes in this script:

  1. create_tag is used to create a NSX security tag;
  2. create_sg is used to create a security group and define a criterion to add all virtual machines tagged with the specified security tag into this newly created security group;
import requests
from base64 import b64encode
import getpass
username=raw_input('Enter Your NSXManager Username: ')
yourpass = getpass.getpass('Enter Your NSXManager Password: ')
sg_name=raw_input('Enter Security Group Name: ')
vm_tag=raw_input('Enter Tag Name: ')
userandpass=username+":"+yourpass
userpass = b64encode(userandpass).decode("ascii")
auth ="Basic " + userpass
payload_tag="<securityTag>\r\n<objectTypeName>SecurityTag</objectTypeName>\r\n<type>\r\n<typeName>SecurityTag</typeName>\r\n</type>\r\n<name>"+vm_tag+"</name>\r\n<isUniversal>false</isUniversal>\r\n<description>This tage is created by API</description>\r\n<extendedAttributes></extendedAttributes>\r\n</securityTag>"
payload_sg= "<securitygroup>\r\n <objectId></objectId>\r\n <objectTypeName>SecurityGroup</objectTypeName>\r\n <type>\r\n <typeName>SecurityGroup</typeName>\r\n </type>\r\n <description></description>\r\n <name>"+sg_name+"</name>\r\n <revision>0</revision>\r\n<dynamicMemberDefinition>\r\n <dynamicSet>\r\n <operator>OR</operator>\r\n <dynamicCriteria>\r\n <operator>OR</operator>\r\n <key>VM.SECURITY_TAG</key>\r\n <criteria>contains</criteria>\r\n <value>"+vm_tag+"</value>\r\n </dynamicCriteria>\r\n </dynamicSet>\r\n</dynamicMemberDefinition>\r\n</securitygroup>"

def create_tag():
        try:
                response = requests.post(
                url="https://NSX-Manager-IP/api/2.0/services/securitytags/tag",
                verify=False,
                headers={
                        "Authorization": auth,
                        "Content-Type": "application/xml",
                    },
                data=payload_tag
                    )
                print('Response HTTP Status Code: {status_code}'.format(status_code=response.status_code))
                #print('Response HTTP Response Body: {content}'.format(content=response.content))
                if response.status_code == 403:
                        print "***********************************************************************"
                        print "WARNING: your username or password is wrong, please retry again!"
                        print "***********************************************************************"
                if  response.status_code == 201:
                        print "***********************************************************************"
                        print('Response HTTP Response Body: {content}'.format(content=response.content))
                api_response=response.text
                print api_response
        except requests.exceptions.RequestException:
                print('HTTP Request failed')

def create_sg():
        try:
                response = requests.post(
                url="https://NSX-Manager-IP/api/2.0/services/securitygroup/bulk/globalroot-0",
                verify=False,
                headers={
                        "Authorization": auth,
                        "Content-Type": "application/xml",
                    },
                data=payload_sg
                    )
                print('Response HTTP Status Code: {status_code}'.format(status_code=response.status_code))
                #print('Response HTTP Response Body: {content}'.format(content=response.content))
                if response.status_code == 403:
                        print "***********************************************************************"
                        print "WARNING: your username or password is wrong, please retry again!"
                        print "***********************************************************************"
                if  response.status_code == 201:
                        print "***********************************************************************"
                        print('Response HTTP Response Body: {content}'.format(content=response.content))
                api_response=response.text
                print api_response
        except requests.exceptions.RequestException:
                print('HTTP Request failed')

Run this script in our O-Dev:

[root]$ python create_sg_dynamic_member_20170429.py

Enter Your NSXManager UserName: admin

Enter Your NSXManager Passowrd:

Enter Security Group Name: sg_app1_web

Enter Tag Name: tag_app1_web

Response HTTP Status Code: 201

***********************************************************************

Response HTTP Response Body: securitytag-14

securitytag-14

Response HTTP Status Code: 201

***********************************************************************

Response HTTP Response Body: securitygroup-485

securitygroup-485

In NSX manager, we can see a securtiy group sg_app1_web is created as below:

2017-04-30_140657

And its dynamic membeship criterion is:

2017-04-30_140729

Automate F5 GSLB with Ansible

F5 BIG-IP Global Traffic Manager (GTM) provides tiered global server load balancing (GSLB). BIG-IP GTM distributes DNS name resolution requests, first to the best available pool in a wide IP, and then to the best available virtual server within that pool. GTM selects the best available resource using either a static or a dynamic load balancing method. Using a static load balancing method, BIG-IP GTM selects a resource based on a pre-defined pattern. Using a dynamic load balancing method, BIG-IP GTM selects a resource based on current performance metrics collected by the big3d agents running in each data center.

So F5 GSLB configuration logic for a DNS record is as below:

  • Define a Data Center, e.g. “SL-SYD-Site1”;
  • Define a server which can be F5 LTM or any other kind of local load balancer or host;

GTM Server Type

  • Create virtual servers if you don’t use F5 BigIP LTM or you don’t “Virtual Server Discovery” feature for your F5 BigIP LTM
  • Create GTM pool/pools using virtual server as member of this newly created pool;
  • Create Wide-IP which points to the GTM pool/pools which you defined in the previous step ; Note: F5 module in Ansible 2.3 still doesn’t support the association of GTM pool with wide-ip.

Unlike F5 BigIP LTM, Ansible F5 module doesn’t support F5 BigIP GTM very well. The known limitation of automating F5 GSLB configuration with Ansible version 2.3 includes:

  1. Doesn’t support setting up a server; (Luckily, if you are using F5 BigIP LTM, this is one-off task: you only need to perform this task once for each LTM.)
  2. Doesn’t support adding pool member when you create a GTM pool;
  3. Doesn’t support adding pool when you create a wide ip;
  4. Doesn’t support health monitor when you create GTM virtual server and GTM pool;

To accommodate these above limitation, I pre-defined a F5 LTM server called “myLTM”. 

2017-04-25_164141

After running my Ansible playbook, I manually add pool member into newly created GTM pool and add the GTM pool to wideip as well.

GTMPool_Adding_Member

GTM_WideIP_Pool

My playbook YAML file:

– name: f5 config
hosts: lb.davidwzhang.com
connection: local
tasks:
– name: create a GTM DC SL-SYD-Site1
bigip_gtm_datacenter:
password: “password”
server: “10.1.1.122”
user: “admin”
name: “SL-SYD-Site1”
validate_certs: “no”
delegate_to: localhost
– name: create a virtual server myVIP
bigip_gtm_virtual_server:
password: “password”
server: “10.1.1.122”
user: “admin”
virtual_server_name: “myVIP”
virtual_server_server: “myLTM”
validate_certs: “no”
port: “80”
address: “192.168.72.199”
state: “present”
delegate_to: localhost

– name: create GTM pool: mypool
bigip_gtm_pool:
server: “10.1.1.122”
user: “admin”
password: “password”
name: “mypool”
state: “present”
type: “a”
validate_certs: “no”
delegate_to: localhost

– name: create a wideip w3.davidwzhang.com
bigip_gtm_wide_ip:
server: “10.1.1.122”
user: “admin”
password: “password”
lb_method: “round_robin”
name: “w3.davidwzhang.com”
type: “a”
state: “present”
validate_certs: “no”
delegate_to: localhost

Ansible Playbook Output:

GTM DC

GTM_DC

GTM Virtual Server

GTM_Server_VS

GTM Pool

GTM_Pool

GTM WideIP

GTM_WideIP

NSlookup for wideip: w3.davidwzhang.com

GTM_Nslookup

Automate F5 LTM with Ansible

Ansible has included F5 as extra network module, which can help to provide LBaaS by use of Infrastructure as Code method.

Like normal Ansible modules,  Ansible F5 module is installed the /usr/lib/python2.7/site-packages/ansible/modules/extras/network directory.

[dzhang@localhost network]$ pwd
/usr/lib/python2.7/site-packages/ansible/modules/extras/network
[dzhang@localhost network]$ ls -al
total 512
drwxr-xr-x. 9 root root 4096 Jan 30 03:17 .
drwxr-xr-x. 20 root root 4096 Jan 30 03:17 ..
drwxr-xr-x. 2 root root 4096 Jan 30 03:17 a10
drwxr-xr-x. 2 root root 4096 Jan 30 03:17 asa
drwxr-xr-x. 2 root root 4096 Jan 30 03:17 citrix
-rw-r–r–. 1 root root 24673 Jan 16 08:48 cloudflare_dns.py
-rw-r–r–. 2 root root 18915 Jan 16 14:36 cloudflare_dns.pyc
-rw-r–r–. 2 root root 18915 Jan 16 14:36 cloudflare_dns.pyo
-rw-r–r–. 1 root root 11833 Jan 16 08:48 dnsimple.py
-rw-r–r–. 2 root root 8642 Jan 16 14:36 dnsimple.pyc
-rw-r–r–. 2 root root 8642 Jan 16 14:36 dnsimple.pyo
-rw-r–r–. 1 root root 13723 Jan 16 08:48 dnsmadeeasy.py
-rw-r–r–. 2 root root 12410 Jan 16 14:36 dnsmadeeasy.pyc
-rw-r–r–. 2 root root 12410 Jan 16 14:36 dnsmadeeasy.pyo
drwxr-xr-x. 2 root root 4096 Jan 30 03:17 exoscale
drwxr-xr-x. 2 root root 4096 Jan 30 03:17 f5
-rw-r–r–. 1 root root 13404 Jan 16 08:48 haproxy.py
-rw-r–r–. 2 root root 13504 Jan 16 14:36 haproxy.pyc
-rw-r–r–. 2 root root 13504 Jan 16 14:36 haproxy.pyo

The Ansible playbook below is to:

  • Create 2 web servers as F5 LTM nodes;
  • Create a F5 LTM pool;
  • Add the 2 web servers as the member for F5 LTM pool which is created in Step 2;
  • Create a F5 LTM vIP and associate the F5 LTM pool created in Step 2 with this newly created vIP;

[dzhang@localhost f5]$ cat f5-v0.4.yml

– name: f5 config
hosts: lb.davidwzhang.com
connection: local

tasks:
– name: create a pool
bigip_pool:
lb_method: “ratio_member”
name: “web”
password: “password”
server: “10.1.1.122”
user: “admin”
validate_certs: “no”
monitors:
– /Common/http
delegate_to: localhost
– name: create a node1
bigip_node:
host: “192.168.72.168”
name: “node-1”
password: “password”
server: “10.1.1.122”
user: “admin”
monitors:
– /Common/icmp
validate_certs: “no”
delegate_to: localhost
– name: create a node2
bigip_node:
host: “192.168.72.128”
name: “node-2”
password: “password”
server: “10.1.1.122”
user: “admin”
monitors:
– /Common/icmp
validate_certs: “no”
delegate_to: localhost
– name: add a node to pool
bigip_pool_member:
description: “webservers”
host: “{{item.host}}”
name: “{{item.name}}”
password: “password”
pool: “web”
port: “80”
connection_limit: “0”
monitor_state: “enabled”
server: “10.1.1.122”
user: “admin”
validate_certs: “no”
delegate_to: localhost
with_items:
– host: “192.168.72.168”
name: “node-1”
– host: “192.168.72.128”
name: “node-2”
– name: create a VIP
bigip_virtual_server:
description: “ansible-vip”
destination: “192.168.72.199”
name: “ansible-vip”
pool: “web”
port: “80”
snat: “Automap”
password: “password”
server: “10.1.1.122”
user: “admin”
all_profiles:
– “http”
validate_certs: “no”
delegate_to: localhost

Run the playbook:

[dzhang@localhost f5]$ ansible-playbook f5-v0.4.yml
/usr/lib64/python2.7/site-packages/cffi/model.py:525: UserWarning: ‘point_conversion_form_t’ has no values explicitly defined; guessing that it is equivalent to ‘unsigned int’
% self._get_c_name())

PLAY [f5 config] ***************************************************************

TASK [setup] *******************************************************************
ok: [lb.davidwzhang.com]

TASK [create a pool] ***********************************************************
ok: [lb.davidwzhang.com -> localhost]

TASK [create a node1] **********************************************************
ok: [lb.davidwzhang.com -> localhost]

TASK [create a node2] **********************************************************
ok: [lb.davidwzhang.com -> localhost]

TASK [add a node to pool] ******************************************************
ok: [lb.davidwzhang.com -> localhost] => (item={u’host’: u’192.168.72.168′, u’name’: u’node-1′})
ok: [lb.davidwzhang.com -> localhost] => (item={u’host’: u’192.168.72.128′, u’name’: u’node-2′})

TASK [create a VIP] ************************************************************
ok: [lb.davidwzhang.com -> localhost]

PLAY RECAP *********************************************************************
lb.davidwzhang.com : ok=6 changed=0 unreachable=0 failed=0

In F5 LTM, you can verify the configuration:

LTM Node:

NodeList

LTM Pool

Pool1

Pool2

LTM vIP

vIP