Disaster Recovery as a Service: Ten steps to success


Disaster recovery is becoming top of mind for many CIOs. Understanding the success criteria to make the disaster recovery journey of your own organization smooth and successful is critical, but the path to getting there can be difficult.
Follow the ten key steps below, to guide you on the right path to success.

  1. Understand why disaster recovery is important to your business, and what your specific disaster recovery requirements are.

The first key step is understanding why you are looking for a disaster recovery solution for your business, and what your requirements are- from a disaster recovery perspective as well as for the solution in need. Running a Business Impact Analysis (BIA) will assist in the impact of a disruption to your business and will also help expose the effect of such disruption to your reputation including the effect of any loss of data or loss of staff, the BIA is very much the building block and foundation of your disaster recovery planning and knowing what the business impact to outages is probably the most important aspect in defining the answer to the “why” question. Knowing the business impact will not only drive the Service Level Agreements (SLAs) for the business processes they will also assist the disaster recovery plan to minimise any prolonged outages which could be derived from human error during the recovery process. If these aspects are missing and haven’t been thought of yet then running a Business Impact Analysis should be the first thing that you do and will put you in good stead as you move forward.
An additional aspect of the disaster recovery process is to understand your Recovery Point Objective (RPO), Recovery Time Objective (RTO). From a SLA perspective, think about the amount of time and data loss your business can incur. Zero data loss is obviously ideal, but this can exponentially drive up the cost of the solution. Having a limit to the data loss that can be incurred by your business based on the business service is realistic. Both the time and data loss windows will translate to your RTO and RPOs respectively.
Additionally, does your business require adherence to any regulatory compliance or operating rules? For example, do you need to provide proof of a quarterly or yearly disaster recovery test?  Disaster recovery testing is important, and there are a lot of factors to take into consideration here. What kind of replication technology would you choose – expensive hardware-based replication or host-based or even replication to the cloud. What you choose is based on various factors including cost, business policies, SLA requirements, and importantly environmental factors.  For instance, if your data center is located in an area which gets affected by floods, then your disaster recovery location needs to be in a separate geographic area or even in the cloud.

  1. Should you build your own or buy off the shelf?

The next step is driven by how much investment you either want to make operationally or in capital expenditure. You probably have already invested quite heavily into infrastructure at your primary data center location – things such as server hardware, virtualization technologies and storage. You could take a simple approach and invest in another physical data center for disaster recovery, but this would lead to the expense of not only double software / hardware infrastructure costs but also additional physical location costs. A more savvy approach would be to utilize a vendor to supply disaster recovery services at a fraction of the cost of running dual locations. Keep in mind that choosing the right vendor is important too. You will want to look for a leader in the managed disaster recovery services space that has years of credible experience.

  1. Understand the difference between disaster recovery as-a-service and backup and recovery as-a-service.

Understand that disaster recovery and backup are different ball games. While backup is a necessary part of a business continuity strategy, it lends itself to SLAs of hours to days. On the other hand, disaster recovery is better suited to SLA requirements in minutes to hours. Based on the business uptime and data loss requirements specific to a business service, your business would deploy a disaster recovery solution for your business-critical applications, while backup would be sufficient for those non-critical business services which can take some downtime.  Choose a disaster recovery as-a-service solution that can protect your entire estate or at least the critical elements of it that drive your business. This includes physical and virtual systems, as well as the mix of different OSs that typically are run within enterprise businesses today. The disaster recovery as-a-service solution that you choose should also be able to provide you with the ability to run your systems within their cloud location for a period of time, until you can get your infrastructure back up and running and transfer services back to your primary site.

  1. Choose the right Cloud Hypervisor.

It may seem like an easy decision to make- you would seek a vendor that runs the same hypervisor on the backend as you are on your primary site, but keep in mind this is not a necessity.  If you are using VMware vSphere or Microsoft Hyper-V then running these type of hypervisors in the cloud is going to incur some additional licensing costs in a DR solution. Another thing to think about is whether you really need all the bells and whistles when you’ve invoked disaster recovery. Most of your time is going to be taken up with getting services up and running back at your own location as quickly as possible, so maybe not. What you basically need is a hypervisor to host your systems that provides the basic performance, scale and resilience you require. A more cost-efficient stance would be to utilise a KVM-based hypervisor running within OpenStack. This ticks the boxes in terms of enterprise ready and best of all, the service costs should yield a better ROI than those running proprietary hypervisor technologies, saving your business considerable money.

  1. Plan for all business services that need to be protected, including multi-tier services

Now were getting down to the nitty-gritty details. The business services that need to be protected will be primarily driven by the SLAs that brought you down this path. Keep in mind that you capture all operating system types that these business services are running on and also think about how you handle any physical systems that have not yet been virtualized. Moving virtualized applications to the cloud is an easy process, as these are already encapsulated by the hypervisor in use. But pure physical business applications are another matter altogether.  It is not impossible to move physical application data to the cloud, but when it comes to a failback scenario, if the services you select does not have this capability, then you are a sitting duck. This is especially important to keep in mind in the case where a complete outage has occurred and a rebuild is needed. Another thing to think about is when your business services or applications are started in the cloud- can you start or stop these systems in a particular order if a business service is made of different processes, such as a multi-tier application, and also inject manual steps within your failover plan if so required? Controlling multi-tier business applications that span across systems is going to be a high priority, not only while invoking disaster recovery but also when you’re performing a disaster recovery test.

  1. Plan for your RTOs, RPOs, Bandwidth, Latency and IOPs

Understanding how you can achieve your Recovery Point Objective (RPO), Recovery Time Objective (RTO), as well as the IO load of virtual machines, and the peaky nature of writes through the business day within your systems, this data will help you understand what your required WAN bandwidth should be. Determine whether your disaster recovery service vendor can guarantee these RTOs and RPOs, because every additional minute or hour that your business is down as defined by the Business Impact Analysis is going to cost you. If you aim for RPO of 15 minutes or less, then your bandwidth to the cloud needs to be big enough to cope with extended periods of heavy IO within your systems. If your RTO is something like 4 hours, then you need to know if your systems can recover within that time period, keeping in mind that other operations too need to be managed, such as DNS and AD/LDAP updates including any additional infrastructure services that your business needs.

  1. Avoid vendor lock-in while moving data to the cloud

Understanding how your data will be sent to the cloud provider site is important. A solution that employs VMware vSphere on-premises and in the cloud limits you to a replication solution that works only for virtualized systems with no choice of protecting physical OS systems. This may seem acceptable at the time, but you will be locked into this solution and switching DR providers in the future may be difficult.  Seeking a solution that is flexible and can protect all types of major virtualization platforms as well as physical OS gives you the flexibility of choice for the future.

  1. Run successful disaster recovery rehearsals without unexpected costs

Rehearsals or exercises are probably the most important aspect of any disaster recovery solution. Not having an automated disaster recovery rehearsal process that you test on a regular basis can leave your business vulnerable. Your recovery rehearsals should not affect your running production environment. Any rehearsal system should run in parallel albeit within a separate network VLAN, but still have some type of access to infrastructure services such as AD, LDAP and DNS etc. so that full disaster recovery testing can be carried out. Once testing is complete, it is essential that the solution include a provision to easily remove and clean up the rehearsal processes.

  1. How long can you stay in the cloud?

For a moment let’s imagine that the unthinkable has happened, and you have invoked disaster recovery to your cloud service provider. The nature of the outage at your primary location will dictate the length of time you will need to keep your business applications running on your service providers’ infrastructure. It is imperative that you are aware of any clauses within your contract that pertain to length of time you can keep your business running on the cloud providers’ site. There is also a big pull to get enterprises to think about running in the cloud and staying there, but this is a big decision to make. Performance of the systems is going to be one metric to poll against, as is performance of storage, or more precisely the quality of service of the storage that the cloud vendor will provide. On the whole, it makes sense to get back into your own infrastructure as quick as possible, since it is custom built to support your business.

  1. How easy is it to failback business services to your own site?

Getting your data back or reversing the replication data path is going to be important especially as you don’t want to affect your running systems within the cloud by injecting more downtime! Rebuilding your infrastructure is one aspect that needs to be meticulously planned. Any assistance that the solution itself can provide to make this process smoother is a bonus. Your on-premises location is going to need a full re-sync of data from the cloud location which may take some time, so the solution should be able to handle a two-step approach to failback- the re-sync should happen in one operation and once complete, the process to switch back your systems can be done at a time that suits your business.
Success, you’re now armed to create a robust business continuity plan.
Follow the steps above to gain an understanding of what’s needed to be successful on your disaster recovery as a service journey, and use them as checkpoints while developing you own robust business continuity plan for your business.

VMware Integrated Openstack 2.0 set for release before the end of Q3 2015

Its been just six months since VMware released version 1.0 of VMware Integrated Openstack for general availability and now the next release is expected to be available before the end of Q3 2015 for download, here’s what’s new in this 2.0 release:

  • Kilo-based: VMware Integrated OpenStack 2.0 will be based on OpenStack Kilo release, making it current with upstream OpenStack code.
  • Seamless OpenStack Upgrade: VMware Integrated OpenStack 2.0 will introduce an Industry-first seamless upgrade capability between OpenStack releases. Customers will now be able to upgrade from V1.0 (Icehouse) to V2.0 (Kilo), and even roll back if anything goes wrong, in a more operationally efficient manner.
  • Additional Language Support: VMware Integrated OpenStack 2.0 will now available in six more languages: German, French, Traditional Chinese, Simplified Chinese, Japanese and Korean.
  • LBaaS: Load Balancing as a service will be available supported through VMware NSX.
  • Ceilometer Support: VMware Integrated OpenStack 2.0 will now support Ceilometer with Mongo DB as the Backend Database
  • App-Level Auto Scaling using Heat: Auto Scaling will enable users to set up metrics that scale up or down application components. This will enable development teams to address unpredictable changes in demand for the app services. Ceilometer will provide the alarms and triggers, Heat will orchestrate the creation (or deletion) of scale out components and LBaaS will provide load balancing for the scale out components.
  • Backup and Restore: VMware Integrated OpenStack 2.0 will include the ability to backup and restore OpenStack services and configuration data.
  • Advanced vSphere Integration: VMware Integrated OpenStack 2.0 will expose vSphere Windows Guest Customization. VMware admins will be able to specify various attributes such as ability to generate new SIDs, assign admin passwords for the VM, manage compute names etc. There will also be added support for more granular placement of VMs by leveraging vSphere features such as affinity and anti-affinity settings.
  • Qcow2 Image Support: VMware Integrated OpenStack 2.0 will support the popular qcow2 Virtual Machine image format.
  • Available through our vCloud Air Network Partners: Customers will be able to use OpenStack on top of VMware through any of the service providers in out vCloud Air Partner Network.

Openstack Juno- RDO Packstack deployment to an external network & config via Neutron

Openstack Juno- RDO Packstack deployment to an external network & config via Neutron
Openstack is a solution that I have briefly been following over the past couple of years or so but never really had enough time to give it the focus it probably deserves, the current project I am working on has an element of interaction with Openstack so it seems a great opportunity to gain some in depth hands on experience giving me greater insight on how the various Openstack components click together and the level of interaction required for existing environments.
Having already bult a fairly solid VMware and Hyper-V lab environment meant that I wasn’t going to crash and burn what I already have; I need to shoehorn an Openstack deployment in to the lab environment, utilizing the existing network and storage facilities already available. This blog post will endeavor to layout the steps required to add an Openstack deployment from start to operational build and go over some of the hurdles I encountered along the way. As some background, in my existing built lab I use a typical 192.168.1.0/24 range of IPv4 address and also have a router to the outside world at 192.168.1.254. If your labs the same then it’s a matter of running the commands, if not then modify the address ranges to suit yours.
So many flavors to choose from.
Before I go into the steps, I also wanted to highlight some of the hurdles I encountered to building the Openstack deployment. The first question I asked myself is which distribution to choose to build the environment; initially I reviewed the Openstack docs to see the process of building the latest version of Openstack Juno version. Ubuntu and Centos seemed like the most common distributions that are used, I went for Ubuntu first because of the Devstack deployment process which a friend of mine suggested to check out. The docs surrounding Devstack (http://docs.openstack.org/developer/devstack/) are good, but are not so straight forward as it wasn’t clear exactly which files needed creating or modifying  for building the environment. For example it wasn’t clear if you needed to create the configuration file (local.conf or localrc) to get the options you need installed and configured. After a couple of attempts I did get a working environment going but initially it was a basic Nova/Network setup only finding the correct way to configure the local.conf file I got Neutron installed although configuration was another matter. I did have many late nights trying to get a worked environment but eventually gave up on it.
After ditching the Ubuntu build I then looked at building with Centos, having used Redhat for many years it did feel much more comfortable, I carried out some research on the options with Centos and went for an automatic installation process by using RDO (https://www.rdoproject.org/Main_Page), a community project of Redhat, Fedora and Centos deployments supported by users of the community.  One thing I have found with both Devstack and RDO is that information is out there but it is spread all over the place and not all sites have up to date information, for example some still focus on Havana or Icehouse and not many have info on Juno. Hopefully this guide will bring the installation steps into a single document which will help you.
Building out the Openstack environment following steps 1 to 27
Below are the steps I have created which will build out an Openstack deployment of Juno on a single VM or physical system which is based on Centos 7, it will use Neutron and connect to the existing external lab network of 192.168.1.0/24. The Openstack VM will have an IP of 192.168.1.150 which we will configure as a bridge, we will create a new network for the Openstack instances which will use a private IP pool of 10.0.0.0/24 and a floating IP or 192.168.1.0/24, we will create a floating IP range of 192.168.1.200-192.168.1.220 so that I can have 18 IPs available for instances if needed.
I will use vSphere 6 but really vSphere v5.x would be OK too, my vSphere servers can run nested virtualization which is ideal as I can create a snapshot and revert the snapshot if certain things failed.
1.      Create a new VM, for my requirements I have created a 16gb VM which is enough to run a couple of instances too along with Openstack, it also has a boot disk of 20GB, I also added another disk which I will use for Cinder (block storage), it will be a 100GB disk. I have also attached 2 virtual network cards both are directly connected to the main network.
2.     Install Centos 7.0 on the VM or physical system, I have used CentOS-7.0-1406-x86_64-Minimal.iso for my build. Install the OS following the configuration inputs as requested as asked by the install process.
3.     Some additional house keeping I make on the image is to rename the enolxxxxxxx network devices to eth0 and eth1, I’m a bit old school with device naming.
Modify the /etc/default/grub and append ‘net.ifnames=0 biosdevname=0‘ to the GRUB_CMDLINE_LINUX= statement.

# vi /etc/default/grub
GRUB_CMDLINE_LINUX="rd.lvm.lv=rootvg/usrlv rd.lvm.lv=rootvg/swaplv crashkernel=auto vconsole.keymap=usrd.lvm.lv=rootvg/rootlv vconsole.font=latarcyrheb-sun16 rhgb quiet net.ifnames=0 biosdevname=0"

4.     next make a new grub config

# grub2-mkconfig -o /boot/grub2/grub.cfg

5.     Rename the config files for both eno devices

# mv /etc/sysconfig/network-scripts/ifcfg-eno16777736 /etc/sysconfig/network-scripts/ifcfg-eth0

6.     repeat for eth1

# mv /etc/sysconfig/network-scripts/ifcfg-eno32111211 /etc/sysconfig/network-scripts/ifcfg-eth1

7.      Reboot to run with the modified changes.

# reboot

The RDO Install process
8.     Bring the Centos OS up to date

# yum update -y

9.     Open the SE Linux barriers a bit, this is a lab environment so can loosen the security a little

# vi /etc/selinux/config # SELINUX=enforcing SELINUX=permissive

10.  Install the epel repository

# yum install epel-release -y

11.   Modify the epel repo and enable core, debuginfo and source sections.

# vi /etc/yum.repos.d/epel.repo
[epel]enabled=1
[epel-debuginfo] enabled=1
[epel-source] enabled=1

12.   Install net tools

# yum install net-tools -y

13.   Install the RDO release

# yum install -y http://rdo.fedorapeople.org/rdo-release.rpm

14.   Install openstack packstack

# yum install -y openstack-packstack

15.   Install openvswitch

# yum install openvswitch -y

16.   Final update

# yum update -y

Cinder volume preparation
17.   Install lvm2

# yum install lvm2 -y

18. Build out using packstack puppet process

# packstack --allinone --provision-all-in-one-ovs-bridge=n

19.  Remove 20gb loopback file from Packstack install and create new cinder-volume disk on 100GB virtual disk

# vgremove cinder-volumes
# fdisk sdb
# pvcreate /dev/sdb
# vgcreate cinder-volumes /dev/sdb

UPDATE
Instead of the changes to eth1 and br-ex I have found a simpler method of using eth1 as the NIC that will be used on the OVS switch. just remember that if the server is rebooted to check that the eth1 is still connected to the br-ex port group.
20. Add eth1 to the openvswitch br-ex ports

# ovs-vsctl add-port br-ex eth1

Change network configuration for /etc/sysconfig/network-scripts/ifcfg-br-ex & /etc/sysconfig/network-scripts/ifcfg-eth1

# vi /etc/sysconfig/network-scripts/ifcfg-br-ex
DEVICE=br-ex
DEVICETYPE=ovs
TYPE=OVSBridge
BOOTPROTO=static
IPADDR=192.168.1.150
NETMASK=255.255.255.0
GATEWAY=192.168.1.254
DNS1=192.168.1.1
DNS2=192.168.1.254
ONBOOT=yes
# vi /etc/sysconfig/network-scripts/ifcfg-ifcfg-eth0
DEVICE=eth0
HWADDR=52:54:00:92:05:AE # your hwaddr
TYPE=OVSPort
DEVICETYPE=ovs
OVS_BRIDGE=br-ex
ONBOOT=yes

21.  Additional network configurations for the bridge

# network_vlan_ranges = physnet1
# bridge_mappings = physnet1:br-ex

22.   Restart the network services so that the config takes effect

# service network restart

Configure new network and router to connect onto external network
23.  Remove old network configuration settings

# . keystonerc_admin
# neutron router-gateway-clear router1
# neutron subnet-delete public_subnet
# neutron subnet-delete private subnet
# neutron net-delete private
# neutron net-delete public

24.  Open ports for icmp pings and connection via ssh

# nova secgroup-add-rule default icmp -1 -1 0.0.0.0/0 
# nova secgroup-add-rule default tcp 22 22 0.0.0.0/0

25.  Create new private network on 10.0.0.0/24 subnet

# neutron net-create private 
# neutron subnet-create private 10.0.0.0/24 --name private --dns-nameserver 8.8.8.8

26.  Create new public network on 192.168.1.0/24 subnet

# neutron net-create homelan --router:external=True 
# neutron subnet-create homelan 192.168.1.0/24 --name homelan --enable_dhcp False --allocation_pool start=192.168.1.201,end=192.168.1.220 --gateway 192.168.1.254

27.  Create new virtual router to connect private and public networks

# HOMELAN_NETWORK_ID=`neutron net-list | grep homelan | awk '{ print $2 }'` 
# PRIVATE_SUBNET_ID=`neutron subnet-list | grep private | awk '{ print $2}'` 
# ADMIN_TENANT_ID=`keystone tenant-list | grep admin | awk '{ print $2}'` 
# neutron router-create --name router --tenant-id $ADMIN_TENANT_ID router 
# neutron router-gateway-set router $HOMELAN_NETWORK_ID
# neutron router-interface-add router $PRIVATE_SUBNET_ID

That’s the install and configuration process complete. I will continue this series of blogs with deployment of instances and floating IP allocation.
Hope this has helped you deploy Openstack. Feel free to leave me a comment.

Goodbye vSphere AppHA, you were just not up to the job, enter Symantec ApplicationHA to the rescue

Well I thought this day would come eventually but I am surprised to see it so soon, its official folks. vSphere AppHA is no more as of vSphere 6.0, the official announcement is here . With the effort that’s required to provide continual support for old and new applications and also having to provide continual support for their updates it looks like the task was not something that VMware wanted to focus on. Don’t think that your covered though with backups, replication, vSphere HA or vSphere FT, non of those will get your application back up and running automatically should it fail.
don’t worry though
Symantec ApplicationHA comes to the rescue…..

As one of the first third party vendors providing support for application availability within virtual machines, Symantec has always been at the forefront providing resilience for applications running within VMware vSphere. ApplicationHA is one solution that has been doing this, and for the past four years its been going from strength to strength adding functionality and automation and importantly, resilience for mission critical applications that enables our customers to sleep at night. If your unfamiliar with Symantec ApplicationHA take a look at this comparison which I made a while back, its very detailed but will give you an insight of ApplicationHA’s true potential. Its inexpensive and doesn’t need vSphere Enterprise Plus to work. Its stable mature technology built on Veritas Cluster Server heritage. The development effort required to keep on top of platform and application updates is a challenge but it’s worth it, after all it’s the applications that drive your business and providing resilience for them should be top of mind.
More info on Symantec Application can be found here there’s also a free trial that you can test drive for 60 days if you like too.

Whats new in VMware Fault Tolerance 6.0

VMware Fault Tolerance (FT) in vSphere 5.5 is one of those features you would love to use but because of its vCPU limitation it was not really helping to protect the Mission Critical applications so for many its left behind. With vSphere 6.0, VMware broken the limitation of a single vCPU for Fault Tolerance, a FT VM now Supports upto 4 vCPUs and 64 GB of RAM. With vSMP support, FT can be used to protect your mission critical applications. Along with the vSMP FT support, let’s take a look at what’s new in vSphere 6.0 Fault Tolerance(FT).
vSphere 6.0 - FT_1

Benefits of Fault Tolerance

  • Continuous Availability with Zero downtime and Zero data loss
  • NO TCP connections loss during failover
  • Fault Tolerance is completely transparent to Guest OS.
  • FT doesn’t depend on Guest OS and application
  • Instantaneous failover from Primary VM to Secondary VM in case of ESXi host failure

What’s New in vSphere 6.0 Fault Tolerance

  • FT support upto 4 vCPUs and 64 GB RAM
  • Fast Check-Pointing, a new Scalable technology is introduced to keep primary and secondary in Sync by replacing “Record-Replay”
  • vSphere 6.0, Supports vMotion of both Primary and Secondary Virtual Machine
  • With vSphere 6.0, You will be able to backup your virtual machines. FT supports for vStorage APIs for Data Protection (VADP) and it also supports all leading VADP solutions in Market like Symantec, EMC, HP ,etc.
  • With vSphere 6.0, FT Supports all Virtual Disk Type like EZT, Thick or Thin Provisioned disks. It supports only Eager Zeroed Thick with vSphere 5.5 and earlier versions
  • Snapshot of FT configured Virtual Machines are supported with vSphere 6.0
  • New version of FT keeps the Separate copies of VM files like .VMX, .VMDk files to protect primary VM from both Host and Storage failures. You are allowed to keep both Primary and Secondary VM files on different datastore.

Difference between vSphere 5.5 and vSphere 6.0 Fault Tolerance (FT)

Difefrence between FT 5.5 amd 6.0
One thing to be aware of with VMware FT is that this feature does not monitor the application its still only virtual machine protection so you still need to think about the application and how it will be protected.

What new features are in vSphere 6.0

Well there has been some public information out there for some time on some of the new features that will or maybe in vSphere 6.0, mainly information that has come from VMworld 2014 and some from the beta which although public did have some NDA with it. As VMware announce the new release below are some of the new features that made the cut into the new version.
vSphere Platform (including ESXi)

  • Increase in vSphere Host Configuration Maximums
    • 480 Physical CPUs per Host
    • Up to 12 TB of Physical Memory
    • Up to 1000 VMs per Host
    • Up to 6000 VMs per Cluster
  • Virtual Hardware v11
    • 128 vCPUs per VM
    • 4 TB RAM per VM
    • Hot-add RAM now vNUMA aware
    • Serial and parallel port enhancements
      • A virtual machine can now have a maximum of 32 serial ports
      • Serial and parallel ports can now be removed
  • ESXi Account & Password Management
    • New ESXCLI commands to add/modify/remove local user accounts
    • Configurable account lockout policies
    • Password complexity setting via VIM API & vCenter Host Advanced System Settings
  • Improved Auditability of ESXi Admin Actions
    • Prior to vSphere 6.0, actions taken through vCenter by any user would show up as ‘vpxuser’ in ESXi logs.
    • In vSphere 6.0 actions taking through vCenter will show the actual username in the ESXi logs
  • Enhanced Microsoft Clustering (MSCS) Support
    • Support for Windows 2012 R2 and SQL 2012
    • Failover Clustering and AlwaysOn Availability Groups
    • IPv6 Support
    • PVSCSI & SCSI controller support
    • vMotion Support
      • Clustering across physical hosts with Physical Compatibility Mode RDMs (Raw Device Mapping)
      • Supported on Windows 2008, 2008 R2, 2012, and 2012 R2

vCenter 6.0

  • Scalability Improvements
    • 1000 Hosts per vCenter
    • 10,000 VMs per vCenter
    • 64 Hosts per cluster (including VSAN!)
    • 6000 VMs per cluster
    • Linked Mode no longer requires MS ADAM
  • New Simplified Architecture with Platform Services Controller
    • Centralizes common services
    • Embedded or Centralized deployment models
  • Content Library
    • Repository for vApps, VM templates, and ISOs
    • Publisher/Subscriber model with two replication models
    • Allow content to be stored in one location and replicated out to “Subscriber” vCenters
  • Certificate Management
    • Certificate management for ESXi hosts & vCenter
    • New VMware Endpoint Certificate Service (VECS)
    • New VMware Certificate Authority
  • New vMotion Capabilities
    • Cross vSwitch vMotion
    • Cross vCenter vMotion
    • Long Distance vMotion
    • vMotion across L3 boundaries

Storage & Availability

  • VMware Virtual Volumes (VVOLS)
    • Logical extension of virtualization into the storage world
    • Policy based management of storage on per-VM basis
    • Offloaded data services
    • Eliminates LUN management
  • Storage Policy-Based Management
    • Leverages VASA API to intelligently map storage to policies and capabilities
    • Polices are assigned to VMs and ensure storage performance & availability
  • Fault Tolerance
    • Multi-vCPU FT for up to 4 vCPUs
    • Enhanced virtual disk format support (thin & thick disks)
    • Ability to hot configure FT
    • Greatly increased FT host compatibility
    • Backup support with snapshots through VADP
    • Now uses copies of VMDKs for added storage redundancy (allowed to be on separate datastores)
  • vSphere Replication
    • End-to-end network compression
    • Network traffic isolation
    • Linux file system quiescing
    • Fast full sync
    • Move replicas without full sync
    • IPv6 support
  • vSphere Data Protection
    • VDP Advanced has been rolled into VDP and is no longer available for purchase (the features of VDP-A are now available for free to Essentials Plus and higher editions of vSphere!)
    • Protects up to 800 VMs per vCenter
    • Up to 20 VDP appliances per vCenter
    • Replicate backup data between VDP & EMC Avamar
    • EMC Data Domain support with DD Boost
    • Automated backup verification

So there you have it – a pretty long list of updates for vSphere 6.0. One thing that I was surprised to see that that vSphere Application HA has been removed in vSphere 6.0 due to a lack of demand for the feature, oddly its not something we have seen at Symantec and still our user base grows quarter by quarter and Symantec ApplicationHA goes on..

Providing high availability and disaster recovery for virtualized SAP within VMware the right way

Over the past couple of years I have been getting more and more involved in SAP architecture designs for HA and DR and one on my pet hates at the start of my journey was the lack of basic information on what the SAP components were for and how they interacted with each other, it was a hard slog, for those who are venturing into SAP or even those hardened SAP veterans out there the paper below covers SAP in great detail and more importantly covers how SAP deployments should be done correctly especially when high availability and disaster recovery is a requirement.
Many organizations rely on SAP applications to support vital business processes. Any disruption of these services translates directly into bottom-line losses. As organization’s information systems become increasingly integrated and interdependent, the potential impact of failures and outages grows to enormous proportions.
The challenge for IT organizations is to maintain continuous SAP application availability in a complex, interconnected, and heterogeneous  application environment. The difficulties are significant:

  • there are many potential points of failure or disruption
  • the interdependencies between components complicates administration
  • the infrastructure itself undergoes constant change

To gain additional competitive advantage, enterprises must now work more closely together and integrate their SAP environment with those of other organizations, such as partners, customers, or suppliers. The availability of these applications is therefore essential.
There are three main availability classes, depending on the degree of availability required:

  • Standard Availability – achievable availability without additional measures
  • High Availability – increased availability after elimination of single points of failure within the local datacenter
  • Disaster Recovery – highest availability, which even overcomes the failure of an entire production site

Symantec helps the organizations that rely on SAP applications with an integrated, out-of-the-box solution for SAP availability. Symantec’s High Availability and Disaster Recovery solutions for SAP enhance both local and global availability for business critical SAP applications.
Local high availability: By clustering critical application components with application-specific monitoring and failover, Symantec’s solutions simplify the management of complex environments. Administrators can manually move services for preventative and proactive maintenance, and the software automatically migrates and restarts applications in case of failures.
Global availability/disaster recovery: By replicating data across geographically dispersed data centers and using global failover capabilities, companies can provide access to essential services in the event of major site disruptions. Using Symantec’s solutions, administrators can migrate applications or an entire data center within minutes, with a single click through a central console. Symantec’s flexible, hardware independent solutions support a variety of cost-effective strategies for leveraging your investment in disaster recovery resources.
Symantec provides High Availability and Disaster Recovery solutions for SAP, utilizing Symantec™ Storage Foundation, powered by Veritas, Symantec™ Replicator Option, Symantec™ Cluster Server, powered by Veritas, and Cluster Server agents that are designed specifically for SAP applications. The result is an out-of-the-box solution that you can quickly deploy to protect critical SAP applications immediately from either planned or unplanned downtime.
Download the full white paper below.
WP-High-Availability-Disaster-Recovery-for-SAP-applications-1114

Fix – vSphere Replication – Cannot connect to the specified site – due to change in default ports

Keep meaning to document this one, so here it goes.
Adding a site in VMware vSphere Replication fails with the error: Cannot connect to the specified site, site might not be available on the network or the network configuration may not be correct.
This may happen if you change the default network port for the vCenter Servers from 80 to another port number.
To resolve this issue when you are not using the standard port 80 or port 443, specify the port number in vSphere Replication: Add Site dialog.
VR_Setting
For example: If vCenter Server at IP address 192.168.1.10 is accessed over port 8081, enter 192.168.1.10:8081 in the vSphere Replication: Add Site dialog.
Hope this helps you out.
thanks for reading.

Symantec Storage and Availability products and SHELLSHOCK bug impact

Customer ALERT:  Please take the time to read through the following notifications and alert your teams/customers regarding this bug.
 
The Symantec Storage Foundation Products position on the BASH ShellShock bug and the related Technotes are now published externally:
 
http://www.symantec.com/business/support/index?page=content&id=TECH225112
 
The pertinent content explaining the position of IA products is as follows:
 
It is of critical importance for customers to apply the available BASH patches immediately.
Check with the Operating System Vendor to determine if your version of BASH is affected and apply the vendor’s patch as necessary.
Further details are available in the Symantec overview of ShellShock:
http://www.symantec.com/connect/blogs/shellshock-all-you-need-know-about-bash-bug-vulnerability
 
The Symantec products in the table below may interface with a vulnerable version of BASH on the host operating system. None of these IA products have been proven to be vulnerable.  As a precaution, for the Symantec products in the table below, we recommend that services are stopped and restarted after a patch for BASH has been applied.
 

                                Product BASH Status
Storage Foundation for Unix/Linux (SF) BASH is not distributed with this product
Storage Foundation and High Availability solutions (SFHA) BASH is not distributed with this product
Storage Foundation Cluster File System (SFCFS) BASH is not distributed with this product
Storage Foundation for Oracle RAC (SFRAC) BASH is not distributed with this product
Storage Foundation for Windows BASH is not distributed with this product
Volume Manager (VxVM) BASH is not distributed with this product
Volume Replicator (VVR) BASH is not distributed with this product
File Replicator (VFR) BASH is not distributed with this product
Dynamic Multi-Pathing (DMP) BASH is not distributed with this product
Veritas File System (VxFS) BASH is not distributed with this product
Cluster Server (VCS) for Unix/Linux BASH is not distributed with this product
Cluster Server  for Windows (VCSW) BASH is not distributed with this product
ApplicationHA BASH is not distributed with this product
FileStore (S/W Appliance) A vulnerable version of BASH is distributed with FileStore, see technote TECH225136
FileStore N8300 (H/W appliance) A vulnerable version of BASH is distributed with FileStore, see technote TECH225136
Data Insight BASH is not distributed with this product
Veritas Operations Manager (VOM) BASH is not distributed with this product
CommandCentral Storage (CCS) BASH is not distributed with this product
Veritas Enterprise Administrator (VEA) BASH is not distributed with this product
Symantec Disaster Recovery Orchestrator (DRO) BASH is not distributed with this product

 
FileStore Product:
If you are using the Symantec FileStore product, a patch for BASH is available from Symantec, reference tech note TECH225136 <http://www.symantec.com/docs/TECH225136> for more information on how to obtain this patch.

Providing availability of vCenter Server v5.x with Symantec ApplicationHA v6.1

It’s been a while coming but I’ve finally got some time to write this article on protecting vCenter Server availability. It’s probably also come as an opportune time as not so long ago VMware announced the end of availability of vCenter Heartbeat so now many of you are probably looking for ways to protect vCenter Server more than ever especially due to the criticality that it brings in management and operations of your vSphere environment. This article will highlight areas that need to be protected and what options you have.
With release after release of vSphere more functionality goes into vCenter Server and more of the virtualized environment relies on it being available to serve the needs of the administrator. although vCenter Server can typically reside on a single server, it is made up of many critical parts. If you’ve sat through an install of vCenter Server you will know that its broken up into 4 core areas, these are Single Sign-on (SSO), Inventory Service, vCenter Server itself and lastly there is also the vSphere Web Client & Services. SSO is a core component of vSphere since its introduction in v5.1, it’s there to handle authentication requests and also is a security broker handling requests coming from the various vSphere solutions. Although there were some operational hiccups in v5.1, subsequent versions have become stronger and deployment options have increased, I’ll take a look at those in a minute. The Inventory Service is another key component that has two functions firstly it stores the custom tags for the vSphere Web Client and secondly it’s also a proxy for the vSphere Web Client which actually assists in reducing the load on the vCenter Server (VXPD), knowing this little tidbit can actually help in deployment scenarios so if you are breaking up the components onto separate servers then it’s best to keep the Inventory Service close to the vSphere Web Client Services. Next there is the vCenter Server itself which is made up of a number of services and critical to the whole environment. Lastly there is the vCenter Web Server/Services which provides the administrator with a web UI for management and operations of the entire environment.
Now we’ve gone through the critical services let’s take a look at deployment and availability options within each group. Ignoring the simple install option of vCenter for the moment, options available when using the custom install method for SSO provide the ability to install in 3 types of deployment modes, these are single SSO, SSO installed in HA mode and SSO installed for multi-site environment. With single deployment it’s just that, SSO is installed onto a system and acts as a single entity for the whole vSphere environment. HA mode provides the ability to add another SSO system to an existing SSO system and provides a failover mechanism in case the primary SSO system fails; typically a load balancer is used in front of the SSO servers for ease of configuration. Lastly the multisite option provides local authentication in a multiple site scenario, be aware though that there is no failover between sites so if a failed site fails then local authentication for that site will fail too. I don’t want to focus too much on the different scenarios too much as there are plenty of blogs out there which highlight best practices for deploying SSO. What is important is the availability of the services especially in a single SSO deployment which let’s face it will be used by large number of SMBs and enterprise customers.
When deployed on a single system SSO services consist of 5 key services, these are the VMware Certificate Services, VMware Directory Services, VMware Identity Management Services, VMware KDC Services and the VMware Secure Token Services when these services are installed the default Windows Service Manager recovery configuration for most of these services are set to restart upon 1st and 2nd failure, you may think this will be OK for availability but what if the service keeps failing, what if the service doesn’t restart, what effect will it have on the other key components in the environment which as we know now are critical to operations. What’s needed is a method to monitor these services and the other components intelligently and remediate any issues that occur within the environment. The other services such as Inventory, vCenter Server and Web Services do not have any recovery options enabled so the administrator is pretty much left to manage those independently.
Using a solution like Symantec ApplicationHA can assist in protecting all of the vCenter Server services and still have the ability to utilize VMware features like VMwareHA and DRS especially useful if vCenter Server has been deployed onto a virtual machine, which I assume you have. Symantec ApplicationHA provides the ability to monitor all of the key components and in the event it is unable to resolve issues it can pass control to VMwareHA to reset the virtual machine. ApplicationHA has a number of application agents it supports and also has a vCenter agent which can be used to protect vCenter. There is also a wizard which can be launched from with vSphere Web/Desktop Client which can be used to protect vCenter. The current version of the wizard does not include SSO configuration but can be added after the wizard is run. Symantec are aiming to update their wizard to include SSO so for the moment we can script the additional services pretty easily with ApplicationHA commands.

Symantec ApplicationHA auto detects the services within the deployment and provides the ability to also monitor the connection between the SQL database and vCenter itself.

This is of available vCenter services are displayed within the configuration

The dependency of the services is shown by viewing the dependency component view.

Finally the additional SSO services can be added to the configuration by running the script containing the commands below.
haconf -makerw
hatype -modify GenericService RestartLimit 1
hares -add VMWareCertificateService GenericService vCenterServer_SG
hares -modify VMWareCertificateService ServiceName VMWareCertificateService
hares -modify VMWareCertificateService Enabled 1
hares -add VMwareDirectoryService GenericService vCenterServer_SG
hares -modify VMwareDirectoryService ServiceName VMwareDirectoryService
hares -modify VMwareDirectoryService Enabled 1
hares -add VMwareIdentityMgmtService GenericService vCenterServer_SG
hares -modify VMwareIdentityMgmtService ServiceName VMwareIdentityMgmtService
hares -modify VMwareIdentityMgmtService Enabled 1
hares -add VMwareKdcService GenericService vCenterServer_SG
hares -modify VMwareKdcService ServiceName VMwareKdcService
hares -modify VMwareKdcService Enabled 1
hares -add VMwareSTS GenericService vCenterServer_SG
hares -modify VMwareSTS ServiceName VMwareSTS
hares -modify VMwareSTS Enabled 1
hares -add vmwarelogbrowser GenericService vCenterServer_SG
hares -modify vmwarelogbrowser ServiceName vmwarelogbrowser
hares -modify vmwarelogbrowser Enabled 1
hares -link vspherewebclientsvc vpxd
hares -link vimQueryService vctomcat
hares -link vpxd VMwareKdcService
hares -link vpxd VMwareSTS
hares -link vpxd VMWareCertificateService
hares -link VMwareIdentityMgmtService VMwareDirectoryService
hares -link VMwareSTS VMwareIdentityMgmtService
hares -link vmwarelogbrowser vspherewebclientsvc
hares -unlink vimQueryService vpxd
haconf –dump –makero
 
Here is the final list of all services being monitored by ApplicationHA

And the dependency component view is also updated to include all of the services and the correct dependencies.

Now that the configuration is complete testing for fault scenarios can commence. For more information on ApplicationHA please follow the product link below.
Symantec ApplicationHA
http://www.symantec.com/application-ha