For companies that run their day-to-day business operations on SAP ERP (enterprise resource planning), it’s important for IT teams to make sure they comply with mandated high-availability service-level agreements (SLAs).
As part of the formal SLA process, many companies require their internal IT teams to set specific recovery time objectives (RTO) and recovery point objectives (RPO). This helps ensure the ERP application is protected and that services for end users, customers and vendors can recover as rapidly as possible if an outage occurs.
As your IT team takes on this challenge—whether to implement new high-availability capabilities or to validate your current capabilities—there are six key steps to follow to make sure your SAP ERP system is fully protected. These best practices were compiled for this eWEEK Data Points article by Harry Aujla, EMEA technical director at SIOS Technology.
Data Point No. 1: Eliminate Single Points of Failure
Any high-availability solution needs to eliminate single points of failure, such as the case when connecting cluster nodes to a single SAN or other shared storage. If your SAP system runs in the cloud, you can also take advantage of geographically separated availability zones and regions. Although a high-availability cluster can be deployed within a single availability zone, the zone itself presents a single point of failure. That is, if the zone becomes unavailable, you can potentially lose access to the entire high-availability cluster and its associated data.
Data Point No. 2: Separate SAP Cluster Nodes Across Cloud Availability Zones
To eliminate single points of failure, separate the SAP cluster nodes across cloud availability zones, such as deploying Node1 in Zone1, Node2 in Zone2, and a witness or quorum node in Zone3. If necessary, the SAP application can then fail over from one zone to another. You can also address disaster recovery requirements by adding a third node to the high-availability cluster in an additional zone or region.
Data Point No. 3: Ensure Allocation Constraints Do Not Impact Performance
Configure separate high-availability clusters for each protected SAP instance (ASCS/ERS, PAS, AAS) and any associated databases. This allows for maximum performance for both the SAP software and the databases—rather than forcing them to fight over system resources if running on the same cluster nodes. Taking this approach is especially important when using a memory-intensive database solution such as SAP HANA.
Data Point No. 4: Upgrade to SAP Enqueue Server Framework Version 2
After a failure event in a configuration using version 1 of the SAP Standalone Enqueue Server Framework, the Central Services instance needs to be started on the cluster node where the Enqueue Replication Server (ERS) instance is running. Upgrading to version 2 of the Standalone Enqueue Server Framework eliminates this issue. You can assign a dedicated virtual IP/hostname to the ERS instance and then direct traffic to the cluster node hosting the ERS instance. Because of this dedicated virtual IP/hostname, the corresponding Central Services instance can then fail over to any cluster node and retrieve its replicated lock table through the network.
Data Point No. 5: Set Up Shared File Systems for ASCS and ERS
To set up shared file systems for ASCS and ERS, use an NFS configuration in which the current host for each resource acts as the NFS server for the corresponding SAP instance file system. This makes the file system for each SAP instance accessible via the virtual IP associated with that instance. To ensure data integrity and failovers occur quickly and with minimal disruption, also verify the Enqueue lock table consistency on failovers and switchovers for the ASCS/ERS cluster.
Data Point No. 6: Manage Instances Carefully
Every high-availability solution should carefully manage the instances in the SAP environment so that they are brought online and configured to communicate with one another in a coordinated manner. To accomplish this objective, it’s important to maintain an Enqueue Replication Server instance on a separate cluster node from the Central Services instance. This node will hold a replicated back-up copy of the lock table that the Enqueue Server can use to recover its active locks after a failure.
If you have a suggestion for an eWEEK Data Points article, email [email protected].