PLANNING
The more replicas of a database, the more accessible the data. Creating too many replicas, however, can add unnecessarily to the overhead of maintaining a system and affect performance. As you plan your cluster strategy, try to create a balance between your users' requirements for data availability and the physical ability of each server in your cluster to manage additional workload. More than three replicas of a database may not provide you with significant incremental availability. If users can adequately access a database from one or two servers, do not increase the number of replicas in the cluster.
When users require the constant availability of a specific database, consider placing replicas on every server in the cluster if you have adequate disk space and resources.
In addition, try to distribute the busiest databases to different servers so that no server contains too many busy databases. If the servers in the cluster all have a similar amount of processing power, you can have an equal load on each server, including the processing power reserved for failover. If a server has significantly more or less processing power than the other servers, consider changing the number of databases on the server and the number of databases that can fail over to the server. Also, distribute mail files across a cluster, or set up separate servers or separate clusters for mail.
Because busy databases in a cluster can create a lot of replication events, it is a good idea to install these replicas on the fastest disk hardware available in the cluster. If possible, place these replicas where other processes are not in contention -- for example, on a partition other than the one that contains the operating system swap file.
To view which databases and replicas already exist in the cluster, open the Cluster Database Directory (CLDBDIR.NSF). It contains a document that stores information about each database and replica in a cluster.
Note: Selective replication formulas work differently in a cluster.
How many replicas to create
The following list describes some factors to consider when determining how many replicas to create.
There are many factors to consider when deciding how many replicas to create. Some factors suggest creating more replicas, and some suggest creating fewer replicas. The following list describes those factors and how they might affect your cluster traffic and performance.
Prior to distributing databases in a cluster, it can be helpful to create a table of information about the databases and the cluster hardware. You can use the table to determine how important specific databases are and how adequate your resources are. You can include some or all of the following:
This identifies each database.
Large databases consume a lot of disk space. Depending on your disk capacity, you may want to create fewer replicas of larger databases to preserve disk space.
If you have a large number of users, they will probably experience better performance if usage is spread across multiple servers. This requires multiple replicas. If the number of users is small, they probably won't notice a performance improvement from additional replicas.
If the transaction rate is high, creating multiple replicas may improve performance.
To find out the rate of activity for a database, look in the IBM® Notes® log file.
If you expect a large amount of new data in the database, additional replicas may slow down performance because cluster replication will cause a lot of additional traffic. If you have powerful servers and a lot of bandwidth, this may not create a problem.
The more powerful the servers and the more disk space they have, the more active replicas you can create without significantly affecting performance.
Cluster replication can create a bottleneck on a network that does not have enough bandwidth. Therefore, the greater the bandwidth, the more replicas you can create.
For databases that are mission-critical, you should create multiple replicas. For databases where availability is less important, create fewer replicas or none at all.
This table helps identify which databases require high availability, which databases are busiest, and how much additional disk space you will need in the future. In this example, two databases are very important and are growing rapidly. You should be sure that there are enough replicas of these databases so that they are always available. You should also be sure there is adequate disk space for growth on every server that contains a replica of these databases. One database is of medium importance, not growing as quickly, and not very active. You should provide no more than one replica of this database, unless it would affect your business negatively if the database was not available for a while. One database is not very important and does not require a replica in the cluster.
The number of concurrent users helps you determine the need for workload balancing.
The following table uses a subset of the preceding information to determine the number of replicas needed.
Table 1. Sample table of organization-specific database information
Related concepts How replication works in a cluster Planning a cluster