(WO/2003/026244) CLIENT SERVER NETWORKS
- Biblio. Data
- Description
- Claims
- National Phase
- Notices
- Documents
- Note: OCR Text
- Note: Text based on automatic Optical
Character Recognition processes. Please
use the PDF version for legal matters
- Note: Text based on automatic Optical
CLIENT SERVER NETWORKS This invention relates to client server networks and, in particular, to networks in which clients have multiple alternative servers that can handle their requests.
In such client server networks, a problem arises in determining to which of the server nodes a client should send a given request. Known methods of dealing with this problem include load sharing, in which the load is spread across the multiple servers on a round robin basis, and worker standby schemes in which a server is designated as the worker to which requests are usually sent but an alternative server, the standby, handles requests when the worker is unable to. When the worker is again able to handle requests, the requests are again sent to it.
However, neither of these approaches is satisfactory. Round robin load sharing is unsuitable if the capacities of the various servers differ significantly as the approach does not take relative server size into account. Worker-standby approaches are unsuitable if the request volume exceeds the capacity of a single server.
The effective capacity of any given server can vary considerably over time, for example because the server is handling requests from other clients, or for other reasons such as maintenance activity, such as archiving.
The present invention aims to address the problems outlined above and to provide an improved handling of requests to multiple servers.
According to the invention there is provided a
method of distributing requests from a client node to
servers of a server system having a plurality of servers,
the method comprising periodically repeating the steps of:
measuring the activity of each server;
The invention also provides a system for distributing requests from a client node to servers of a server system having a plurality of servers, comprising a distributor at the client node, the distributor including means for measuring the activity of each server, means for assessing the relative loading of the plurality of servers from the measured server activities, and means for adjusting the distribution of requests to individual servers of the plurality of servers in accordance with the assessment of relative loadings; wherein the distribution adjustment means comprises means for adjusting the proportion of requests assigned to each server as a function of measured server activity, the mean activity across the plurality of servers and the existing proportion of requests assigned to the server.
Embodiments of the invention have the advantage that load distribution can be even across servers, taking into account their relative capacities.
Preferably, the client node comprises a load
controller for applying a load control algorithm to the
requests distributed to all servers of the plurality of
servers. This has the advantage that loads can be
Preferably, server activity is measured by counting the number of server requests and responses over a period.
Preferably, this period is an aggregate number of requests.
Preferably, the assessing of relative loading comprises comparing the server requests and responses for each of the servers over the sample period.
Preferably, the request distribution of the server is adjusted according to a function of measured server activity, mean activity across all the servers and the existing proportion of requests assigned to the server.
Preferably, an override facility is provided whereby a request related to a previous request is sent to the same server as the previous request.
Embodiments of the invention will now be described,
by way of example, and with reference to the accompanying
drawings in which:
Figure 1 shows a schematic diagram of a request
distribution system embodying the invention;
Figure 2 is a UML use case diagram for the
distributor of Figure 1;
Figure 3 is a UML class diagram for measurement of
server loadings to assess relative loadings;
Figure 4 is an example sequence for a Distribute use
case;
Figure 5 is an example sequence for a Response use
case;
Figure 6 is an example sequence for a Measure use
case;
Figure 7 is an example sequence for a Re-evaluate use
case;
Referring to Figure 1, the embodiment to be described is based on a request distributor unit 10, within each client node 12. The request distributor receives both local requests 14 generated internally within the client node and external requests 16 received over an external interface.
The distributor distributes server requests among n
servers, here shown as server 1,18,
The architecture illustrated in Figure 1 enables the same load control to be used for external requests and requests originating locally at the client node. Also, the distributor makes the server network look like a single node to the load control algorithm. It can be seen from Figure 1 than the distributor is positioned, logically, between the servers and the load control.
Distributing the load to the servers on a request by
request basis does not work when a request is related to an
earlier request. The distributor includes an override
The UML use cases for the distributor are shown in Figure 2. The local application or remote client 26, time 28 and remote server 30 are shown as being remote from the system. The diagram shows that the distributor distributes 31 server requests in accordance with responses 32 from the servers and some measure 34 based on distribution and response. The distribution is re-evaluated over time 36.
Irrespective of the origin of the server request, that is external or internal to a client node, three requirements for distributing server requests to a set of available servers may be identified as follows: 1. To load the available servers evenly, as far as possible; 2. To avoid high frequency oscillations in server load by not overreacting to past load imbalances, causing new load imbalances; and 3. To allow override requests to insist on a specific server despite the distribution algorithm.
The override request of requirements is invoked when a subsequent request requires context data that would have been created in the server that handled a previous request.
The response to the previous request indicated which server handled it.
The distribution used should preferably be
independent of the particular details of the client-server
The load control algorithm works from measurements of the aggregate traffic to all servers. A load control scheme suitable for a server node terminating request traffic should be equally suitable for a client node having a request distributor. It should be understood that load control is not essential.
The approach outlined above is based on measurements
of relative loadings. It would be possible to measure
traffic related quantities such as request and response
counts and their rate derivatives, response time
distributions (such as mean,
Although these are all universal measurements, they suffer to a greater or less degree from the problem that the threshold separating normal from overload conditions are usually protocol and/or application specific. For example, what constitutes a satisfactory response time for one application is likely to be unacceptable for another. Thus, a design based upon testing measurements against absolute values is unsatisfactory.
In addition to the approach based on measurements of
relative loadings, some protocols include an explicit
Figures 3 to 7 show generalised UML class and sequence diagrams for the distributor. Figure 3 is a UML class diagram showing three entities: the server 38, request 40 and response 42. The server has the attributes of measurement, proportion, the request has the attributes of Timestamp, transaction ID and Override, and the response has the attributes of Timestamp and transaction ID. The server has the operations read and update.
Figure 4 shows the sequence for the Distribute. use case in which the client or local application and Remote Server 42,44 are shown outside the distributor system.
Requests 46 are sent to the distributor and a read operation 48 performed on every server object instance to obtain current proportions, and which includes receiving responses 50 from the server objects. The requests are then distributed 52 to the appropriate remote server 44 depending on the relative loading.
Figure 5 shows the sequence for the response use
case. Here, the remote server 44 sends responses to the
Distributor Control which sends the responses to the client
or local application 42. Figure 6 shows the sequence for
the Measure use case in which updates to measurements 51
are made between Distribution-Control and
Figure 7 shows the sequence for the Re-evaluate use
case. The outside control here is time 58. On expiry of a
timer, which is a periodic event, read measurements are
made by the Distributor from the Server Objects and an
update made. Both receive replies from the server objects
and both are invoked on every instance of server object.
The distribution algorithm must not be disrupted by addition and removal of servers. A distribution by parts scheme is preferred although other algorithms are possible.
Under the preferred scheme, the server's proportions are expressed as a ratio according to capacity. For example, in a three server case where one of the servers has twice the capacity of other two, the ratio would be expressed as 1: 2: 1.
The number of parts assigned to each server is
The distribution algorithm adds these numbers cumulatively
from the first server to the last, recording the server
identity against each interval. Thus, the range
Where the random number generated is equal to one of the interval boundary values, a tie breaker rule is applied.
Where the distributor starts up, it assumes that all servers have the same capacity and assigns the same proportion to each. Rather than applying the value 1, a larger value such as 10 is applied to allow for downward, as well as upward adjustments.
Figure 8 shows a UML activity diagram for the Distribute use case. At 60, Value B is set to zero. Within box 62, a server range is iterated for each server. This involves the steps of reading a proportion P from each server object 64, assigning a range B to (B+P-1) to the server 66 and then resetting the value of B to B=B+P at 68.
In the foregoing description, it has been established that the selection of which server to send a request to should be based on a relative measurement of server loadings and that distribution is achieved using an algorithm based on server capacities. The following description concerns the measurement of server loading, and describes three possible ways of measuring server loadings: response time measurements, outstanding response count; and request and response count. The latter is especially preferred but the former two methods are possible. Other methods are also possible and will occur to those skilled in the art.
Response Time Measurements
The distributor can calculate the response time
distributions of each server. The relative loadings are
then evaluated by direct comparisons of the chosen measure,
for example the mean or
Secondly, a certain minimum number of response time samples
must be collected before their statistical distribution can
be assessed reliably. This may make the sample periods too
long and the distributor too sluggish in its reaction to
Outstanding Responses Counting As an alternative, the distributor maintains a count of the number of requests sent to each server that has not received at least one response. As this is a continuous measurement, periodic evaluation could be based either on fixed intervals or aggregate numbers of request. As a server becomes overloaded it should show an increase in its outstanding responses. TCP load control is based on this principle. The process is illustrated in Figure 13 which shows the counting of requests which have received no response at step 124 and the comparison of these counts over a period of time or aggregate number of requests at step 126. However, there are two potential problems. First, the approach is indirectly affected by any normal differences between server response times. This follows from the basic queuing theory formula which states that: mean concurrency = transactions per second (tps) x mean response time.
Thus, it is again difficult to distinguish distressed servers from naturally slow servers.
Second, the distributor has to distinguish initial responses to requests from subsequent responses in order that the outstanding response count is only decremented when an initial response is sent. This is difficult to achieve.
Request and Response Counting
This option is the most preferred of the three
considered. The distributor records, over a short sample
The absolute numbers of requests or responses may differ between servers due to the differing capacities.
However, provided that none is overloaded the request/response ratio for each should be almost identical.
This relies on each server receiving the same mixture of request types, which is true over a large number of requests but may not be reliable over short periods containing few requests. It is preferred therefore, to define samples by aggregate number of requests sent, rather than by time periods.
There are many candidates for the values that might be compared to decide relative loading. A straight ratio of requests to responses is only one alternative. The following is a non-exhaustive list of some candidate quantities and their theoretical ranges. In the list Rq is the number of theoretical requests sent to a server and Rs the number of responses returned.
1. Rq/Rs: 0
Figures 9 and 10 show the request and response counts of two servers as they are subjected to increasing traffic.
The figures are both graphs of messages per second (mps) against time. The server of Figure 9 has a relatively large capacity and that of Figure 10 a relatively small capacity.
The two servers have equal traffic shares. The figures emphasise how the server's response rate plateaus as the server reaches its capacity and becomes distressed. Figures 9 and 10 are based on the assumption of an average 1.5 responses per request.
In Figure 10, the smaller server starts to plateau at
time
Distribution Adjustment
It was mentioned previously that it'is important not
to overcompensate any adjustment of the server
distribution, so setting up oscillations. Periodically, or
Oscillations may be reduced by one of two ways.
First, the data from the last few sample periods may be
explicitly combined to compute the adjustments to
proportions for the next period. This effectively uses a
sliding window of retained data. Second, the data from all
previous samples may be implicitly combined using a decay
formula. Under this approach, data for second period
adjustment is the average of true second period data and
The embodiment described enables distribution of requests to multiple servers from a client node in a manner which reflects the individual capacities of the servers and based on their relative loadings.
Various modifications to the embodiments described are possible and will occur to those skilled in the art.
For example, other methods of measuring server activity may be possible and other distribution adjustment formulae may be adopted. However, such modifications are within the scope of the invention which is defined by the appended claims.