Paper Reading - CS Model - Some constraints and tradeoffs in the design of network communications

client-server-issues-Akkoyunlu-et-al-75Download

Name: Yuncheng Yao (Peter)

Reference: E. A. Akkoyunlu, K. Ekanadham, and R. V. Huber. 1975. Some constraints and tradeoffs in the design of network communications. In Proceedings of the fifth ACM symposium on Operating systems principles (SOSP ‘75). Association for Computing Machinery, New York, NY, USA, 67–74. DOI:https://doi.org/10.1145/800213.806523


The challenge that the authors address is how to incorporate many desirable but sometimes conflicting features into a client server model, by making necessary tradeoffs to solve the incompatibility. 

Some reasonable assumptions about the system are made. The clients and servers communicate via messages and ports; the most complex failure model is the timing failure, and failures may be silent; the system is asynchronous in terms of communication and computation; there is no mysterious system buffering as in MPI; failure intervals are long compared to the transaction time. 

The author also ruled out some unreasonable assumptions. It is not realistic to assume that 1)the network is reliable; 2)the topology is connected all the time; and 3) failures can be detected immediately.

The features that the authors want to include in theirs systems are:

  1. Status. Because status information cannot be provided elsewhere if it is not provided in the IPCM;

  2. Time-out. Because we don’t want to block the whole process forever just because of a single undelivered message;

  3. Insertion property. This provides maximum abstraction to the users by supplying them with limited communication primitives;

  4. Well known ports. It exposes frequently used resources like HTTP, compiling, FTP… ;

  5. Partial transfer to deal with different buffer sizes;

The authors identify some inevitable incompatibility among those desirable features and proposed possible solutions to have them all by making acceptable compromises.

  1. The conflict between the incomplete connectedness of the topology (eg. network partition) and providing complete status. Accurate status messages may be blocked by a partition.

  2. The conflict between time-outs and complete status. Even with the strong assumption that the network is reliable, because of the asynchronous nature of our communication model, a status message may not arrive before the time-out, and we cannot be sure whether the message got delivered and accepted (but is still being processed by the server), got delivered and rejected, or that the status information is sent but not yet received.

  3. The conflict between time-outs and insertion property. Similar to Conflict 2, it is impossible to be sure about the exact outcome of the transaction with time-outs, a situation that cannot happen had these two processes been directly connected on a centralized system, and this uncertainty violates the insertion property.

The solution to the first 3 conflicts is to provide the same ambiguous status information in many situations where we are not certain about the exact correct status . Whether the message is never delivered successfully, or the delivery to the server is complete but status is not sent in time, or that the status is sent but not received by the client in time, it is the same ambiguous status provided to the client. The tradeoff that we are making is that we are unable to provide complete status, but we will be able to provide some status.

  1. The conflict between the strong insertion property and the varying buffer sizes of different processes. Enforcing a universal buffer size is restricting, and violates insertion property by exposing communication details.

Solution: Allow partial transfers.

  1. The conflict among partial transfer, time-outs and insertion property. The RECEIVE request may time out before a complete message is transferred, and by telling the server process about the data existing in the buffer, we expose ugly communication details and violate insertion property; if we don’t tell the server process about the incomplete message, the message gets lost.

Solution: Adopt a weaker insertion property and allow buffer sizes to be exposed when necessary.

  1. The conflict between well-known ports, partial transfer and time-outs. Time-outs are necessary here, otherwise the well-known port may be blocked by a super slow message. New messages arriving after the original SEND request has timed out raises data consistency issues.

Solution: Ban partial transfer with well-known ports. Longer messages sent to the well-known process must use separate connections which are set up after the initial short message sent to the well-known port, which potentially use a layer of buffer processes, or communicate via another port.

  1. The conflict between many ports processes and partial transfers. With or without a separate buffer for each port, once a complete message arrives while there is an incomplete one, the server process has to deal with the incomplete message by buffering it internally, which violates the insertion property.

Solution: add a layer of buffer processes, which sends only complete messages to the server.

The paper also proves that it is impossible for the client and server to know that they are in mutual agreement about the status of the original message from client to the server. The implication is that it is not necessary to send status information more than once, because no matter how many times you send the status and ACK of the status, you can never be sure that you are in mutual agreement with each other.

Strengths of the paper:

  1. The insertion property is strict but helpful. By exposing limited and general primitives to the application layer, we can easily insert service layers on the communication path without interfering with the users. We have seen a beautiful layer of buffer processes, which spare the service process the trouble to provide internal buffering for partial transfers. One can further imagine that, when the load is big, we may add a layer of load-balancing dispatcher processes, without interfering with the client or the server. This powerful insulation and abstraction allows for maximum forward compatibility.

  2. The use of concurrent programming to increase the throughput. The design of buffer processes allows a service process to use the RECEIVE primitive concurrently. This is a huge improvement in terms of throughput, now that a service process doesn’t have to receive messages one by one. This is concurrency on the communication layer, and with the potential implementation of concurrency within the service process itself, the overall efficiency of this CS model is good.

Weaknesses of the paper:

  1. If the clients keep sending requests when the server is already busy, the service process may well run out of buffer space, and data may be lost. Some traffic control may be desirable to ensure that there is enough capacity on the server side.

  2. The overhead induced by dynamic creation of buffer processes is costly. It may be an optimization to use a dynamic process pool, or even a thread pool for buffering partial transfers, which reduces the overhead of process creation, and provides a cap for server capacity.

  3. The time-out feature is necessary in the CS model, for reasons aforementioned, but the exact time-out window could potentially be optimized by adding a failure detection layer. By dynamically calculating the estimated arrival time, we may be able to improve the completeness and accuracy of the time-out feature.

Paper Reading - CS Model - Some constraints and tradeoffs in the design of network communications

http://peteryaonyu.github.io/2023/02/17/paper-reading---cs-model---some-constraints-and-tradeoffs-in-the-design-of-network-communications/

Author

Yuncheng Yao

Posted on

2023-02-17

Updated on

2024-01-08

Licensed under