The concepts you should know before creating a real-time application

Referenced by the official spring-framework docs


It’s important to understand that HTTP is used only for the initial handshake, which relies on a mechanism built into HTTP to request a protocol upgrade (or in this case a protocol switch) to which the server can respond with HTTP status 101 (switching protocols) if it agrees. Assuming the handshake succeeds the TCP socket underlying the HTTP upgrade request remains open and both client and server can use it to send messages to each other.

An important challenge to adoption is the lack of support for WebSocket in some browsers. Furthermore, some restrictive proxies may be configured in ways that either preclude the attempt to do an HTTP upgrade or otherwise break connection after some time because it has remained opened for too long. Therefore to build a WebSocket application today, fallback options are required in order to simulate the WebSocket API where necessary. The Spring Framework provides such transparent fallback options based on the SockJS Protocol.

Using WebSocket brings up important design considerations that are important to recognize early on, especially in contrast to what we know about building web applications today. Today REST is widely accepted, understood, and supported architecture for building web applications. It is an architecture that relies on having many URLs (nouns), a handful of HTTP methods (verbs), and other principles such as using hypermedia (links), remaining stateless, etc. By contrast a WebSocket application may use a single URL only for the initial HTTP handshake. All messages thereafter share and flow on the same TCP connection. This points to an entirely different, asynchronous, event-driven, messaging architecture.

WebSocket does imply a messaging architecture but does not mandate the use on any specific messaging protocol. It is a very thin layer over TCP that transforms a stream of bytes into a stream of messages and not much more. It is up to applications to interpret the meaning of a message.

Unlike HTTP, which is an application-level protocol, in the WebSocket protocol there is simply not enough information in an incoming message for a framework or container to know how to route it or process it. Therefore WebSocket is arguably too low level for anything but a very trivial application.

For this reason the WebSocket RFC defines the use of sub-protocols. During the handshake, the client and server can use the header Sec-WebSocket-Protocol to agree on a sub-protocol, i.e. a higher, application-level protocol to use. The use of a sub-protocol is not required, but even if not used, applications will still need to choose a message format that both the client and server can understand. The Spring Framework provides support for using STOMP.

WebSocket sends an HTTP request to establish a connection which is called handshake. Upgrade and Connection Headers are included in the HTTP request header as shown below.

GET ws://localhost:3000/sockjs-node HTTP/1.1
Host: localhost:3000
Connection: Upgrade
Pragma: no-cache
Cache-Control: no-cache
User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10_13_6) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/84.0.4147.89 Safari/537.36
Upgrade: websocket
Origin: http://localhost:3000
Sec-WebSocket-Version: 13
Accept-Encoding: gzip, deflate, br
Accept-Language: ko-KR,ko;q=0.9,en-US;q=0.8,en;q=0.7,ja;q=0.6
Sec-WebSocket-Key: xwGnajy+I6YJ/AW7pTKioA==
Sec-WebSocket-Extensions: permessage-deflate; client_max_window_bits

The server responds with 101 Switching Protocols status code as shown below, and the TCP connection is maintained even after handshake.

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Accept: 6Ux2cxOp2HhzP9SLCuADGUKiLbU=
  1. Traditional Polling

The classic polling method sends HTTP requests periodically to check for new information. Since this method continuously sends requests, the handshake cost to create a connection each time increases and a burden is placed on the server.

2. Long Polling

Long Polling is an improved method of Traditional Polling. The client sends a request to the server, and the server responds only when there are changes and then close the connection. Then, the client immediately sends a request back to the server, waiting for the change to be made. Since the connection cannot be waited indefinitely, the browser waits about 5 minutes, and depending on the intermediate proxy, the connection may be closed for a shorter time. If changes occur at irregular intervals, it is efficient, but if the changes are frequent, there is no difference from the existing traditional polling and the burden on the server increases.

3. HTTP Streaming

HTTP Streaming sends HTTP requests in the same way as Long Polling, but keeps the connection without closing even after responding to the client for changes. Therefore, instead of establishing and disconnecting a new connection each time, one connection is maintained and the change is delivered as a response message. HTTP Streaming can reduce the burden on the server compared to the Long Polling method, but simultaneous processing becomes difficult when multiple changes are made. This is because the server must deliver all of the currently updated data, so that the client can know the start position of the next updated data. In addition, the HTTP Streaming method guarantees a certain degree of real-time for messages that the server delivers to the client, but requests that the client sends to the server still need to create a new connection. In terms of trade-offs such as concurrency and server burden, it is said that Long Polling is used more than HTTP Streaming.

4. WebSocket

WebSocket was created to solve the problems of the HTTP Long Polling and Streaming methods as described above, and to enable two-way communication between server and client. WebSocket creates services dynamically, but sometimes AJAX, HTTP Streaming, and HTTP Long Polling are more effective. For example, if the frequency of changes does not occur frequently and the data size is small, AJAX, HTTP Streaming, and HTTP Long Polling techniques can be effective. In other words, if you need to guarantee real-time and the frequency of changes is large, WebSocket can be a good solution.

Overview of SockJs

The goal of SockJS is to let applications use a WebSocket API but fall back to non-WebSocket alternatives when necessary at runtime, i.e. without the need to change application code.

SockJS is designed for use in browsers. It goes to great lengths to support a wide range of browser versions using a variety of techniques. Transports fall in 3 general categories: WebSocket, HTTP Streaming, and HTTP Long Polling.

The SockJS client begins by sending “GET /info” to obtain basic information from the server. After that it must decide what transport to use. If possible WebSocket is used. If not, in most browsers there is at least one HTTP Streaming option and if not then HTTP (Long) Polling is used.

STOMP Over WebSocket Messaging Architecture

The WebSocket protocol defines two main types of messages — text and binary — but leaves their content undefined. Instead it’s expected that the client and server may agree on using a sub-protocol, i.e. a higher-level protocol that defines the message content. Using a sub-protocol is optional but either way the client and server both need to understand how to interpret messages.

STOMP is a simple text-oriented messaging protocol that was originally created for scripting languages to connect to enterprise message brokers. It is designed to address a subset of commonly used patterns in messaging protocols. STOMP can be used over any reliable 2-way streaming network protocol such as TCP and WebSocket.

The following summarizes the benefits for an application of using STOMP over WebSockets:

  1. Standard message format
  2. Application-level protocol with support for common messaging patterns
  3. Client-side support, e.g. stomp.js, msgs.js
  4. The ability to interpret, route, and process messages on both the client and server-side
  5. The option to plug in a message broker — RabbitMQ, Kafka, and many others — to broadcast messages

Most importantly the use of STOMP enables the Spring Framework to provide a programming model for application-level use in the same way that Spring MVC provides a programming model based on HTTP.

Backend & Blockchain Developer