PAGES

18

Jan 12

WebSockets with RFC 6455



WebSockets allow for bi-directional, full-duplex commication between a client and a server. So instead of the client having to issue a request to the server and wait for a response, the client and server can each send data over the line at any time. The way this works is first the client initiates a handshake. The server receives that handshake message and replies with its own message. That’s it. Once the handshake is complete, data can be sent at anytime. This saves a ton of overhead. More info here.

Now to build a proof of concept. I started out with Oliver Mezquita Prieto’s post which is written using the hixie-76 protocol. The source contains two projects, a WebSocket server and a WebSocket client. The server is a windows console application in C# and the client is HTML and javascript.

There were two major differences between the hixie-76 protocol and the more current RFC 6455 version that Chrome 16 uses.

  1. Handshake
  2. Data Framing

The Handshake
Chrome 16 sends the server a handshake that looks like this.

[0]: “GET /test HTTP/1.1″
[1]: “Upgrade: websocket”
[2]: “Connection: Upgrade”
[3]: “Host: localhost:8181″
[4]: “Origin: http://localhost:8080″
[5]: “Sec-WebSocket-Key: 3d+7Mq6H6kr1PhIho1cGMA==”
[6]: “Sec-WebSocket-Version: 13″
[7]:

The server needs to take the Sec-WebSocket-Key and translate this key in its response. The steps are:

  1. Concatenate the “magic” GUID “258EAFA5-E914-47DA-95CA-C5AB0DC85B11″ to the given key. This results in “3d+7Mq6H6kr1PhIho1cGMA==258EAFA5-E914-47DA-95CA-C5AB0DC85B11″.
  2. Compute the SHA1 hash.

The function is

[gist id=1635057]

This must be sent to the client as “Sec-WebSocket-Accept”.

[gist id=1635232]

The final response looks like this. Something important to mention is that each header line must end with \r\n (carriage return + line feed or ASCII 13 followed by ASCII 10). The last header must be followed by an additional \r\n (making that two) to indicate the end of the message.

[0]: “HTTP/1.1 101 Switching Protocols”
[1]: “Upgrade: websocket”
[2]: “Connection: Upgrade”
[3]: “WebSocket-Origin: http://localhost:8080″
[4]: “WebSocket-Location: ws://localhost:8181/test”
[5]: “Sec-WebSocket-Accept: btX/LN887VqIxyneBXdiC+9z1MA=”
[6]:
[7]:

Data Framing
Once the handshake is complete, the client and server can send messages at any time. Every message sent from the client to the server or the server to the client must include information that will allow the recipient to parse the data. The following figure shows the high level overview of formatting required.

      0                   1                   2                   3
      0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1
     +-+-+-+-+-------+-+-------------+-------------------------------+
     |F|R|R|R| opcode|M| Payload len |    Extended payload length    |
     |I|S|S|S|  (4)  |A|     (7)     |             (16/63)           |
     |N|V|V|V|       |S|             |   (if payload len==126/127)   |
     | |1|2|3|       |K|             |                               |
     +-+-+-+-+-------+-+-------------+ - - - - - - - - - - - - - - - +
     |     Extended payload length continued, if payload len == 127  |
     + - - - - - - - - - - - - - - - +-------------------------------+
     |                               |Masking-key, if MASK set to 1  |
     +-------------------------------+-------------------------------+
     | Masking-key (continued)       |          Payload Data         |
     +-------------------------------- - - - - - - - - - - - - - - - +
     :                     Payload Data continued ...                :
     + - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - +
     |                     Payload Data continued ...                |
     +---------------------------------------------------------------+

The important fields to mention here are

  • opcode – indicates the type of data in the message. In our example, 0001 indicates text.
  • mask – indicates whether the data is masked. As far as I can tell, this is required and should always be 1.
  • Payload len – length of the message if it is under 126. If the length is 126 or greater, then the Extended payload lengths much be used. In our example, we will force the length to be under 126 and not worry about the Extended payload length.
  • Masking-key – Randomly generated bytes used to mask the payload data for security.
  • Payload Data – masked message.

Here is the function for encoding a text message.

[gist id=1635265]

Messages received from the client will also be encoded this way and will need to be decoded.

[gist id=1635108]

Resources
http://www.undisciplinedbytes.com/2010/06/html-5-c-web-sockets-server-and-asp-net-client-implementation/
http://www.codeproject.com/KB/HTML/Web-Socket-in-Essence.aspx
http://tools.ietf.org/html/draft-ietf-hybi-thewebsocketprotocol-10#page-15
https://github.com/lemmingzshadow/php-websocket/blob/master/server/lib/WebSocket/Connection.php
http://stackoverflow.com/questions/7040078/how-to-deconstruct-data-frames-in-websockets-hybi-08

Comments Off , permalink


Comments are closed.