Protocol packets using Data Structures

In this article, we will see the use of direct type-casting in C for converting one structure into another. A network byte stream packet will be cast into desired data-types rather than explicitly reading out the stream byte by byte. Note that this technique works only when protocol byte ordering and that of the system are identical.

While working with Firmware and Embedded Systems, creating a packet and parsing of a packet is akin to breathing air - inhale, parse packet, exhale, create packet. Packets may be parsed or created for different standard as well as for proprietary protocols. The most usual and widespread method to do so is what I call "The Programmer Memory and i++ method." In this method, the programmer has to remember which byte is present at what position in the packet, and also what comes after it (thus, he uses his own memory to make things work). He/she must also parse or fill a new packet iteratively with the help of incrementing counters like i++. I personally do not like variable names like i, j, my_i, my_j, etc. Although this method is the most widespread method, it is prone to the following problems:

  1. A programmer can screw up the protocol while parsing or creating packets - his memory may fail, might miss out on a field, or jumble it up. The compiler cannot check the validity of what he's done with the byte stream.
  2. It is very difficult to debug such kind of packet creation or parsing.
  3. Yeah - You just do what others tell you to - the conventional way of playing with packets - You don't think out of the box.

Besides, it is cumbersome. I'm lazy at the core. So, why should I have to work my memory when I can make the computer do my bidding? So, here I will show you how I use data structures in order to manipulate packets easily and without worrying that I'll forget members. The following example is a sample from creating/parsing CANOpen packets which mostly[1] obey little-endian byte ordering. A large fraction of embedded devices are also Little Endian, so, this technique reaps the benefits of system's memory and protocol endianness compatibility.

To explain this further, I'll take the example of a packet that is sent over CANOpen protocol between the master and the slave. This is called an SDO and all you need to know about it are the following points:

  1. SDO is transmitted between a Master and a Slave.
  2. SDO is at max 8 bytes long.
  3. The 1st byte of an SDO, called the command ID identifies the kind of SDO:
    1. Tx SDO : From Master to Slave
    2. Rx SDO : From Slave to Master
    3. Segmented Rx Tx SDO : Long SDO messages broken into chunks of 8 Bytes
  4. The Next 7 bytes of the SDO contain information/data depending on the type of SDO:
    • Tx SDO/Rx SDO :
      • The first 2 bytes contain the Object Index in Little Endian
      • The third byte contains the Object Sub Index
      • The remaining four bytes contain the Data
    • Segmented Rx Tx SDO :
      • All 7 bytes contain data

This information is pictorially represented in the figure below.

So in order to create a new SDO packet or parse any one that is received, I'll define an SDO data structure. But prior to that, I'll define the size of each of these frames and sub-frames as:

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes in a SDO Frame */
#define CAN_OPEN_SDO_NUM_BYTES_IN_FRAME                             ( 8 )

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes for command ID in a SDO Frame */
#define CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES                           ( 1 )

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes for object index in SDO Message */
#define CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES                         ( 2 )

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes for sub index in SDO Message */
#define CAN_OPEN_SDO_NUM_OBJECT_SUB_INDEX_BYTES                     ( 1 )

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes for object index in SDO Message */
#define CAN_OPEN_SDO_MAX_OBJECT_DATA_BYTES_WITH_OBJECT_INDEX        ( CAN_OPEN_SDO_NUM_BYTES_IN_FRAME - \
                                                                      CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES - \
                                                                      CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES - \
                                                                      CAN_OPEN_SDO_NUM_OBJECT_SUB_INDEX_BYTES )

/*----------------------------------------------------------------------------*/
/*!@brief macro to define the number of bytes for segemted Tx Rx object index in SDO Message */
#define CAN_OPEN_SDO_MAX_SEGMENTED_RX_TX_OBJECT_DATA_BYTES          ( CAN_OPEN_SDO_NUM_BYTES_IN_FRAME - \
                                                                      CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES )

I think that the above code segment is relatively straightforward. The reason I defined them as macros is that I do not want to use magic numbers in my data structures.

Magic numbers in Programming are unique values with unexplained meaning or multiple occurences which could (preferably) be replaced with named constants. [2]

Now, I'll define the data structure for the SDO packet:

/*----------------------------------------------------------------------------*/
/*!@brief struct to define the SDO Packet */
typedef struct
{
	union
	{
		/*!> Data field in bytes */
		BYTE data_field[CAN_OPEN_SDO_NUM_BYTES_IN_FRAME];

		struct
		{
			/*!> Command ID */
			BYTE command_id[CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES];

			union
			{
				/* Rest of the bytes */
				BYTE frame_field[CAN_OPEN_SDO_NUM_BYTES_IN_FRAME - CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES];

				/* SDO Tx */
				struct
				{
					union
					{
						/*!> object_idx[0] = LSB; object_idx[1] = MSB */
						BYTE object_idx[CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES];

						/* Directly readable object Index of two bytes */
						uint16 object_idx_16;

					}object_idx_u;
					
					/*!> object Sub Index */
					BYTE object_sub_idx[CAN_OPEN_SDO_NUM_OBJECT_SUB_INDEX_BYTES];

					/*!> Object data */
					union
					{
						BYTE object_data[CAN_OPEN_SDO_MAX_OBJECT_DATA_BYTES_WITH_OBJECT_INDEX];
						uint32 object_data_32;
					}object_data_u;
				}tx;

				/* SDO Rx */
				struct
				{
					union
					{
						/*!> object_idx[0] = LSB; object_idx[1] = MSB */
						BYTE object_idx[CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES];

						/* Directly readable object Index of two bytes */
						uint16 object_idx_16;
					}object_idx_u;
					
					/*!> object Sub Index */
					BYTE object_sub_idx[CAN_OPEN_SDO_NUM_OBJECT_SUB_INDEX_BYTES];

					/*!> Object data */
					union
					{
						BYTE object_data[CAN_OPEN_SDO_MAX_OBJECT_DATA_BYTES_WITH_OBJECT_INDEX];
						uint32 object_data_32;
					}object_data_u;
				}rx;

				/* Segmented Rx Tx */
				struct
				{
					/*!> Object data */
					BYTE object_data[CAN_OPEN_SDO_MAX_SEGMENTED_RX_TX_OBJECT_DATA_BYTES];
				}segmented_data_rx_tx;
			}msg_u;
		}data_field_s;
	}data_field_u;
}CANopen_sdo_data_field_s;


It is very easy and efficient! With these data structures, I've essentially created a programmatic representation of the figure above and the standard mandated structure in C semantics. Let's see what I did here:

Within the outer-most structure CANopen_sdo_data_field_s, I've have created a union named data_field_u, a BYTE array data_field of 8 bytes and another structure data_field_s. This ensures that the memory is always contiguous and 8 bytes long. So the packet can either be accessed as:

  • An element of the array data_field[]
  • An element of the structure data_field_s

Just keep in mind that since the frame is max 8 bytes, the structure definition of data_field_s should also not exceed 8 bytes or else it is a mistake. Throughout the implementation, I've exploited this union membership. Okay, let's get just one level deeper for more clarity.

The 1st byte of an SDO is called the command ID and tells what kind of SDO it is. So, I've segment it further into 1 byte and 7 bytes. The structure data_field_s represents the packet. The first byte of the packet is always the command ID. Hence, the first definition in the structure is:

BYTE command_id[CAN_OPEN_SDO_NUM_COMMAND_ID_BYTES];

The next 7 seven bytes after the command ID are defined as a union of four elements:

  • frame_field[] : An array of 7 bytes
  • tx : Structure defining the bytes for a Tx SDO Message
  • rx : Structure defining the bytes for a Rx SDO Message
  • segmented_data_rx_tx : Structure defining the bytes for a Segmented Rx Tx Message

In the Rx/Tx segments, what I did is created a union between a 2 element array object_idx and a 16 bit variable object_idx_16. This way I can directly fill the 2 bytes of object index into the 16 bit variable object_idx_16 and memory will itself take care of splitting it as Little-Endian. I don't need to do something like this myself to split the object index into two bytes of little-endian data:

object_idx_pkt[0] = object_idx & 0xFF;
object_idx_pkt[1] = ( object_idx >> 8 ) & 0xFF;

What I can directly do now is:

<instance of CANopen_sdo_data_field_s>.data_field_u.data_field_s.msg_u.tx.object_idx_u.object_idx_16 = object_idx;

Similarly, while parsing a received packet, I don't need do the following to get the object index:

object_idx_derived = object_idx[0] | ( object_idx[1] << 8 );

Instead, I can directly read it from the frame array pointer type-casted to our packet structure:

object_idx_derived = <typecasted pointer pointing to the frame array> ->data_field_u.data_field_s.msg_u.tx.object_idx_u.object_idx_16 = object_idx;

Note that these memory manipulations are something one will need to do when the memory ordering of Protocol and Machine are different.

The rest of the structure follows similar principles. Try working it through.

Alright, let's now use our data structure!

Creating a new Tx SDO Packet:

Let's say we need to create a new Tx SDO Packet with:

  1. Command ID as 0x80
  2. Object Index as 0xFD16
  3. Object Sub Index as 0x08
  4. Object Data as 0xFDAB1781

And then print the packet.

Step 1 - Create an object of the Packet Structure CANopen_sdo_data_field_s and initialize it to zero:

CANopen_sdo_data_field_s sdo_tx_packet;

memset ( ( void *) sdo_tx_packet.data_field_u.data_field, 0x0, CAN_OPEN_SDO_NUM_BYTES_IN_FRAME * sizeof ( BYTE ) );

Do note that I memset() the 8 bytes of the array data_field here.

Step 2 - Fill in the data into the packet:

sdo_tx_packet.data_field_u.data_field_s.command_id[0]                          = 0x80;
sdo_tx_packet.data_field_u.data_field_s.msg_u.tx.object_idx_u.object_idx_16    = 0xFD16;
sdo_tx_packet.data_field_u.data_field_s.msg_u.tx.object_sub_idx[0]             = 0x08;
sdo_tx_packet.data_field_u.data_field_s.msg_u.tx.object_data_u.object_data_32  = 0xFDAB1781;

Step 3 - Print the packet in array form:

uint32 packet_index = 0;

for ( packet_index = 0; CAN_OPEN_SDO_NUM_BYTES_IN_FRAME > packet_index; packet_index ++ )
{
	printf( "0x%x\t", sdo_tx_packet.data_field_u.data_field[packet_index] );
}

and the output I get is:

0x80 0x16 0xFD 0x08 0x81 0x17 0xAB 0xFD

Parsing a Rx SDO Packet:

Now, when I receive a Rx SDO Packet, I can read off data directly from it using my data structures. For example, if I received the same packet that was created above:

0x80 0x16 0xFD 0x08 0x81 0x17 0xAB 0xFD

in an array pointer called BYTE * received_SDO, I can read off the data from it by typecasting it to the data structure:

CANopen_sdo_data_field_s * sdo_rx_packet = ( CANopen_sdo_data_field_s * ) received_SDO;

printf ( "\nCommand ID : 0x%x", sdo_rx_packet->data_field_u.data_field_s.command_id[0] );
printf ( "\nObject Index : 0x%x", sdo_rx_packet->data_field_u.data_field_s.msg_u.rx.object_idx_u.object_idx_16 );
printf ( "\nObject Sub Index : 0x%x", sdo_rx_packet->data_field_u.data_field_s.msg_u.rx.object_sub_idx[0] );
printf ( "\nObject Data : 0x%x", sdo_rx_packet->data_field_u.data_field_s.msg_u.rx.object_data_u.object_data_32 );

and the output I get is:

Command ID : 0x80
Object Index : 0xFD16
Object Sub Index : 0x08
Object Data : 0xFDAB1781

Isn't this something! :-)

Few Points to remember:

  1. In this type of use case, the data structure needs to be byte-aligned!
  2. The union trick where we can directly convert data into an array of bytes in little-endian can only be implemented if the number of bytes in the data is a power of 2. That is the reason I didn't use the trick for segmented_data_rx_tx as the data size is 7 bytes. Think of it this way. A 32-bit word sized machine uses memory in sequences of 32-bits, or 4-bytes. Thus, if we were to declare 3 chars, and 1 int, the compiler would internally be padding an extra byte between the 3rd char and the following integer (unless we've specifically asked it not to). In such cases, if one directly uses the typecast, it is not guaranteed that the data being read off is being written to its desired place, particularly when using different implementations on different machines.

Another major advantage of this system over creating packets byte-wise is that the compiler is there to assist us in remembering the proper places of fields. So, one will not jumble up the byte sequence.

So, as you see, this can be a very effective technique for creating and parsing protocols. I will leave you with an alternate implementation of the above structure. Do note that this implementation has an overhead of having to declare separate variables for a raw byte stream and a typecasted data-structure. In contrast, the above method could dynamically be used without an explicit cast (because unions can take any forms of valid members).

/*----------------------------------------------------------------------------*/
/*!@brief TX SDO Frame */
typedef struct tx_frame
{
	BYTE command_id; /*!> Command Identifier */
	BYTE object_index[CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES]; /*!> Object Index */
	BYTE object_sub_index; /*!> Object Sub Index */

	uint32 object_data_32; /*!> 4 byte data */
}CANopen_tx_frame;

/*----------------------------------------------------------------------------*/
/*!@brief RX SDO Frame */
typedef struct rx_frame
{
	BYTE command_id; /*!> Command Identifier */
	BYTE object_index[CAN_OPEN_SDO_NUM_OBJECT_INDEX_BYTES]; /*!> Object Index */
	BYTE object_sub_index; /*!> Object Sub Index */

	uint32 object_data_32; /*!> 4 byte data */
}CANopen_rx_frame;

/*----------------------------------------------------------------------------*/
/*!@brief Segmented Rx/Tx SDO Frame */
typedef struct sdo_segment
{
	BYTE command_id; /*!> Command Identifier */
	BYTE data[7]; /*!> Segmented Data */
}CANopen_segment;

/*----------------------------------------------------------------------------*/
/*!@brief Generic SDO Frame */
typedef struct
{
	BYTE command_id; /*!> Command Identifier */
	BYTE frame_data[7]; /* Raw byte stream */
}CANopen_sdo_data_field_s;

P.S. Note that the data type definitions here might seem arcane and non-standard. You could read about their being in this post.

[1] Stack overflow: Big Endian vs Little Endian Padding issue

[2] Wikipedia - Magic Number (Programming)

[3] Stack Exchange - Has Little Endian won?