Endianess

While working on the IEEE 896 Futurebus standard, which would replace all local bus connections, researchers took note of a difference between what was termed a type-l bus and a type-2 bus. These buses would multiplex or mux the first data byte, which has the lowest address with the least and most significant portions of the address, respectively.

(Borrill, 1981) observed that the type-2 ordering simplified the interpretation of memory dumps, and proposed that future multiplexed bus standards, that had multiple-width data transfers implement the type-2 storage formats.

In an analogy based on Gulliver's Travels, the associated architectures have been dubbed little-endian and big-endian processors, also known as Sad and Mad, respectively. Big-endian microprocessors place the first byte of a sequential data stream into the more significant part of a register, while little-endian microprocessors put the first byte into the least significant part of a register. (James, 1990)

The issue of endianness isn’t merely a theoretical one either, as the CPU must know if the hard-coded word is big-endian or little-endian. In examining this, we find that CPU used in PCs have tended to be little-endian, and the SPARC, Motorolas 68K, and PowerPC chip used in Macs, have been big-endian, as have the Java Virtual Machine, ARM and MIPS. (Bradley, 2007)

As a result file formats developed on a particular platform may specify byte order. For example, a bitmap (.bmp) specifies a little-endian byte order, while a JPEG expects big-endian. TIFF image files may be big or little-endian, but will encodes in its metadata the form to which it conforms. (Bradley, 2007)

Endianness also has ramifications for networking, as network stacks and protocols must define their endianness, just as the afore-mentioned image files were forced to. Fortunately, protocol layers in the TCP/IP suite are designated as big-endian.

This may beg the question, “What if a PC utilizing little-endian is communicating with a SPARC or PowerPC employing big-endian, won’t that cause a problem”?

The solution for this is to convert the IP address into a multi-byte integer format, known as network byte order. Thus in our example of a little-endian PC and a big-endian  SPARC, the PC would convert the IP 192.0.1.2 to the little endian integer 0x020100C0 and transmit the bytes in the order 02 01 00 C0, with the SPARC receiving it as 02 01 00 X0, and then reconstruct the bytes into a big-endian integer 0x020100c0, and misinterpret the address as 2.1.0.192.

In (Fig. 1) we see the 4-byte integer value 0x01020304 stored on a big-endian system:

 endianness

Thus to mitigate the endianness and networking issues, preprocessor macros based upon the Berkeley sockets API are utilized to perform the conversions:

htons(): “host to network short” will reorder the bytes of a 16-bit value from processor order to network order, and htonl(): “host to network long" will reorder the bytes of a 32-bit value from processor order to network order. (“Introduction to BSD Sockets”, 2014)

ntohs(): “network to host short” will reorder the bytes of a 16-bit value from network order to processor order, and ntohl(): “network to host long” will reorder the bytes of a 32-bit value from network order to processor order. (“Introduction to BSD Sockets”, 2014)

Please click here for references.