Byte

Home * Programming * Data * Byte

A Byte, more precise an Octet, is a unit of measurement of information storage, consisting of eight bits. In most computer architectures it is the granularity of memory addresses, containing 8-bit numbers, 256 different symbols - interpreted as signed or unsigned numbers, ASCII characters or machine code. Processors provide byte-wise arithmetical and logical units. x86 and x86-64 can address the two lower bytes of each 32 or 64 bit register, for instance AL and AH from EAX or RAX. SIMD instruction sets like MMX, AltiVec and SSE2 provide operations on vectors of eight or sixteen bytes inside appropriate SIMD-registers.

The programming languages C and C++ define a byte as a "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment".

=Char= The C-datatype unsigned char covers one byte and has a numerical range of 0 to 255. The primitive char in Java is a signed byte, and ranges from -128 to +127. Same is likely true for signed char in C, though two's complement is not strictly specified. Same is true for signed right shifts, where x86 performs shift arithmetical right, but other processors and their compilers possibly shift in always zeros. Bytes are therefor often type defined as unsigned char in C: typedef unsigned char BYTE; Mailbox chess programs often use an array of bytes for a dense Board Representation, where each byte contains piece- or empty square code for each indexed square. A Byte is also sufficient to store usual (0..63), or 0x88 board coordinates. A byte can contain a rank of a bitboard. For pawn-structure issues, filesets are a dense set-wise representation to cover boolean properties for each file.

A byte can be written with two hexadecimal digits, 0x00 to 0xff in C or Java. Take care and compiler warnings serious, if wider types are assigned to bytes - since all upper bits are lost, if wider types are outside the valid signed or unsigned range.

=SWAR Bytes= To apply 'add' or 'sub' on vectors of bytes (or any arbitrary structure) SWAR-wise within a 32-bit or 64-bit register, we have to take care carries and borrows don't wrap around. Thus we apply a mask of all most significant bits (H) and 'add' in two steps, one 'add' with MSB clear and one add modulo 2 aka 'xor' for the MSB itself. For byte-wise math of a vector of four bytes inside a 32-bit register, H is 0x80808080 and L is 0x01010101. SWAR add z = x + y   z = ((x &~H) + (y &~H)) ^ ((x ^ y) & H) SWAR sub z = x - y    z = ((x | H) - (y &~H)) ^ ((x ^~y) & H) SWAR average z = (x+y)/2 based on x + y = (x^y) + 2*(x&y) z = (x & y) + (((x ^ y) & ~L) >> 1)

=See also=
 * Byte Magazine
 * Nibble
 * Filesets
 * First Rank Attacks
 * Word
 * Double Word
 * Quad Word
 * SWAR

=External Links=
 * Byte from Wikipedia
 * Octet from Wikipedia

Up one Level