25,161
edits
Changes
Byte
,Created page with "'''Home * Programming * Data * Byte''' A '''Byte''', more precise an '''Octet''', is a unit of measurement of information storage, consisting of '''eigh..."
'''[[Main Page|Home]] * [[Programming]] * [[Data]] * Byte'''
A '''Byte''', more precise an '''Octet''', is a unit of measurement of information storage, consisting of '''eight''' [[Bit|bits]]. In most computer architectures it is the granularity of memory addresses, containing 8-bit numbers, 256 different symbols - interpreted as signed or unsigned numbers, [https://en.wikipedia.org/wiki/ASCII ASCII] characters or machine code. Processors provide byte-wise arithmetical and logical units. [[x86]] and [[x86-64]] can address the two lower bytes of each 32 or 64 bit register, for instance AL and AH from EAX or RAX. [[SIMD and SWAR Techniques|SIMD]] instruction sets like [[MMX]], [[AltiVec]] and [[SSE2]] provide operations on vectors of eight or sixteen bytes inside appropriate SIMD-registers.
The programming languages [[C]] and [[Cpp|C++]] define a byte as a "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment".
=Char=
The [[C]]-datatype '''unsigned char''' covers one byte and has a numerical range of 0 to 255. The primitive '''char''' in [[Java]] is a signed byte, and ranges from -128 to +127. Same is likely true for signed char in C, though [https://en.wikipedia.org/wiki/Two%27s_complement two's complement] is not strictly specified. Same is true for signed right shifts, where [[x86]] performs shift arithmetical right, but other processors and their compilers possibly shift in always zeros. Bytes are therefor often type defined as '''unsigned char''' in C:
<pre>
typedef unsigned char BYTE;
</pre>
[[Mailbox]] chess programs often use an [[Array|array]] of bytes for a dense [[Board Representation]], where each byte contains [[Pieces#PieceCoding|piece- or empty square code]] for each indexed [[Squares|square]]. A Byte is also sufficient to store usual (0..63), or [[0x88]] board coordinates. A byte can contain a rank of a [[Bitboards|bitboard]]. For pawn-structure issues, [[Pawns and Files (Bitboards)#Fileset|filesets]] are a dense set-wise representation to cover boolean properties for each [[Files|file]].
A byte can be written with two hexadecimal digits, 0x00 to 0xff in [[C]] or [[Java]]. Take care and compiler warnings serious, if wider types are assigned to bytes - since all upper bits are lost, if wider types are outside the valid signed or unsigned range.
=SWAR Bytes=
To apply 'add' or 'sub' on vectors of bytes (or any arbitrary structure) [[SIMD and SWAR Techniques#SWAR|SWAR-wise]] within a 32-bit or 64-bit register, we have to take care carries and borrows don't wrap around. Thus we apply a mask of all most significant bits (H) and 'add' in two steps, one 'add' with MSB clear and one add modulo 2 aka 'xor' for the MSB itself. For byte-wise math of a vector of four bytes inside a 32-bit register, H is 0x80808080 and L is 0x01010101.
<pre>
SWAR add z = x + y
z = ((x &~H) + (y &~H)) ^ ((x ^ y) & H)
SWAR sub z = x - y
z = ((x | H) - (y &~H)) ^ ((x ^~y) & H)
SWAR average z = (x+y)/2 based on x + y = (x^y) + 2*(x&y)
z = (x & y) + (((x ^ y) & ~L) >> 1)
</pre>
=See also=
* [[Byte Magazine]]
* [[Nibble]]
* [[Pawns and Files (Bitboards)#Fileset|Filesets]]
* [[First Rank Attacks]]
* [[Word]]
* [[Double Word]]
* [[Quad Word]]
* [[SIMD and SWAR Techniques#SWAR|SWAR]]
=External Links=
* [https://en.wikipedia.org/wiki/Byte Byte from Wikipedia]
* [https://en.wikipedia.org/wiki/Octet_%28computing%29 Octet from Wikipedia]
'''[[Data|Up one Level]]'''
A '''Byte''', more precise an '''Octet''', is a unit of measurement of information storage, consisting of '''eight''' [[Bit|bits]]. In most computer architectures it is the granularity of memory addresses, containing 8-bit numbers, 256 different symbols - interpreted as signed or unsigned numbers, [https://en.wikipedia.org/wiki/ASCII ASCII] characters or machine code. Processors provide byte-wise arithmetical and logical units. [[x86]] and [[x86-64]] can address the two lower bytes of each 32 or 64 bit register, for instance AL and AH from EAX or RAX. [[SIMD and SWAR Techniques|SIMD]] instruction sets like [[MMX]], [[AltiVec]] and [[SSE2]] provide operations on vectors of eight or sixteen bytes inside appropriate SIMD-registers.
The programming languages [[C]] and [[Cpp|C++]] define a byte as a "addressable unit of data storage large enough to hold any member of the basic character set of the execution environment".
=Char=
The [[C]]-datatype '''unsigned char''' covers one byte and has a numerical range of 0 to 255. The primitive '''char''' in [[Java]] is a signed byte, and ranges from -128 to +127. Same is likely true for signed char in C, though [https://en.wikipedia.org/wiki/Two%27s_complement two's complement] is not strictly specified. Same is true for signed right shifts, where [[x86]] performs shift arithmetical right, but other processors and their compilers possibly shift in always zeros. Bytes are therefor often type defined as '''unsigned char''' in C:
<pre>
typedef unsigned char BYTE;
</pre>
[[Mailbox]] chess programs often use an [[Array|array]] of bytes for a dense [[Board Representation]], where each byte contains [[Pieces#PieceCoding|piece- or empty square code]] for each indexed [[Squares|square]]. A Byte is also sufficient to store usual (0..63), or [[0x88]] board coordinates. A byte can contain a rank of a [[Bitboards|bitboard]]. For pawn-structure issues, [[Pawns and Files (Bitboards)#Fileset|filesets]] are a dense set-wise representation to cover boolean properties for each [[Files|file]].
A byte can be written with two hexadecimal digits, 0x00 to 0xff in [[C]] or [[Java]]. Take care and compiler warnings serious, if wider types are assigned to bytes - since all upper bits are lost, if wider types are outside the valid signed or unsigned range.
=SWAR Bytes=
To apply 'add' or 'sub' on vectors of bytes (or any arbitrary structure) [[SIMD and SWAR Techniques#SWAR|SWAR-wise]] within a 32-bit or 64-bit register, we have to take care carries and borrows don't wrap around. Thus we apply a mask of all most significant bits (H) and 'add' in two steps, one 'add' with MSB clear and one add modulo 2 aka 'xor' for the MSB itself. For byte-wise math of a vector of four bytes inside a 32-bit register, H is 0x80808080 and L is 0x01010101.
<pre>
SWAR add z = x + y
z = ((x &~H) + (y &~H)) ^ ((x ^ y) & H)
SWAR sub z = x - y
z = ((x | H) - (y &~H)) ^ ((x ^~y) & H)
SWAR average z = (x+y)/2 based on x + y = (x^y) + 2*(x&y)
z = (x & y) + (((x ^ y) & ~L) >> 1)
</pre>
=See also=
* [[Byte Magazine]]
* [[Nibble]]
* [[Pawns and Files (Bitboards)#Fileset|Filesets]]
* [[First Rank Attacks]]
* [[Word]]
* [[Double Word]]
* [[Quad Word]]
* [[SIMD and SWAR Techniques#SWAR|SWAR]]
=External Links=
* [https://en.wikipedia.org/wiki/Byte Byte from Wikipedia]
* [https://en.wikipedia.org/wiki/Octet_%28computing%29 Octet from Wikipedia]
'''[[Data|Up one Level]]'''