CIS3355:
Business Data Structures |
What is a byte, and why does it contain 8 bits? At the most basic level, a byte is a collection of eight bits. More commonly, a byte can be seen as a single character. For example, the eight bit collection of binary digits: 0100 0001 is understood by computer programs as the ASCII character ‘A’. There are 128 common characters in the ASCII character set that are described in the same manner. All of these values are shown in the common ASCII table: Dec Oct Hex ASCII EBCDIC BINARY ASCII Description ----------------------------------------------------------------------- 0 0 0 NUL NUL 0000 0000 NULL NUL null c-@ c-` 1 1 1 SOH SOH 0000 0001 SOH GTL c-A c-a start-of-heading 2 2 2 STX STX 0000 0010 STX c-B c-b start-of-text 3 3 3 ETX ETX 0000 0011 ETX c-C c-c end-of-text 4 4 4 EOT SEL 0000 0100 EOT SDC end-of-transmission ..._._ 5 5 5 ENQ HT 0000 0101 ENQ PPC c-E c-e enquiry 6 6 6 ACK RNL 0000 0110 ACK c-F c-f acknowledge 7 7 7 BEL DEL 0000 0111 BELL BEL bell c-G c-g \a 8 10 8 BS GE 0000 1000 BS GET backspace c-H c-h \b 9 11 9 TAB SPS 0000 1001 TAB TCT HT tab c-I c-i \t 10 12 A LF RPT 0000 1010 LF lf linefeed c-J c-j \n 11 13 B VT VT 0000 1011 VT vertical-tab c-K c-k \v 12 14 C FF FF 0000 1100 FF ff formfeed page \f c-L c-l 13 15 D CR CR 0000 1101 CR cr carriage-return c-M c-m \r 14 16 E SO SO 0000 1110 SO c-N c-n shift-out 15 17 F SI SI 0000 1111 SI c-O c-o shift-in 16 20 10 DLE DLE 0001 0000 DLE c-P c-p data-link-escape 17 21 11 DC1 DC1 0001 0001 DC1 LLO go XON xon c-Q c-Q 18 22 12 DC2 DC2 0001 0010 DC2 c-R c-r 19 23 13 DC3 DC3 0001 0011 DC3 stop XOFF xoff c-S c-s 20 24 14 DC4 RES/ENP 0001 0100 DC4 DCL c-T c-t 21 25 15 NAK NL 0001 0101 NAK PPU negative-acknowledge c-U c-u 22 26 16 SYN BS 0001 0110 SYN c-V c-v synchronous-idle 23 27 17 ETB POC 0001 0111 ETB end-of-transmission-block c-W c-w 24 30 18 CAN CAN 0001 1000 CAN SPE c-X c-x cancel 25 31 19 EM EM 0001 1001 EM SPD c-Y c-y end-of-medium 26 32 1A SUB UBS 0001 1010 SUB suspend c-Z c-z substitute 27 33 1B ESC CU1 0001 1011 ESC escape c-[ c-{ m- 28 34 1C FS IFS 0001 1100 FS field-separator c-\ c-| 29 35 1D GS IGS 0001 1101 GS group-separator 30 36 1E RS IRS 0001 1110 RS record-separator c-^ c-~ 31 37 1F US ITB/IUS 0001 1111 ^DEL unit-separator US c-_ c-DEL 32 40 20 SPC DS 0010 0000 SPC space spc 33 41 21 ! SOS 0010 0001 ! exclamation-point bang wow boing hey 34 42 22 " FS 0010 0010 " straight-double-quotation-mark dirk 35 43 23 # WUS 0010 0011 # number-sign she sharp crosshatch octothorpe 36 44 24 $ BYP/INP 0010 0100 $ @@ dollar-sign money buck escape 37 45 25 % LF 0010 0101 % percent-sign per double-o-seven mod 38 46 26 & ETB 0010 0110 & ampersand and address snowman donald-duck 39 47 27 ' ESC 0010 0111 ' apostrophe quote tick prime 40 50 28 ( SA 0010 1000 ( left-parenthesis open sad 41 51 29 ) SFE 0010 1001 ) right-parenthesis close happy 42 52 2A * SM/SW 0010 1010 * asterisk star times wildcard Hale 43 53 2B + CSP 0010 1011 + addition-sign plus and 44 54 2C , MFA 0010 1100 , comma __..__ 45 55 2D - ENQ 0010 1101 - subtraction-sign minus hyphen negative dash 46 56 2E . ACK 0010 1110 . period dot decimal radix full-stop ._._._ 47 57 2F / BEL 0010 1111 / right-slash virgule stroke over 48 60 30 0 0011 0000 0 _____ 49 61 31 1 0011 0001 1 .____ 50 62 32 2 SYN 0011 0010 2 ..___ 51 63 33 3 IR 0011 0011 3 ...__ 52 64 34 4 PP 0011 0100 4 ...._ 53 65 35 5 TRN 0011 0101 5 ..... 54 66 36 6 NBS 0011 0110 6 _.... 55 67 37 7 EOT 0011 0111 7 __... 56 70 38 8 SBS 0011 1000 8 ___.. 57 71 39 9 IT 0011 1001 9 ____. 58 72 3A : RFF 0011 1010 : colon double-dots ___... 59 73 3B ; CU3 0011 1011 ; semicolon go-on _._._. 60 74 3C < DC4 0011 1100 < less-than bra in west left-chevron 61 75 3D = NAK 0011 1101 = equals quadrathorpe 62 76 3E > 0011 1110 > greater-than (bra)ket out east right-chevron 63 77 3F ? SUB 0011 1111 ? UNL question-mark query what ..__.. 64 100 40 @ SP 0100 0000 @ at-symbol at-sign strudel whirl snail 65 101 41 A RSP 0100 0001 A ._ 66 102 42 B 0100 0010 B _... 67 103 43 C 0100 0011 C _._. 68 104 44 D 0100 0100 D _.. 69 105 45 E 0100 0101 E . 70 106 46 F 0100 0110 F .._. 71 107 47 G 0100 0111 G __. 72 110 48 H 0100 1000 H .... 73 111 49 I 0100 1001 I .. 74 112 4A J 0100 1010 J .___ 75 113 4B K . 0100 1011 K _._ 76 114 4C L < 0100 1100 L ._.. 77 115 4D M ( 0100 1101 M __ 78 116 4E N + 0100 1110 N _. 79 117 4F O | 0100 1111 O ___ 80 120 50 P & 0101 0000 P .__. 81 121 51 Q 0101 0001 Q __._ 82 122 52 R 0101 0010 R ._. 83 123 53 S 0101 0011 S ... 84 124 54 T 0101 0100 T _ 85 125 55 U 0101 0101 U .._ 86 126 56 V 0101 0110 V ..._ 87 127 57 W 0101 0111 W .__ 88 130 58 X 0101 1000 X _.._ 89 131 59 Y 0101 1001 Y _.__ 90 132 5A Z ! 0101 1010 Z __.. 91 133 5B [ $ 0101 1011 [ left-bracket open-square 92 134 5C \ * 0101 1100 \ left-slash backslash bash 93 135 5D ] ) 0101 1101 ] right-bracket close-square 94 136 5E ^ ; 0101 1110 ^ hat circumflex caret up-arrow 95 137 5F _ 0101 1111 _ UNT underscore underbar 96 140 60 ` _ 0110 0000 ` accent-grave backprime backquote 97 141 61 a / 0110 0001 a alpha able 98 142 62 b 0110 0010 b bravo baker 99 143 63 c 0110 0011 c charlie 100 144 64 d 0110 0100 d delta 101 145 65 e 0110 0101 e echo 102 146 66 f 0110 0110 f foxtrot fox 103 147 67 g 0110 0111 g golf 104 150 68 h 0110 1000 h hotel 105 151 69 i 0110 1001 i india 106 152 6A j | 0110 1010 j juliett 107 153 6B k , 0110 1011 k kilo 108 154 6C l % 0110 1100 l lima 109 155 6D m _ 0110 1101 m mike 110 156 6E n > 0110 1110 n november 111 157 6F o ? 0110 1111 o oscar 112 160 70 p 0111 0000 p papa 113 161 71 q 0111 0001 q quebec 114 162 72 r 0111 0010 r romeo 115 163 73 s 0111 0011 s sierra 116 164 74 t 0111 0100 t tango 117 165 75 u 0111 0101 u uniform 118 166 76 v 0111 0110 v victor 119 167 77 w 0111 0111 w whiskey 120 170 78 x 0111 1000 x x-ray 121 171 79 y ` 0111 1001 y yankee 122 172 7A z : 0111 1010 z zulu 123 173 7B { # 0111 1011 { left-brace begin leftit 124 174 7C | @ 0111 1100 | logical-or vertical-bar pipe 125 175 7D } ' 0111 1101 } right-brace end rightit 126 176 7E ~ = 0111 1110 ~ similar tilde wave squiggle approx wave 127 177 7F DEL " 0111 1111 ^? DEL rubout delete
However, these 128 characters would only need seven bits to describe them. As you will notice, all of the binary groups start with ‘0’. Originally, this eighth bit was used as a ‘parity bit’. This extra bit was used as a method to double check data that was transmitted from one part of the machine to the other. Parity works like this: First, the sender and transmitter agreed on either even or odd parity. If it was agreed that parity would be even, then all of the ‘1’s in the seven bit group would be added together. If the total was even, then no action was taken and the first, or parity, bit was left at ‘0’. If the total of ‘1’s equaled an odd number, then the first bit was changed to a one to force the total to be even, thus ‘even’ parity. For more information on parity, check out the following link. http://www.webopedia.com/TERM/p/parity.html Once the reliability of computers improved, parity checking became less necessary. Rather than dropping the parity bit, it was utilized to expand the ASCII character set to 256 characters (recalling that 2^8=256). So there it is. Historically, a byte has eight bits. This is unlikely to change, since nearly all computer architecture is based on this principal. However, it is incorrect to learn that a byte equals one character. As the new Unicode standard comes to replace ASCII, a character may be made up of two bytes. |