# MATHEMATICS

2 Applications of Discrete Mathematics

contradicting the fact that the minimum distance between codewords is 2k +1.

We can now give a useful bound on how many codewords can be in a code consisting of n-tuples that can correct a specified number of errors.

Theorem 5 The Sphere Packing or (Hamming) Bound Suppose that C is a code of bit strings of length n with d(C) = 2k+1. Then |C|, the number of codewords in C, cannot exceed

2n

C(n, 0) + C(n, 1) + · · · + C(n, k).

Proof: There are 2n bit strings of length n. By Lemma 1 the sphere of radius k centered at a codeword x contains

C(n, 0) + C(n, 1) + · · · + C(n, k)

bit strings. Since no bit string can be in two such spheres (by Lemma 2), it follows that the number of bit strings of length n is at least as large as the number of codewords times the number of bit strings in each such sphere. Hence,

2n ≥ |C|[C(n, 0) + C(n, 1) + · · · + C(n, k)]. We obtain the inequality we want by dividing by the second factor on the right- hand side of this inequality (and writing the inequality with the smaller term first).

Example 10 Find an upper bound for the number of codewords in a code C where codewords are bit strings of length seven and the minimum distance between codewords is three.

Solution: The minimum distance between codewords is 3 = 2k + 1, so that k = 1. Hence the sphere packing bound shows that there are no more than

27/[C(7, 0) + C(7, 1)] = 128/8 = 16

codewords in such a code.

The sphere packing bound gives us an upper bound for the number of codewords in a binary code with a given minimum distance where codewords are bit strings of length n. The codes that actually achieve this upper bound,

Chapter 5 Coding Theory 83

that is, that have the most codewords possible, are of special interest because they are the most efficient error correcting codes. Such codes are known as perfect codes.

Example 11 Show that the code consisting of just two codewords 00000 and 11111 is a perfect binary code.

Solution: The minimum distance between codewords in this code is 5. The sphere packing bound states that there are at most

25/[C(5, 0) + C(5, 1) + C(5, 2)] = 32/16 = 2

codewords in a code consisting of 5-tuples with minimum distance 5. Since there are 2 codewords in this code, it is a perfect binary code.

The code in Example 11 is called a trivial perfect code since it only consists of the two codewords, one containing only 0s and the other containing only 1s. As Exercise 8 demonstrates, when n is an odd positive integer there are trivial perfect codes consisting of the two codewords which are bit strings of length n consisting of all 0s and of all 1s. Finding perfect binary codes different from the trivial codes has been one of the most important problems in coding theory. In the next section we will introduce a class of perfect binary codes known as Hamming codes.

Generator Matrices Before describing Hamming codes, we need to generalize the concept of a parity check bit. When we use a parity check bit, we encode a message x1x2 . . . xk as x1x2 . . . xkxk+1 where xk+1 = (x1 + x2 + · · · + xk) mod 2. To generalize this notion, we add more than one check bit. More precisely, we encode a message x1x2 . . . xk as x1x2 . . . xkxk+1 . . . xn, where the last n − k bits xk+1,…,xn, are parity check bits, obtained from the k bits in the message. We will describe how these parity check bits are specified.

Consider a k-bit message x1x2 · · ·xk as a 1× k matrix x. Let G be a k×n matrix that begins with the k × k identity matrix Ik. That is, G = (Ik|A), where A is a k × (n − k) matrix, known as a generator matrix. We encode x as E(x) = xG, where we do arithmetic modulo 2. Coding using a parity check bit and using the triple repetition code are special cases of this technique, as illustrated in Examples 12 and 13.

Example 12 We can represent encoding by adding a parity check bit to a

84 Applications of Discrete Mathematics

three-bit message as E(x) = xG, where

G =

⎛ ⎝ 1 0 0 10 1 0 1

0 0 1 1

⎞ ⎠ .

Note that to obtain G we add a column of 1s to I3, the 3 × 3 identity matrix. That is, G = (I3|A), where

A =

⎛ ⎝ 11

1

⎞ ⎠ .

Example 13 We can represent encoding using the triple repetition code for three-bit messages as E(x) = xG, where

G =

⎛ ⎝ 1 0 0 1 0 0 1 0 00 1 0 0 1 0 0 1 0

0 0 1 0 0 1 0 0 1

⎞ ⎠ .

Note that G is formed by repeating the identity matrix of order three, I3, three times, that is,

G = (I3|I3|I3).

We now consider an example which we will use to develop some important ideas.

Example 14 Suppose that

G =

⎛ ⎝ 1 0 0 1 1 10 1 0 1 1 0

0 0 1 1 0 1

⎞ ⎠ ,

that is, G = (I3|A), where

A =

⎛ ⎝ 1 1 11 1 0

1 0 1

⎞ ⎠ .

What are the codewords in the code generated by this generator matrix?

Chapter 5 Coding Theory 85

Solution: We encode each of the eight three-bit messages x = x1x2x3 as E(x) = xG. This produces the codewords 000000, 001101, 010110, 011011, 100111, 101010, 110001, and 111100. For example, we get the third of these by computing

E(010) = (0 1 0)G = (0 1 0)

⎛ ⎝ 1 0 0 1 1 10 1 0 1 1 0

0 0 1 1 0 1

⎞ ⎠ = (0 1 0 1 1 0).

It is easy to see that we can find the codewords in a binary code generated by a generator matrix G by taking all possible linear combinations of the rows of G (since arithmetic is modulo 2, this means all sums of subsets of the set of rows of G). The reader should verify this for codewords in the code in Example 14.

It is easy to see that the binary codes formed using generator matrices have the property that the sum of any two codewords is again a codeword. That is, they are linear codes. To see this, suppose that y1 and y2 are codewords generated by the generator matrix G. Then there are bit strings x1 and x2 such that E(x1) = y1 and E(x2) = y2, where E(x) = xG. It follows that y1 + y2 is also a codeword since E(x1 + x2) = y1 + y2. (Here we add bit strings by adding their components in the same positions using arithmetic modulo 2.)

We will see that there is an easy way to find the minimum distance of a linear code. Before we see this, we need to make the following definition.

Definition 3 The weight of a codeword x, denoted by w(x), in a binary code is the number of 1s in this codeword.

Example 15 Find the weights of the codewords 00000, 10111, and 11111.

Solution: Counting the number of 1s in each of these codewords we find that w(00000) = 0, w(10111) = 4, and w(11111) = 5.

Lemma 3 Suppose that x and y are codewords in a linear code C. Then d(x,y) = w(x + y).

Proof: The positions with 1s in them in x+y are the positions where x and y differ. Hence d(x,y) = w(x + y).

We also will need the fact that 0, the bit string with all 0s, belongs to a linear code.

86 Applications of Discrete Mathematics

Lemma 4 Suppose that C is a nonempty linear code. Then 0 is a codeword in C.

Proof: Let x be a codeword in C. Since C is linear x+x = 0 belongs to C.

Theorem 6 The minimum distance of a linear code C equals the minimum weight of a nonzero codeword in C.

Proof: Suppose that the minimum distance of C is d. Then there are code- words x and y such that d(x,y) = d. By Lemma 3 it follows that w(x + y) = d. Hence w, the minimum weight of a codeword, is no larger than d.

Conversely, suppose that the codeword x is a nonzero codeword of min- imum weight. Then using Lemma 4 we see that w = w(x) = w(x + 0) = d(x,0) ≥ d. It follows that w = d, establishing the theorem.

Parity Check Matrices Note that in Example 14 the bit string x1x2x3 is encoded as x1x2x3x4x5x6 where x4 = x1 + x2 + x3, x5 = x1 + x2, and x6 = x1 + x3 (here, arithmetic is carried out modulo 2). Because we are doing arithmetic modulo 2, we see that

x1 + x2 + x3 + x4 = 0 x1 + x2 + x5 = 0 x1 + x3 + x6 = 0.

Furthermore, it is easy to see that x1x2x3x4x5x6 is a codeword if and only if it satisfies this system of equations.

We can express this system of equations as

⎛ ⎝ 1 1 1 1 0 01 1 0 0 1 0

1 0 1 0 0 1

⎞ ⎠ ⎛ ⎜⎜⎜⎜⎜⎝

x1 x2 x3 x4 x5 x6

⎞ ⎟⎟⎟⎟⎟⎠ =

⎛ ⎝ 00

0

⎞ ⎠ ,

that is, HE(x)t = 0,

where E(x)t is the transpose of E(x) and H, the parity check matrix, is given by

H =

⎛ ⎝ 1 1 1 1 0 01 1 0 0 1 0

1 0 1 0 0 1

⎞ ⎠ .