# Lecture log

**20/04:** We expanded state diagrams to *trellis diagrams* that depict all possible paths we can follow in a state diagram. To decode a received vector we follow the path that gives the vector closest to it. This is called *Viterbi decoding*.

**19/04:** We began with our final topic of the course, namely binary *convolutional codes*. These, unlike block codes, can encode messages of arbitrary length. Encoding is done with the help of the code's generating matrix. We saw that this is just polynomial multiplication and that the process can be mechanised using *shift registers*. Finally, we saw that *state diagrams* can simplify our calculations. How to generalise state diagrams and use them in decoding is the topic of tomorrow's lecture.

**13/04:** We continued with the third step of the BCH-decoding process, namely finding the roots of the the error locator polynomial \(s(x)\). By the construction of \(s(x)\), the inverses of its roots give us the error locations. In the final step we used these to constuct the error polynomial \(e(x)\) and thus correct the received word. We spent the rest of the lecture going through another example.

**12/04:** We started discussing how to decode BCH codes. There are four steps in this process and we managed to go through the first two, both in theory and example. Given a received polynomial \(w(x)=c(x)+e(x)\), where \(c(x)\) is a codeword and \(e(x)\) is the *error polynomial*, one first needs to compute the syndromes \(S_i=w(\alpha^i)\) where \(i\) runs through the \(\delta-1\) consecutive elements of the code's defining set, \(\alpha\) is the primitive root of unity associated with the defining set and \(\delta\) is the designed distance. Then, with the use of the Berlekamp-Massey algorithm, the *error locator polynomial* \(s(x)\) is obtained and will help us determine the error positions (i.e. the non-zero terms of \(e(x)\)) in tomorrow's lecture.

**22/03:** We saw a special class of BCH codes over \(\mathbb F_q\) called *Reed-Solomon (RS)* codes. These have length \(q-1\), which equips them with special properties: they are MDS, their designed distance \(\delta\) is their actual minimum distance, their defining set consists only of \(\delta-1\) consecutive elements, and their duals are also RS (although in general, duals of BCH codes are not BCH).

**16/03:** We first saw a bound on the dimension of BCH codes and
then discussed the relationship between Hamming codes and BCH codes:
All binary Hamming codes are BCH codes but not all non-binary ones are
cyclic. We saw an example but left the proof for the next lecture.

**15/03:** We proved the BCH-bound and defined *BCH
codes with designed distance* \(\delta\). These are cyclic
codes constructed such that their defining set has at least \(\delta-1\) consecutive elements,
which by the BCH bound means their minimum distance is at least the
designed distance.

**09/03:** We spent the entire lecture on an example that brought all the pieces of the theory of cyclic codes we have covered so far together.

**08/03:** We defined the *defining set* of a cyclic code \(\mathcal C=\langle g(x) \rangle \subseteq \mathbb F_q^n\) to be the union of the \(q\)-cyclotomic cosets modulo \(n\) that are used in the construction of \(g(x)\). We say that this set has \(s\) *consecutive elements* if it contains \(s\) consecutive integers modulo \(n\). We then saw that the minimum distance of a cyclic code is bounded below by the number of consecutive elements in its defining set plus one (the BCH-bound). We also briefly discussed the defining set of the dual code.

**02/03:** We proved that every cyclic code has a unique idempodent that generates it. Then we showed how to find all the idempodents in the factor ring \(\mathbb F_2[x] / \langle x^n-1 \rangle \). Each such idempodent generates a cyclic code and taking its g.c.d. with \(x^n-1\) gives the generating polynomial of the code.

**01/03:** We saw an alternative encoding method in which the message is found at the end of the codeword, and also how to decode and detect errors. After that, we discussed how to factorise \(x^n-1\) over \(\mathbb F_q\) using \(q\)-cyclotomic cosets modulo \(n\). Since this involves primitive \(n^{th}\) roots of unity, it would be nice to have an alternative method for finding generating polynomials of cyclic codes. For that we need idempodents, that is polynomials that are equal to their squares: \(e(x)=e^2(x)\).

**24/02:** We started discussing cyclic codes: An \([n,k]\)-code \(\mathcal C\) over \(\mathbb F_q\) is cyclic if for any codeword \(c=(c_0,\ldots,c_{n-1})\), \(c'=(c_{n-1},c_0,\ldots,c_{n-2})\) is also a codeword. By viewing each codeword as a polynomial of degree less than \(n\) (i.e. viewing \(c\) as \(c(x)=c_0+c_1x+\ldots c_{n-1}x^{n-1}\)) we noted that \(\mathcal C\) is the set of products of a monic polynomial \(g(x)\) of degree \(n-k\) with polynomials in \(\mathbb F_q[x]\) modulo \((x^n-1)\). This \(g(x)\) is unique and is called the *generating polynomial* of \(\mathcal C\).

Encoding of messages, that is vectors of length \(k\) or, equivalently, polynomials of degree less than \(k\), is done via multiplication (\(\text{mod}\ x^n-1\)) by \(g(x)\). A generating matrix for \(\mathcal C\) is easily obtained from \(g(x)\), and a parity check matrix is similarly obtained from the *parity check polynomial* \(h(x)=(x^n-1)/g(x)\). This fact hinted to us that \(\mathcal C^{\perp}\) is also cyclic.

Our final observation was that since \(g(x)\) divides \(x^n-1\) and is unique, there is one cyclic code of length \(n\) over \(\mathbb F_q\) for each of the non-trivial factors of \(x^n-1\). Supposing \(x^n-1\) has \(m\) irreducible factors, if \(\gcd(q,n)=1\) then they are all distinct and so there are \(2^m-1\) different possibilities for a generating polynomial.

**23/02:** We showed how to construct finite fields and stated some fundamental results that we will refer to in coming lectures.

**17/02:** After looking at some more bounds, including the
Singleton upper bound which is met by *maximum distance
separable* (MDS) codes, we began introducing finite fields.

**16/02:** We had a look at bounds on the size of
codes. Ideally, a code should have short length, a high number of
codewords and a large minimum distance. Unfortunately these goals are
contradicting and the bounds we saw give us an idea about the
limitations in the parameters. Codes that have the largest possible
size for their length and minimum distance are called *optimal*.

**10/02:** We deduced the paramaters of the binary Reed-Muller
codes and proved a couple of other elementary results (e.g. that \(\mathcal
R(i,m) \subseteq \mathcal R(j,m)\) for \(0\leq i \leq j \leq m\)), using the
first or the and second construction method as appropriate.
In the end we showed (without many details or explanations) an
efficient decoding method (encoding is done the usual way).

**09/02:** We gave two descriptions of the
binary Reed-Muller codes \(\mathcal R(r,m)\) which, for \(m \in
\mathbb Z^+\) and \(r \in \mathbb Z\) such that \(0\leq r \leq m\), are
\(\displaystyle [2^m,\sum_{i=0}^r\binom{m}{i},2^{m-r}]\)-codes. Firstly, we discussed
the recursive construction: \(\mathcal R(0,m)\) is the repetition code
of length \(2^m\), \(\mathcal R(m,m)\) is the entire space \(\mathbb F_q^n\), and
the inbetween codes for \(1 \leq r \leq m-1\) are given by combining
codes, i.e. \(\mathcal
R(r,m)=\mathcal R(r,m-1) \oplus \mathcal R(r-1,m-1)\).

The second construction was based on polynomials in \(m\) variables over
\(\mathbb F_2\): For \(0\leq i \leq n-1\) let \(P_i\) be the binary representation
of \(i\) as a vector in \(\mathbb F_2^n\), where \(n=2^m\), and let \(\mathcal B(r,m)\) be the set of
monomials in \(m\) variables of degree at most \(r\) where each variable
appears at most once.
Then \(\mathcal R(r,m)\) is the code spanned by
\(\mathcal{BR}(r,m)=\{\left ( f(P_0),\ldots,f(P_n)\right )|\ f \in \mathcal
B(r,m)\}\). Finally, we showed that the two constructions are
equivalent since the generating matrix in one case can be obained from
the one in the other by elementary row operations.

**03/02:** We studied two additional methods for modifying a
code. *Shortening* a code \(\mathcal C\) at the \(i^{th}\) position means
puncturing at the \(i^{th}\) position the subcode of \(\mathcal C\) consisting of
those codewords whose \(i^{th}\) cordinate is zero. The resulting code
is denoted by \(\mathcal C_{\{i\}}\). If we puncture or shorten a code at
several positions, say those in a set \(T\), we denote the resulting codes by \(\mathcal C^T\) and \(\mathcal C_T\) respectively.
Interestingly, we
observed that \(\left (\mathcal C^\perp \right )_T = \left ( \mathcal C^T\right
)^\perp\) and \(\left ( \mathcal C^\perp\right )^T =\left ( \mathcal C_T\right
)^\perp\).

The final method was *combining* two codes over the same
alphabet. In particular, if we have two codes \(\mathcal C_1\) and \(\mathcal C_2\) over \(\mathbb F_q\) of length \(n\)
and dimensions \(k_1\) and \(k_2\) respectively, then we
can augment each codeword of the first code with the sum of it
and each codeword of the second code. Thus, the resulting code \(\mathcal C_3=\mathcal C_1 \oplus \mathcal C_2\) will
have double the length and each of the \(q^{k_1}\) codewords of the
first code gives it \(q^{k_2}\) codewords. We saw that
the generator and parity check matrices of the resulting code are
easily obtained from those of the component codes. Though weak (due to
the short resulting minimum distance) this construction method yields
a very nice class of codes, the *Reed-Muller* codes, which have
a nice recursive structure that allows for easy implementations and
decoding.

**02/02:** We started with the study of the *dual* of
an \([n,k]\)-code \(\mathcal C\), that is the \([n,n-k]\)-code \(\mathcal C^\perp\) whose every codeword has inner product zero
with each and every codeword of \(\mathcal C\). Interestingly, if \(G\) and \(H\) are
generator and parity check matrices for \(\mathcal C\), they switch roles when it
comes to \(\mathcal C^\perp\). It can happen that a code is contained in or even
equal to its dual,
in which case it is called *self-orthogonal* and *self-dual* respectively. We saw that such codes are \([2k,k]\)-codes for some \(k \in \mathbb Z^+\) and that \(\mathcal
H_{3,2}\) is an example.

We then went on to discussing ways of modifying a code to obtain
another one with controllable parameters. *Extending* an \([n,k]\) code
means appending a parity-check digit to each of its codewords. The
result is a linear \([n+1,k]\)-code, denoted by \(\widehat{\mathcal C}\), whose
minimum distance is either \(d(\mathcal C)\) or \(d(\mathcal C)+1\). Its generator matrix
is obtained by appending a column to the generator matrix of \(\mathcal C\), such
that each entry is the parity check digit of its row. We demonstrated
the method by obtaining the extended Hamming
code \(\widehat{\mathcal H}_3\) which is also self-dual.

We concluded the lecture with *puncturing* an \([n,k,d]\)-code
\(\mathcal C\) at, say, the \(i^{th}\) position. By that we mean removing the
\(i^{th}\) coordinate from all of its codewords, and the resulting code
\(\mathcal C^*_i\) has a generating matrix obtained from one of \(\mathcal C\) by
removing the \(i^{th}\) column and any occuring zero or duplicate
rows. It is an \([n-1,k,d]\)-code unless it has a codeword of minimum
weight that is non-zero at the \(i^{th}\) position. In such case, it is either
an \([n-1,k,d-1]\)-code (if \(d>1\)) or an \([n-1,k-1]\)-code with \(d(\mathcal C^*_i)\geq 1\) (if \(d=1\)).

**27/01:** First we showed how Hamming codes can be extended
over any field \(\mathbb F_q\). \(\mathcal H_{q,r}\) is a
\(\left [\frac{q^r-1}{q-1},\frac{q^r-1}{q-1}-r,3 \right ]\)-code with parity check
matrix whose columns consist of a non-zero representative from each \(1\)-dimensional
subspace of \(\mathbb F_q^r\).

Afterwards, we saw how the *binary symmetric channel* (BSC) models
message transmission and why the nearest neighbour decoding approach
is natural in a sense. Moreover, we discussed Shannon's theorem which
puts some constraints on a code's parameters with respect to the BSC's
capacity.

Finally, we introduced the concept of spheres: A sphere of radius \(r\)
centered at vector \(x \in \mathbb F_q^n\) is the set \(S_r(x)\) containing all
vectors from \(\mathbb F_q^n\) with Hamming distance from \(x\) at most \(r\). By our
previous results, All spheres of radius \(t\) centered at the codewords
of a code \(\mathcal C\) that corrects up to \(t\) errors are all
disjoint and \(t\) is called the *packing radius* of
\(\mathcal C\). Moreover, if for a code \(\mathcal C\) these spheres cover the entire space
\(\mathbb F_q^n\), then this code attains the *Sphere Packing* (or *Hamming*) bound
and is called a *perfect code*. The Hamming codes are examples
of such codes.

**26/01:** We showed how we can avoid creating the
\(q^{n-k}\times q^k\) array of
the coset method using *syndrome decoding*. The *syndrome* of a vector \(y \in \mathbb F_q^n\)
is \(syn(y)=Hy^T\), where \(H\) is a parity check matrix of the code \(\mathcal C
\leq \mathbb F_q^n\). It has the property that two vectors are in the same
coset iff they have the same syndrome. Hence we reduce the array to a
\(q^{n-k}\times 2\) list containing the coset leaders and their
syndromes. We also discussed how to construct this list (if a coset
leader candidate has Hamming weight greater than the maximum number of errors the code can
correct, we put it on the list only if its syndrome is not already
there).

Finally, we saw a special class of binary linear codes, called the *Hamming
codes* \(\mathcal H_r\), which for \(r \ge 2\) are \([2^r-1,2^r-r-1,3]\)
codes whose parity check matrices have as columns the binary
representation of the numbers \(1\) to \(2^r-1\) (in any order). They are interesting due
to their property that all the coset leaders (w.r.t. syndrome
decoding) have weight at most
\(1\). As a result, no lists are needed for decoding: One computes the
syndrome of the received vector, finds its position among the columns of
the parity check matrix and decodes by changing the vector's digit in the
corresponding position.

**20/01:** We began looking for efficient ways to decode a received vector: according to the nearest neighbour decoding, if this vector is not a codeword, we should replace it by the codeword having the shortest distance to it. Finding this can be hard if the code is large so we need more efficient techniques.

We looked at the coset method: we made an array of all the elements of \(\mathbb F_q^n\) where the first row is the code (starting with the zero codeword) and each of the other rows are the cosets of the code, in which the first column contains the vectors of smallest weight (called the *coset-leaders*) and each other column is the sum of the coset leader with the codeword on top of that column. Then a received vector is decoded by subtracting from it the coset leader. Writing down this array is clearly inefficient - we will improve on this method further in the next lecture.

**19/01:** We first mentioned code equivalence: If we permute the columns of each codeword of a code in the same way, we obtain a code *equivalent* to it. It was easy to see that every code is equivalent to one with generator matrix in standard form.

Next we began discussing the number of errors a code can detect and correct. Errors are detectable as long as they do not transform the original codeword into another code word. For this reason we defined a metric \(d(x,y)\) on \(\mathbb F_q^n\), called the *Hamming distance*, as the number of positions two distinct vectors \(x\) and \(y\) are different in. Then, if we denote the smallest of all distances between the codewords of a code \(C\) by \(d(C)\) (and call it the *minimum distance*), \(C\) can detect up to \(d(C)-1\) errors and correct up to \(\left \lfloor \frac{d(C)-1}{2} \right \rfloor \) errors.

Calculating \(d(C)\) can be hard if \(C\) is large. As a first shortcut, we saw that for a linear code \(C\), the minimum distance is equal to the *minimum weight* \(wt(C)\), which is the smallest of the weights of the codewords in \(C\) and is easier to compute. The *Hamming weight* \(wt(x)\) of a vector \(x\) is the number of non-zero entries in it.

**13/01:** We started with linear error correcting block codes over \(\mathbb F_q\). We formally defined them as subspaces of \(\mathbb F_q^n\) of \(q^k\) elements (codewords) that are parametrised by the length of their codewords (i.e. \(n\)) and their dimension (i.e. the basis cardinality \(k\)). We also defined two types of matrices related to an \([n,k]\)-linear code \(C\): generator matrices and parity check matrices.

Generator matrices are \(k\times n\) matrices who have a basis of \(C\) as their rows, and are all "equivalent" to the same unique matrix \(G=[I_k | A]\) which we say is in standard form. They are used for obtaining the codeword \(c\) of length \(n\) corresponding to a message \(m\) of length \(k\) by adding \(n-k\) redundancy digits to it via \(c=mG\).

Parity check matrices are \((n-k) \times n\) matrices with the property that for any codeword \(c\in C, \ Hc^T=0_{n-k \times 1}\) or equivalently \(cH^T=0_{1 \times n-k}\). This provides an easy way to detect errors in received messages, especially since \(H=[-A^T|I_{n-k}]\) is such a matrix. The use of parity check matrices in decoding will be seen at a later stage.

In the process we briefly reviewed all relevant concepts from linear algebra as well as finite fields.

**12/01:** We discussed what the course is about, namely reliable communication over noisy channels. In short, when messages are transmitted, they might get altered by errors introduced to them due to noise in the communication channels. Coding Theory is about techniques that can allow for the receiver to remove these errors and obtain the message originally sent. This is usually done by adding redundancy to each message, sufficient for its reconstruction at the receivers end.