Lecture log

20/04: We expanded state diagrams to trellis diagrams that depict all possible paths we can follow in a state diagram. To decode a received vector we follow the path that gives the vector closest to it. This is called Viterbi decoding.

19/04: We began with our final topic of the course, namely binary convolutional codes. These, unlike block codes, can encode messages of arbitrary length. Encoding is done with the help of the code's generating matrix. We saw that this is just polynomial multiplication and that the process can be mechanised using shift registers. Finally, we saw that state diagrams can simplify our calculations. How to generalise state diagrams and use them in decoding is the topic of tomorrow's lecture.

13/04: We continued with the third step of the BCH-decoding process, namely finding the roots of the the error locator polynomial \(s(x)\). By the construction of \(s(x)\), the inverses of its roots give us the error locations. In the final step we used these to constuct the error polynomial \(e(x)\) and thus correct the received word. We spent the rest of the lecture going through another example.

12/04: We started discussing how to decode BCH codes. There are four steps in this process and we managed to go through the first two, both in theory and example. Given a received polynomial \(w(x)=c(x)+e(x)\), where \(c(x)\) is a codeword and \(e(x)\) is the error polynomial, one first needs to compute the syndromes \(S_i=w(\alpha^i)\) where \(i\) runs through the \(\delta-1\) consecutive elements of the code's defining set, \(\alpha\) is the primitive root of unity associated with the defining set and \(\delta\) is the designed distance. Then, with the use of the Berlekamp-Massey algorithm, the error locator polynomial \(s(x)\) is obtained and will help us determine the error positions (i.e. the non-zero terms of \(e(x)\)) in tomorrow's lecture.

22/03: We saw a special class of BCH codes over \(\mathbb F_q\) called Reed-Solomon (RS) codes. These have length \(q-1\), which equips them with special properties: they are MDS, their designed distance \(\delta\) is their actual minimum distance, their defining set consists only of \(\delta-1\) consecutive elements, and their duals are also RS (although in general, duals of BCH codes are not BCH).

16/03: We first saw a bound on the dimension of BCH codes and then discussed the relationship between Hamming codes and BCH codes: All binary Hamming codes are BCH codes but not all non-binary ones are cyclic. We saw an example but left the proof for the next lecture.

15/03: We proved the BCH-bound and defined BCH codes with designed distance \(\delta\). These are cyclic codes constructed such that their defining set has at least \(\delta-1\) consecutive elements, which by the BCH bound means their minimum distance is at least the designed distance.

09/03: We spent the entire lecture on an example that brought all the pieces of the theory of cyclic codes we have covered so far together.

08/03: We defined the defining set of a cyclic code \(\mathcal C=\langle g(x) \rangle \subseteq \mathbb F_q^n\) to be the union of the \(q\)-cyclotomic cosets modulo \(n\) that are used in the construction of \(g(x)\). We say that this set has \(s\) consecutive elements if it contains \(s\) consecutive integers modulo \(n\). We then saw that the minimum distance of a cyclic code is bounded below by the number of consecutive elements in its defining set plus one (the BCH-bound). We also briefly discussed the defining set of the dual code.

02/03: We proved that every cyclic code has a unique idempodent that generates it. Then we showed how to find all the idempodents in the factor ring \(\mathbb F_2[x] / \langle x^n-1 \rangle \). Each such idempodent generates a cyclic code and taking its g.c.d. with \(x^n-1\) gives the generating polynomial of the code.

01/03: We saw an alternative encoding method in which the message is found at the end of the codeword, and also how to decode and detect errors. After that, we discussed how to factorise \(x^n-1\) over \(\mathbb F_q\) using \(q\)-cyclotomic cosets modulo \(n\). Since this involves primitive \(n^{th}\) roots of unity, it would be nice to have an alternative method for finding generating polynomials of cyclic codes. For that we need idempodents, that is polynomials that are equal to their squares: \(e(x)=e^2(x)\).

24/02: We started discussing cyclic codes: An \([n,k]\)-code \(\mathcal C\) over \(\mathbb F_q\) is cyclic if for any codeword \(c=(c_0,\ldots,c_{n-1})\), \(c'=(c_{n-1},c_0,\ldots,c_{n-2})\) is also a codeword. By viewing each codeword as a polynomial of degree less than \(n\) (i.e. viewing \(c\) as \(c(x)=c_0+c_1x+\ldots c_{n-1}x^{n-1}\)) we noted that \(\mathcal C\) is the set of products of a monic polynomial \(g(x)\) of degree \(n-k\) with polynomials in \(\mathbb F_q[x]\) modulo \((x^n-1)\). This \(g(x)\) is unique and is called the generating polynomial of \(\mathcal C\).
Encoding of messages, that is vectors of length \(k\) or, equivalently, polynomials of degree less than \(k\), is done via multiplication (\(\text{mod}\ x^n-1\)) by \(g(x)\). A generating matrix for \(\mathcal C\) is easily obtained from \(g(x)\), and a parity check matrix is similarly obtained from the parity check polynomial \(h(x)=(x^n-1)/g(x)\). This fact hinted to us that \(\mathcal C^{\perp}\) is also cyclic.
Our final observation was that since \(g(x)\) divides \(x^n-1\) and is unique, there is one cyclic code of length \(n\) over \(\mathbb F_q\) for each of the non-trivial factors of \(x^n-1\). Supposing \(x^n-1\) has \(m\) irreducible factors, if \(\gcd(q,n)=1\) then they are all distinct and so there are \(2^m-1\) different possibilities for a generating polynomial.

23/02: We showed how to construct finite fields and stated some fundamental results that we will refer to in coming lectures.

17/02: After looking at some more bounds, including the Singleton upper bound which is met by maximum distance separable (MDS) codes, we began introducing finite fields.

16/02: We had a look at bounds on the size of codes. Ideally, a code should have short length, a high number of codewords and a large minimum distance. Unfortunately these goals are contradicting and the bounds we saw give us an idea about the limitations in the parameters. Codes that have the largest possible size for their length and minimum distance are called optimal.

10/02: We deduced the paramaters of the binary Reed-Muller codes and proved a couple of other elementary results (e.g. that \(\mathcal R(i,m) \subseteq \mathcal R(j,m)\) for \(0\leq i \leq j \leq m\)), using the first or the and second construction method as appropriate. In the end we showed (without many details or explanations) an efficient decoding method (encoding is done the usual way).

09/02: We gave two descriptions of the binary Reed-Muller codes \(\mathcal R(r,m)\) which, for \(m \in \mathbb Z^+\) and \(r \in \mathbb Z\) such that \(0\leq r \leq m\), are \(\displaystyle [2^m,\sum_{i=0}^r\binom{m}{i},2^{m-r}]\)-codes. Firstly, we discussed the recursive construction: \(\mathcal R(0,m)\) is the repetition code of length \(2^m\), \(\mathcal R(m,m)\) is the entire space \(\mathbb F_q^n\), and the inbetween codes for \(1 \leq r \leq m-1\) are given by combining codes, i.e. \(\mathcal R(r,m)=\mathcal R(r,m-1) \oplus \mathcal R(r-1,m-1)\).
The second construction was based on polynomials in \(m\) variables over \(\mathbb F_2\): For \(0\leq i \leq n-1\) let \(P_i\) be the binary representation of \(i\) as a vector in \(\mathbb F_2^n\), where \(n=2^m\), and let \(\mathcal B(r,m)\) be the set of monomials in \(m\) variables of degree at most \(r\) where each variable appears at most once. Then \(\mathcal R(r,m)\) is the code spanned by \(\mathcal{BR}(r,m)=\{\left ( f(P_0),\ldots,f(P_n)\right )|\ f \in \mathcal B(r,m)\}\). Finally, we showed that the two constructions are equivalent since the generating matrix in one case can be obained from the one in the other by elementary row operations.

03/02: We studied two additional methods for modifying a code. Shortening a code \(\mathcal C\) at the \(i^{th}\) position means puncturing at the \(i^{th}\) position the subcode of \(\mathcal C\) consisting of those codewords whose \(i^{th}\) cordinate is zero. The resulting code is denoted by \(\mathcal C_{\{i\}}\). If we puncture or shorten a code at several positions, say those in a set \(T\), we denote the resulting codes by \(\mathcal C^T\) and \(\mathcal C_T\) respectively. Interestingly, we observed that \(\left (\mathcal C^\perp \right )_T = \left ( \mathcal C^T\right )^\perp\) and \(\left ( \mathcal C^\perp\right )^T =\left ( \mathcal C_T\right )^\perp\).
The final method was combining two codes over the same alphabet. In particular, if we have two codes \(\mathcal C_1\) and \(\mathcal C_2\) over \(\mathbb F_q\) of length \(n\) and dimensions \(k_1\) and \(k_2\) respectively, then we can augment each codeword of the first code with the sum of it and each codeword of the second code. Thus, the resulting code \(\mathcal C_3=\mathcal C_1 \oplus \mathcal C_2\) will have double the length and each of the \(q^{k_1}\) codewords of the first code gives it \(q^{k_2}\) codewords. We saw that the generator and parity check matrices of the resulting code are easily obtained from those of the component codes. Though weak (due to the short resulting minimum distance) this construction method yields a very nice class of codes, the Reed-Muller codes, which have a nice recursive structure that allows for easy implementations and decoding.

02/02: We started with the study of the dual of an \([n,k]\)-code \(\mathcal C\), that is the \([n,n-k]\)-code \(\mathcal C^\perp\) whose every codeword has inner product zero with each and every codeword of \(\mathcal C\). Interestingly, if \(G\) and \(H\) are generator and parity check matrices for \(\mathcal C\), they switch roles when it comes to \(\mathcal C^\perp\). It can happen that a code is contained in or even equal to its dual, in which case it is called self-orthogonal and self-dual respectively. We saw that such codes are \([2k,k]\)-codes for some \(k \in \mathbb Z^+\) and that \(\mathcal H_{3,2}\) is an example.
We then went on to discussing ways of modifying a code to obtain another one with controllable parameters. Extending an \([n,k]\) code means appending a parity-check digit to each of its codewords. The result is a linear \([n+1,k]\)-code, denoted by \(\widehat{\mathcal C}\), whose minimum distance is either \(d(\mathcal C)\) or \(d(\mathcal C)+1\). Its generator matrix is obtained by appending a column to the generator matrix of \(\mathcal C\), such that each entry is the parity check digit of its row. We demonstrated the method by obtaining the extended Hamming code \(\widehat{\mathcal H}_3\) which is also self-dual.
We concluded the lecture with puncturing an \([n,k,d]\)-code \(\mathcal C\) at, say, the \(i^{th}\) position. By that we mean removing the \(i^{th}\) coordinate from all of its codewords, and the resulting code \(\mathcal C^*_i\) has a generating matrix obtained from one of \(\mathcal C\) by removing the \(i^{th}\) column and any occuring zero or duplicate rows. It is an \([n-1,k,d]\)-code unless it has a codeword of minimum weight that is non-zero at the \(i^{th}\) position. In such case, it is either an \([n-1,k,d-1]\)-code (if \(d>1\)) or an \([n-1,k-1]\)-code with \(d(\mathcal C^*_i)\geq 1\) (if \(d=1\)).

27/01: First we showed how Hamming codes can be extended over any field \(\mathbb F_q\). \(\mathcal H_{q,r}\) is a \(\left [\frac{q^r-1}{q-1},\frac{q^r-1}{q-1}-r,3 \right ]\)-code with parity check matrix whose columns consist of a non-zero representative from each \(1\)-dimensional subspace of \(\mathbb F_q^r\).
Afterwards, we saw how the binary symmetric channel (BSC) models message transmission and why the nearest neighbour decoding approach is natural in a sense. Moreover, we discussed Shannon's theorem which puts some constraints on a code's parameters with respect to the BSC's capacity.
Finally, we introduced the concept of spheres: A sphere of radius \(r\) centered at vector \(x \in \mathbb F_q^n\) is the set \(S_r(x)\) containing all vectors from \(\mathbb F_q^n\) with Hamming distance from \(x\) at most \(r\). By our previous results, All spheres of radius \(t\) centered at the codewords of a code \(\mathcal C\) that corrects up to \(t\) errors are all disjoint and \(t\) is called the packing radius of \(\mathcal C\). Moreover, if for a code \(\mathcal C\) these spheres cover the entire space \(\mathbb F_q^n\), then this code attains the Sphere Packing (or Hamming) bound and is called a perfect code. The Hamming codes are examples of such codes.

26/01: We showed how we can avoid creating the \(q^{n-k}\times q^k\) array of the coset method using syndrome decoding. The syndrome of a vector \(y \in \mathbb F_q^n\) is \(syn(y)=Hy^T\), where \(H\) is a parity check matrix of the code \(\mathcal C \leq \mathbb F_q^n\). It has the property that two vectors are in the same coset iff they have the same syndrome. Hence we reduce the array to a \(q^{n-k}\times 2\) list containing the coset leaders and their syndromes. We also discussed how to construct this list (if a coset leader candidate has Hamming weight greater than the maximum number of errors the code can correct, we put it on the list only if its syndrome is not already there).
Finally, we saw a special class of binary linear codes, called the Hamming codes \(\mathcal H_r\), which for \(r \ge 2\) are \([2^r-1,2^r-r-1,3]\) codes whose parity check matrices have as columns the binary representation of the numbers \(1\) to \(2^r-1\) (in any order). They are interesting due to their property that all the coset leaders (w.r.t. syndrome decoding) have weight at most \(1\). As a result, no lists are needed for decoding: One computes the syndrome of the received vector, finds its position among the columns of the parity check matrix and decodes by changing the vector's digit in the corresponding position.

20/01: We began looking for efficient ways to decode a received vector: according to the nearest neighbour decoding, if this vector is not a codeword, we should replace it by the codeword having the shortest distance to it. Finding this can be hard if the code is large so we need more efficient techniques.
We looked at the coset method: we made an array of all the elements of \(\mathbb F_q^n\) where the first row is the code (starting with the zero codeword) and each of the other rows are the cosets of the code, in which the first column contains the vectors of smallest weight (called the coset-leaders) and each other column is the sum of the coset leader with the codeword on top of that column. Then a received vector is decoded by subtracting from it the coset leader. Writing down this array is clearly inefficient - we will improve on this method further in the next lecture.

19/01: We first mentioned code equivalence: If we permute the columns of each codeword of a code in the same way, we obtain a code equivalent to it. It was easy to see that every code is equivalent to one with generator matrix in standard form.
Next we began discussing the number of errors a code can detect and correct. Errors are detectable as long as they do not transform the original codeword into another code word. For this reason we defined a metric \(d(x,y)\) on \(\mathbb F_q^n\), called the Hamming distance, as the number of positions two distinct vectors \(x\) and \(y\) are different in. Then, if we denote the smallest of all distances between the codewords of a code \(C\) by \(d(C)\) (and call it the minimum distance), \(C\) can detect up to \(d(C)-1\) errors and correct up to \(\left \lfloor \frac{d(C)-1}{2} \right \rfloor \) errors.
Calculating \(d(C)\) can be hard if \(C\) is large. As a first shortcut, we saw that for a linear code \(C\), the minimum distance is equal to the minimum weight \(wt(C)\), which is the smallest of the weights of the codewords in \(C\) and is easier to compute. The Hamming weight \(wt(x)\) of a vector \(x\) is the number of non-zero entries in it.

13/01: We started with linear error correcting block codes over \(\mathbb F_q\). We formally defined them as subspaces of \(\mathbb F_q^n\) of \(q^k\) elements (codewords) that are parametrised by the length of their codewords (i.e. \(n\)) and their dimension (i.e. the basis cardinality \(k\)). We also defined two types of matrices related to an \([n,k]\)-linear code \(C\): generator matrices and parity check matrices.
Generator matrices are \(k\times n\) matrices who have a basis of \(C\) as their rows, and are all "equivalent" to the same unique matrix \(G=[I_k | A]\) which we say is in standard form. They are used for obtaining the codeword \(c\) of length \(n\) corresponding to a message \(m\) of length \(k\) by adding \(n-k\) redundancy digits to it via \(c=mG\).
Parity check matrices are \((n-k) \times n\) matrices with the property that for any codeword \(c\in C, \ Hc^T=0_{n-k \times 1}\) or equivalently \(cH^T=0_{1 \times n-k}\). This provides an easy way to detect errors in received messages, especially since \(H=[-A^T|I_{n-k}]\) is such a matrix. The use of parity check matrices in decoding will be seen at a later stage.
In the process we briefly reviewed all relevant concepts from linear algebra as well as finite fields.

12/01: We discussed what the course is about, namely reliable communication over noisy channels. In short, when messages are transmitted, they might get altered by errors introduced to them due to noise in the communication channels. Coding Theory is about techniques that can allow for the receiver to remove these errors and obtain the message originally sent. This is usually done by adding redundancy to each message, sufficient for its reconstruction at the receivers end.

2012-04-25, petrides