The pycodes package provides various extensions to Python which are useful in developing and analyzing error correcting codes and data compression codes (especially low density parity check codes). The code was originally developed by Emin Martinian; questions and comments should be directed to him via emin@alum.mit.edu or emin63@alum.berkeley.edu. For copying, modificiation, and distribution information, see the LICENSE.
The following modules are provided:
pyLDPC
utils
tests
$ echo from pycodes.tests.quantization.BEC import small_BEC_perf | python
Similarly, you can run other tests in the directories
pycodes/tests/quantization/BEC
,
pycodes/tests/channel_coding/BSC
, etc. by using a command like
the one above.
The pycodes package provides various extensions to Python which are useful in developing and analyzing error correcting codes and data compression codes.
You will need the following installed on your system to use all of the pycodes features:
The following procedure is required to use the main iterative coding and decoding routines. If you just want to use things in the utils directory, you do not need to compile anything.
Edit the file setup.py to edit installation options (e.g., if you want
to compile in debug mode or optimization mode. Then, in this
directory, execute the command
$ python setup.py install
to compile the required python extensions and install them globally.
After the installation completes, you can use pycodes just like
any other python module. For example, to run the regression tests
you could start python and do
>>> import pycodes, pycodes.tests >>> pycodes.tests.DoRegressionTests()
If you don't want to globally install pycodes, you can just compile
it in place and use it as a local python module. To do this,
uncomment the lines for [build_ext] in setup.cfg and do
$ python setup.py build
to compile everything. Make sure that the name of the directory
containing the pycodes module is named pycodes
and not
something like pycodes-1.0
. To use pycodes, start python in
the directory containing the pycodes
directory. For example,
to run the regression tests you could do
$ python >>> import pycodes, pycodes.tests >>> pycodes.tests.DoRegressionTests()
The pyLDPC package provides low density parity check (LDPC) codes and
their dual codes which are low density generator matrix (LDGM) codes.
To create a new LDPC or LDGM code you can use the LDPCCode
or
DualLDPCCode
commands to create a new code. The
setevidence
command sets the channel evidence and specifies the
decoding algorithm to use, while the decode
command runs an
iteration of decoding. The getbeliefs
command returns the
computed beliefs.
The following sequence of python commands creates the dual of a
Hamming code, sets the evidence to correspond to the bits
[0,1,?,1,?,?,?]
, decodes by finding a codeword to match the
unerased bits, and prints out the result:
>>> from pycodes.pyLDPC import DualLDPCCode >>> code = DualLDPCCode(7,4,12,[[0,1,2,4],[1,2,3,5],[2,3,4,6]]) >>> code.setevidence(ev=[1,-1,0,-1,0,0,0],alg='BECQuant') >>> code.decode() >>> beliefs = code.getbeliefs() >>> print beliefs[0:3] # first K beliefs are for hidden vars [1.0, -1.0, 1.0]
Note that the beliefs are in something like log-likelihood-ratio
format. To map the beliefs into zeros and ones you can do
>>> comp = map(lambda x: x < 0,beliefs[0:3]) >>> print comp [0, 1, 0]
In order to see what codeword is produced from the compressed
result, you can use the EncodeFromLinkArray utility as follows:
>>> from pycodes.utils.encoders import EncodeFromLinkArray >>> r = EncodeFromLinkArray(comp,7,[[0,1,2,4],[1,2,3,5],[2,3,4,6]]) >>> print r [0, 1, 1, 1, 0, 1, 0]
Note that the first, second, and fourth positions of the codeword match the source.
The following sequence of python commands creates the dual of a
length 3000 (3,6) Gallager code, sets the evidence to be a random
binary word, quantizes the result using the AccBitFlip algorithm,
and counts the percentage of distorted bits:
>>> from pycodes.utils.CodeMaker import make_H_gallager >>> from pycodes.utils.channels import GetRandomBinaryCodeword >>> L = make_H_gallager(3000,3,6) # make the code >>> E = reduce(lambda x,y:x+y,map(lambda z:len(z),L))#count edges in code >>> code = DualLDPCCode(3000,1500,E,L) >>> source = GetRandomBinaryCodeword(3000) >>> ev = map(lambda b: 1 - 2*b,source) # map bits to log-likelihoods >>> code.setevidence(ev,alg='AccBitFlip') >>> for iteration in range(25): ... code.decode() >>> beliefs = code.getbeliefs()[0:1500] >>> comp = map(lambda x: x < 0,beliefs) # map log-likelihoods to bits >>> result = EncodeFromLinkArray(comp,3000,L) >>> diffs = reduce(lambda x,y:x+y,map(lambda r,s: r!=s,result,source)) >>> print (diffs/3000.0 > 0.14 and diffs/3000.0 < .17) 1 >>> # the above insures that the number of diffs is reasonable.
>>> from pycodes.pyLDPC import LDPCCode >>> code = LDPCCode(7,4,12,[[0,1,2,4],[1,2,3,5],[2,3,4,6]]) >>> # Set the channel evidence to the all-zeros codeword with 2 erasures >>> code.setevidence(ev=[1,1,0,0,1,1,1],alg='SumProductBP') >>> for iteration in range(25): ... code.decode() >>> beliefs = code.getbeliefs() >>> codeword = map(lambda x: x > 0.5,beliefs) >>> print codeword [0, 0, 0, 0, 0, 0, 0]
>>> from pycodes.pyLDPC import LDPCCode >>> from pycodes.utils.CodeMaker import make_H_gallager >>> from pycodes.utils.channels import BSC >>> L = make_H_gallager(3000,3,6) # make the code >>> E = reduce(lambda x,y:x+y,map(lambda z:len(z),L))#count edges in code >>> code = LDPCCode(3000,1500,E,L) >>> # Set the channel evidence to the all-zeros codeword through a BSC >>> ev = BSC([0]*3000,0.05) >>> code.setevidence(ev,alg='SumProductBP') >>> for iteration in range(25): ... code.decode() >>> beliefs = code.getbeliefs() >>> result = map(lambda x: x > 0.5,beliefs) >>> print 'num decoding errors = ' + `reduce(lambda a,b:a+b,result)` num decoding errors = 0
You can easily extend the pyLDPC package by adding additional decoding algorithms for channel coding or quantization. To add a new decoding algorithm for the LDPCCode or DualLDPCCode object, you need to do the following:
CodeGraphAlgorithm
or
DualCodeGraphAlgorithm
data structure.
CodeGraphAlgorithms
or
DualCodeGraphAlgorithms
data
structure in c_src/pyLDPC/CodeGraphAlgorithms.c
or
c_src/pyLDPC/CodeGraphAlgorithms.c
.
CodeGraphAlgorithmNames
or
DualCodeGraphAlgorithmNames
array in
c_src/pyLDPC/CodeGraphAlgorithms.c
or
c_src/pyLDPC/DualCodeGraphAlgorithms.c
.
The CodeGraphAlgorithm
and DualCodeGraphAlgorithm
data
structure defined in
c_src/pyLDPC/CodeGraphAlgorithms.h
and
c_src/pyLDPC/DualCodeGraphAlgorithms.h
have a field for the
algorithm name, a clientData
field for storing data required by
the algorithm, fields for functions to set evidence, get beliefs,
initialize, deallocate the algorithm, do a an iteration of decoding,
and possibly other actions as well. Once you provide these functions
and add your algorithm and its name to the appropriate arrays, you can
access it through python just like the existing algorithms. For
example, if you created a new algorithm called my_decode
, you
could use it by replacing the code.setevidence
command in the
examples above with
>>> code.setevidence(ev,alg='my_decode')
The utils package provides various utilities for creating, visualizing, and analyzing low density codes. You can use most of these features even without compiling the C code in the c_src directory.
The CodeMaker package contains function to create regular and irregular
Gallager codes. For example, to create a 3,6 Gallager code
of block length 30 and dimension 15, you could do
>>> regL = make_H_gallager(30,3,6)
To create an irregular Gallager code with 4 variables of degree 1,
4 variables of degree 2, 2 variables of degree 3, 3 checks of degree 2,
and 4 checks of degree 3, you could do
>>> iregL = MakeIrregularLDPCCode(10,3,{1:4,2:4,3:2},{2:3,3:4})
To create an irregular Gallager code with degree sequences
lambda(x) = 0.33241 x^2 + 0.24632 x^3 + 0.11014 x^4 + 0.31112 x^6
rho(x) = 0.76611 x^6 + 0.23380 x^7 you could do
>>> iregL = MakeIrregularLDPCCodeFromLambdaRho(30,15,{2:0.33241, 3:.24632, 4:.11014, 6:0.31112},{6:.76611, 7:.23389})
Finally, note that although it is possible to make regular Gallager codes using the Irregular code functions, YOU SHOULD NOT DO THAT. The irregular code functions only give you approximatley the degree sequence you request due to issues with randomly adding edges and removing redundant edges.
This module provides the ECCLPFormer class and QuantLPFormer to decode a low density parity check (LDPC) error correcting code or quantize via the dual of an LDPC code using a linear programming relaxation. The basic idea is that you instantiate an ECCLPFormer or QuantLPFormer and give it the code parameters. Then you tell it to form an LP and either solve it or print it out.
Before using this file you need to download and install the freely available GNU Linear Programming Kit (GLPK). If you install the executable glpsol from GLPK in a weird place that is not in the path seen by python, set the default for the LPSolver variable in the __init__ method of the LPFormer base class to point to the glpsol executable.
Some examples of how to use the ECCLPFormer and QuantLPFormer are shown below.
The following is a simple example of how to use the ECCLPFormer class
for a parity check matrix representing the code shown below.
y0 y1 y2 y3 = = = = \ |\ /| / \ | \ / | / \| \/ |/ + + +
The only two codewords for this code are 0000 and 1111.
First we generat the ECCLPFormer class, then we use it solve for
the optimal y0,y1,y2,y3 given the received data [1,0,1,1].
This received data corresponds to sending 1111 and getting an
error on the second bit. The LP decoder correcterly decoes
to the answer y0,y1,y2,y3 = 1,1,1,1.
>>> from FormLP import * >>> r = ECCLPFormer(4,1,[[0,1],[1,2],[2,3]]) >>> r.FormLP([1,0,1,1]) >>> (v,s,o) = r.SolveLP() >>> print v [1.0, 1.0, 1.0, 1.0]
Next we do LP decoding for a medium size Gallager code assuming
that the all zeros codeword was transmitted. Feldman, Karger,
and Wainwright argue that analyzing things assuming the all-0
codeword was sent over a binary symmetric channel is valid provided
the LP satisfies certain conditions (see their 2003 CISS paper
for more details). IMPORTANT: the all-0 assumption works for
analyzing things sent over a BSC but *NOT* over an erasure channel.
The following test takes about a minute to run on a Mac G3.
>>> N = 1200 >>> K = 600 >>> numErrors = 90 # error rate of 7.5% >>> from FormLP import * >>> from CodeMaker import * >>> from random import * >>> regL = make_H_gallager(N,3,6) >>> origSource = [0]*N >>> recSource = list(origSource) >>> i = 0 >>> while (i < numErrors): ... index = randrange(N) ... if (0 == recSource[index]): ... recSource[randrange(N)] = 1 ... i = i+1 ... >>> r = ECCLPFormer(N,K,regL) >>> r.FormLP(recSource) >>> (v,s,o) = r.SolveLP() >>> errors = map(lambda x,y: int(x) != int(y), origSource,v) >>> print 'num errors = ', errors.count(1) num errors = 0
The following is a simple example of how to use the QuantLPFormer class
for a generator matrix representing the code shown below.
y0 y1 y2 y3 + + + + \ |\ /| / \ | \ / | / \| \/ |/ x0 x1 x2
First we generat the QuantLPFormer class, then we use it solve for
the optimal x0,x1,x2 when y0,y1,y2,y3=[1,0,0,1]. The answer turns
out to be x0,x1,x2 = 1,1,1.
>>> from FormLP import * >>> r = QuantLPFormer(4,3,[[0],[0,1],[1,2],[2]]) >>> r.FormLP([1,0,0,1]) >>> (v,s,o) = r.SolveLP() >>> print v [1.0, 1.0, 1.0]
In the following example we take the dual of a (7,4) Hamming code using the
built in function TransposeCodeMatrix and then use that as the generator
matrix for quantization. In this example we quantize the
sequence 0,*,*,*,*,*,1 where the *'s represent don't cares which can
be reconstructed to either 0 or 1.
>>> r = QuantLPFormer(7,3,TransposeCodeMatrix(7,4,[[0,1,2,4],[1,2,3,5],[2,3,4,6]])) >>> r.FormLP([0,.5,.5,.5,.5,.5,1]) >>> (v,s,o) = r.SolveLP() >>> print v [0.0, 0.0, 1.0]
Next we iteratively solve a quantization LP for a medium size code.
This does not seem to work all that well, but none of the other LP
relaxations does much better at quantization either.
>>> N = 300 >>> K = 150 >>> numErase = 180 >>> numIter = 1000 >>> from FormLP import * >>> from CodeMaker import * >>> from random import * >>> regL = make_H_gallager(N,3,6) >>> source = map(lambda x: round(random()),range(N)) >>> for i in range(numErase): ... source[randrange(N)] = 0.5 ... >>> r = QuantLPFormer(N,K,TransposeCodeMatrix(N,K,regL)) >>> r.FormLP(source) >>> (v,s,o) = r.IterSolveLP(numIter,verbose=0) >>> from encoders import * >>> recon = EncodeFromLinkArray(map(lambda x: int(x),v),N,regL) >>> diffs = map(lambda x,y: x != 0.5 and x != y, source,recon) >>> print 'num flips = ', diffs.count(1) num flips = 25 >>> if (25 != diffs.count(1)): ... print 'failure may be due to diffs w/glpsol on different platforms' >>>
The following example illustrates what can go wrong with the LP
relaxation in doing quantization. Choosing v = [1.0,1.0,0.0] would
reconstruct the source perfectly in all the unerased positions
(places where the source is not 0.5). But the LP relaxation produces
the vector [1.0/3.0, 1.0/3.0, 1.0/3.0]. First, this 'solution' is
not even binary, and second even rounding the bits would not give the
right answer.
>>> from FormLP import * >>> from CodeMaker import * >>> from random import * >>> hammingCode = [[0,1,2,4],[1,2,3,5],[2,3,4,6]] >>> source = [0.5, 0.0, 0.0, 0.5, 1.0, 0.5, 0.5] >>> r = QuantLPFormer(7,3,TransposeCodeMatrix(7,4,hammingCode)) >>> r.FormErasureLP(source) >>> (v,s,o) = r.SolveLP() >>> from encoders import * >>> recon = EncodeFromLinkArray(map(lambda x: int(x),v),7,hammingCode) >>> diffs = map(lambda x,y: x != 0.5 and x != y, source,recon) >>> print 'num flips = ', diffs.count(1) num flips = 1
The visualize package contains routines to draw a low density parity
check code graph or the dual of a low density parity check code graph.
The main functions to call are:
VisualizeCodeGraph VisualizeDualCodeGraph
For example,
>>> VisualizeCodeGraph(7,4,[[0,1,2,4],[1,2,3,5],[2,3,4,6]])
will display the graph for a Hamming code, while
>>> VisualizeDualCodeGraph(7,4,[[0,1,2,4],[1,2,3,5],[2,3,4,6]])
will display the graph for its dual.
Copyright 2003 Mitsubishi Electric Research Laboratories All Rights Reserved. Permission to use, copy and modify this software and its documentation without fee for educational, research and non-profit purposes, is hereby granted, provided that the above copyright notice and the following three paragraphs appear in all copies.
To request permission to incorporate this software into commercial products contact: Vice President of Marketing and Business Development; Mitsubishi Electric Research Laboratories (MERL), 201 Broadway, Cambridge, MA 02139 or license@merl.com.
IN NO EVENT SHALL MERL BE LIABLE TO ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR CONSEQUENTIAL DAMAGES, INCLUDING LOST PROFITS, ARISING OUT OF THE USE OF THIS SOFTWARE AND ITS DOCUMENTATION, EVEN IF MERL HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGES.
MERL SPECIFICALLY DISCLAIMS ANY WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" BASIS, AND MERL HAS NO OBLIGATIONS TO PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS OR MODIFICATIONS.