Sparse signal representation using overlapping frames.

Karl Skretting, Stavanger University College.

Most of the content on this page is obsolete now.
An updated page is the dictionary learning page,
where newer sparse approximation (matching pursuit) implemetations are available.

Introduction

This page contains MatLab functions for frame design and sparse signal representation. During the work on my thesis a lot of Matlab files were generated. Many of these functions will be helpful for people (students) that want to continue the work on spare signal representation and frame design and for people that work in (closely) related areas, some of the files may even have a more general interest. To make the job of deciding if some of these functions may be helpful for your work there is a brief description just below the list of files.

Many of the functions are rather complicated as they are closely connected to the rather theoretic work in my thesis. To understand these functions at least some understanding of the work in my thesis is necessary, the thesis is also included in the file list below.

An overview of the matching pursuit algorithms for vector selection is given in an article presented at the NORSIG-2003 conference in Bergen, Norway. The vector selection part of this document contains the algorithms both as Matlab functions and as an exe-file. Using the exe-file execution is approximately ten times faster.

Files

FrameTools.zip, 62 files 188 kB. This file contains all the m-files and also the file VSblockC.exe. They should all be copied into one directory/catalog that is included in the Matlab path, for example ..\FrameTools\.
FrameEx.zip, 4 example frames 44 kB. The example frames are stored in mat-files. They should be copied into a Matlab work directory, i.e. current working directory (pwd).
Mfiles.pdf, 119 kB. This file contains the help texts for the m-files.
dissert.pdf, 2778 kB. This file contains the thesis as a pdf file.
Fig516Lena.eps, 182 kB. The quality of Figure 5.16 in the pdf-file of the thesis is poorer than in the original eps-file. To get a better qualitative judgement of the image quality for the different methods presented, the eps-file is recommended.

Matching Pursuit algorithms implemented in Java

Spring and summer of 2007 I made some Java classes for Matching Pursuit, they were collected in mp package. The main class is MatchingPursuit which contains the different variants of the Matching Pursuit algorithm. The dictionary, or frame, should be given as a matrix which should be representated by an instance of the MPDictionary abstract class. Five different implementations of this abstract class is given, but I think that MPFlexMatrix will be appropriate for most dictionaries, i.e. matrices. The classes are:
MatchingPursuit.class with documentation
MPDictionary.class with documentation
MPSimpleMatrix.class with documentation
MPMatrix.class with documentation
MPSparseMatrix.class with documentation
MPBandMatrix.class with documentation
MPFlexMatrix.class with documentation

I have also made two Java classes that implement several matching pursuit algorithms. This work was finished in 2006, but most work done earlier. The classes are:
BlockFrame.class, 15.9 kB, with documentation
OverlappingFrame.class, 13.6 kB, with documentation
I also made a Matlab m-file to test these functions in Matlab, test02.m, 10.4 kB, this file is in Norwegian. This file is an example of how to use Java classes in Matlab m-files. To make Java classes available to Matlab, see Matlab documentation Calling Java from MATLAB, typically you need to edit the Matlab file classpath.txt.

Description of the Matlab functions

The Matlab functions can be grouped into several categories. For all of the functions a more detailed description is available in the help text of the function.

Help text, information and examples

These are functions made to make it possible to understand and use the other functions.

Pure help text functions

contents lists all the m-files in the catalog
DataFile explains the contents of a mat-file that stores a set of training data.
FrameFile explains the contents of a mat-file that stores a frame.

Information on frames and frame properties

A frame is stored in a mat-file. These functions display information and properties for the frames.

FrameInfo displays information about the frame stored in a mat-file
FrameProperties finds, displays and returns some properties of a frame F. This function is closely related to Chapter 7 in my thesis.
FramePropertiesObjFun is an objective function used in FrameProperties
MCsimF is Monte Carlo simulation for the frame F, also used in FrameProperties
PlotF plots the column vectors of the one-dimensional frame F
PlotF2D plots the image blocks in 2D frame F
PlotFt plots the image (texture) blocks in frame F, this function is a variant of PlotF2D.
PlotSNR plots development in SNR after each iteration during training

Example functions

These functions give examples on how to use other functions to do a specific task.

FrameDesignEx designs example frames for an AR(1) signal.
SignalExpansion tests many different methods for signal expansion.
TestGenLloyd tests the function GenLloyd

Signal representation

The (sparse) signal representations done by these functions are different kinds of signal expansions.

The main functions are

Decom1D decomposes a 1D-signal into expansion coefficients.
Decom2D decomposes a 2D-signal (image) into expansion coefficients.
Ttimes does the operation Y=T*X, where T represent a filter bank or a transform.

The other functions are

ImDC splits an image into DC component and residual, or opposite.
Reorder reorders distinct image blocks into columns, or vice versa.
GetELT is Extended Lapped (orthogonal) Transform, ELT, of size 4NxN
GetHaar finds the NxN Haar matrix
GetLOT is Lapped Orthogonal Transform of size 2NxN
GetWave gets a synthesis transform matrix based on a dyadic wavelet filter bank.
C_ana is the analysis part of an IIR filter bank based on all-pass filters
C_anablock is an analysis block in a FIR two channel filter bank
C_anafb is tree structured analysis filter bank using IIR filter.
C_syn is the synthesis part of an IIR filter bank based on all-pass filters
C_synblock is a synthesis block in a FIR two channel filter bank
C_synfb is tree structured synthesis filter bank using IIR filter.

Frame design

The main functions are

DesignF designs the frame and store it in FrameFile
DesignFb is a stripped version of DesignF, only for block-oriented frame.

Functions for initializing the frame

SetF03 sets F using initial values from DCT/Haar/LOT/ELT/eye/randn
SetF05 sets F as a general filter bank, using initial values from FIR filters
SetF06 sets F as a general filter bank, using initial values from a wavelet
SetF07 sets F as a general filter bank with N=8, K=16, P=6 and Q=208
SetFfromImage sets F as a frame of randomly selected blocks from image B.
GenLloyd sets or designs a codebook (frame) by the Generalized Lloyd Algorithm

The other functions are

BulidFg builds the Fg matrix from F vector and G matrix.
BulidG builds the G matrix and F vector from Fg matrix.
MapCD makes the tables that map from WWT (or W*W') to C.
NormalizeF normalizes the synthesis vectors in frame F.

Vector selection, Matching Pursuit algorithms

Downloads of the Matlab-files are available at the top of this page.

I recommend that you read the overview of the matching pursuit algorithms for vector selection that is given in the article presented at the NORSIG-2003 conference in Bergen, Norway. The abstract is:

In this paper a new algorithm for vector selection in signal representation problems is proposed, we call it Partial Search (PS). The vector selection problem is described, and one group of algorithms for solving this problem, the Matching Pursuit (MP) algorithms, is reviewed. The proposed algorithm is based on the Order Recursive Matching Pursuit (ORMP) algorithm, it extends ORMP by searching a larger part of the solution space in an effective way. The PS algorithm tests up to a given number of possible solutions and returns the best, while ORMP is a greedy algorithm testing and returning only one possible solution. A detailed description of PS is given. In the end some examples of its performance are given, showing that PS performs very well, that the representation error is considerable reduced and that the probability of finding the optimal solution is increased compared to ORMP, even when only a small part of the solution space is searched.

The vector selection functions presented here can be grouped into two categories:

General vector selection methods

These are the most general vector selection functions, most of them are variants of the matching pursuit algorithms. The functions take a signal vector, the number of vectors to select and (for some) the set of vectors to select from (the frame or the dictionary) as input arguments, and they return the weight vector. Note that the frame (dictionary) for most of these functions must be given as a global variable, F, and sometimes also the inner product of the frame vectors to each other must be given as a global variable, FF. This is done to make the functions faster when used many times with the same frame, but now it is usually even faster to use the compiled program, VSblockC, for this case. The functions are:

VSmp2, Basic Matching Pursuit variant 2 (the other variants should not be mentioned). This Vector Selection algorithm allows the user to give how many times the main loop should be executed.
VSomp, Orthogonal Matching Pursuit. A clean and efficient implementation of the OMP algorithm. This is the preferred m-file for OMP.
VSompJHH, Orthogonal Matching Pursuit. A straight forward implementation of the OMP algorithm that does not use QR factorization. This is a slow variant of OMP.
VSormp, Order Recursive Matching Pursuit. A clean and efficient implementation of the ORMP algorithm. This is the preferred m-file for ORMP, for Matlab 6.5 or newer.
VSormp2, Order Recursive Matching Pursuit variant 2. A slow implementation of the ORMP algorithm. You should rather use VSormp.
VSfomp2, Fast Orthogonal Matching Pursuit variant 2. This was one of the first m-files doing order recursive matching pursuit (ORMP) that I made (based on the implementation of A. Barkhald of the Vector Selection algorithm presented by Gharavi). Usually VSormp is the preferred one, but if you use Matlab 5.0 this is the fastest version for ORMP.
VSps, Vector Selection algorithm doing Partial Search by a recursive function. Usually VSps2 finds better results faster.
VSps2, Vector Selection algorithm doing Partial Search. Algorithm is improved since VSps, and arguments passed to the function is changed a little bit since VSps. This is an implementation of the algorithm presented at NORSIG 2003 in Bergen.
VSfs, Vector Selection algorithm doing Full Search. This function is only relevant for small problems.
VSab2, Vector Selection algorithm that Always returns a Better (or the same) w. Which particular algorithm to use is randomized, and if it does not find a better solution the original values in w will be kept.

Special vector selection functions

These are the functions as used for sparse signal representation and for frame design. The input is usually a large set of vectors, and output is a (sparse) matrix of weights. The functions are:

VSblock, Vector Selection for block-oriented frame, this function is well tested and much used. It is the one most likely to have a wider interest.
VSblockC, Vector Selection for block-oriented frame where a compiled C-program is called to do the hard work. Also this function is well tested and much used. It is approximately ten times faster than VSblock. The compiled program, VSblockC.exe, should be on the same directory as the m-file.
VSolap1, Vector Selection for overlapping frame and 1D signal, this function is also fairly well tested.
VSolap2, Vector Selection for overlapping frame and 2D signal, this function is only made to do vector selection for the special case 3 in Section 5.2 in my thesis.

Test of the vector selection functions

I have also made two m-files to test and demonstrate the functions above.

TestVS, an m-file that tests the different functions listed in the general vector selection methods section.
TestVSblockC, an m-file that tests mainly the VSblockC function, but also the VSblock function is used.

Here we presents some results of the TestVSblockC function. We have done the test on two systems: A new fast PC, 2.4 GHz Pentium 4 CPU, running Matlab version 6.5, and an old PC, 120 Mhz Pentium CPU, running Matlab version 5.0. We note from the times of VSblockC that the fast PC is approximately 37 times faster than the slower PC. The tables are divided vertically into two halves, on the left side are the results using VSblockC (the fast C program) and on the right side are the results using the Matlab function VSblock. The table is also grouped horizontally into groups for each of the different main algorithms for vector selection, the number of variants within each group vary from VSblockC and VSblock.

The test vectors that are generated for this test are simply random values generated by the Matlab randn function, and so are the frame vectors. On the fast PC we use L=10000 test vectors, each of length N=16, selecting S=9 frame vectors from a frame of K=40 vectors. On the slow PC we use only one tenth of the test vectors, L=1000.

Results on the fast PC and Matlab 6.5
VSblockC	error (1)	time (s)	time (s)	error (1)	VSblock
BMP, A1	0.0743	0.84
BMP, A2	0.0448	1.19	12.33	0.0469	VSmp2
BMP, A3	0.0295	1.83
OMP, A4	0.0240	0.81	11.61	0.0240	VSomp
			11.7	0.0188	VSormp
ORMP, A5	0.0189	0.80	60.2	0.0188	VSormp2
			17.5	0.0189	VSfomp2
PS, A6 M=5	0.0097	3.91	959	0.0069	VSps (2)
PS, A6 M=20	0.0063	13.6	235	0.0065	VSps2 (3)
PS, A6 M=50	0.0051	34.0	544	0.0051	VSps2 (4)
PS, A6 M=200	0.0039	126
PS, A6 M=500	0.0034	311

Results on slow PC and Matlab 5.0
VSblockC	error (1)	time (s)	time (s)	error (1)	VSblock
BMP, A1	0.0744	2.97
BMP, A2	0.0444	3.62	54	0.0465	VSmp2
BMP, A3	0.0294	5.77
OMP, A4	0.0244	2.69	807	0.0244	VSomp
			818	0.0198	VSormp
ORMP, A5	0.0198	2.69	249	0.0198	VSormp2
			59	0.0198	VSfomp2
PS, A6 M=5	0.0105	14.0	3919	0.0072	VSps (2)
PS, A6 M=20	0.0067	49.4	13488	0.0068	VSps2 (3)
PS, A6 M=50	0.0053	126	31368	0.0054	VSps2 (4)
PS, A6 M=200	0.0041	458
PS, A6 M=500	0.0035	1160

(1) The error measure is the average (over the NL entries in the error signal) of the error squared (energy). Since the signal samples are random Gaussian distributed values with standard deviation one, the (expected) average value of the energy of the signal samples is 1.

(2) VSps is the recursive variant of partial search, here we have used the P-argument as [5,4], giving 20 different combinations to search. The function is not called from VSblock.

(3) VSps2 here is from using the VSblock function. Here M is always 20.

(4) VSps2 with M=50 is found without using VSblock.

Abstract of my Ph.D. thesis.

Signal expansions using frames may be considered as generalizations of signal representations based on transforms and filter banks. Frames for sparse signal representations may be designed using an iterative method with two main steps: (1) Frame vector selection and expansion coefficient determination for signals in a training set, -- selected to be representative of the signals for which compact representations are desired, using the frame designed in the previous iteration. (2) Update of frame vectors with the objective of improving the representation of step (1).

In this thesis we solve step (2) of the general frame design problem using the compact notation of linear algebra. This makes the solution both conceptually and computationally easy, especially for the non-block-oriented frames, -- for short overlapping frames, that may be viewed as generalizations of critically sampled filter banks. Also, the solution is more general than those presented earlier, facilitating the imposition of constraints, such as symmetry, on the designed frame vectors. We also take a closer look at step (1) in the design method. Some of the available vector selection algorithms are reviewed, and adaptations to some of these are given. These adaptations make the algorithms better suited for both the frame design method and the sparse representation of signals problem, both for block-oriented and overlapping frames.

The performances of the improved frame design method are shown in extensive experiments. The sparse representation capabilities are illustrated both for one-dimensional and two-dimensional signals, and in both cases the new possibilities in frame design give better results.

Also a new method for texture classification, denoted Frame Texture Classification Method (FTCM), is presented. The main idea is that a frame trained for making sparse representations of a certain class of signals is a model for this signal class. The FTCM is applied to nine test images, yielding excellent overall performance, for many test images the number of wrongly classified pixels is more than halved, in comparison to state of the art texture classification methods.

Finally, frames are analyzed from a practical viewpoint, rather than in a mathematical theoretic perspective. As a result of this, some new frame properties are suggested. So far, the new insight this has given has been moderate, but we think that this approach may be useful in frame analysis in the future.

Visit my home page.

Last update: April 7, 2011. Karl Skretting