# Mathmatical method and algorithms for signal processing

Mathematical Methods and Algorithms
for
Signal Processing
Todd K. Moon
Utah State University

Wynn С Stirling
Brigham Young University

PRENTICE HALL

This previously included a CD. The
CD contents can now be accessed
at www.prenhall.com/moon. Thank You.

Contents
1
1

II
2

Introduction and Foundations

1

Introduction and Foundations
1.1
What is signal processing?
1.2
Mathematical topics embraced by signal processing
1.3
Mathematical models
1.4
Models for linear systems and signals
1.4.1 Linear discrete-time models
1.4.2 Stochastic MA and AR models
1.4.3 Continuous-time notation
1.4.4 Issues and applications
1.4.5 Identification of the modes
1.4.6 Control of the modes
1.5
filtering
1.5.1 System identification
1.5.2 Inverse system identification
1.5.4 Interference cancellation
1.6
Gaussian random variables and random processes
1.6.1 Conditional Gaussian densities
1.7
Markov and Hidden Markov Models
1.7.1 Markov models
1.7.2 Hidden Markov models
1.8
Some aspects of proofs
1.8.1 Proof by computation: direct proof

1.8.3 Proof by induction
1.9
An application: LFSRs and Massey's algorithm
1.9.1 Issues and applications of LFSRs
1.9.2 Massey's algorithm
1.9.3 Characterization of LFSR length in Massey's algorithm
1.10 Exercises
1.11 References

3
3
5
6
7
7
12
20
21
26
28
28
29
29
29
30
31
36
37
37
39
41
43
45
46
48
50
52
53
58
67

Vector Spaces and Linear Algebra

69

Signal Spaces
2.1
Metric spaces
2.1.1 Some topological terms
2.1.2 Sequences, Cauchy sequences, and completeness

71
72
76
78

Contents

2.2

2.3
2.4
2.5
2.6
2.7
2.8
2.9
2.10
2.11
2.12
2.13
2.14
2.15
2.16
2.17
2.18

2.1.3 Technicalities associated with the Lp and L^ spaces
Vector spaces
2.2.1
Linear combinations of vectors
2.2.2
Linear independence
2.2.3
Basis and dimension
2.2.4
Finite-dimensional vector spaces and matrix notation
Norms and normed vector spaces
2.3.1
Finite-dimensional normed linear spaces
Inner products and inner-product spaces
2.4.1
Weak convergence
Induced norms
The Cauchy-Schwarz inequality
Direction of vectors: Orthogonality
Weighted inner products
2.8.1
Expectation as an inner product
Hilbert and Banach spaces
Orthogonal subspaces
Linear transformations: Range and nullspace
Inner-sum and direct-sum spaces
Projections and orthogonal projections
2.13.1 Projection matrices
The projection theorem
Orthogonalization of vectors
Some final technicalities for infinite dimensional spaces
Exercises
References

Representation and Approximation in Vector Spaces
3.1
The approximation problem in Hilbert space
3.1.1
The Grammian matrix
3.2
The orthogonality principle
3.2.1
Representations in infinite-dimensional space
3.3
3.4
Matrix representations of least-squares problems
3.4.1
Weighted least-squares
3.4.2
Statistical properties of the least-squares estimate
3.5
Minimum error in Hilbert-space approximations

82
84
87
88
90
93
93
97
97
99
99
100
101
103
105
106
107
108
110
113
115
116
118
121
121
129
130
130
133
135
136
137
138
140
140
141

Applications of the orthogonality theorem
3.6
3.7
3.8
3.9

3.10
3.11
3.12
3.13

Approximation by continuous polynomials
Approximation by discrete polynomials
Linear regression
Least-squares
filtering
3.9.1
Least-squares prediction and AR spectrum
estimation
Minimum mean-square estimation
Minimum mean-squared error (MMSE)
filtering
Comparison of least squares and minimum mean squares
Frequency-domain optimal
filtering
3.13.1 Brief review of stochastic processes and
Laplace transforms

143
145
147
149
154
156
157
161
162
162

Contents

3.13.2

3.14
3.15
3.16
3.17
3.18

3.19

3.20
3.21
4

Two-sided Laplace transforms and their
decompositions
3.13.3 The Wiener-Hopf equation
3.13.4 Solution to the Wiener-Hopf equation
3.13.5 Examples of Wiener
filtering
3.13.6 Mean-square error
3.13.7 Discrete-time Wiener filters
A dual approximation problem
Minimum-norm solution of underdetermined equations
Iterative Reweighted LS (IRLS) for Lp optimization
Signal transformation and generalized Fourier series
Sets of complete orthogonal functions
3.18.1 Trigonometric functions
3.18.2 Orthogonal polynomials
3.18.3 Sine functions
3.18.4 Orthogonal wavelets
Signals as points: Digital communications
3.19.1 The detection problem
3.19.2 Examples of basis functions used in digital
communications
3.19.3 Detection in nonwhite noise
Exercises
References

Linear Operators and Matrix Inverses
4.1
Linear operators
4.1.1
Linear functionals
4.2
Operator norms
4.2.1
Bounded operators
4.2.2
The Neumann expansion
4.2.3
Matrix norms
4.3
4.3.1
A dual optimization problem
4.4
Geometry of linear equations
4.5
Four fundamental subspaces of a linear operator
4.5.1
The four fundamental subspaces with
non-closed range
4.6
Some properties of matrix inverses
4.6.1
Tests for invertibility of matrices
4.7
Some results on matrix rank
4.7.1
Numeric rank
4.8
Another look at least squares
4.9
Pseudoinverses
4.10 Matrix condition number
4.11 Inverse of a small-rank adjustment
4.11.1 An application: the RLS
4.11.2 Two RLS applications
4.12 Inverse of a block (partitioned) matrix
4.12.1 Application: Linear models
4.13 Exercises
4.14 References

165
169
171
174
176
176
179
182
183
186
190
190
190
193
194
208
210
212
213
215
228
229
230
231
232
233
235
235
237
239
239
242

filter

246
247
248
249
250
251
251
253
258
259
261
264
267
268
274

Contents

viii

5

Some Important Matrix Factorizations
5.1
The LU factorization
5.1.1
Computing the determinant using the LU factorization
5.1.2
Computing the LU factorization
5.2
The Cholesky factorization
5.2.1
Algorithms for computing the Cholesky factorization
5.3
Unitary matrices and the QR factorization
5.3.1
Unitary matrices
5.3.2
The QR factorization
5.3.3
QR factorization and least-squares
filters
5.3.4
Computing the QR factorization
5.3.5
Householder transformations
5.3.6
Algorithms for Householder transformations
5.3.7
QR factorization using Givens rotations
5.3.8
Algorithms for QR factorization using Givens rotations
5.3.9
Solving least-squares problems using Givens rotations
5.3.10 Givens rotations via CORDIC rotations
5.3.11 Recursive updates to the QR factorization
5.4
Exercises
5.5
References

275
275
277
278
283
284
285
285
286
286
287
287
291
293
295
296
297
299
300
304

6

Eigenvalues and Eigenvectors
6.1
Eigenvalues and linear systems
6.2
Linear dependence of eigenvectors
6.3
Diagonalization of a matrix
6.3.1
The Jordan form
6.3.2
6.4
Geometry of invariant subspaces
6.5
Geometry of quadratic forms and the minimax principle
6.6
Extremal quadratic forms subject to linear constraints
6.7
The Gershgorin circle theorem

305
305
308
309
311
312
316
318
324
324

Application of Eigendecomposition methods
6.8
6.9

6.10

6.11
6.12

6.13
6.14

Karhunen-Loeve low-rank approximations and principal methods —
6.8.1
Principal component methods
Eigenfilters
6.9.1
Eigenfilters for random signals
6.9.2
Eigenfilter for designed spectral response
6.9.3
Constrained eigenfilters
Signal subspace techniques
6.10.1 The signal model
6.10.2 The noise model
6.10.3 Pisarenko harmonic decomposition
6.10.4 MUSIC
Generalized eigenvalues
6.11.1 An application: ESPRIT
Characteristic and minimal polynomials
6.12.1 Matrix polynomials
6.12.2 Minimal polynomials
Moving the eigenvalues around: Introduction to linear control
Noiseless constrained channel capacity

327
329
330
330
332
334
336
336
337
338
339
340
341
342
342
344
344
347

ix

6.15

6.16
6.17

Computation of eigenvalues and eigenvectors
6.15.1 Computing the largest and smallest eigenvalues
6.15.2 Computing the eigenvalues of a symmetric matrix
6.15.3 The QR iteration
Exercises
References

The Singular Value Decomposition
7.1
Theory of the SVD
7.2
Matrix structure from the SVD
7.3
Pseudoinverses and the SVD
7.4
Numerically sensitive problems
7.5
Rank-reducing approximations: Effective rank
Applications of the SVD
7.6
System identification using the SVD
7.7
Total least-squares problems
7.7.1 Geometric interpretation of the TLS solution
7.8
Partial total least squares
7.9
Rotation of subspaces
7.10 Computation of the SVD
7.11 Exercises
7.12 References

350
350
351
352
355
368
369
369
372
373
375
377
378
381
385
386
389
390
392
395

Some Special Matrices and Their Applications
8.1
Modal matrices and parameter estimation
8.2
Permutation matrices
8.3
Toeplitz matrices and some applications
8.3.1
Durbin's algorithm
8.3.2
Predictors and lattice filters
8.3.3
Optimal predictors and Toeplitz inverses
8.3.4
Toeplitz equations with a general right-hand side
8.4
Vandermonde matrices
8.5
Circulant matrices
8.5.1
Relations among Vandermonde, circulant, and
companion matrices
8.5.2
Asymptotic equivalence of the eigenvalues of Toeplitz and
circulant matrices
8.6
Triangular matrices
8.7
Properties preserved in matrix products
8.8
Exercises
8.9
References

396
396
399
400
402
403
407
408
409
410

413
416
417
418
421

Kronecker Products and the Vec Operator
9.1
The Kronecker product and Kronecker sum
9.2
Some applications of Kronecker products
9.2.1
9.2.2
DFT computation using Kronecker products
9.3 The vec operator
9.4 Exercises
9.5 References

422
422
425
425
426
428
431
433

412

X

III

Detection, Estimation, and Optimal Filtering

435

10

Introduction to Detection and Estimation, and Mathematical Notation
10.1 Detection and estimation theory
10.1.1 Game theory and decision theory
10.1.2 Randomization
10.1.3 Special cases
10.2 Some notational conventions
10.2.1 Populations and statistics
10.3 Conditional expectation
10.4 Transformations of random variables
10.5 Sufficient statistics
10.5.1 Examples of sufficient statistics
10.5.2 Complete sufficient statistics
10.6 Exponential families
10.7 Exercises
10.8 References

437
437
438
440
441
442
443
444
445
446
450
451
453
456
459

11

Detection Theory
11.1 Introduction to hypothesis testing
11.2 Neyman-Pearson theory
11.2.1 Simple binary hypothesis testing
11.2.2 The Neyman-Pearson lemma
11.2.3 Application of the Neyman-Pearson lemma
11.2.4 The likelihood ratio and the receiver operating
characteristic (ROC)
11.2.5 A Poisson example
11.2.6 Some Gaussian examples
11.2.7 Properties of the ROC
11.3 Neyman-Pearson testing with composite binary hypotheses
11.4 Bayes decision theory
11.4.1 The Bayes principle
11.4.2 The risk function
11.4.3 Bayes
risk
11.4.4 Bayes tests of simple binary hypotheses
11.4.5 Posterior distributions
11.4.6 Detection and sufficiency
11.4.7 Summary of binary decision problems
11.5 Some M-ary problems
11.6 Maximum-likelihood detection
11.7 Approximations to detection performance: The union bound
11.8 Invariant Tests
11.8.1 Detection with random (nuisance) parameters
11.9
Detection in continuous time
11.9.1
Some extensions and precautions
11.10 Minimax Bayes decisions
11.10.1 Bayes envelope function
11.10.2 Minimax rules
11.10.3 Minimax Bayes in multiple-decision problems

460
460
462
462
463
466
467
468
469
480
483
485
486
487
489
490
494
498
498
499
503
503
504
507
512
516
520
520
523
524

xi

11.11
11.12

11.10.4 Determining the least favorable prior
11.10.5 A minimax example and the minimax theorem
Exercises
References

528
529
532
541

Estimation Theory
12.1
The maximum-likelihood principle
12.2
ML estimates and sufficiency
12.3
Estimation quality
12.3.1
The score function
12.3.2
The Cramer-Rao lower bound
12.3.3
Efficiency
12.3.4
Asymptotic properties of maximum-likelihood
estimators
12.3.5
The multivariate normal case
12.3.6
Minimum-variance unbiased estimators
12.3.7
The linear statistical model
12.4
Applications of ML estimation
12.4.1
ARMA parameter estimation
12.4.2
Signal subspace identification
12.4.3
Phase estimation
12.5
Bayes estimation theory
12.6
Bayes risk
12.6.1
MAP estimates
p 12.6.2
Summary
12.6.3
Conjugate prior distributions
12.6.4
Connections with minimum mean-squared
estimation
12.6.5
Bayes estimation with the Gaussian distribution
12.7
Recursive estimation
12.7.1
An example of non-Gaussian recursive Bayes
12.8
Exercises
12.9
References

542
542
547
548
548
550
552

577
578
580
582
584
590

The Kaiman Filter
13.1
The state-space signal model
13.2
Kaiman filter I: The Bayes approach
13.3
Kaiman filter II: The innovations approach
13.3.1
Innovations for processes with linear observation models.
13.3.2
Estimation using the innovations process
,
13.3.3
Innovations for processes with state-space models
13.3.4
A recursion for P„ r _|
13.3.5
The discrete-time Kaiman
filter
13.3.6
Perspective
13.3.7
Comparison with the RLS adaptive filter algorithm
13.4 Numerical considerations: Square-root
filters
13.5 Application in continuous-time systems
13.5.1 Conversion from continuous time to discrete time
13.5.2 A simple kinematic example
13.6 Extensions of Kaiman filtering to nonlinear systems

591
591
592
595
596
597
598
599
601
602
603
604
606
606
606
607

553
556
559
561
561
561
565
566
568
569
573
574
574

xii

Contents

Smoothing
13.7.1 The Rauch-Tung-Streibel fixed-interval smoother
13.8 Another approach: Я«, smoothing
13.9 Exercises
13.10 References

IV
14

15

13.7

613
613
616
617
620

Iterative and Recursive Methods in Signal Processing

621

Basic Concepts and Methods of Iterative Algorithms
14.1 Definitions and qualitative properties of iterated
functions
14.1.1 Basic theorems of iterated functions
14.1.2 Illustration of the basic theorems
14.2 Contraction mappings
14.3 Rates of convergence for iterative algorithms
14.4 Newton's method
14.5 Steepest descent
14.5.1 Comparison and discussion: Other techniques
Some Applications of Basic Iterative Methods
14.6.1 An example LMS application
14.6.2 Convergence of the LMS algorithm
14.7 Neural networks
14.7.1 The backpropagation training algorithm
14.7.2 The nonlinearity function
14.7.3 The forward-backward training algorithm
14.7.5 Neural network code
14.7.6 How many neurons?
14.7.7 Pattern recognition: ML or NN?
14.8 Blind source separation
14.8.1 A bit of information theory
14.8.2 Applications to source separation
14.8.3 Implementation aspects
14.9 Exercises
14.10 References
Iteration by Composition of Mappings
15.1 Introduction
15.2 Alternating projections
15.2.1 An applications: bandlimited reconstruction
15.3 Composite mappings
15.4 Closed mappings and the global convergence theorem
15.5 The composite mapping algorithm
15.5.1 Bandlimited reconstruction, revisited
15.5.2 An example: Positive sequence determination
15.5.3 Matrix property mappings
15.6 Projection on convex sets
15.7 Exercises
15.8 References

623
624
626
627
629
631
632
637
642
643
645
646
648
650
653
654
654
655
658
659
660
660
662
664
665
668
670
670
671
675
676
677
680
681
681
683
689
693
694

Contents

xiii

16

Other Iterative Algorithms
16.1 Clustering
16.1.1 An example application: Vector quantization
16.1.2 An example application: Pattern recognition
16.1.3 к -means Clustering
16.1.4 Clustering using fuzzy к -means
16.2 Iterative methods for computing inverses of matrices
16.2.1 The Jacobi method
16.2.2 Gauss-Seidel iteration
16.2.3 Successive over-relaxation (SOR)
16.3 Algebraic reconstruction techniques (ART)
16.4 Conjugate-direction methods
16.7 Exercises
16.8 References

695
695
695
697
698
700
701
702
703
705
706
708
710
713
713
715

17

The EM Algorithm in Signal Processing
17.1 An introductory example
17.2 General statement of the EM algorithm
17.3 Convergence of the EM algorithm
17.3.1 Convergence rate: Some generalizations
Example applications of the EM algorithm
17.4 Introductory example, revisited
17.5 Emission computed tomography (ЕСТ) image reconstruction
17.6 Active noise cancellation (ANC)
17.7 Hidden Markov models
17.7.1 The E-and M-steps
r,
r
17.7.2 The forward and backward probabilities
17.7.3 Discrete output densities
17.7.4 Gaussian output densities
17.7.5 Normalization
17.7.6 Algorithms for HMMs
17.9 Summary
17.10 Exercises
17.11 References

717
718
721
723
724
725
725
729
732
734
735
736
736
737
738
740
743
744
747

V

Methods of Optimization

749

18

Theory of Constrained Optimization
18.1 Basic definitions
18.2 Generalization of the chain rule to composite functions
18.3 Definitions for constrained optimization
18.4 Equality constraints: Lagrange multipliers
18.4.1 Examples of equality-constrained optimization
18.5 Second-order conditions
18.6 Interpretation of the Lagrange multipliers
18.7 Complex constraints
......
18.8 Duality in optimization

751
751
755
757
758
764
767
770
773
773

Contents

xiv

19

18.9

Inequality constraints: Kuhn-Tucker conditions
18.9.1 Second-order conditions for inequality constraints
18.9.2 An extension: Fritz John conditions
18.10 Exercises
18.11 References

777
783
783
784
786

Shortest-Path Algorithms and Dynamic Programming
19.1 Definitions for graphs
19.2 Dynamic programming
19.3 The Viterbi algorithm
19.4 Code for the Viterbi algorithm
19.4.1 Related algorithms: Dijkstra's and Warshall's
19.4.2 Complexity comparisons of Viterbi and Dijkstra

787
787
789
791
795
798
799

Applications of path search algorithms
19.5

Maximum-likelihood sequence estimation
19.5.1 The intersymbol interference (ISI) channel
19.5.2 Code-division multiple access
19.5.3 Convolutional decoding
HMM likelihood analysis and HMM training
19.6.1 Dynamic warping
Alternatives to shortest-path algorithms
Exercises
References

800
800
804
806
808
811
813
815
817

Linear Programming
20.1 Introduction to linear programming
20.2 Putting a problem into standard form
20.2.1 Inequality constraints and slack variables
20.2.2 Free variables
20.2.3 Variable-bound constraints
20.2.4 Absolute value in the objective
20.3 Simple examples of linear programming
20.4 Computation of the linear programming solution
20.4.1 Basic variables
20.4.2 Pivoting
20.4.3 Selecting variables on which to pivot
20.4.4 The effect of pivoting on the value of the problem
20.4.5 Summary of the simplex algorithm
20.4.6 Finding the initial basic feasible solution
20.4.7 MATLAB® code for linear programming
20.4.8 Matrix notation for the simplex algorithm
20.5 Dual problems
20.6 Karmarker's algorithm for LP
20.6.1 Conversion to Karmarker standard form
20.6.2 Convergence of the algorithm
20.6.3 Summary and extensions

818
818
819
819
820
822
823
823
824
824
826
828
829
830
831
834
835
836
838
842
844
846

19.6
19.7
19.8
19.9
20

Examples and applications of linear programming
20.7
20.8

Linear-phase FIR filter design
20.7.1 Least-absolute-error approximation
Linear optimal control

846
847
849

Contents

xv

20.9 Exercises
20.10 References

850
853

A

Basic Concepts and Definitions
A.l
Set theory and notation
A.2
Mappings and functions
A.3
Convex functions
A.4
О and о Notation
A.5
Continuity
A.6
Differentiation
A.6.1 Differentiation with a single real variable
A.6.2 Partial derivatives and gradients on W"
A.6.3 Linear approximation using the gradient
A.6.4 Taylor series
A.7
Basic constrained optimization
A.8
The Holder and Minkowski inequalities
A.9
Exercises
A. 10 References

855
855
859
860
861
862
864
864
865
867
868
869
870
871
876

В

Completing the Square
B. 1
The scalar case
B.2
The matrix case
B.3
Exercises

877
877
879
879

С

Basic Matrix Concepts
C.l
Notational conventions
C.2
Matrix Identity and Inverse
C.3
Transpose and trace
C.4
Block (partitioned) matrices
C.5
Determinants
C.5.1 Basic properties of determinants
C.5.2 Formulas for the determinant
C.5.3 Determinants and matrix inverses
C.6
Exercises
C.7
References

880
880
882
883
885
885
885
887
889
889
890

D

Random Processes
D.l
Definitions of means and correlations
г D.2
Stationarity
D.3
Power spectral-density functions
D.4
Linear systems with stochastic inputs
D.4.1 Continuous-time signals and systems
D.4.2 Discrete-time signals and systems
D.5
References

891
891
892
893
894
894
895
895

E

E. 1 Derivatives of vectors and scalars with respect to a real vector
E.2 Derivatives of real-valued functions of real matrices
E.3 Derivatives of matrices with respect to scalars, and vice versa
E.4 The transformation principle
E.5 Derivatives of products of matrices

896
896
897
899
901
903
903

xvi

Contents

E.6
E.7
E.8
E.9
E.10
F

Derivatives of powers of a matrix
Derivatives involving the trace
Modifications for derivatives of complex vectors and matrices
Exercises
References

904
906
908
910
912

Conditional Expectations of Multinomial and Poisson r.v.s
F. 1 Multinomial distributions
F.2 Poisson random variables
F.3 Exercises

913
913
914
914

Bibliography

915

Index

929

&

### Tài liệu bạn tìm kiếm đã sẵn sàng tải về

Tải bản đầy đủ ngay

×