blob: 50923e84aad7cac4fbb1ea3a5788d2d0d68e95b5 [file] [log] [blame]
= Digital Signal Processing
// Licensed to the Apache Software Foundation (ASF) under one
// or more contributor license agreements. See the NOTICE file
// distributed with this work for additional information
// regarding copyright ownership. The ASF licenses this file
// to you under the Apache License, Version 2.0 (the
// "License"); you may not use this file except in compliance
// with the License. You may obtain a copy of the License at
//
// http://www.apache.org/licenses/LICENSE-2.0
//
// Unless required by applicable law or agreed to in writing,
// software distributed under the License is distributed on an
// "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY
// KIND, either express or implied. See the License for the
// specific language governing permissions and limitations
// under the License.
This section of the user guide explores functions that are commonly used in the field of
Digital Signal Processing (DSP).
== Convolution
The `conv` function calculates the convolution of two vectors. The convolution is calculated by *reversing*
the second vector and sliding it across the first vector. The dot product of the two vectors
is calculated at each point as the second vector is slid across the first vector.
The dot products are collected in a third vector which is the convolution of the two vectors.
=== Moving Average Function
Before looking at an example of convolution it's useful to review the `movingAvg` function. The moving average
function computes a moving average by sliding a window across a vector and computing
the average of the window at each shift. If that sounds similar to convolution, that's because the `movingAvg`
function involves a sliding window approach similar to convolution.
Below is an example of a moving average with a window size of 5. Notice that the original vector has 13 elements
but the result of the moving average has only 9 elements. This is because the `movingAvg` function
only begins generating results when it has a full window. The `ltrim` function is used to trim the
first four elements from the original `y` array to line up with the moving average.
image::images/math-expressions/conv1.png[]
=== Convolutional Smoothing
The moving average can also be computed using convolution. In the example
below the `conv` function is used to compute the moving average of the first array
by applying the second array as a filter.
Looking at the result, we see that the convolution produced an array with 17 values instead of the 9 values created by the
moving average. That is because the `conv` function pads zeros
to the front and back of the first vector so that the window size is always full.
image::images/math-expressions/conv2.png[]
We achieve the same result as the `movingAvg` function by trimming the first and last 4 values of
the convolution result using the `ltrim` and `rtrim` functions.
The example below plots both the trimmed convolution and the moving average on the same plot. Notice that
they perfectly overlap.
image::images/math-expressions/conv3.png[]
This demonstrates how convolution can be used to smooth a signal by sliding a filter across the signal and
computing the dot product at each point. The smoothing effect is caused by the design of the filter.
In the example, the filter length is 5 and each value in the filter is .2. This filter calculates a
simple moving average with a window size of 5.
The formula for computing a simple moving average using convolution is to make the filter length the window
size and make the values of the filter all the same and sum to 1. A moving average with a window size of 4
can be computed by changing the filter to a length of 4 with each value being .25.
==== Changing the Weights
The filter, which is sometimes called the *kernel*, can be viewed as a vector of weights. In the initial
example all values in the filter have the same weight (.2). The weights in the filter can be changed to
produce different smoothing effects. This is demonstrated in the example below.
In this example the filter increases in weight from .1 to .3. This places more weight towards the front
of the filter. Notice that the filter is reversed with the `rev` function before the `conv` function applies it.
This is done because convolution will reverse
the filter. In this case we reverse it ahead of time and when convolution reverses it back, it is the same
as the original filter.
The plot shows the effect of the different weights in the filter. The dark blue line is the initial array.
The light blue line is the convolution and the orange line is the moving average. Notice that the convolution
responds quicker to the movements in the underlying array. This is because more weight has been placed
at the front of the filter.
image::images/math-expressions/conv4.png[]
== Cross-Correlation
Cross-correlation is used to determine the delay between two signals. This is accomplished by sliding one signal across another
and calculating the dot product at each shift. The dot products are collected into a vector which represents the correlation
at each shift. The highest dot product in the cross-correlation vector is the point where the two signals are most closely correlated.
The sliding dot product used in convolution can also be used to represent cross-correlation between two vectors. The only
difference in the formula when representing correlation is that the second vector is *not reversed*.
Notice in the example below that the second vector is reversed by the `rev` function before it is operated on by the `conv` function.
The `conv` function reverses the second vector so it will be flipped back to its original order to perform the correlation calculation
rather than the convolution calculation.
Notice in the result the highest value is 217. This is the point where the two vectors have the highest correlation.
image::images/math-expressions/crosscorr.png[]
== Find Delay
It is fairly simple to compute the delay from the cross-correlation result, but a convenience function called `finddelay` can
be used to find the delay directly. Under the covers `finddelay` uses convolutional math to compute the cross-correlation vector
and then computes the delay between the two signals.
Below is an example of the `finddelay` function. Notice that the `finddelay` function reports a 3 period delay between the first
and second signal.
image::images/math-expressions/delay.png[]
== Oscillate (Sine Wave)
The `oscillate` function generates a periodic oscillating signal which can be used to model and study sine waves.
The `oscillate` function takes three parameters: `amplitude`, `angular frequency`, and `phase` and returns a vector containing the y-axis points of a sine wave.
The y-axis points were generated from an x-axis sequence of 0-127.
Below is an example of the `oscillate` function called with an amplitude of
1, and angular frequency of .28 and phase of 1.57.
image::images/math-expressions/sinewave.png[]
=== Sine Wave Interpolation & Extrapolation
The `oscillate` function returns a function which can be used by the `predict` function to interpolate or extrapolate a sine wave.
The example below extrapolates the sine wave to an x-axis sequence of 0-256.
image::images/math-expressions/sinewave256.png[]
== Autocorrelation
Autocorrelation measures the degree to which a signal is correlated with itself. Autocorrelation is used to determine
if a vector contains a signal or is purely random.
A few examples, with plots, will help to understand the concepts.
The first example simply revisits the example above of an extrapolated sine wave. The result of this
is plotted in the image below. Notice that there is a structure to the plot that is clearly not random.
image::images/math-expressions/sinewave256.png[]
In the next example the `sample` function is used to draw 256 samples from a `uniformDistribution` to create a
vector of random data. The result of this is plotted in the image below. Notice that there is no clear structure to the
data and the data appears to be random.
image::images/math-expressions/noise.png[]
In the next example the random noise is added to the sine wave using the `ebeAdd` function.
The result of this is plotted in the image below. Notice that the sine wave has been hidden
somewhat within the noise. Its difficult to say for sure if there is structure. As plots
becomes more dense it can become harder to see a pattern hidden within noise.
image::images/math-expressions/hidden-signal.png[]
In the next examples autocorrelation is performed with each of the vectors shown above to see what the
autocorrelation plots look like.
In the example below the `conv` function is used to autocorrelate the first vector which is the sine wave.
Notice that the `conv` function is simply correlating the sine wave with itself.
The plot has a very distinct structure to it. As the sine wave is slid across a copy of itself the correlation
moves up and down in increasing intensity until it reaches a peak. This peak is directly in the center and is the
the point where the sine waves are directly lined up. Following the peak the correlation moves up and down in decreasing
intensity as the sine wave slides farther away from being directly lined up.
This is the autocorrelation plot of a pure signal.
image::images/math-expressions/signal-autocorrelation.png[]
In the example below autocorrelation is performed with the vector of pure noise. Notice that the autocorrelation
plot has a very different plot then the sine wave. In this plot there is long period of low intensity correlation that appears
to be random. Then in the center a peak of high intensity correlation where the vectors are directly lined up.
This is followed by another long period of low intensity correlation.
This is the autocorrelation plot of pure noise.
image::images/math-expressions/noise-autocorrelation.png[]
In the example below autocorrelation is performed on the vector with the sine wave hidden within the noise.
Notice that this plot shows very clear signs of structure which is similar to autocorrelation plot of the
pure signal. The correlation is less intense due to noise but the shape of the correlation plot suggests
strongly that there is an underlying signal hidden within the noise.
image::images/math-expressions/hidden-signal-autocorrelation.png[]
== Discrete Fourier Transform
The convolution-based functions described above are operating on signals in the time domain. In the time
domain the x-axis is time and the y-axis is the quantity of some value at a specific point in time.
The discrete Fourier Transform translates a time domain signal into the frequency domain.
In the frequency domain the x-axis is frequency, and y-axis is the accumulated power at a specific frequency.
The basic principle is that every time domain signal is composed of one or more signals (sine waves)
at different frequencies. The discrete Fourier transform decomposes a time domain signal into its component
frequencies and measures the power at each frequency.
The discrete Fourier transform has many important uses. In the example below, the discrete Fourier transform is used
to determine if a signal has structure or if it is purely random.
=== Complex Result
The `fft` function performs the discrete Fourier Transform on a vector of *real* data. The result
of the `fft` function is returned as *complex* numbers. A complex number has two parts, *real* and *imaginary*.
The *real* part of the result describes the magnitude of the signal at different frequencies.
The *imaginary* part of the result describes the *phase*. The examples below deal only with the *real*
part of the result.
The `fft` function returns a `matrix` with two rows. The first row in the matrix is the *real*
part of the complex result. The second row in the matrix is the *imaginary* part of the complex result.
The `rowAt` function can be used to access the rows so they can be processed as vectors.
=== Fast Fourier Transform Examples
In the first example the `fft` function is called on the sine wave used in the autocorrelation example.
The results of the `fft` function is a matrix. The `rowAt` function is used to return the first row of
the matrix which is a vector containing the real values of the `fft` response.
The plot of the real values of the `fft` response is shown below. Notice there are two
peaks on opposite sides of the plot. The plot is actually showing a mirrored response. The right side
of the plot is an exact mirror of the left side. This is expected when the `fft` is run on real rather than
complex data.
Also notice that the `fft` has accumulated significant power in a single peak. This is the power associated with
the specific frequency of the sine wave. The vast majority of frequencies in the plot have close to 0 power
associated with them. This `fft` shows a clear signal with very low levels of noise.
image::images/math-expressions/signal-fft.png[]
In the second example the `fft` function is called on a vector of random data similar to one used in the
autocorrelation example. The plot of the real values of the `fft` response is shown below.
Notice that in is this response there is no clear peak. Instead all frequencies have accumulated a random level of
power. This `fft` shows no clear sign of signal and appears to be noise.
image::images/math-expressions/noise-fft.png[]
In the third example the `fft` function is called on the same signal hidden within noise that was used for
the autocorrelation example. The plot of the real values of the `fft` response is shown below.
Notice that there are two clear mirrored peaks, at the same locations as the `fft` of the pure signal. But
there is also now considerable noise on the frequencies. The `fft` has found the signal and but also
shows that there is considerable noise along with the signal.
image::images/math-expressions/hidden-signal-fft.png[]