Essentials of Data Quality: Optimizing your range, sampling and filtering settings
All scientists know the value of high quality and consistent data. It is the first variable you want to eliminate, leaving you to concentrate on interpreting a result rather than worrying about lack of resolution or introduced artifacts. And while it can seem tricky to work out the optimal settings for your signal, it is well worth doing right early on - understanding whether your signal needs conditioning as you are recording is much more efficient than trying to manipulate your data retrospectively when some information may be lost. Here, we’ll discuss the essentials of optimizing your sampling rate, amplification and filtering settings to get the best data quality for waveform signals.
We’ll focus on biological signals recorded via a data acquisition unit (DAQ). In this case, you will be using a transducer to collect an analog signal, which is converted to a digital signal by your DAQ unit.
The resulting digital signal is sampled (or recorded) at regular intervals by your analysis software which, in turn, will store and display the data on your computer.
The regular interval at which the software ‘asks’ the DAQ unit for the voltage of the signal, is known as the sampling rate. For example, if your sampling rate is 10 Hz, your software will record a data value 10 times in every second. Each sampled point is recorded by your analysis software and contributes to the resulting curve. As you will see, the space between the points has an effect on how the curve is represented when they are joined.
If sampling rate is too low:
- Information will be lost.
- The original signal will not be represented correctly.
If sampling rate is too high:
- Signal may contain excessive noise or artifacts.
- E.g. if measuring EGG, a normal frequency is 3 Hz, but if you run it at 10000 Hz you will end up contaminating your signal with things like abdominal muscle activity which the high rate will pick up.
- Excess data increases processing time or huge disk files.
Two rules to help you choose your sampling rate
Appropriate sampling rates depend on the signal to be measured. While often your equipment will come with suggested sampling rates for different signals, it’s a good idea to check them rather than following the suggestions blindly. Here are two rules to keep in mind when you are choosing your sampling rate.
1. The Nyquist frequency
The minimum rate at which digital sampling can accurately record an analog signal is known as the Nyquist Frequency, which is double the highest expected signal frequency.
Nyquist frequency = 2 x highest expected frequency
e.g. you are recording ECG in humans that has components which can reach up to 50 Hz as their highest expected frequency, so the minimum sampling rate should be 100 Hz. To increase the detail recorded or the smoothness of the curve, you can go much higher - to 200 or even 400 Hz.
2. Magic Number 20
As a more general rule, a sampling rate that gives you around 20 data points for the peaks of your waveform signal will give you a smooth curve with adequate detail. The webinar will show you how to visualise your waveform as a series of individual points to check this
Also known as range, amplification is the selection of values between which your hardware will look for information and has a direct effect on resolution. When you reduce the range to fit your signal, you will get better resolution of your signal. However, if you reduce your range too much, you can cut off peaks and troughs and lose important details of your input signal.
What is resolution? The resolution of a recorded signal refers to the number of y-axis data points, or the density of the data points used to create a digital representation of that signal. It is often easier to think of resolution in practical terms. We use 16-bit resolution in this example to show how amplification affects the final resolution of your curve.
At 16-bit resolution and at any amplification setting, the y-axis is divided into 2 to the power of 16 segments (roughly 64 000). At an amplification setting of 10 V, the total range is 20 V (from +10 to -10). This gives you a resolution of 20/64000 = 312.5 uV. This means each segment of the y-axis is 312.5 uV high. This can cause problems if you have a very small signal, 1mV for example, and your range is too high, then your curve can look blocky and pixelated. If you reduce your range to 1 V, your resolution becomes 31.25 uV. As the segments become smaller, the data points can be closer together and more detail can be captured.
Finally, we’ll discuss the value of filters in gaining quality data. There are two main categories of filter. Analog, or hardware, filters filter the continuous, incoming signal before it is digitized. Digital, or software, filters filter the data after it has been converted to a digital signal and stored in its raw form.
- Can be useful if you already know you will need to remove known frequency noise from your signal. With an analog filter you can remove the noise before it is amplified by your bioamplifier, so you end up with a better signal to noise ratio.
- Alter the raw data in real time so that you cannot change them after the recording is complete. As you cannot remove the filter from your data, if the filter is not set correctly you will need to record your data again.
- Are applied over the raw data, without altering it. They can be used in real time or after recording has finished and can be manipulated at any stage without the need to record your data again.
- Are not affected by the real-world limitations of analog filters.
The most common filters you will see in analysis software are low pass and high pass filters.
A low pass filter allows the low frequency signals to be recorded, while removing the high frequency. This is useful if your signal is noisy and you want to smooth it out.
A high pass filter allows only the high frequencies through and eliminates the low frequencies. This is useful if your signal has a drifting baseline and you need to focus on the higher frequency component.
Some DAQ units come with a mains filter which will compare sampled data to a template of known mains noise such as interference from equipment around the lab - for example, a refrigerator, or fluorescent lights. The filter then subtracts the mains noise from the recorded data leaving your data intact and uncomplicated by the interfering frequencies.
A notch filter can be set to remove one particular frequency from your data. For example, if you have a centrifuge that produces 80 Hz noise you can set the filter to remove that band.
A band-pass filter removes all frequencies except the band that you select. For example, if you want to isolate the alpha activity from the beta wave in an EEG, you would set your band-pass filter to 8-12 Hz.
A band-stop filter is the opposite to a band-pass filter and removes the selected range of frequencies.
As shown here, the three main factors to consider for optimizing your data quality are sampling rate, range and filtering. Each can be applied to give clarity to your data and with appropriate use they can provide good resolution, reduce noise in your signal and give you the confidence to focus on the insights your data holds. For more information you can watch the full webinar and see detailed instructions on creating and changing the sampling rate, range and filter settings in LabChart analysis software.