resurfemg.postprocessing.features module

Copyright 2022 Netherlands eScience Center and University of Twente Licensed under the Apache License, version 2.0. See LICENSE for details.

This file contains functions to extract features from preprocessed EMG arrays.

resurfemg.postprocessing.features.area_under_curve(array, start_index, end_index, end_curve=70, smooth_algorithm='none')

This algorithm should be applied to breaths longer than 60 values on an index. The mid_savgol assumes a parabolic fit. It is recommended to test a smoothing algorithm first, apply, then run the area_under the curve with none for smooth_algortihm. If a cutoff of the curve before it hits bottom is desired then a value other than zero must be in end_curve variable. This variable should be written from 0 to 100 for the percentage of the max value at which to cut off after the peak. :param array: an array e.g. single lead EMG recording :type array: np.array :param start_index: which index number the breath starts on :type start_index: int :param end_index: which index number the breath ends on :type end_index: int :param end_curve: percentage of peak value to stop summing at :type end_curve: float :param smooth_algorithm: algorithm for smoothing :type smooth_algorithm: str :returns: area; area under the curve :rtype: float

resurfemg.postprocessing.features.calc_closed_sampent(t_vecs, n, tolerance)
resurfemg.postprocessing.features.calc_open_sampent(t_vecs, n, tolerance)
resurfemg.postprocessing.features.entropical(sig)

This function computes something close to certain type of entropy of a series signal array. Input is sig, the signal, and output is an array of entropy measurements. The function can be used inside a generator to read over slices. Note it is not a true entropy, and works best with very small numbers.

Parameters:

sig (ndarray) – array containin the signal

Returns:

number for an entropy-like signal using math.log w/base 2

Return type:

float

resurfemg.postprocessing.features.entropy_maker(array, method='sample_entropy', base=None)

The following code allows a user to input an array and calculate either a time-series specific entropy i.e. the nolds or a more general Shannon entropy as calculated in scipy. It calls entropy functions in the file.

resurfemg.postprocessing.features.entropy_scipy(sli, base=None)

This function wraps scipy.stats entropy (which is a Shannon entropy) for use in the resurfemg library, it can be used in a slice iterator as a drop-in substitute for the hf.entropical but it is a true entropy.

Parameters:

sli (ndarray) – array

Returns:

entropy_count

Return type:

float

resurfemg.postprocessing.features.find_peak_in_breath(array, start_index, end_index, smooth_algorithm='none')

This algorithm locates peaks on a breath. It is assumed an array of absolute values for electrophysiological signals will be used as the array. The mid_savgol assumes a parabolic fit. The convy option uses a convolution to essentially smooth values with those around it as in function running_smoother() in the same module. It is recommended to test a smoothing algorithm first, apply, then run the find peak algorithm.

Parameters:
  • array (np.array) – an array e.g. single lead EMG recording

  • start_index (int) – which index number the breath starts on

  • end_index (int) – which index number the breath ends on

  • smooth_algorithm (str) – algorithm for smoothing (none or ‘mid-savgol’ or ‘convy’)

Returns:

index of max point, value at max point, smoothed value

Return type:

tuple

resurfemg.postprocessing.features.pseudo_slope(array, start_index, end_index, smoothing=True)

This is a function to get the shape/slope of the take-off angle of the resp. surface EMG signal, however we are returning values of mV divided by samples (in abs values), not a true slope and the number will depend on sampling rate and pre-processing, therefore it is recommended only to compare across the same single sample run

Parameters:
  • array (np.array) – an array e.g. single lead EMG recording

  • start_index (int) – which index number the breath starts on

  • end_index (int) – which index number the breath ends on

  • smoothing (bool) – smoothing which can or can not run before calculations

Returns:

pseudoslope

Return type:

float

resurfemg.postprocessing.features.rowwise_chebyshev(x, y)
resurfemg.postprocessing.features.sampen(data, emb_dim=2, tolerance=None, dist=<function rowwise_chebyshev>, closed=False)

The following code is adapted from openly licensed code written by Christopher Schölzel in his package nolds (NOnLinear measures for Dynamical Systems). It computes the sample entropy of time sequence data. Returns the sample entropy of the data (negative logarithm of ratio between similar template vectors of length emb_dim + 1 and emb_dim) [c_m, c_m1]: list of two floats: count of similar template vectors of length emb_dim (c_m) and of length emb_dim + 1 (c_m1) [float list, float list]: Lists of lists of the form [dists_m, dists_m1] containing the distances between template vectors for m (dists_m) and for m + 1 (dists_m1). Reference .. [se_1] J. S. Richman and J. R. Moorman, “Physiological time-series analysis using approximate entropy and sample entropy,” American Journal of Physiology-Heart and Circulatory Physiology, vol. 278, no. 6, pp. H2039-H2049, 2000.

Kwargs are emb_dim (int): the embedding dimension (length of vectors to compare) tolerance (float): distance threshold for two template vectors to be considered equal (default: 0.2 * std(data) at emb_dim = 2, corrected for dimension effect for other values of emb_dim) dist (function (2d-array, 1d-array) -> 1d-array): distance function used to calculate the distance between template vectors. Sampen is defined using rowwise_chebyshev. You should only use something else, if you are sure that you need it. closed (boolean): if True, will check for vector pairs whose distance is in the closed interval [0, r] (less or equal to r), otherwise the open interval [0, r) (less than r) will be used

Parameters:
  • data (array) – array-like

  • emb_dim (int) – the embedded dimension

  • tolerance (float) – distance threshold for two template vectors

  • distance (function) – function to calculate distance

Returns:

saen

Return type:

float

resurfemg.postprocessing.features.sampen_optimized(data, tolerance=None, closed=False)

The following code is adapted from openly licensed code written by Christopher Schölzel in his package nolds (NOnLinear measures for Dynamical Systems). It computes the sample entropy of time sequence data. emb_dim has been set to 1 (not parameterized) Returns the sample entropy of the data (negative logarithm of ratio between similar template vectors of length emb_dim + 1 and emb_dim) [c_m, c_m1]: list of two floats: count of similar template vectors of length emb_dim (c_m) and of length emb_dim + 1 (c_m1) [float list, float list]: Lists of lists of the form [dists_m, dists_m1] containing the distances between template vectors for m (dists_m) and for m + 1 (dists_m1). Reference: .. [se_1] J. S. Richman and J. R. Moorman, “Physiological time-series analysis using approximate entropy and sample entropy,” American Journal of Physiology-Heart and Circulatory Physiology, vol. 278, no. 6, pp. H2039–H2049, 2000.

Kwargs are pre-set and not available. For more extensive you should use the sampen function.

Parameters:
  • data (array) – array-like

  • tolerance (float) – distance threshold for two template vectors

  • distance (function) – function to calculate distance

Returns:

saen

Return type:

float

resurfemg.postprocessing.features.simple_area_under_curve(array, start_index, end_index)

This function is just a wrapper over np.sum written because it isn’t apparent to some clinically oriented people that an area under the curve will be a sum of all the numbers

Parameters:
  • array (np.array) – an array e.g. single lead EMG recording

  • start_index (int) – which index number the breath starts on

  • end_index (int) – which index number the breath ends on

Returns:

area; area under the curve

Return type:

float

resurfemg.postprocessing.features.snr_pseudo(src_signal, peaks, baseline=array([], dtype=float64))

Approximate the signal-to-noise ratio (SNR) of the signal based on the peak height relative to the baseline.

Parameters:
  • signal (ndarray) – Signal to evaluate

  • peaks – list of individual peak indices

  • signal – Signal to evaluate

Returns:

snr_peaks, the SNR per peak

Return type:

ndarray

resurfemg.postprocessing.features.times_under_curve(array, start_index, end_index)

This function is meant to calculate the length of time to peak in an absolute and relative sense

Parameters:
  • array (np.array) – an array e.g. single lead EMG recording

  • start_index (int) – which index number the breath starts on

  • end_index (int) – which index number the breath ends on

Returns:

times; a tuple of absolute and relative times

Return type:

tuple

resurfemg.postprocessing.features.variability_maker(array, segment_size, method='variance', fill_method='avg')

Calculate variability of segments of an array according to a specific method, then interpolate the values back to the original legnth of array

Parameters:
  • array (ndarray) – the input array

  • segment_size (int) – length over which variabilty calculated

  • method (str) – method for calculation i.e. variance or standard deviation

  • fill_method – method to fill missing values at end result array, ‘avg’ will fill with average of last values, ‘zeros’ fills zeros, and ‘resample’ will resample (not fill) and strech array to the full ‘correct’ length of the original signal

Returns:

variability_values array showing variability over segments

Return type:

ndarray