2.4.6 Segment Transforms

Offset Factor

Adds an offset to each column in input data. This can be used in conjunction with scale factor to split multiple channels into distinct bands of data.

Parameters

input_columns – List of column names
offset_factor – int; number by which input_columns are offset by

Returns

DataFrame

Pre-emphasis Filter

Performs a pre-emphasis filter on the input columns and modifies the sensor streams in place. This is a first-order Fir filter that performs a weighted average of each sample with the previous sample.

Parameters

input_column (str) – sensor stream to apply pre_emphasis filter against
alpha (float) – pre-emphasis factor (weight given to the previous sample)
prior (int) – the value of the previous sample, default is 0

Returns

input data after having been passed through a pre-emphasis filter

Examples

>>> client.pipeline.reset()
>>> df = client.datasets.load_activity_raw_toy()
>>> print df
    out:
           Subject     Class  Rep  accelx  accely  accelz
        0      s01  Crawling    1     377     569    4019
        1      s01  Crawling    1     357     594    4051
        2      s01  Crawling    1     333     638    4049
        3      s01  Crawling    1     340     678    4053
        4      s01  Crawling    1     372     708    4051
        5      s01  Crawling    1     410     733    4028
        6      s01  Crawling    1     450     733    3988
        7      s01  Crawling    1     492     696    3947
        8      s01  Crawling    1     518     677    3943
        9      s01  Crawling    1     528     695    3988
        10     s01  Crawling    1      -1    2558    4609
        11     s01   Running    1     -44   -3971     843
        12     s01   Running    1     -47   -3982     836
        13     s01   Running    1     -43   -3973     832
        14     s01   Running    1     -40   -3973     834
        15     s01   Running    1     -48   -3978     844
        16     s01   Running    1     -52   -3993     842
        17     s01   Running    1     -64   -3984     821
        18     s01   Running    1     -64   -3966     813
        19     s01   Running    1     -66   -3971     826
        20     s01   Running    1     -62   -3988     827
        21     s01   Running    1     -57   -3984     843
>>> client.pipeline.set_input_data('test_data', df, force=True)

>>> client.pipeline.add_transform("Pre-emphasis Filter",
                   params={"input_column": 'accelx',
                        "alpha": 0.97,
                        "prior": 2})

>>> results, stats = client.pipeline.execute()
>>> print results
    out:
               Class  Rep Subject  accelx  accely  accelz
        0   Crawling    1     s01     187     569    4019
        1   Crawling    1     s01      -5     594    4051
        2   Crawling    1     s01      -7     638    4049
        3   Crawling    1     s01       8     678    4053
        4   Crawling    1     s01      21     708    4051
        5   Crawling    1     s01      24     733    4028
        6   Crawling    1     s01      26     733    3988
        7   Crawling    1     s01      27     696    3947
        8   Crawling    1     s01      20     677    3943
        9   Crawling    1     s01      12     695    3988
        10  Crawling    1     s01    -257    2558    4609
        11   Running    1     s01     -23   -3971     843
        12   Running    1     s01      -3   -3982     836
        13   Running    1     s01       1   -3973     832
        14   Running    1     s01       0   -3973     834
        15   Running    1     s01      -5   -3978     844
        16   Running    1     s01      -3   -3993     842
        17   Running    1     s01      -7   -3984     821
        18   Running    1     s01      -1   -3966     813
        19   Running    1     s01      -2   -3971     826
        20   Running    1     s01       1   -3988     827
        21   Running    1     s01       1   -3984     843

Scale Factor

Scale data by either a defined scalar, std or median.

Parameters

input_columns – list of column names
scale_factor – float; number by which input_columns are divided.

Returns

DataFrame

Examples

>>> client.pipeline.reset()
>>> df = client.datasets.load_activity_raw_toy()
>>> print df
    out:
           Subject     Class  Rep  accelx  accely  accelz
        0      s01  Crawling    1     377     569    4019
        1      s01  Crawling    1     357     594    4051
        2      s01  Crawling    1     333     638    4049
        3      s01  Crawling    1     340     678    4053
        4      s01  Crawling    1     372     708    4051
        5      s01  Crawling    1     410     733    4028
        6      s01  Crawling    1     450     733    3988
        7      s01  Crawling    1     492     696    3947
        8      s01  Crawling    1     518     677    3943
        9      s01  Crawling    1     528     695    3988
        10     s01  Crawling    1      -1    2558    4609
        11     s01   Running    1     -44   -3971     843
        12     s01   Running    1     -47   -3982     836
        13     s01   Running    1     -43   -3973     832
        14     s01   Running    1     -40   -3973     834
        15     s01   Running    1     -48   -3978     844
        16     s01   Running    1     -52   -3993     842
        17     s01   Running    1     -64   -3984     821
        18     s01   Running    1     -64   -3966     813
        19     s01   Running    1     -66   -3971     826
        20     s01   Running    1     -62   -3988     827
        21     s01   Running    1     -57   -3984     843
>>> client.pipeline.set_input_data('test_data', df, force=True)
>>> client.pipeline.add_transform('Scale Factor',
                    params={'scale_factor':4096.,
                    'input_columns':['accely']})
>>> results, stats = client.pipeline.execute()

Strip

Remove each signal’s mean or min from its values, while leaving specified passthrough columns unmodified. This function transforms a dataframe in such a way that the entire signal is shifted towards ‘mean’ or ‘min’.

Parameters

input_columns – The list of columns names to use.
type – Possible values are ‘mean’ or ‘min’.

Returns

If type = ‘mean’, mean of each qualified columns is calculated. Each value in a column will be subtracted by column-mean.

Example

>>> client.pipeline.reset()
>>> df = client.datasets.load_activity_raw_toy()
>>> df
    out:
           Subject     Class  Rep  accelx  accely  accelz
        0      s01  Crawling    1     377     569    4019
        1      s01  Crawling    1     357     594    4051
        2      s01  Crawling    1     333     638    4049
        3      s01  Crawling    1     340     678    4053
        4      s01  Crawling    1     372     708    4051
        5      s01  Crawling    1     410     733    4028
        6      s01  Crawling    1     450     733    3988
        7      s01  Crawling    1     492     696    3947
        8      s01  Crawling    1     518     677    3943
        9      s01  Crawling    1     528     695    3988
        10     s01  Crawling    1      -1    2558    4609
        11     s01   Running    1     -44   -3971     843
        12     s01   Running    1     -47   -3982     836
        13     s01   Running    1     -43   -3973     832
        14     s01   Running    1     -40   -3973     834
        15     s01   Running    1     -48   -3978     844
        16     s01   Running    1     -52   -3993     842
        17     s01   Running    1     -64   -3984     821
        18     s01   Running    1     -64   -3966     813
        19     s01   Running    1     -66   -3971     826
        20     s01   Running    1     -62   -3988     827
        21     s01   Running    1     -57   -3984     843

>>> client.pipeline.set_input_data('test_data', df, force=True,
                    data_columns=['accelx', 'accely', 'accelz'],
                    group_columns=['Subject', 'Class', 'Rep'],
                    label_column='Class')

>>> client.pipeline.add_transform("Strip",
                   params={"input_columns": ['accelx'],
                           "type": 'min' })

>>> results, stats = client.pipeline.execute()
>>> print results
    out:
               Class  Rep Subject  accelx  accely  accelz
        0   Crawling    1     s01   378.0     569    4019
        1   Crawling    1     s01   358.0     594    4051
        2   Crawling    1     s01   334.0     638    4049
        3   Crawling    1     s01   341.0     678    4053
        4   Crawling    1     s01   373.0     708    4051
        5   Crawling    1     s01   411.0     733    4028
        6   Crawling    1     s01   451.0     733    3988
        7   Crawling    1     s01   493.0     696    3947
        8   Crawling    1     s01   519.0     677    3943
        9   Crawling    1     s01   529.0     695    3988
        10  Crawling    1     s01     0.0    2558    4609
        11   Running    1     s01    22.0   -3971     843
        12   Running    1     s01    19.0   -3982     836
        13   Running    1     s01    23.0   -3973     832
        14   Running    1     s01    26.0   -3973     834
        15   Running    1     s01    18.0   -3978     844
        16   Running    1     s01    14.0   -3993     842
        17   Running    1     s01     2.0   -3984     821
        18   Running    1     s01     2.0   -3966     813
        19   Running    1     s01     0.0   -3971     826
        20   Running    1     s01     4.0   -3988     827
        21   Running    1     s01     9.0   -3984     843

Vertical AutoScale Segment

Scale the segment scale amplitude to MAX_INT_16 or as close to possible without overflowing. Scaling operation only applied for input_columns other sensor columns will be ignored.

Parameters

input_columns – The list of columns names to use.
group_columns – <List>; List of columns on which grouping is to be done. Each group will go through scale one at a time

Returns

The vertically scaled dataframe for each segment for input_columns.