.. _finding_peaks:


Finding peaks in a vector
=========================

.. [in preparation]

In the following example we try to find peaks in some artificially
generated data.

First we make a test time series data set for 4 antennas::

    >>> data = hArray(float, [4,512], name='Random series with peaks')

and fill it with random data that have arbitrary offsets::

    >>> data.random(-1024,1024)
    >>> data[...] += Vector([-128.,256., 385.,-50.])

Then we put some peaks at location 2-3, 32, and 64-67 in each of the 4
data sets::

    >>> for i in range(4):
    ...     data[i,[2,3,32,64,65,67],...] = Vector([4096.,5097,-4096,4096,5099,3096])

Now, we reverse-engineer and try finding all 5 sigma peaks::

    >>> nsigma = 5

First make a scratch array that will contain the locations of the
peaks.  A location is actually a 'slice' in the array, i.e. given by
its beginning and ending position (plus one). The length of the return
array must be pre-allocated and should be long enough to contain all
peaks (at maximum as long as the input array)::

    >>> datapeaks = hArray(int, [4,5,2], name="Location of peaks")

Now, retrieve the mean and RMS of the array to set the thresholds
above and below which one considers a peak to be significant::

    >>> datamean = data[...].mean()
    >>> datathreshold2 = data[...].stddev(datamean)
    >>> datathreshold2 *= nsigma
    >>> datathreshold1 = datathreshold2*(-1)
    >>> datathreshold1 += datamean
    >>> datathreshold2 += datamean

Finally, we determine the input parameters for the search algorithm::

    >>> maxgap = Vector(int, len(datamean), fill=10)

The gap vector tells the algorithm how many samples can be between two
values that are above threshold, so that the two peaks are considered
as one::

    >>> minlength = Vector(int, len(datamean), fill=1)

A minimum length can be specified to exclude peaks that consists of
only a single or a few values (no relevant here, so set to 1, i.e. all
peaks are relevant). Then call :func:`hFindSequenceOutside` (or
:func:`hFindSequenceBetween`, :func:`hFindSequenceGreatererThan`,
:func:`hFindSequenceLessEqual` ...)::

    >>> npeaks = datapeaks[...].findsequenceoutside(data[...], datathreshold1, datathreshold2, maxgap, minlength)

The return value is the number of peaks found (in each row of the data set)::

    >>> npeaks
    Vector(int, 4, fill=[3,3,3,3])

And the slices are actually contained in the return vector for each antenna::

    >>> datapeaks.mprint()
    [2,4,32,33,64,66,0,0,0,0]
    [2,4,32,33,64,66,0,0,0,0]
    [2,4,32,33,64,66,0,0,0,0]
    [2,4,32,33,64,66,0,0,0,0]