np.float64(19.898744534830083)
A short remainder on numpy
with exercises
CNRS
IMAG
Paul-Valéry Montpellier 3 University
Short application about numpy
, just a refresh
Exercises:
np.float64(19.898744534830083)
np.float64(19.89874453483009)
Exercice:
Use np.testing
with the right assert function for comparison purpose.
array([[0, 1, 2, 3, 4],
[5, 6, 7, 8, 9]])
Exercise:
array([ 5, 7, 9, 11, 13])
A frequent in image processing is to find a precise pattern in a given image. We will restrict ourselves to the one-dimensional case (a list of positive integers), and we will first try to implement this algorithm.
We want a function that takes as argument two unidimensional numpy arrays, the first contains the data, and the second the sequence we want to find in the data. The function returns the list of indices in the array of data, indices which correspond to the start of each subsequence of data identical to the sequence we are searching for.
We want to get
First create an increasing list of indices avec np.arange
with the same size of the search sequence
array([0, 1])
Call data_size
the size of the input data.
We now want a list of increasing indices from 0 to data_size-seq_size
, but transformed into a column vector thanks to reshape
of numpy. Call data_ind
this column vector (dimesion (data_size-seq_size+1,1)
).
array([[0],
[1],
[2],
[3],
[4],
[5],
[6],
[7],
[8],
[9]])
We will then use numpy’s broadcasting rules to create a vector of dimension (data_size-seq_size+1,2)
which contains the list of all possible adjacent sequences of indices that we want to locate in the data as follows:
What very simple operation to perform on data_ind
and seq_ind
to get this?
array([[ 0, 1],
[ 1, 2],
[ 2, 3],
[ 3, 4],
[ 4, 5],
[ 5, 6],
[ 6, 7],
[ 7, 8],
[ 8, 9],
[ 9, 10]])
(10,1) and (2,) are compatible because the first array data_ind
contains an unit dimension on the right, the broadcasting, it will first “stretch” on this dimension to match that of seq_ind
, then broadcast the addition of seq_ind
over the ten lines, the first element of ind_seq
being added to the first element of a line, then the second element of seq_ind
adding to the second element of the corresponding row. The operation is thus repeated on each line.
Using the result of the previous question as indices for the array of data
, apply the search for sequences that are correctly matched with a simple operator. Explain why the result has the same dimension (shape) as data_ind
and not as data
.
array([[False, False],
[False, False],
[False, False],
[ True, True],
[False, False],
[False, False],
[False, False],
[ True, True],
[False, True],
[False, False]])
array([[False, False],
[False, False],
[False, False],
[ True, True],
[False, False],
[False, False],
[False, False],
[ True, True],
[False, True],
[False, False]])
It is the array data_ind + seq_ind
which is used to index the array of data
, the corresponding indices are simply used on data
to provide the result. Next, numpy performs a broadcasting with the ==
operator which returns a boolean for each element broadcasted from both sides (as before, first element of each line of the first operand on first element of the sequence to match, etc.)
Now we are looking for all lines having a perfect match, ie only True
. Use the np.all
function for this
array([False, False, False, True, False, False, False, True, False,
False])
Finally we now extract the indices where there is “match” thanks to np.nonzero
array([3, 7])
Numpy WorkoutAdvanced Programming and Parallel Computing, Master 2 MIASHS