Velocimetry#
Estimating surface velocity is one of the core methods of pyorc. Typically surface velocity is derived from a set of frames, by analyzing frame to frame movements in time. This can be done with several methods. pyorc currently has implemented one of these methods called “Particle Image Velocimetry”.
Particle Image Velocimetry#
Particle Image Velocimetry (PIV) uses cross-correlation methods to estimate the most likely position of an observed pattern from frame to frame. Therefore there is no interdependency over more than 2 frames. To observe patterns, the total area of interest is subdivided into small blocks of pixels with overlap, also called “interrogation windows”. Currently the overlap is fixed in pyorc on 50% of each block. Say that you have reprojected your data to a 0.02 m resolution and you select an interrogation window of 20 pixels, then each interrogation window will be 0.02 * 20 = 0.4 m in size in both x- and y-direction. The next window will share 10 pixels with its neighbouring window and hence a velocity will be resolved for each 0.2 x 0.2 meters.
If you do not provide any video
and frames
section in your recipe, then pyorc will simply perform
velocimetry on black-and-white frames, reprojected following the camera configuration as is.
The only direct requirement for calculating velocimetry is that you have:
opened a Camera configurations video with
pyorc.Video
(e.g. to an object calledvideo
) using a video file and a camera configuration;retrieved frames from it using
video.get_frames
(e.g. into aDataArray
calledframes
);orthorectified the frames using
frames.project
(e.g. intoframes_proj
).
We implemented several engines to compute PIV. The default is openpiv
which uses the OpenPIV library to
perform computations. The alternatives are numba
or numpy
which use the numba and numpy libraries for
the computations, as provided in the underlying FF-PIV library. numba
is by far the fastest option, and likely
will become the default in the future.
Naturally, before orthorectification you may decide to apply one or several preprocessing methods as described in Frames to improve the visibility of tracers. We highly recommend that in many cases, as shown in the snippets in Frames.
Similar to the reprojection step, described in the frames section, not supplying any details at all will already lead to derivation of the velocimetry results. In this case, the information in the camera configuration will be used to inform the window size of the velocimetry. If you wish to override this, you can supply an extra section in the recipe as follows:
velocimetry:
get_piv:
window_size: 64
engine: numba
Here the engine: numba
ensures that the much faster numba implementation is used. window_size: 64
overrides any window size provided in the camera configuration and sets it to 64. When using the engine
parameter with numba
or numpy
, you can also provide memory safety margins by manually adjusting the
chunk size with the chunksize
parameter. If you notice a memory warning is given, and chunksize
is
set to 5, try manually setting it to e.g. 3 or 2. The default should be reasonably safe, so we hope you never
have to touch this parameter at all :-) unless you process very very large videos with very large objectives.
Note
It seems a little superfluous to have a section called velocimetry
, then a subsection get_piv
and then
the flags used in get_piv
, however, we wish to keep the option open to add other velocimetry methods here.
Perhaps in the near future we may offer a method get_ptv
to use Particle Tracking Velocimetry
instead of Partical Image Velocimetry. This method follows particles instead of using cross correlation.
Getting surface velocity from the orthoprojected set of frames stored in object frames_proj
is as easy as calling
piv = frames_proj.get_piv()
The size of the interrogation window can already be set in the camera configuration, as shown in Camera configurations.
However, you may also provide a different window size on the fly using the keyword window_size
:
# ignore the window size in the camera config and set this to 10 pixels
piv = frame_proj.get_piv(window_size=64)
Use thew engine
parameter to select a much faster computational engine. With engine="numba"
a very fast
numba-based computation will be used. The computation will be chunked into several batches based on available
memory. If you find out your computations crash it is likely due to lack of memory. In this case you can
override automatically computed chunk amounts by setting chunksize
to a smaller or user-defined amount.
You may also set memory_factor
to a higher amount than the default (2). memory_factor
decides on the fraction of the memory reserved for one entire chunk of computation. E.g. by setting it to 4 only
1/4th of the available memory is assumed to be available. In practice, for large problems, more temporary
memory storage is needed within the subprocessing.
Interrogating and storing PIV results#
The results of the velocimetry processing will contain grids for each frame (minus one because frame pairs are needed). These results can be stored so that they can be interrogated later, by other software or used for plotting.
If you add the subsection write
to the section velocimetry
, then results will be written to disk
automatically. The results will then be stored in the output folder (passed in the command) under a name
convention <prefix>piv.nc
, where prefix
is supplied on the command line using the argument -p
or
--prefix
. If you do not supply this, then the results will simply be stored in piv.nc
.
The object piv
is a normal xarray.Dataset
object. Therefore, you can use any xarray
functionality to
interrogate the data. An important functionality for instance that you may require, is to reduce the data to a
time-averaged mean or one or more quantiles such as a median. This is important for instance when you want to plot
results in a spatial view:
# derive the mean over time
piv_mean = piv.mean(dim="time")
# derive the median
piv_median = piv.median(dim="time")
If you apply such reducers, xarray
can no longer guarantee that the metadata attributes of your data variables remain
valid. Therefore, you normally loose metadata, important to for instance reproject data onto the camera perspective.
Therefore we highly recommend to apply such reducers with keep_attrs=True
to prevent that important attributes
get lost in the process.
# derive the mean over time, while keeping the attributes in place
piv_mean = piv.mean(dim="time", keep_attrs=True)
# derive the median, while keeping the attributes in place
piv_median = piv.median(dim="time", keep_attrs=True)
Storing your piv results, either with time in place or after applying reducers can also be done. We recommend using the NetCDF standard as the data model. pyorc also follows the Climate and Forecast conventions.
# store results in a file, this will take a while
piv.to_netcdf("piv_results.nc")
PIV computations are the longest in time. Therefore it is normal that you will need to wait for a while before data is returned. This may take a few seconds for small problems, but for large areas of interest or large amounts of time steps (or a slow machine with little memory) it can also take half an hour or longer. With the new (>=0.7.0) numba and numpy engines, you can keep track of progress with a progress bar which is automatically displayed during the processing.
If you wish to load your results into memory after having stored it in a previous session, you can simply
use xarray
functionality to do so.
import xarray as xr
piv = xr.open_dataset("piv_results.nc")
piv
Masking spurious velocities#
In many cases, you may find that velocities are not accurately resolved, either consistently in a given location or
region in the area of interest, or in specific time steps for given frame to frame results. Nevertheless, the
get_piv
method will return results in those cases, even though these may be incorrect or very inaccurate. Causes
for such spurious or poorly estimated velocities may be:
very little visible patterns available to trace: this can cause many moments in time in which no velocities are observed. If only sometimes a traceable pattern passes by, longer integration time may be needed, e.g. 30 or 60 seconds. With low flow velocities (typically 0.5 m/s or lower) longer integration times are often needed to capture enough valid velocities.
poor light conditions, e.g. too dark: causes patterns to be difficult to be distinguished. The pre-processing method for edge detection described here is useful in this case to strengthen gradients in the patterns before estimating velocities.
very strong visibility of the bottom: causes patterns on the surface to be more difficult to distinguish from non-moving bottom patterns. In part this can be resolved with the normalization method, also described here.
wind: you may find very nice velocity vectors which show a very different process than what you are looking for. Especially when the wind waves are oriented in the same direction as the flow, this is very difficult to resolve.
poor quality footage: water is typically a relatively uniform and relatively dark surface. If your footage has a low bitrate (e.g. 1080p with 2Mbps), then the compression algorithm used will usually decide that the water surface contains very little interesting information to store. This results in strong loss of visibility of patterns and hence poor results, usually resulting in underestimation of velocities.
Note
Cheap IP cameras are notorious candidates for poor quality videos and underestimation of velocities and river discharge. If you use an IP camera, then look for one that can record in 1080p at a bit rate in the order of 20 Mbps.
To accomodate masking out valid velocities from spurious ones, we have developed many methods referred to as “masking” methods to remove spurious velocities. As there are many masking methods available, we refer to the API description on how to apply each specific masking method, also for the command-line interface. Here we provide a general description of how to apply masks and what to be aware of.
Note
To understand in detail how a mask works, please read the individual masking methods in the masks
section in the API description. In the command-line recipe, you may supply a mask within a mask group using its name
as defined in the masks. The API description shows the mask name as method under a class. For
instance, the minmax
mask is referred to as pyorc.Velocimetry.mask.minmax
. In the .yml
file containing
your recipe, you must simply insert minmax
, i.e. the last part of the method name. The arguments defined under
Parameters in the description can be supplied in the .yml recipe as key-value pairs as further exemplified
in the sections below.
Usually one will use a set of masks, either organized in combination or in cascade (and both is possible in combination!) to improve the results. Using a combination or a cascade can lead to quite different results.
Independent masks#
With this approach, you first assemble a set of masks by analyzing your raw results several times independently. Only after having derived the masks, do you apply them on your data in one go. Below we show a small example how that works.
mask:
# we make one mask group, that combines a number of masks, and applies them in one go
combined_mask:
# get a mask to remove values that are based on a too low correlation
corr:
tolerance: 0.3
# get a mask to remove velocities that are lower or higher than a user defined threshold (default 0.1 and 5 m s-1)
minmax:
# get a mask for outliers, that deviate a lot from the mean, measured in standard deviations in time
outliers:
# count per grid cell, how many valid (i.e. non masked) values we have, only when there this is above 50% do we trust
# the results
count:
tolerance: 0.5
# get a mask to remove values that are based on a too low correlation
mask_corr = piv.velocimetry.mask.corr(tolerance=0.3)
# get a mask to remove velocities that are lower or higher than a user defined threshold (default 0.1 and 5 m s-1)
mask_minmax = piv.velocimetry.mask.minmax()
# get a mask for outliers, that deviate a lot from the mean, measured in standard deviations in time
mask_outliers = piv.velocimetry.mask.outliers()
# count per grid cell, how many valid (i.e. non masked) values we have, only when there this is above 50% do we trust
# the results
mask_count = ds_mask.velocimetry.mask.count(tolerance=0.5)
# now apply the resulting masks
piv_masked = piv.velocimetry.mask([
mask_corr,
mask_minmax,
mask_outliers,
mask_count
])
In this example, the order in which we derive the masks will not matter. This is because we only
apply the masks on the data at the very end. Following this approach the last mask method count
we applied will not
do anything, because it is basically derived from the raw results, which do not contain any masked out values
yet. Hence in many cases it may make sense to first apply a set of masks, for instance those that work on individual
values rather than using a full analysis in time, or a neighbourhood analysis of neighbouring grid cells, and only after
that apply other masks that use counts of valid values, check how well neighbouring values match the value under
consideration or compute standard deviations or variance in time to evaluate how valid a velocity may be.
Conditional masking by cascading masks#
Therefore, we recommend to consider using cascades of masks, so that already applied masks influence the result of
later applied masks for which an analysis of the values through time is essential. For instance the mask method outlier
checks for each grid cell what the mean and standard deviation of velocities through time is, and then assesses
which velocity values are above or under a certain amount of standard deviations. If this mask method is applied before
any of the masks that work on individual values, then the outliers that may have been removed with those masks will
influence the results of this mask, making it less effective. Cascading can be done, by first applying one or a group of
masks, and then on the result apply another single or group of masks.
In your recipe you can supply several mask groups. Below we show how that works. Each mask group has a unique name that you can decide upon yourself.
mask:
# we make one mask group, that combines a number of masks, and applies them in one go
mask_with_independent_vals:
# get a mask to remove values that are based on a too low correlation
corr:
tolerance: 0.3
# get a mask to remove velocities that are lower or higher than a user defined threshold (default 0.1 and 5 m s-1)
minmax:
# another mask group is defined below, which will be applied after imposing the masks from mask group
# mask_with_independent_vals
mask_outliers:
# directly apply mask for outliers, that deviate a lot from the mean, measured in standard deviations in time
outliers:
# then after applying mask_outliers, many values may have been removed. Now we can effectively also
# count remaining valid values per grid cell and decide if we are satisfied with this or not.
mask_count:
# count per grid cell, how many valid (i.e. non masked) values we have, only when there this is above
# 50% do we trust the results
count:
tolerance: 0.5
Within the API, you may either derive a few masks, apply them, and then derive more masks using the results of
the first masking, or simply by using the inplace=True
flag which immediate overwrites the velocity vectors
with missings where the mask is indicating so. Below we show how that works.
# directly apply a mask to remove values that are based on a too low correlation
piv.velocimetry.mask.corr(tolerance=0.3, inplace=True)
# directly apply a mask to remove velocities that are lower or higher than a user defined threshold (default 0.1 and 5 m s-1)
piv.velocimetry.mask.minmax(inplace=True)
# directly apply a mask for outliers, that deviate a lot from the mean, measured in standard deviations in time
piv.velocimetry.mask.outliers(inplace=True)
# directly apply another mask that remove grid cells entirely when their variance is deemed too high to be trustworthy
piv.velocimetry.mask.variance(inplace=True)
# count per grid cell, how many valid (i.e. non masked) values we have, only when there this is above 50% do we trust
# the results
piv.velocimetry.mask.count(tolerance=0.5, inplace=True)
In this case, as masks are already applied before count
is called, count
will have effect!
In our experience cascading of masks leads to much better results than independently combined masks.
Some masks may also be applied on time averaged data.
Some specific masks#
A few masks are worthwhile to mention specifically, as they may lead to unexpected results if you don’t know how they work.
angle
: this mask removes velocities that do not follow an expected flow direction. The default for this is left-to right oriented flow in the orthorectified x, y grid, with a tolerance of 0.5 * pi (i.e. 90 degrees). This means that if flow in your x, y grid is oriented from bottom to top, or right-to-left, then almost all your velocities will be removed and your filtered result will be empty. In streams with a very clear dominant flow direction however, this filter is very useful. To ensure your flow follows a left-to-right direction, the selection of corner points in your camera configuration is important. If you select these in the right order, the orientation will be correct. The right order should be:upstream left bank
downstream left bank
downstream right bank
upstream right bank
The angle masking method can then be applied as follows, with the example showing expected flow in right-to-left direction and a more wide tolerance of 0.5 * pi:
mask: # give a unique name to the mask group mask_angle: # then define the name of the mask method and supply arguments if required angle: expected_angle: -1.57 angle_tolerance: 1.57
mask_angle = piv.velocimetry.mask.angle( expected_angle: -1.57 angle_tolerance: 1.57 ) # add inplace=True if you want to apply directly
window_median
: this mask can only be applied on time-reduced results and analyses (instead of time series) values of neighbours in a certain window defined by parameterwdw
.wdw=1
means that a one left/right/above/under window is analyzed resulting in a 3x3 window. If the velocity in the cell under consideration is very different from the mean of its surrounding cells (defined bytolerance
and measured as a relative velocity to the mean) the value is removed. Windows can also be defined with specific strides in x and y direction. See spatial maskswindow_nan
: this mask can only be applied on time-reduced results and analyses (instead of time series) values of neighbours in a certain window defined by parameterwdw
. If there are too many missings in the window, then the value considered is also removed. This is meant to remove isolated values. Also described in spatial masks