Evaluation of task robustnessLink
We have a CLI tool to evaluate the robustness of the Reconstruction & Quantification (R&Q) tasks.
Indeed, some tasks like Colmap
are known to be stochastic, notably the feature detection and matching that use non-deterministic algorithms.
Hereafter we give a design overview, some examples about task evaluation and the API.
Design overviewLink
To evaluate to robustness of a task, we sought at a very simplistic and empirical approach: repeat the task and analyze the variability of the results.
To do so we follow these steps:
- copy the selected scan dataset in a temporary database (and clean it from previous R&Q if necessary)
- run the R&Q pipeline up to the upstream task of the task to evaluate, if any upstream task exist
- replicate (copy) this result to an evaluation database as many times as requested (by the
-n
option, defaults to30
) - run the task to evaluate on each replicated scan dataset
- compare the directories and files of the task to evaluate pair by pair
- use the comparison metrics for the task to evaluate, as defined in
robustness_evaluation.json
Please note that:
- at step 3, a replicate id is appended to the replicated scan dataset name
- directory comparisons are done at the scale of the files generated by the selected task.
- we use metrics to get a quantitative comparison on the output of the task.
- it is possible to create fully independent repetitions by running the whole R&Q pipeline on each scan dataset using the
--full-pipe
option (or the shorter-f
). - in order to use the ML-based R&Q pipeline, you will have to:
- create an output directory
- use the
--models
argument to copy the CNN trained models
Evaluate the robustness of the Structure from Motion algorithmsLink
To evaluate the robustness of the Structure from Motion algorithms, one option is to independently reproduce the same estimation of the camera intrinsic and extrinsic parameters on several datasets.
Camera extrinsic can be compared to their Euclidean distance to the expected CNC poses and the median poses obtained from Colmap. Camera intrinsics can be compared for a given camera model using simple statistics like the deviation from the mean value of each parameter.
Command lineLink
To evaluate the robustness of the Structure from Motion algorithms defined in the Colmap
task proceed as follows:
robustness_evaluation Colmap /Data/ROMI/eval_dataset_* \
--config configs/geom_pipe_real.toml --clean --suffix exhaustive_matcher -n 50
ExplanationsLink
Let's break it down:
- we use a list of scan dataset matching the UNIX glob expression
eval_dataset_*
accessible under/Data/ROMI/
- we use the
geom_pipe_real.toml
from theconfig
directory of theplant-3d-vision
Python module - we request a
Clean
task to be performed prior to the robustness evaluation with the--clean
option - we append the
exhaustive_matcher
as a suffix to the evaluation database name with--suffix exhaustive_matcher
- we request 50 repetitions of the
Colmap
task with-n 50
Note
You may specify list of scan dataset, to use instead of a UNIX glob expression.
It should be provided as a space separated list of directory names, e.g. robustness_evaluation Colmap /Data/ROMI/eval_dataset_2 /Data/ROMI/eval_dataset_3
.
What to expectLink
The evaluation database, that is the database containing the replicated & processed scan dataset, will be located under /Data/ROMI
and should be named something like YYYY.MM.DD_HH.MM_Eval_Colmap_exhaustive_matcher
.
Within that directory you should also find the outputs of the evaluation methods.
Evaluate the robustness of the geometry-based reconstruction pipelineLink
To evaluate the robustness of geometry-based reconstruction pipeline, one option is to independently reproduce the same workflow multiple times to see if we get a consistent point cloud.
The distance between point clouds can be evaluated using a chamfer distance
.
Command lineLink
To evaluate the robustness of the geometry-based reconstruction pipeline, defined by using the geom_pipe_real.toml
configuration up to the PointCloud
task, proceed as follows:
robustness_evaluation PointCloud /Data/ROMI/eval_dataset/ \
--config configs/geom_pipe_real.toml --full-pipe --clean --suffix independent_reconstruction -n 50
ExplanationsLink
Let's break it down:
- we use a scan dataset named
eval_dataset
accessible under/Data/ROMI/
- we use the
geom_pipe_real.toml
from theconfig
directory of theplant-3d-vision
Python module - we run the geometry-based reconstruction pipeline independently on each replicate
- we request a
Clean
task to be performed prior to the robustness evaluation with the--clean
option - we append the
independent_reconstruction
as a suffix to the evaluation database name with--suffix independent_reconstruction
- we request 50 repetitions of the
Colmap
task with-n 50
What to expectLink
The evaluation database, that is the database containing the replicated & processed scan dataset, will be located under /Data/ROMI
and should be named something like YYYY.MM.DD_HH.MM_Eval_PointCloud_independent_reconstruction
.
Within that directory you should also find the outputs of the evaluation methods.
Reference APILink
Please note that this might not be 100% accurate as this is a copy/paste from the terminal and the API might evolve.
To be absolutely sure of the API, use the robustness_evaluation -h
command.
usage: robustness_evaluation [-h] [--config CONFIG] [-n N_REPLICATES] [-c] [-f] [-np]
[--suffix SUFFIX] [--eval-db EVAL_DB] [--date-fmt DATE_FMT] [--no-date]
[--models MODELS]
[--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}]
task [dataset_path ...]
positional arguments:
task Task to evaluate, should be in: AnglesAndInternodes, ClusteredMesh, Colmap,
CurveSkeleton, ExtrinsicCalibration, IntrinsicCalibration, Masks,
OrganSegmentation, PointCloud, Segmentation2D, Segmentation2d,
SegmentedPointCloud, TreeGraph, TriangleMesh, Undistorted, Voxels
dataset_path Path to scan dataset to use for task robustness evaluation.
optional arguments:
-h, --help show this help message and exit
--config CONFIG Path to the pipeline TOML configuration file.
Evaluation options:
-n N_REPLICATES, --n_replicates N_REPLICATES
Number of replicates to create for task robustness evaluation. Defaults to `30`.
-c, --clean Run a Clean task on scan dataset prior to duplication.
-f, --full-pipe Use this to run the whole pipeline independently for each replicate. Else the task to evaluate is run on clones of the results from the upstream task, if any.
-np, --no-pipeline Do not run the pipeline, only compare tasks outputs. Use with `--eval-db` to rerun this code on an existing test evaluation database!
Database options:
--suffix SUFFIX Suffix to append to the evaluation database directory to create.
--eval-db EVAL_DB Existing evaluation database location to use.Use with `-np` to rerun this code on an existing test evaluation database!
--date-fmt DATE_FMT Datetime format to use as prefix for the name of the evaluation database directory to create. Defaults to `"%Y.%m.%d_%H.%M"`.
--no-date Do not add the datetime as prefix to the name of the evaluation database directory to create.
Other options:
--models MODELS Models database location to use with ML pipeline.
--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
Set message logging level. Defaults to `INFO`.