Evaluation of task robustnessLink

We have a CLI tool to evaluate the robustness of the Reconstruction & Quantification (R&Q) tasks. Indeed, some tasks like Colmap are known to be stochastic, notably the feature detection and matching that use non-deterministic algorithms.

Hereafter we give a design overview, some examples about task evaluation and the API.

Design overviewLink

To evaluate to robustness of a task, we sought at a very simplistic and empirical approach: repeat the task and analyze the variability of the results.

To do so we follow these steps:

copy the selected scan dataset in a temporary database (and clean it from previous R&Q if necessary)
run the R&Q pipeline up to the upstream task of the task to evaluate, if any upstream task exist
replicate (copy) this result to an evaluation database as many times as requested (by the -n option, defaults to 30)
run the task to evaluate on each replicated scan dataset
compare the directories and files of the task to evaluate pair by pair
use the comparison metrics for the task to evaluate, as defined in robustness_evaluation.json

Please note that:

at step 3, a replicate id is appended to the replicated scan dataset name
directory comparisons are done at the scale of the files generated by the selected task.
we use metrics to get a quantitative comparison on the output of the task.
it is possible to create fully independent repetitions by running the whole R&Q pipeline on each scan dataset using the --full-pipe option (or the shorter -f).
in order to use the ML-based R&Q pipeline, you will have to:
create an output directory
use the --models argument to copy the CNN trained models

Evaluate the robustness of the Structure from Motion algorithmsLink

To evaluate the robustness of the Structure from Motion algorithms, one option is to independently reproduce the same estimation of the camera intrinsic and extrinsic parameters on several datasets.

Camera extrinsic can be compared to their Euclidean distance to the expected CNC poses and the median poses obtained from Colmap. Camera intrinsics can be compared for a given camera model using simple statistics like the deviation from the mean value of each parameter.

Command lineLink

To evaluate the robustness of the Structure from Motion algorithms defined in the Colmap task proceed as follows:

robustness_evaluation Colmap /Data/ROMI/eval_dataset_* \
  --config configs/geom_pipe_real.toml --clean --suffix exhaustive_matcher -n 50

ExplanationsLink

Let's break it down:

we use a list of scan dataset matching the UNIX glob expression eval_dataset_* accessible under /Data/ROMI/
we use the geom_pipe_real.toml from the config directory of the plant-3d-vision Python module
we request a Clean task to be performed prior to the robustness evaluation with the --clean option
we append the exhaustive_matcher as a suffix to the evaluation database name with --suffix exhaustive_matcher
we request 50 repetitions of the Colmap task with -n 50

Note

You may specify list of scan dataset, to use instead of a UNIX glob expression. It should be provided as a space separated list of directory names, e.g. robustness_evaluation Colmap /Data/ROMI/eval_dataset_2 /Data/ROMI/eval_dataset_3.

What to expectLink

The evaluation database, that is the database containing the replicated & processed scan dataset, will be located under /Data/ROMI and should be named something like YYYY.MM.DD_HH.MM_Eval_Colmap_exhaustive_matcher.

Within that directory you should also find the outputs of the evaluation methods.

Evaluate the robustness of the geometry-based reconstruction pipelineLink

To evaluate the robustness of geometry-based reconstruction pipeline, one option is to independently reproduce the same workflow multiple times to see if we get a consistent point cloud. The distance between point clouds can be evaluated using a chamfer distance.

Command lineLink

To evaluate the robustness of the geometry-based reconstruction pipeline, defined by using the geom_pipe_real.toml configuration up to the PointCloud task, proceed as follows:

robustness_evaluation PointCloud /Data/ROMI/eval_dataset/ \
 --config configs/geom_pipe_real.toml --full-pipe --clean --suffix independent_reconstruction -n 50

ExplanationsLink

Let's break it down:

we use a scan dataset named eval_dataset accessible under /Data/ROMI/
we use the geom_pipe_real.toml from the config directory of the plant-3d-vision Python module
we run the geometry-based reconstruction pipeline independently on each replicate
we request a Clean task to be performed prior to the robustness evaluation with the --clean option
we append the independent_reconstruction as a suffix to the evaluation database name with --suffix independent_reconstruction
we request 50 repetitions of the Colmap task with -n 50

What to expectLink

The evaluation database, that is the database containing the replicated & processed scan dataset, will be located under /Data/ROMI and should be named something like YYYY.MM.DD_HH.MM_Eval_PointCloud_independent_reconstruction.

Within that directory you should also find the outputs of the evaluation methods.

Reference APILink

Please note that this might not be 100% accurate as this is a copy/paste from the terminal and the API might evolve.

To be absolutely sure of the API, use the robustness_evaluation -h command.

usage: robustness_evaluation [-h] [--config CONFIG] [-n N_REPLICATES] [-c] [-f] [-np]
                             [--suffix SUFFIX] [--eval-db EVAL_DB] [--date-fmt DATE_FMT] [--no-date]
                             [--models MODELS]
                             [--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}]
                             task [dataset_path ...]

positional arguments:
  task                  Task to evaluate, should be in: AnglesAndInternodes, ClusteredMesh, Colmap,
                        CurveSkeleton, ExtrinsicCalibration, IntrinsicCalibration, Masks,
                        OrganSegmentation, PointCloud, Segmentation2D, Segmentation2d,
                        SegmentedPointCloud, TreeGraph, TriangleMesh, Undistorted, Voxels
  dataset_path          Path to scan dataset to use for task robustness evaluation.

optional arguments:
  -h, --help            show this help message and exit
  --config CONFIG       Path to the pipeline TOML configuration file.

Evaluation options:
  -n N_REPLICATES, --n_replicates N_REPLICATES
                        Number of replicates to create for task robustness evaluation. Defaults to `30`.
  -c, --clean           Run a Clean task on scan dataset prior to duplication.
  -f, --full-pipe       Use this to run the whole pipeline independently for each replicate. Else the task to evaluate is run on clones of the results from the upstream task, if any.
  -np, --no-pipeline    Do not run the pipeline, only compare tasks outputs. Use with `--eval-db` to rerun this code on an existing test evaluation database!

Database options:
  --suffix SUFFIX       Suffix to append to the evaluation database directory to create.
  --eval-db EVAL_DB     Existing evaluation database location to use.Use with `-np` to rerun this code on an existing test evaluation database!
  --date-fmt DATE_FMT   Datetime format to use as prefix for the name of the evaluation database directory to create. Defaults to `"%Y.%m.%d_%H.%M"`.
  --no-date             Do not add the datetime as prefix to the name of the evaluation database directory to create.

Other options:
  --models MODELS       Models database location to use with ML pipeline.
  --log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Set message logging level. Defaults to `INFO`.