Skip to content

Evaluating the repeatability of a reconstruction and quantification pipelineLink

ObjectiveLink

The reconstruction and quantification pipeline aims to convert a series of RGB images (output of the plant-imager) into a 3D object, here a plant, with the ultimate goal to obtain quantitative phenotypic information.

It is possible to create multiple pipelines as they are composed of a sequence of different tasks, each having a specific function. Some algorithms used in these tasks may be stochastic, hence their output might vary even tough we provide the same input. As it can impact the subsequent quantification, it is of interest to be able to identify the sources of variability and to quantify them.

Mainly, two things can be evaluated: * using a dedicated metric (e.g. chamfer distance on point-clouds), quantify the differences in the outputs of a repeated task * quantify the final repercussion on the extracted phenotypic traits

PrerequisiteLink

CLI overviewLink

The robustness_evaluation script has been developed to quantify variability in the reconstruction and quantification pipeline. It may be used to test the variability of a specific task or of the full reconstruction (and quantification) pipeline.

Basically it compares outputs of a task given the same input (previous task output or acquisition output depending on the mode) on a fixed parameterizable number of replicates.

robustness_evaluation -h
usage: robustness_evaluation [-h] [-n REPLICATE_NUMBER] [-s SUFFIX] [-f] [-np] [-db TEST_DATABASE] [--models MODELS]
                             [--log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}]
                             scan task config_file

Robustness evaluation of the Reconstruction & Quantification pipelines.

Evaluating the repeatability of a Reconstruction & Quantification (R&Q) pipeline is made as follows:
 1. duplicate the selected scan dataset in a temporary folder (and clean it from previous R&Q if necessary)
 2. run the R&Q pipeline up to the previous task of the selected task to evaluate, if any
 3. copy/replicate this result to a new database (append a replicate id to the dataset name)
 4. run the task to evaluate for each replicated dataset
 5. compare the directories of the task to evaluate pair by pair
 6. apply the comparison metrics for the task to evaluate, as defined in `robustness_evaluation.json` 

Please note that:
 - Directory comparisons are done at the scale of the files generated by the selected task.
 - We use metrics to get a quantitative comparison on the output of the task.
 - It is possible to create fully independent repetitions by running the whole R&Q pipeline using `-f`.
 - In order to use the ML-based R&Q pipeline, you will have to:
   1. create an output directory
   2. use the `--models` argument to copy the CNN trained models

positional arguments:
  scan                  Scan dataset to use for repeatability evaluation.
  task                  Task to test, should be in: AnglesAndInternodes, ClusteredMesh, Colmap, CurveSkeleton,
                        ExtrinsicCalibration, IntrinsicCalibration, Masks, OrganSegmentation, PointCloud,
                        Segmentation2D, Segmentation2d, SegmentedPointCloud, TreeGraph, TriangleMesh, Undistorted,
                        Voxels
  config_file           Path to the pipeline TOML configuration file.

optional arguments:
  -h, --help            show this help message and exit
  -n REPLICATE_NUMBER, --replicate_number REPLICATE_NUMBER
                        Number of replicate to use for repeatability evaluation. Defaults to `30`.
  -s SUFFIX, --suffix SUFFIX
                        Suffix to append to the created database folder.
  -f, --full_pipe       Run the whole Reconstruction & Quantification pipeline on each replicate independently.
  -np, --no_pipeline    Do not run the pipeline, only compare tasks outputs.
  -db TEST_DATABASE, --test_database TEST_DATABASE
                        test database location to use. Use at your own risks!
  --models MODELS       Models database location to use with ML pipeline.
  --log-level {CRITICAL,FATAL,ERROR,WARN,WARNING,INFO,DEBUG,NOTSET}
                        Set message logging level. Defaults to `INFO`.

Detailed explanations here: https://docs.romi-project.eu/plant_imager/developer/pipeline_repeatability/
The metrics used are the same as the ones for an evaluation against a ground-truth

Step-by-step tutorialLink

1. Test a single taskLink

Example with the task TriangleMesh task, whose goal is to compute a mesh from a point-cloud:

robustness_evaluation /path/db/my_scan TriangleMesh plant-3d-vision/config/pipeline.toml -n 10
To summarize, the pipeline.toml configuration indicate the following order of tasks:
ImagesFilesetExists -> Colmap -> Undistorted -> Masks -> Voxels -> PointCloud -> TriangleMesh

The call to robustness_evaluation, as previously defined, should result in the following folder structure:

path/
├── 20210628123840_eval_TriangleMesh/
│   ├── my_scan_0/
│   ├── my_scan_1/
│   ├── my_scan_2/
│   ├── my_scan_3/
│   ├── my_scan_4/
│   ├── my_scan_5/
│   ├── my_scan_6/
│   ├── my_scan_7/
│   ├── my_scan_8/
│   ├── my_scan_9/
│   ├── filebyfile_comparison.json
│   ├── romidb
│   └── TriangleMesh_comparison.json
└── db/
    ├── my_scan/
    └── romidb
The scan datasets my_scan_* are identical up to PointCloud as they result from copies of the same temporary folder. Then the TriangleMesh task is run separately on each one. Quantitative results, using the appropriate metric(s), are in the TriangleMesh_comparison.json file.

2. Independent testsLink

If the goal is to evaluate the impact of stochasticity through the whole pipeline in the output of the TriangleMesh task, you should perform independent tests (run the whole pipeline each time) using the -f parameter:

robustness_evaluation /path/db/my_scan TriangleMesh plant-3d-vision/config/pipeline.toml -n 10 -f

This will yield a similar folder structure:

path/
├── 20210628123840_eval_TriangleMesh/
│   ├── my_scan_0/
│   ├── my_scan_1/
│   ├── my_scan_2/
│   ├── my_scan_3/
│   ├── my_scan_4/
│   ├── my_scan_5/
│   ├── my_scan_6/
│   ├── my_scan_7/
│   ├── my_scan_8/
│   ├── my_scan_9/
│   ├── filebyfile_comparison.json
│   ├── romidb
│   ├── TriangleMesh_comparison.json
└── db/
    ├── my_scan/
    └── romidb

Note

To run tests on an existing database the -db parameter is configurable but be careful with it!