- 06 Feb 2024
- 5 Minutes to read
- Print
- DarkLight
Cogniac Box Detection Best Practices
- Updated on 06 Feb 2024
- 5 Minutes to read
- Print
- DarkLight
Objective
This document details the best practices for providing feedback (Labeling) box detection applications in the Cogniac system and the importance of following such best practices.
Overview
The Cogniac “Box Detection” (detection with bounding boxes) application type is extremely powerful and capable of assessing very high-resolution images for small or subtle objects, defects, or conditions and producing bounding box predictions around the areas of interest. As with all Deep Learning tasks, the resulting application accuracy depends on the labeled consensus data used for model training and evaluation.
Accuracy: Influencing Factors
The major factors that influence the resulting application accuracy include:
● Number of labeled consensus images available for training, including number of positive examples and number of negative examples
More images are better -- up to a point. Typically 20 to 200 positive and negative images are required to achieve a good baseline level of performance. Depending on the sources of variation discussed below, 200 to 2000 additional training images may be required in some cases to maximize application performance.
● Size of object, object, condition, or defect of interest, especially with respect to the size of the images
Finding small objects/defects/conditions in large images is much harder than finding big things in small images. Small things may require more training images, especially if the image background has a lot of variation.
● The natural variation of the object, object, condition, or defect of interest.
Objects/defects/conditions with larger natural variation require more training data to resolve than things with less natural variation.
● The natural variation of the image background.
Images with substantial background variation may require more training images to reduce false detections.
● Image variation includes object size, angle, color, and lighting differences.
The more variation there is in the imaging conditions, the more training images will be required to achieve a given level of accuracy.
● The visual distinctness of the objects/defects/conditions of interest versus the background
Objects that have natural and distinct edge boundaries are easier to detect and box than more amorphous defects or conditions.
● The inherent difficulty of the visual task, including the need to consider visual context when deciding on bounding box placement.
Cogniac box detection applications can learn to consider the visual context when making bounding box predictions, but this is a more difficult visual task and may require more training data.
● Availability of hard negative data.
Cogniac systems mines hard negative data (wrong detections made by models and corrected by users) to make better models. Exclusive use of pre-labeled data doesn’t allow for hard negative data mining. The most effective feedback strategy is to provide some feedback, wait for a better model, and provide more feedback.
Best Practice for Bounding Boxes
Why it Matters In addition to the sources of object and image variation detailed above, the user can accidentally introduce additional variation in the bounding box labeling if they do not follow the best practices outlined in this document. In particular, variations in the size of the user-supplied bounding boxes relative to the object, defect, or condition can also impact the application's accuracy! This collateral variation can increase the amount of training data required to achieve a given level of accuracy.
Best Practices Recommended Cogniac best practices to minimize this source of collateral variation are as follows:
2.1. Size of Boxes For objects that have distinct external edge boundaries, place the bounding boxes so that all of the object's edges are within the bounding box. A single bounding box should encompass the entire item. Leave a few pixels of ‘buffer’ between the bounding box and the closest outside edges of the object. The bounding boxes should fit reasonably ‘tight’ around the object, almost touching the object in at least four points. In some cases, the bounding box may end up covering other objects. This is okay. As long as the bounding box covers the entirety of the object of interest with as little extra space around the edges as possible, the label follows the best practices.
Figure 1. This bounding box around a pair of scissors follows the labeling best practice because it fully encompasses the pair of scissors while fitting tightly around the edges
Figure 2. Do not use multiple boxes to represent a single instance of an object. This is a common mistake users make in an effort to minimize the amount of background that is captured in a label. It introduces confusing variability to the applications.
Figure 3. Boxes that are too small (right) or too large (left) do not follow the best practices for box labels.
2.2. Multiple Objects of the Same Type In some cases, there may be multiple objects of the same type in a single image. When labeling multiple objects in an image, follow the above guidance and ensure that each distinct object gets its own bounding box.
Figure 4. This image contains four clamps. Each clamp should receive its own bounding box, which contains the entire object within the box. It is okay for the boxes to overlap, as seen.
Figure 5. Do not use a single large box to bound multiple objects of the same type.
2.3. “Angled” Objects
A common question arises around objects that are angled in an image. “Do I still draw a box around this, even though there is all this extra space?” The answer is “Yes!”
Figure 6. Even though these boxes look very different, these are both valid labels for this level. If an item is angled in the image, continue following the best practice of drawing a complete box encompassing the entire object.
Figure 7. Do not use multiple boxes to “adjust” for angled items. Never use multiple boxes to represent a single instance of an object to minimize the amount of background captured in a label. It introduces confusing variability to the applications.
2.4. Partial Objects Labeling items that are not fully contained in the image is generally not recommended. There may be some instances where you have received specific instructions from the Cogniac team regarding partial object labeling. If you have not received specific instructions to label partial objects, please refrain from doing so.
Figure 8. The pair of scissors is not fully visible in the image. Therefore, it should not be labeled.