Optimizing Target Detection and Tracking Stability

Vuforia is able to recognize and track targets by analyzing the contrast based features of the target that are visible to camera. You can improve the performance of a target by improving the visibility of these features through adjustments to the target's design, it's rendering and scale, and how it's printed.

You can also improve detection and tracking performance by controlling the focus mode of the device camera and designing your app's user experience to obtain the best image of the target.

Review these tips and practices to understand how to affect these variables and achieve the best possible performance for your AR apps.
 

Target star rating

Image Targets are detected based on natural features that are extracted from the target image and then compared at run time with features in the live camera image. The star rating of a target ranges between 1 and 5 stars; although targets with low rating (1 or 2 stars) can usually detect and track well. For best results, you should aim for targets with 4 or 5 stars. To create a trackable that is accurately detected, you should use images that are:

AttributeExample
Rich in detailStreet-scene, group of people, collages and mixtures of items, or sport scenes
Good contrastHas both bright and dark regions, is well lit, and not dull in brightness or color
No repetitive patternsGrassy field, the front of a modern house with identical windows, and other regular grids and patterns
 

Camera focus modes

If the target is not focused well in the camera view, the camera image result can be blurry and the target details can be hard to detect. As a consequence, detection and tracking performance can be negatively affected.
It is recommended to make use of the appropriate Camera Focus Mode to ensure the best camera focus conditions.
  • Try the continuous autofocus mode (FOCUS_MODE_CONTINUOUS_AUTO) because it lets your device automatically adjust the focus as the view changes.
  • Not all devices support the continuous autofocus mode, so consider the other focus modes available in the Vuforia API.  
  • Many of the Vuforia samples show how to use the FOCUS_MODE_TRIGGER_AUTO option to trigger a one-shot autofocus adjustment when the user touches the screen. Check the sample code to discover how to do this.
For a complete description of camera focus modes, Camera Focus Modes

Lighting conditions

The lighting conditions in your test environment can significantly affect target detection and tracking.
  • Make sure that there is enough light in your room or operating environment so that the scene details and target features are well visible in the camera view.
  • Consider that Vuforia works best in indoor environments, where the lighting conditions are usually more stable and easy to control.
  • If your application use case and scenarios require operating in dark environments, consider enabling the device Flash torch (if your device has one), using setFlashTorchMode() Vuforia API:
CameraDevice.getInstance().setFlashTorchMode( true );
                        or in Unity:
 
CameraDevice.Instance.SetFlashTorchMode( true );
 

Target size

  • For tabletop, near-field, product shelf and similar scenarios, a physical printed image target should be at least 5 inches or 12 cm in width and of reasonable height for a good AR experience.
  • The recommended size varies based on the actual target rating and the distance to the physical image target.
  • Consider increasing the size of your targets if the distance of the target is higher.
  • You can estimate the minimum size that your target should have by dividing your camera-to-target distance by ~10. For instance, a 20 cm wide target would be detectable up to a distance of about 2 meters (20 cm x 10). Note, however, that this is just a rough indication and the actual working distance/size ratio can vary based on lighting conditions, camera focus, and target rating.  

Printed target - flatness

The quality of the tracking using Vuforia SDK can degrade significantly when the printed targets are not flat. When designing the physical printouts, game boards, play pieces, try to ensure that the targets do not bend, coil up, and are not creased or wrinkled. A simple trick is to use thick paper when printing, for example, 200-220 g/m². A more elegant solution is to get the printout foam core mounted on a 1/8” or 3/16” – 3 or 5 mm – thick board.

Printed target - glossiness

Printouts from modern laser printers might also be glossy. Under ambient lighting conditions a glossy surface is not a problem. But under certain angles some light sources, such as a lamp, window, or the sun, can create a glossy reflection that covers up large parts of the original texture of the printout. The reflection can create issues with tracking and detection, since this problem is very similar to partially occluding the target.
 

Viewing angle

The target features will be harder to detect and tracking can also be less stable if you are looking at the target from a very steep angle, or your target appears very oblique with regard to the camera. When defining your use scenarios, keep in mind that a target facing the camera, whose normal is well aligned with the camera viewing direction, will have a better chance to get detected and tracked.

 


Attributes of an Ideal Image Target

Image Targets possessing the following attributes will enable the best detection and tracking performance from the Vuforia SDK.

AttributeExample
Rich in detailStreet scene, group of people, collages and mixtures of items, and sport scenes
Good contrastBright and dark regions, and well-lit
No repetitive patternsA grassy field, the façade of modern house with identical windows, and a checkerboard
FormatMust be 8- or 24-bit PNG and JPG formats; less than 2 MB in size; JPGs must be RGB or greyscale (no CMYK)

User-added image

Examples

Figure A – Image target with coordinate axes for explanation.

This image is fed into the online Target Manager to create the target databas

User-added image

Figure B – Image showing the natural features that the Vuforia SDK uses to detect the image target.

User-added image

 


Natural Features and Image Ratings

An augmentable rating defines how well an image can be detected and tracked using the Vuforia SDK. This rating is displayed in the Target Manager and returned for each uploaded target via the web API.

The augmentable rating can range from 0 to 5 for any given image. The higher the augmentable rating of an image target, the stronger the detection and tracking ability it contains. A rating of zero indicates that a target is not tracked at all by the AR system, whereas a star rating of 5 indicates that an image is easily tracked by the AR system. 

Features

A feature is a sharp, spiked, chiseled detail in the image, such as the ones present in textured objects. The image analyzer represents features as small yellow crosses. Increase the number of these details in your image, and verify that the details create a non-repeating pattern.

A square contains four features for each one of its corners.

A circle contains no features as it contains no sharp or chiseled detail.

This object contains only two features for each sharp corner.
Note: According to the definition of a feature, soft corners and organic edges are not marked as features.

 

Uploaded Image

Analyzed Image

Star Rating

Image with small number of features

Image with high number of features

The augementable rating on the Target Manager hints at the problem:

Uploaded Image

Analyzed Image

Star Rating

Rating:   

Not enough features. More visual details are required to increase the total number of features.

Poor feature distribution. Features are present in some areas of this image but not in others. Features need to be distributed uniformly across the image.

Poor local contrast. The objects in this image need sharper edges or clearly defined shapes in order to provide better local contrast.  

Local contrast

Good or bad local contrast is often difficult to detect with your eye. Improve the contrast of the image in general, or choose an image with details that are more “edged.” Organic shapes, round details, blurred, or highly-compressed images often do not provide enough richness in detail to be detected and tracked properly.

 

Uploaded Image

Detail

Analyzed Image

Star Rating

Original Image

Enhanced Local Contrast

Strong Local Contrast Enhancement

This artwork shows a more practical example of how to improve the local contrast of the target. We use an image with two layers. In the foreground are a few multi-colored leaves. The background is a textured surface. The layers exist only in our graphic editor; when uploading to the Target Manager we always use a flattened image, e.g., PNG format. The uploaded image is 512x512 pixels in size, a little  bigger than the recommended minimum of 320 pixels.

At first sight the original image might have enough detail to function as a target. Unfortunately, uploading it to the Target Manager yields a very low rating of only one star. This results in poor tracking performance. Consecutive improvements allow improving the target quality to a five-star target, yielding superior detection and tracking performance.

Variation

Image

Applied Operation

Star Rating
(out of 5)

1

Original image intended to be used as a target. This image will result in poor quality, because not many features with good contrast can be found.

2

When changing the background of variation 1 to a more contrasting – in this case lighter background – the rating improves, since more contrasting features can be found in the image. Still the rating of 2 is unsatisfactory.

3

Let’s increase the contrast of the features in the foreground from variation 2. For this we increased the contrast of the foreground layer and also pulled down their brightness. With this we’d get an average result and robustness. We can do more.

4

We can further strengthen the features by applying a local contrast enhancement operation to variation 3. For details on this operation, see Local Contrast Enhancement. Note that to yield the expected tracking result, the printed target must be sharp, and focus must be set correctly in the application at runtime.

5

Another option to increase the local contrast of variation 2 is to further increase foreground/background balance. Here we use a white background. This operation is not always feasible, since it changes your original design. But you might consider this when creating or recommending an initial version.

6

To further improve, we can combine effects. Here we took variation 3 with a foreground that is already enhanced  and replaced the background with white. The total contrast  yielded a superb performance.

7

A different combination is to use the image shown in variation 5 and apply the local contrast enhancement operation as suggested in variation 4. The combined effect is also a five star target.

Feature distribution

The more balanced the distribution of the features in the image, the better the image can be detected and tracked. Verify that the yellow crosses are well-distributed across the entire image. Consider cropping the image to remove any areas without features.

 

Uploaded Image

Analyzed Image

Star Rating

Image features unevenly distributed throughout the target

Cropped image better feature distribution

 

Uploaded Image

Analyzed Image

Star Rating

Rating:  

Poor feature distribution. features are present in some areas of this image but not in others. Features need to be distributed uniformly across the image.

Poor local contrast. The objects in this image need sharper edges or clearly defined shapes to provide better local contrast.

Avoid organic shapes

Typically, organic shapes with soft or round details containing blurred or highly compressed aspects do not provide enough detail to be detected and tracked properly or not at all. They suffer from low feature count.

Uploaded Image

Analyzed Image

Star Rating

Rating:  

There are no features in this image because it lacks visual elements with sharp edges and high contrast. TheAR camera will fail to detect and track images that display these or similar characteristics.

 

Avoid repetitive patterns

Although some images contain enough features and good contrast, repetitive patterns hinder detection performance. For best results, choose an image without repeated motifs (even if rotated and scaled) or strong rotational symmetry. A checkerboard is an example of a repeated pattern that cannot be detected, since the 2x2 pairs of black and white squares look exactly the same and cannot be distinguished by the detector.

 

Uploaded Image

Analyzed Image

Star Rating

Rating: 

This image is not suitable for detection and tracking. You should consider an alternative image or significantly modify this one.

Although this image may contain enough features and good contrast, repetitive patterns hinder detection performance. For best results, choose an image without repeated motifs (even if rotated and scaled) or strong rotational symmetry.

 


Local Contrast Enhancement

Improve the quality of the features by applying a local contrast enhancement operation on the image.

 Original image

Image with enhanced local contrast

This operation modifies the image by increasing the local contrast across edges and around the corners. For this operation to yield the expected result the printed target must be sharp and camera focus must be set correctly in the application at runtime. Otherwise, camera blur can diminish the effect of this operation. Also the down-scaling of the image to the size of 320 pixels on the longer side on the Target Manager can destroy the effect of this operation, so it is important to scale down the image before this step.

The procedure to apply this operation is fairly simple. In our example we use Photoshop, but any other graphic editor tool can be used:

  1. Load the image (= this bottom layer is the ‘original layer’).
  2. Scale the image to 320 pixels width to yield the Target Manager image resolution.
  3. Duplicate layer twice.
  4. Blur last layer with, for example, Gaussian Blur (radius value of blur identical to radius in unsharp mask below) (= this layer is the ‘blurred layer’).
  5. Select middle original copy – Second layer from bottom (= this layer is the ‘final layer’).
  6. Image→Apply Image..., select Layer=’blurred layer’, Blending=Subtract, Offset = 128, click OK
  7. Image→Apply Image..., select Layer=’original layer’, Blending=Add, Offset = -128, click OK
  8. (Hide the ‘blurred layer’ on top); your result is in the ‘final layer’.

The output is the image with enhanced contrast, identical to the Unsharp mask. The Unsharp mask comes from the analog print age, where they used a different image between the original and a blurred copy to add detail contrast. The 'amount' setting of the mask is 100% in this case; this is a weight on the difference.

The formula applied to the actual pixels is for each channel, where the precedence of the evaluation of the bracketed term is important:

output_image = image + ( image - blurred_image + 128 ) - 128

 


How To Evaluate a Target Image in Grayscale

Vuforia uses the grayscale version of your target image to identify features that can be used for recognition and tracking. You can use the grayscale histogram of your image to evaluate its suitability as a target image. Grayscale histograms can be generated using an image editing application, such as GIMP or Photoshop.

If the image has low overall contrast and the histogram of the image is narrow and spiky, it is not likely to be a good target image. These factors indicate that the image does not present many usable features. However if the histogram is wide and flat, this is a good first indication that the image contains a good distribution of useful features. Note though that this is not true in all cases, as demonstrated by the image in last row of this table.
 

Uploaded Image in GrayscaleHistogramStar Rating
User-added imageUser-added imageUser-added image
User-added imageUser-added imageUser-added image
User-added imageUser-added image
User-added imageUser-added imageUser-added image

 


How To Use the Feature Exclusion Buffer

A features-exclusion buffer surrounds the inset of an uploaded image. This buffer area is about 8% wide and it does not pick up any features, even if features do exist within that zone. See the first row of the following table, where the shaded area in red does not contain any features, even though visible features are present in this zone.

 Uploaded ImageAnalyzed Image
(with red marking)
Original Image TargetUser-added imageUser-added image
Image Target with borderUser-added imageUser-added image
 

You can avoid this features-exclusion buffer situation by adding a white 8% buffer around the image for the Target Manager target generation, as is shown in the lower row of the table above. But consider that those features will be helpful only when it can be guaranteed that during the run time execution the target will lie on a surface with a unique color and does not, itself, have features.

 


How To Create Non-Rectangular Image Targets

You can use non-rectangular 2D shapes as targets by placing the image of the shape on a white background. This will ensure that only the features of the shape are used for the Image Target

Steps:

  1. Position the shape on a white background in an image editor.
  2. Render the composite image as a JPG or PNG image file.
  3. Upload this file to the Target Manager to create a new target,
 
 
Cutout DesignUploaded ImageGenerated Features
User-added imageUser-added imageUser-added image

 


How To Optimize the Physical Properties of Image Targets

Use the following recommendations to get the best performance from physical target images. Image Targets should be rigid, not flexible; with matte, but no gloss. Small targets are good for  user  manipulation. Be creative when choosing targets; make them appropriate, relevant, and fun.

A hard material such as card stock, plastic, or paper fixed to a non-flexible surface is better than a simple printed piece of paper. The reason is that the flexibility of the printed piece of paper can make it difficult for the object to stay in focus. However, paper targets are easily reproducible and widely available,  so do not completely discount them as valid targets.

Be aware that even if you provide detailed instructions for the printing size and the paper quality, most users will revert to the default of their printer, which  is usually A4 or US letter. To avoid these printing problems, you should provide targets that are books, marketing materials, packaging, or posters.

Size

For tabletop, near-field, product shelf, and similar scenarios, a physical printed image target should be at least 5 inches, or 12 cm, in width and of reasonable height for a good AR experience. The recommended size varies based on the actual target rating and the distance to the physical image target.

Flatness

The quality of the tracking that uses the Vuforia SDK can degrade significantly when the printed targets are not flat. When designing the physical printouts, game boards, play pieces, ensure that the targets do not bend, coil up, and are not creased or wrinkled. A simple trick is to use thick paper when printing, for example, 200-220 g/m². A more elegant solution is to get the printout mounted onto foam core board, from 1/8 in. to 3/16 in., or from 3 to 5 mm.

Product packages tend to be good targets, since they are manufactured from cardboard or other thick material. Magazine pages lay flat and can work well, but daily newspaper is printed on thinner paper and must be carefully tested before being used.

User-added image

Surface Finish

Printouts from modern laser printers can be very glossy. Under ambient lighting conditions, the glossy surface is not a problem. But under certain angles, light sources (a lamp, window, or sun, for example) can create a glossy reflection that covers up potentially large parts of the original texture of the printout. A glossy reflection can create tracking and detection issues, similar to partially occluding the target. See the following figure as an example of a glossy reflection that can cause problems.

Multi-target considerations

Since multi-targets are spatially combined image targets, they must fulfill the physical properties of image targets. In addition, it is important to guarantee that the shape of the multi-target remains intact. Thicker paper and sharp edges on the fold are recommended; sides should not bulge.