This robot allows you to identify certain elements in an image with the help of Google Cloud Vision. It fully supports all of Cloud Vision's features and has branches prepared for your convenience for the following three entities: Web Entities, Text Detection, and Face Detection.

The robot performs the following steps:

  1. Retrieve an image from a URI,
  2. Call the Google API,
  3. Parse the response.

The response is saved to the development database. If a combination of image and type was already processed, the API called another time.


Add computer vision to Kofax RPA. Here are some common use cases:

  • Detect faces on a scanned ID card,
  • Identify an object in an image provided by a customer,
  • Optical character recognition for images on-the-fly.


Learn more about this robot here.


  • An image by URL
  • The feature type requested (e.g. face detection)

The following two properties need to be configured before using this robot:

  • Google Cloud Vision's endpoint
  • Your Google API Key



Depends on the feature used.

  • Text recognition outputs a collection of words with coordinates
  • Web entities have a description and score
  • In any case, this robot stores Google's response as JSON in the development database


Required Software / Applications

Kofax RPA

Language Availability