Skip to content

Annotate Product Images


Annotate product images using BigQuery and Cloud Vision

Create a Cloud Storage Bucket and Upload Images

Create a Google Cloud Storage bucket to store product images. The bucket should be in the us region and named velano-collectives-n1y3. You can use the following Terraform code to create the bucket:

resource "google_storage_bucket" "velano_collectives" {
    name     = "velano-collectives-${random_string.velano_collectives_suffix.result}"
    location = "US"
}

After creating the bucket, upload your product images to it. You can use the gsutil command-line tool to copy images from your local machine to the Cloud Storage bucket:

gsutil cp velano-collectives/data/images/*.png gs://velano-collectives-n1y3/images/

Create a Dataset

create schema if not exists products
options(
  location="us"
)

Create an Object Table

Create a BigQuery connection to the Cloud Storage bucket containing the images:

resource "google_bigquery_connection" "gcs" {
  connection_id = "gcs"
  location      = "US"
  cloud_resource {}
}

module "bq_conn__gcs__project_iam" {
  source  = "terraform-google-modules/iam/google//modules/projects_iam"
  version = "~> 8.0"

  projects = [module.project.project_id]
  mode     = "additive"

  bindings = {
    "roles/storage.objectViewer" = [
      "serviceAccount:${google_bigquery_connection.gcs.cloud_resource[0].service_account_id}"
    ]
    "roles/serviceusage.serviceUsageConsumer" = [
      "serviceAccount:${google_bigquery_connection.gcs.cloud_resource[0].service_account_id}"
    ]
    "roles/documentai.viewer" = [
      "serviceAccount:${google_bigquery_connection.gcs.cloud_resource[0].service_account_id}"
    ]
  }
}

The connection's service account must have the following roles:

  • roles/storage.objectViewer
  • roles/serviceusage.serviceUsageConsumer

Create an object table in BigQuery to reference the images stored in the Cloud Storage bucket:

create external table `velano-collective-ac8f.products.images`
with connection `velano-collective-ac8f.us.gcs`
options(
  object_metadata = 'SIMPLE',
  uris = ['gs://velano-collectives-n1y3/images/*.png']
)

Create a Remote Model

Create a BigQuery connection to the Cloud Vision API:

resource "google_bigquery_connection" "cloud_vision" {
  connection_id = "cloud-vision"
  location      = "US"
  cloud_resource {}
}

After creating the connection, you can create a remote model that uses the Cloud Vision API to annotate images:

create or replace model `velano-collective-ac8f.products.vision`
remote with connection `velano-collective-ac8f.us.cloud-vision`
options(
    remote_service_type='CLOUD_AI_VISION_V1'
)

Annotate images

When you have the object table and remote model set up, you can use the ML.ANNOTATE_IMAGE function to annotate the images. This function allows you to specify which vision features you want to extract from the images.

select
  *
from ml.annotate_image(
  model `velano-collective-ac8f.products.vision`,
  table `velano-collective-ac8f.products.images`,
  struct(['LABEL_DETECTION'] AS vision_features)
)

References