CTRL + K
Integrations
BigQuery

BigQuery

The BigQuery integration synchronizes metadata from your BigQuery datawarehouse into the data lineage graph.

Web App

BigQuery Integration

Fields

FieldValueExample
NameName for connectionGoogle BigQuery
NamespaceNamespace for the connection, see namespacesdefault
projectGCP project idgrai-demo
datasetBigQuery Dataset Idjaffle_shop
credentialsJSON credentials for service account, see Credentials

Credentials

  1. Create a service account https://cloud.google.com/iam/docs/creating-managing-service-accounts (opens in a new tab).

  2. Add the following permissions to your service account:

  • BigQuery Data Viewer
  • BigQuery Job User
  1. Generate json credentials for your service account https://developers.google.com/workspace/guides/create-credentials#service-account (opens in a new tab).

  2. Copy and paste the json into the [credentials] field.

Python Library

Installation

Install BigQuery Grai package with pip

pip install grai-source-bigquery

This installs the Grai BigQuery integration, which is now ready to run in python

Connecting & Syncing

The integrations comes equipped with the client library already but we will need a python terminal or Jupyter Notebook to execute a few commands to establish a connection and begin querying the server.

Spin up your favorite python terminal then:

import os
from grai_source_bigquery.base import update_server

For now we will use the default user credentials though you are free to create a new user / api keys from the server admin interface at http://localhost:8000/admin (opens in a new tab).

client = ClientV1("localhost","8000", username="null@grai.io", password="super_secret")

Now we can update the server with data from any BigQuery source. In order to do so you will need to pass credentials and namespace into the update_server function. Namespace is used to uniquely identify the nodes and when used consistently will allow you to add to the node from any source.

Using example variables, in order to update the server with your metadata, simply run:

update_server(client, project="[your_project]", dataset="[your_dataset]]", credentials="[your_credentials]", namespace="[your_grai_namespace]")