BigQuery
The BigQuery integration synchronizes metadata from your BigQuery datawarehouse into the data lineage graph.
Web App

Fields
Field | Value | Example |
---|---|---|
Name | Name for connection | Google BigQuery |
Namespace | Namespace for the connection, see namespaces | default |
project | GCP project id | grai-demo |
dataset | BigQuery Dataset Id | jaffle_shop |
credentials | JSON credentials for service account, see Credentials |
Credentials
-
Create a service account https://cloud.google.com/iam/docs/creating-managing-service-accounts (opens in a new tab).
-
Add the following permissions to your service account:
- BigQuery Data Viewer
- BigQuery Job User
-
Generate json credentials for your service account https://developers.google.com/workspace/guides/create-credentials#service-account (opens in a new tab).
-
Copy and paste the json into the [credentials] field.
Python Library
Installation
Install BigQuery Grai package with pip
pip install grai-source-bigquery
This installs the Grai BigQuery integration, which is now ready to run in python
Connecting & Syncing
The integrations comes equipped with the client library already but we will need a python terminal or Jupyter Notebook to execute a few commands to establish a connection and begin querying the server.
Spin up your favorite python terminal then:
import os
from grai_source_bigquery.base import update_server
For now we will use the default user credentials though you are free to create a new user / api keys from the server admin interface at http://localhost:8000/admin (opens in a new tab).
client = ClientV1("localhost","8000", username="null@grai.io", password="super_secret")
Now we can update the server with data from any BigQuery source. In order to do so you will need to pass credentials and namespace into the update_server function. Namespace is used to uniquely identify the nodes and when used consistently will allow you to add to the node from any source.
Using example variables, in order to update the server with your metadata, simply run:
update_server(client, project="[your_project]", dataset="[your_dataset]]", credentials="[your_credentials]", namespace="[your_grai_namespace]")