Skip to content

Redshift/BigQuery Linked Audiences GA + Materialized view requirements [DOC-1094] #7554

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 13 commits into from
Apr 10, 2025
53 changes: 29 additions & 24 deletions src/unify/data-graph/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -13,12 +13,16 @@ The Data Graph acts as a semantic layer that allows businesses to define relatio

## Prerequisites

> info "Why you need both materialized and unmaterialized tables"
> Segment recommends using materialized views for Profiles Sync to optimize performance and reduce query costs with Linked Audiences. However, due to schema inference requirements, you still need to select the matching **unmaterialized tables** as well. Segment relies on the unmaterialized tables during setup, even if they’re not used when queries run.

To use the Data Graph, you'll need the following:

- A supported data warehouse with the appropriate Data Graph permissions
- Workspace Owner or Unify Read-only/Admin and Entities Admin permissions
- For Linked Audiences, set up [Profiles Sync](/docs/unify/profiles-sync/) in a Unify space with ready-to-use [data models and tables](/docs/unify/profiles-sync/tables/) in your warehouse. When setting up selective sync, Segment recommends the following settings:
- Under **Profile materialized tables**, select all the tables (`user_identifier`, `user_traits`, `profile_merges`) for faster and more cost-efficient Linked Audiences computations in your data warehouse.
- **Make sure to include the unmaterialized tables as well**. Segment needs them during setup to understand your schema.
- Under **Track event tables**, select **Sync all Track Call Tables** to enable filtering on event history for Linked Audiences conditions.

> info ""
Expand All @@ -29,7 +33,7 @@ To use the Data Graph, you'll need the following:
> Data Graph, Reverse ETL, and Profiles Sync require different warehouse permissions.

To get started with the Data Graph, set up the required permissions in your warehouse. Segment supports the following:
- Linked Audiences: [BigQuery](/docs/unify/data-graph/setup-guides/BigQuery-setup/), [Databricks](/docs/unify/data-graph/setup-guides/databricks-setup/), and [Snowflake](/docs/unify/data-graph/setup-guides/snowflake-setup/)
- Linked Audiences: [BigQuery](/docs/unify/data-graph/setup-guides/BigQuery-setup/), [Databricks](/docs/unify/data-graph/setup-guides/databricks-setup/), [Redshift](/docs/unify/data-graph/setup-guides/redshift-setup/), and [Snowflake](/docs/unify/data-graph/setup-guides/snowflake-setup/)
- Linked Events: [BigQuery](/docs/unify/data-graph/setup-guides/BigQuery-setup/), [Databricks](/docs/unify/data-graph/setup-guides/databricks-setup/), [Redshift](/docs/unify/data-graph/setup-guides/redshift-setup/), and [Snowflake](/docs/unify/data-graph/setup-guides/snowflake-setup/)

To track the data sent to Segment on previous syncs, Segment uses [Reverse ETL](/docs/connections/reverse-etl/) infrastructure to store diffs in tables within a dedicated schema called `_segment_reverse_etl` in your data warehouse. You can choose which database or project in your warehouse this data lives in.
Expand Down Expand Up @@ -107,8 +111,8 @@ data_graph {
primary_key = "SUB_ID"
}

# Define the profile entity, which corresponds to Segment Profiles tables synced via Profiles Sync
# Recommend setting up Profiles Sync materialized views to optimize warehouse compute costs
# Define the profile entity, which corresponds to Segment Profiles tables synced with Profiles Sync
# Use materialized views in Profiles Sync to reduce query costs and speed things up
profile {
profile_folder = "PRODUCTION.SEGMENT"
type = "segment:materialized"
Expand All @@ -118,22 +122,22 @@ data_graph {
relationship "user-accounts" {
name = "Premium Accounts"
related_entity = "account-entity"
# Join the profile entity with an identifier (e.g. email) on the related entity table
# Join the profile entity with an identifier (like email) on the related entity table
# Option to replace with the trait block below to join with a profile trait on the entity table instead
external_id {
type = "email"
join_key = "EMAIL_ID"
}

# Define 1:many relationship between accounts and carts
# e.g. an account can be associated with many carts
# for example, an account can be associated with many carts
relationship "user-carts" {
name = "Shopping Carts"
related_entity = "cart-entity"
join_on = "account-entity.ID = cart-entity.ACCOUNT_ID"

# Define many:many relationship between carts and products
# e.g. there can be multiple carts, and each cart can be associated with multiple products
# for example, there can be multiple carts, and each cart can be associated with multiple products
relationship "products" {
name = "Purchased Products"
related_entity = "product-entity"
Expand All @@ -157,7 +161,7 @@ data_graph {
}

# Define 1:many relationship between households and subscriptions
# e.g. a household can be associated with multiple subscriptions
# for example, a household can be associated with multiple subscriptions
relationship "user-subscriptions" {
name = "Subscriptions"
related_entity = "subscription-entity"
Expand Down Expand Up @@ -203,10 +207,10 @@ data_graph {

Next, define the profile. This is a special class of entity that represents Segment Profiles, which corresponds to the Profiles Sync tables and models. For Linked Audiences, this allows marketers to filter on profile traits, event history, etc. There can only be one profile for a Data Graph.

| Parameters | Definition |
| ----------- | --------------------------------------------------------------------- |
| `profile_folder` | Define the fully qualified path of the folder or schema location for the profile tables. |
| `type` | Identify the materialization method of the profile tables defined in your Profiles Sync configuration under [Selective Sync settings](/docs/unify/profiles-sync/profiles-sync-setup/#step-3-set-up-selective-sync): `segment:unmaterialized` or `segment:materialized`.|
| Parameters | Definition |
| ---------------- | --------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `profile_folder` | Define the fully qualified path of the folder or schema location for the profile tables. |
| `type` | Use `segment:materialized` to sync materialized views with Profiles Sync. Segment recommends this configuration for all Linked Audiences and Data Graph setups. If you can't sync materialized views, [reach out to Segment support](https://segment.com/help/contact/){:target="_blank"} for help. |

**Example:**

Expand Down Expand Up @@ -238,23 +242,24 @@ This is the first level of relationships and a unique type of relationship betwe

| Parameters | Definition |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (e.g. `user-account` or `user_account`) |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (like `user-account` or `user_account`) |
| `name` | A label displayed throughout your Segment space for Linked Events, Linked Audiences, etc. This name can be modified at any time |
| `related_entity` | References your already defined entity |

To define a profile-to-entity relationship, reference your entity table and depending on your table columns, choose to join on one of the following:

**Option 1 (Most common) - Join on an external ID:** Use the `external_id` block to join the profile entity with an entity table using external IDs from your [Unify ID resolution](/docs/unify/identity-resolution/externalids/) settings. Typically these identifiers are `user_id`, `email`, or `phone` depending on the column in the entity table that you want to join with.
- `type`: Represents the [external ID type](/docs/unify/identity-resolution/externalids/#default-externalids) (`email`, `phone`, `user_id`) in your id-res settings. Depending on if you are using materialized or unmaterialized profiles, these correspond to different columns in your Profiles Sync warehouse tables:
- [Materialized](/docs/unify/profiles-sync/tables/#the-user_identifiers-table) (Recommended): This corresponds to the `type` column in your Profiles Sync `user_identifiers` table.
- [Unmaterialized](/docs/unify/profiles-sync/tables/#the-external_id_mapping_updates-table): This corresponds to the `external_id_type` column in your Profiles Sync `external_id_mapping_updates` table.
- `join_key`: This is the column on the entity table that you are matching to the external identifier.
**Option 1 (Most common) - Join on an external ID:** Use the `external_id` block to join the profile entity with an entity table using external IDs from your [Unify ID resolution](/docs/unify/identity-resolution/externalids/) settings. Typically these identifiers are `user_id`, `email`, or `phone` depending on the structure of your entity table.
- `type`: Represents the [external ID type](/docs/unify/identity-resolution/externalids/#default-externalids) (`email`, `phone`, `user_id`) in your ID resolution settings.
- This maps to the `type` column in the `user_identifiers` table when using materialized views.
- `join_key`: The column on the entity table that matches the external ID.

> note ""
> Segment recommends using materialized views with Profiles Sync. However, Segment may still reference unmaterialized tables during setup for schema detection.

**Option 2 - Join on a profile trait:** Use the `trait` block to join the profile entity with an entity table using [Profile Traits](/docs/unify/#enrich-profiles-with-traits).
- `name`: Represents a trait name in your Unify profiles. Depending on if you are using materialized or unmaterialized profiles, these correspond to different columns in your Profiles Sync warehouse tables:
- [Materialized](/docs/unify/profiles-sync/tables/#the-profile_traits-table) (Recommended): The trait name corresponds to a unique value of the `name` column in your Profiles Sync `user_traits` table.
- [Unmaterialized](/docs/unify/profiles-sync/tables/#the-profile_traits_updates-table): This corresponds to a column in the Profile Sync `profile_trait_updates` table.
- `join_key`: This is the column on the entity table that you are matching to the trait.
- `name`: Represents a trait name in your Unify profiles.
- This maps to the `name` column in the `user_traits` table when using materialized views.
- `join_key`: The column on the entity table that you're matching to the trait.

**Example:**
```python
Expand All @@ -277,7 +282,7 @@ data_graph {
name = "Premium Accounts"
related_entity = "account-entity"

# Option 1: Join the profile entity with an identifier (e.g. email) on the related entity table
# Option 1: Join the profile entity with an identifier (like email) on the related entity table
external_id {
type = "email"
join_key = "EMAIL_ID"
Expand All @@ -298,7 +303,7 @@ For 1:many relationships, define the join on between the two entity tables using

| Parameters | Definition |
| ---------------- | -------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (e.g. `user-account` or `user_account`) |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (like `user-account` or `user_account`) |
| `name` | A label displayed throughout your Segment space for Linked Events, Linked Audiences, and so on. This name can be modified at any time |
| `related_entity` | References your already defined entity |
| `join_on` | Defines relationship between the two entity tables `[lefty entity slug].[column name] = [right entity slug].[column name]`. Note that since you’re referencing the entity slug for the join on, you do not need to define the full table reference |
Expand Down Expand Up @@ -343,7 +348,7 @@ For many:many relationships, define the join on between the two entity tables wi

| Parameters | Definition |
| ---------------- | ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (e.g. `user-account` or `user_account`) |
| `relationship` | An immutable slug for the relationship, and will be treated as a delete if you make changes. The slug must be in all lowercase, and supports dashes or underscores (like `user-account` or `user_account`) |
| `name` | A label displayed throughout your Segment space for Linked Events, Linked Audiences, and so on. This name can be modified at any time |
| `related_entity` | References your already defined entity |

Expand Down
5 changes: 2 additions & 3 deletions src/unify/data-graph/setup-guides/BigQuery-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,12 +6,11 @@ redirect_from:
- '/unify/linked-profiles/setup-guides/BigQuery-setup'
---

> info ""
> BigQuery for Data Graph is in beta and Segment is actively working on this feature. Some functionality may change before it becomes generally available. This feature is governed by Segment’s [First Access and Beta Preview Terms](https://www.twilio.com/en-us/legal/tos){:target="_blank"}.
> warning ""
> Data Graph, Reverse ETL, and Profiles Sync require different warehouse permissions.

Set up your BigQuery data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/data-graph/).


## Step 1: Roles and permissions
> warning ""
> You need to be an account admin to set up the Segment BigQuery connector as well as write permissions for the `__segment_reverse_etl` dataset.
Expand Down
3 changes: 3 additions & 0 deletions src/unify/data-graph/setup-guides/databricks-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,9 @@ redirect_from:
- '/unify/linked-profiles/setup-guides/databricks-setup'
---

> warning ""
> Data Graph, Reverse ETL, and Profiles Sync require different warehouse permissions.

On this page, you'll learn how to connect your Databricks data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/data-graph/).

## Databricks credentials
Expand Down
6 changes: 2 additions & 4 deletions src/unify/data-graph/setup-guides/redshift-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,15 +6,13 @@ redirect_from:
- '/unify/linked-profiles/setup-guides/redshift-setup'
---

> info ""
> Redshift for Data Graph is in beta and Segment is actively working on this feature. Some functionality may change before it becomes generally available. This feature is governed by Twilio Segment’s [First Access and Beta Preview Terms](https://www.twilio.com/en-us/legal/tos){:target="_blank"}.
> warning ""
> Data Graph, Reverse ETL, and Profiles Sync require different warehouse permissions.

Set up your Redshift data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/).

## Prerequisite

To use Linked Audiences with Redshift, the Data Graph only supports [materialized views](/docs/unify/profiles-sync/tables/#tables-segment-materializes).

If you're setting up Profiles Sync for the first time in the Unify space, go through the setup flow for Selective sync. If Profiles Sync is already set up for your Unify space, follow these steps to configure Profiles Sync for your Unify space:

1. Navigate to **Unify > Profile Sync**.
Expand Down
2 changes: 1 addition & 1 deletion src/unify/data-graph/setup-guides/snowflake-setup.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,7 @@ redirect_from:
- '/unify/linked-profiles/setup-guides/snowflake-setup'
---
> warning ""
> Data Graph, Reverse ETL, Profiles Sync require different warehouse permissions.
> Data Graph, Reverse ETL, and Profiles Sync require different warehouse permissions.

On this page, you'll learn how to connect your Snowflake data warehouse to Segment for the [Data Graph](/docs/unify/data-graph/data-graph/).

Expand Down
Loading