Welcome to the Lytics Community!

Lytics Community

Search for answers here or
2

Duplicate Column Names: Use Projected Column Names

Overview

Note: Lytics is actively working toward a solution for the error described here.

When building a Cloud Connect audience scenarios may arise where the same data type exists in multiple fields in an audience query. A known error could occur if the name of the selected primary key is ambiguous in the result set.

For example, given that the foreign source database is BigQuery and has two tables with the following schema:
 

users

email

account_id

 

customers

email

accountid


A CloudConnect user could create a segment with the following SQL definition:

SELECT users.email, customers.email
FROM `users` a
LEFT OUTER JOIN `customers` b
ON a.account_id = b.accountid


This SELECT statement would produce the following schema for the result set:
 

Result Set Schema

email

email_1

 

You can see that BigQuery automatically generates a projected name for one of the columns, since both cannot be named "email".

The user is presented with this result set schema and is prompted to choose a column as the primary key. Assume the user selects "email" as the primary key. The user then saves the segment.

When a sync is executed for a segment, an ORDER BY clause is appended to the SQL query and references the primary key selected by the user. In this example, the query would look as follows if the user selects "email" as the primary key:
 

SELECT users.email, customers.email
FROM `users` a
LEFT OUTER JOIN `customers` b
ON a.account_id = b.accountid
ORDER BY email

 

This query, when run against BigQuery during the sync, will throw an error because "email" in the ORDER BY clause is ambiguous. It does not know whether a.email or b.email should be email. 

Evidence of this error can be seen by using the edge events API and searching for the segment id.

Temporary Workaround


If two or more columns are selected from different tables and those columns have the same name, make sure that all of those columns are projected in the SQL. For example, the query above could be fixed by altering it to the following:
 

SELECT users.email AS email, customers.email AS customer_email
FROM `users` a
LEFT OUTER JOIN `customers` b
ON a.account_id = b.accountid
Reply Oldest first
  • Oldest first
  • Newest first
  • Active threads
  • Popular