Streaming Data Transform in Data Cloud
- February 10, 2026
- Posted by: SFDCGYM
- Category: Salesforce Data Cloud ,
Unlocking the Power of Streaming Data Transform in Data Cloud
Streaming Data Transform is an essential tool for professionals working with the Salesforce Data Cloud. It allows Data Cloud developers, architects, and engineers to clean, enhance, and analyze streaming data as it flows into the data infrastructure. Whether you’re building data pipelines, optimizing real-time processing, or curating custom datasets, Streaming Data Transform delivers unmatched accuracy and flexibility.
This blog will provide a deep-dive into Streaming Data Transform, explain its core capabilities, and break down SQL operations that help unlock the power of real-time transformation. We’ll also explore use cases with examples to help you build smarter and faster data pipelines.
What Is Streaming Data Transform?
At its core, Streaming Data Transform processes incoming event-driven data in real-time. It’s an on-the-fly data transformation tool for structured and semi-structured data, leveraging SQL to curate data for immediate use in downstream applications.
The platform simplifies creating transformations using a GUI (Graphical User Interface) and offers rich SQL syntax capabilities. It ensures that raw data transforms into actionable insights as it enters the Data Cloud. Think of it as both your clean-up crew and a data magician, merging datasets, creating dynamic lists, and optimizing performance without the need for batch processing or manual intervention.
Key Rule for Using DLOs in Streaming Data Transform
A crucial limitation to keep in mind when working with Streaming Data Transform is that you cannot use more than two Data Lake Objects (DLOs) at a time in the Builder. If you need to merge two or more DLOs, you’ll need to write SQL manually, ensuring that there’s a common primary target DLO between them.
Example:
If you want to merge customer data and purchase history stored across three different sources, write SQL to combine them under one logical DLO. This ensures streamlined integration and reduces redundancy in your data pipeline.
SQL tip for combining DLOs effectively:
- Match primary keys across datasets.
- Define target relationships clearly.
- Avoid nested operations that complicate transformations
Core Operations in Streaming Data Transform
Streaming Data Transform allows for several operations that simplify and optimize your data workflow. Here’s what you can achieve using SQL-powered transformations:
- Clean Data – Perform operations like removing null values or trimming unnecessary characters.
- Format Data – Standardize date fields, capitalize names, or change values into readable text.
- Merge Data – Combine multiple datasets using UNION or join conditions.
- Analyze in Real Time – Employ conditional logic and aggregate functions to extract key insights immediately.
Example of Streaming Data Query Structure
When working with streaming data, the query structure typically includes SELECT statements, Data Lineage Operators (DLOs), and WHERE conditions to filter the incoming data. Here’s a basic example of a streaming data query:
SELECT event_time, user_id, action_type
FROM event_stream
WHERE action_type = ‘purchase’ AND event_time >= CURRENT_TIMESTAMP – INTERVAL ‘1 HOUR’;
Explanation of Key Components:
- Fields (SELECT Clause): The fields
event_time,user_id, andaction_typeare specified in the SELECT clause to extract specific columns from the incoming data stream. These fields represent the most relevant pieces of information for the use case, ensuring the query returns meaningful results. - Data Lineage Operator (DLO): The FROM clause defines the streaming source,
event_stream, which represents the dataset being processed. This operator ensures that data is continuously fetched from the live stream, enabling real-time transformations and analysis. - WHERE Condition: The WHERE clause filters the data to include only records where
action_typeis equal to ‘purchase’ and limits the time range to events occurring in the last hour. This filtering is essential for narrowing the scope of analysis, reducing noise, and focusing on actionable insights.
SQL Structure for Streaming Data Transform
The SQL foundation for Streaming Data Transform is built around fundamental functions and operations. Below, we explore examples of powerful SQL features and their output.
1. JSON_TABLE Function
Imagine you have a dataset where JSON data is stored in a single column. Here’s a simplified version of what that might look like:
Dataset: customers_with_orders__dll
| order_id__c | json_data__c |
|---|---|
| 1 | {“customer”: “John Doe”, “items”: [{“name”: “Widget”, “price”: 25.00}]} |
| 2 | {“customer”: “Jane Smith”, “items”: [{“name”: “Gadget”, “price”: 15.00}]} |
After Using JSON_TABLE
By using JSON_TABLE, you can extract and structure this JSON data into separate columns, making it easier to query and analyze.
Transformed Dataset
| order_id__c | customer__c | item_name__c | item_price__c |
|---|---|---|---|
| 1 | John Doe | Widget | 25.00 |
| 2 | Jane Smith | Gadget | 15.00 |
SQL Query Using JSON_TABLE
SELECT
customers_with_orders__dll.order_id__c as order_id__c,
customers.customer as customer__c,
items.item_name as item_name__c,
items.item_price as item_price__c
FROM customers_with_orders__dll,
JSON_TABLE(
customers_with_orders__dll.json_data__c, '$' COLUMNS (
customer TEXT PATH '$.customer'
)
) customers,
JSON_TABLE(
customers_with_orders__dll.json_data__c, '$.items[*]' COLUMNS (
item_name TEXT PATH '$.name',
item_price NUMBER PATH '$.price'
)
) items
This query extracts the customer, item_name, and item_price from the JSON data, creating a more structured and query-friendly dataset.
2. Concatenate Fields
Let’s look at an example of how you can use the CONCAT() function to concatenate fields in a dataset.
Example Dataset
Imagine you have a dataset of customer information with separate fields for first and last names:
| customer_id | first_name | last_name |
|---|---|---|
| 1 | John | Doe |
| 2 | Jane | Smith |
Using CONCAT() to Combine Fields
You can use the CONCAT() function to combine the first_name and last_name fields into a single full_name field.
SQL Query
SELECT
customers__dll.customer_id__c as customer_id__c,
CONCAT(first_name, ' ', last_name) AS full_name__c
FROM customers__dll
Resulting Dataset
| customer_id__c | full_name__c |
|---|---|
| 1 | John Doe |
| 2 | Jane Smith |
In this example, the CONCAT() function is used to merge the first_name and last_name fields, with a space in between, to create a new full_name field. This makes it easier to display or use the full name in reports or applications
3. Handle Null Values
The COALESCE() function is used to return the first non-null value from a list of arguments. It’s particularly useful for handling missing data.
Example Dataset
Let’s say you have a dataset of customer contact information where some customers have a phone number, others have an email, and some have both:
Using COALESCE() to Select the First Non-Null Contact Method
You can use the COALESCE() function to create a new field that selects the first available contact method, either phone or email.
SQL Query
SELECT
customers__dll.customer_id as customer_id__c,
COALESCE(phone_number, email) AS primary_contact__c
FROM customers__dll
Resulting Dataset
| customer_id__c | primary_contact__c |
|---|---|
| 1 | 123-456-7890 |
| 2 | jane@example.com |
| 3 | 987-654-3210 |
In this example, the COALESCE() function checks each row and returns the phone_number if it’s available; otherwise, it returns the email. This way, you ensure that the primary_contact field always has a value, even if one of the contact methods is missing.
4. Change Case
Let’s explore how to use the LOWER() and UPPER() functions to change the case of text in a dataset.Example DatasetSuppose you have a dataset with product names that are inconsistently cased
| product_id | product_name |
|---|---|
| 1 | Widget |
| 2 | gadget |
| 3 | THINGAMABOB |
Using LOWER() and UPPER() to Standardize Case
You can use the LOWER() function to convert all product names to lowercase, or the UPPER() function to convert them to uppercase, depending on your needs.
SQL Query to Convert to Lowercase
SELECT
products__dll.product_id as product_id__c,
LOWER(product_name) AS product_name_lower__c
FROM products__dll
Resulting Dataset (Lowercase)
| product_id__c | product_name_lower__c |
|---|---|
| 1 | widget |
| 2 | gadget |
| 3 | thingamabob |
SQL Query to Convert to Uppercase
SELECT
products__dll.product_id as product_id__c,
UPPER(product_name) AS product_name_upper__c
FROM products__dll
Resulting Dataset (Uppercase)
In these examples, the LOWER() function is used to ensure all product names are in lowercase, while the UPPER() function converts them to uppercase. This can be particularly useful for standardizing data for display or comparison purposes.
5. Convert Data Types
The TO_DATETIME() function is used to convert a date string into a datetime format, which is essential for performing date-based operations and comparisons.
Example Dataset
Consider a dataset where dates are stored as strings:
| event_id | event_date_string |
|---|---|
| 1 | 2025-01-01 |
| 2 | 2025-02-15 |
| 3 | 2025-03-30 |
Using TO_DATETIME() to Convert Strings to Datetime
You can use the TO_DATETIME() function to convert these string representations into a datetime format.
SQL Query
SELECT
events__dll.event_id as event_id__c,
TO_DATETIME(event_date_string) AS event_date__c
FROM events__dll
Resulting Dataset
| event_id__c | event_date__c |
|---|---|
| 1 | 2025-01-01 00:00:00 |
| 2 | 2025-02-15 00:00:00 |
| 3 | 2025-03-30 00:00:00 |
In this example, the TO_DATETIME() function is used to convert the event_date_string field into a datetime format, allowing for more complex date manipulations and queries.
6. Handle Logical Conditions
When working with null values in data, functions like ISNULL(), IFNULL(), NULLIF(), and IS NOT NULL play a crucial role in handling missing or undefined data effectively. Let’s explore how these functions work with examples for each.
ISNULL()
The ISNULL() function checks whether an expression is NULL and returns a boolean value.
Example Dataset
| customer_id | phone_number |
|---|---|
| 1 | 123-456-7890 |
| 2 | NULL |
| 3 | 987-654-3210 |
SQL Query Using ISNULL()
SELECT
customers__dll.customer_id as customer_id__c,
ISNULL(phone_number) AS is_phone_null__c
FROM customers__dll
Resulting Dataset
| customer_id__c | is_phone_null__c |
|---|---|
| 1 | 0 |
| 2 | 1 |
| 3 | 0 |
IFNULL()
The IFNULL() function returns a specified value if the expression is NULL; otherwise, it returns the expression
SQL Query Using IFNULL()
SELECT
customers__dll.customer_id as customer_id__c,
IFNULL(phone_number, 'No Phone') AS phone_number__c
FROM customers__dll
Resulting Dataset
| customer_id__c | phone_number__c |
|---|---|
| 1 | 123-456-7890 |
| 2 | No Phone |
| 3 | 987-654-3210 |
NULLIF()
The NULLIF() function returns NULL if two expressions are equal; otherwise, it returns the first expression.
Example Dataset
| product_id | current_price | previous_price |
|---|---|---|
| 1 | 100 | 100 |
| 2 | 150 | 120 |
| 3 | 200 | 200 |
SQL Query Using NULLIF()
SELECT
products__dll.product_id as product_id__c,
NULLIF(current_price, previous_price) AS price_change__c
FROM products__dll
Resulting Dataset
| product_id__c | price_change__c |
|---|---|
| 1 | NULL |
| 2 | 150 |
| 3 | NULL |
IS NOT NULL
The IS NOT NULL condition checks whether an expression is not NULL.
SQL Query Using IS NOT NULL
SELECT
customers__dll.customer_id as customer_id__c,
customers__dll.phone_number as phone_number__c
FROM customers__dll
WHERE phone_number IS NOT NULL
Resulting Dataset
| customer_id__c | phone_number__c |
|---|---|
| 1 | 123-456-7890 |
| 3 | 987-654-3210 |
These functions are useful for handling NULL values in different scenarios, allowing you to manage missing or undefined data effectively.
7. CASE WHEN
You can use the CASE WHEN statement to assign letter grades based on the score.
SQL Query
SELECT
student_id,
score,
CASE
WHEN score >= 90 THEN 'A'
WHEN score >= 80 THEN 'B'
WHEN score >= 70 THEN 'C'
WHEN score >= 60 THEN 'D'
ELSE 'F'
END AS grade
FROM students
Resulting Dataset
| student_id | score | grade |
|---|---|---|
| 1 | 85 | B |
| 2 | 72 | C |
| 3 | 90 | A |
| 4 | 65 | D |
In this example, the CASE WHEN statement evaluates each student’s score and assigns a grade based on the specified conditions. If none of the conditions are met, it defaults to assigning an ‘F’. This approach is useful for categorizing or transforming data based on specific criteria.
8. Perform Union Operations
Let’s visualize the data transformation process by showing the structure and content of the tables before and after applying the UNION operation.
Before UNION: Source Table
Table: Contacts_with_multiple_phone_numbers__dll
| contactid__c | mobilenumber__c | worknumber__c | homenumber__c |
|---|---|---|---|
| 1 | 123-456-7890 | 234-567-8901 | 345-678-9012 |
| 2 | 456-789-0123 | 567-890-1234 | 678-901-2345 |
| 3 | 789-012-3456 | 890-123-4567 | 901-234-5678 |
After UNION: Target Table
Table: PhoneNumbers
| ID__c | ContactID__c | Phone_Number__c | Phone_Number_Type__c |
|---|---|---|---|
| Mobile_1 | 1 | 123-456-7890 | Mobile |
| Work_1 | 1 | 234-567-8901 | Work |
| Home_1 | 1 | 345-678-9012 | Home |
| Mobile_2 | 2 | 456-789-0123 | Mobile |
| Work_2 | 2 | 567-890-1234 | Work |
| Home_2 | 2 | 678-901-2345 | Home |
| Mobile_3 | 3 | 789-012-3456 | Mobile |
| Work_3 | 3 | 890-123-4567 | Work |
| Home_3 | 3 | 901-234-5678 | Home |
Explanation
Before UNION: The source table Contacts_with_multiple_phone_numbers__dll contains multiple phone numbers in each row, categorized by type (mobile, work, home) for each contact.
After UNION: The target table PhoneNumbers is the result of the UNION operation, where each phone number is extracted into its own row. The ID__c column uniquely identifies each phone number entry by combining the type and contact ID. The Phone_Number_Type__c column indicates the type of phone number. This transformation normalizes the data, making it easier to manage and query.
Let’s break down the data transformation process using the UNION operator to normalize phone numbers from the Contacts_with_multiple_phone_numbers__dll table into the PhoneNumbers table. Each SELECT statement extracts a different type of phone number: mobile, work, and home.
SQL Query
SELECT
Concat("Mobile_", Contacts_with_multiple_phone_numbers__dll.contactid__c) as ID__c,
Contacts_with_multiple_phone_numbers__dll.contactid__c as ContactID__c,
TRIM(Contacts_with_multiple_phone_numbers__dll.mobilenumber__c) as Phone_Number__c,
"Mobile" as Phone_Number_Type__c
FROM Contacts_with_multiple_phone_numbers__dll
WHERE Contacts_with_multiple_phone_numbers__dll.mobilenumber__c <> "" AND Contacts_with_multiple_phone_numbers__dll.mobilenumber__c <> ''
UNION
SELECT
Concat("Work_", Contacts_with_multiple_phone_numbers__dll.contactid__c) as ID__c,
Contacts_with_multiple_phone_numbers__dll.contactid__c as ContactID__c,
Contacts_with_multiple_phone_numbers__dll.worknumber__c as Phone_Number__c,
"Work" as Phone_Number_Type__c
FROM Contacts_with_multiple_phone_numbers__dll
WHERE Contacts_with_multiple_phone_numbers__dll.worknumber__c <> "" AND Contacts_with_multiple_phone_numbers__dll.worknumber__c <> ''
UNION
SELECT
Concat("Home_", Contacts_with_multiple_phone_numbers__dll.contactid__c) as ID__c,
Contacts_with_multiple_phone_numbers__dll.contactid__c as ContactID__c,
Contacts_with_multiple_phone_numbers__dll.homenumber__c as Phone_Number__c,
"Home" as Phone_Number_Type__c
FROM Contacts_with_multiple_phone_numbers__dll
WHERE Contacts_with_multiple_phone_numbers__dll.homenumber__c <> "" AND Contacts_with_multiple_phone_numbers__dll.homenumber__c <> ''
Practical Tips for Leveraging Streaming Data Transform
- Clean and Prep Early: Use functions like
TRIM(),ISNULL(), andTO_DATE()to standardize input data as it arrives. - Optimize Efficiency: Merge similar datasets with
UNIONfor better scalability. - Document Complex Logic: Label custom transformations clearly for easy debugging.
- Collaborate Actively: Share SQL queries within teams to build faster, actionable workflows.
Summary Of This Article
Here’s a quick breakdown of the blog highlighting the key insights and practical tips for leveraging Streaming Data Transform in Salesforce Data Cloud. 💡
🌟 Introduction
- What is it? Streaming Data Transform enables real-time cleaning, enhancing, and analysis of data streams. It’s ideal for building smarter, real-time data pipelines.
- Why it matters: It offers unmatched accuracy, flexibility, and the ability to transform raw streaming data into actionable insights as it flows into your system.
🔑 Core Features and Benefits
- Clean Data Effortlessly: Remove null values, standardize data, and handle missing information effectively.
- Get Insights in Real-Time: Apply advanced SQL operations like conditional logic and aggregation for immediate analysis.
- Streamlined Transformation: Combine datasets, normalize structures, and optimize scalability using SQL-powered operations.
✏️ SQL Examples and Use Cases
- Extract JSON Data: Use
JSON_TABLE()to parse complex JSON into structured datasets for easier querying. - Handle Null Values: Techniques like
ISNULL()andCOALESCE()ensure complete datasets without gaps. - Standardize Data: Use
LOWER()orUPPER()to ensure text consistency andTO_DATE()to convert dates properly. - Union Operations: Merge datasets with multiple phone numbers into normalized tables for efficient management.
⚡ Practical Tips to Maximize Streaming Data Transform
- 🧼 Clean Early: Apply functions like
TRIM(),ISNULL(), andTO_DATE()as data arrives. - 🔄 Merge for Scalability: Use
UNIONto consolidate datasets and enhance workflow efficiency. - 📄 Document Everything: Clearly label custom transformations for debugging and team collaboration.
- 🤝 Share Knowledge: Actively collaborate by sharing SQL queries within your team to fast-track processes.
🎯 Example SQL Queries
- Assign Grades: Use the
CASE WHENstatement to categorize data, like converting scores into letter grades. - Concatenate Fields: Merge first and last names in customer datasets with the
CONCAT()function for full name creation. - Real-Time Filters: Extract key insights with queries filtering live streams, like purchases in the last hour.
🏆 Why Use Streaming Data Transform?
- It’s a game-changer for Salesforce Data Cloud users, allowing seamless, real-time data transformation.
- Helps simplify complex workflows, making data easier to analyze and use in downstream applications.
👉 Start using Streaming Data Transform today to clean, combine, and harness your data with precision and speed! 💻📊