Starter SQL Queries
If you use SQL in your BI tool, the following queries will help you to explore your Fusion warehouse and accelerate your time to analysis. In general, the queries are written in the order of least to most complexity, and include explanations and sample outputs where possible.
Nota bene: Fusion is a MySQL database, so be sure you use syntax for this database. Other databases such as Oracle and Postgres may use other built-in functions, dates and time formats, and queries which differ slightly from MySQL.
I. Understanding your Schema
The first thing you'll want to do after you've connected your Fusion warehouse to a BI tool is understand your schema, or data model. Many BI tools list your tables in a panel or grid view. For example:
However, the same view may also be returned with the following command:
This query will return all three types of tables — raw, fused, and links — in your Fusion warehouse:
SHOW TABLES; +-----------------------+ | Tables_in_e4bab23d80487eb8 | +-----------------------+ | fused_contact | | fused_activity | | fused_activity_contributors | | fused_campaign | | fused_campaign_contributors | | fused_opportunity | | fused_opportunity_contributors | | fused_opportunitycontactrole | | fused_order | | fused_order_contributors | ... | hubspot1_company | | hubspot1_deal | | hubspot1_event | ... | salesforce1_callcenter | | salesforce1_campaign | ... | links_activity_order | | links_activity_ticket | | links_campaign_company | | links_campaign_contact | | links_campaign_list | | links_campaign_opportunity | | links_campaign_order | ... +-----------------------+
Explore Your Tables
Now to understand what fields exist within a given table — raw, fused, or links — use a SELECT * statement.
SELECT * FROM fused_campaign; +-----------------------+ | name | | type | | campaign | +-----------------------+
Note that the fused tables (e.g. fused_activity, fused_company, fused_contact) always follow the naming convention fused_[object].
If you want return the actual data types, DESCRIBE is also a good command to use. For example:
Fusion also includes the raw records in your systems, using the [system][#]_[object] naming convention. To return the fields for a HubSpot instance's owner table, for example, you would write the following query:
SELECT * FROM hubspot1_owner;
In some BI tools, this query will return the fields and data types (e.g. dimensions and measures). For example:
As we'll see below, links tables are especially useful for joining tables together. Alone, they won't return anything except the ID's for a given combination of objects. This is because links tables form relationships between your SaaS applications' different objects. Every links table thus returns three fields:
SELECT * FROM links_company_opportunity; +-----------------------+ | links_company_opportunity_id | | opportunity_id | | company_id | +-----------------------+
Return Field Values
Expanding on the links table above, we can explore what's in a links table by returning the field name values themselves.
SELECT links_company_opportunity_id, company_id, opportunity_id FROM links_company_opportunity;
This will return the links_company_opportunity_id field value (a concatenation of the two object ID's in question, company and opportunity), company_id, and opportunity_id values.
As you can see, the links tables return values very similar to primary keys. But links tables are much more powerful, as they can join tables and form many-to-many relationships very quickly and with very little code.
To return values with more direct meaning, query the fields for the raw and/or fused tables. For example:
SELECT company_name, phone, industry, description FROM fused_company;
These fused tables are the special sauce that powers your analysis. They gather every major object type and organize into common groups like contacts and opportunities.
Now that we know how our Fusion warehouse is structured, we can begin analyzing our customer data. Below are just a few places to start, including lifecycle analysis, cohort analysis, account-based roll-ups, lead scoring, paid search, and more.
The fused_contact table is a good table to conduct a lifecycle analysis. Start with a simple grouping of picklist fields (e.g. country, state, status, lead_score, lifecyclestage, hubspot_score). Then get a COUNT for the lifecycle_stage field. For example:
You can make the output look nicer by defining clearer names to variables and ordering in a certain manner with the GROUP BY and ORDER BY commands. For example:
Measuring Opportunities by Deal Stage
Depending on the system you use, you might be talking about “Opportunities” or “Deals” or some other term, but all of these words indicate the same thing: a prospect who is actively engaged in the sales process. For simplicity, Fusion buckets these deal stages into the fused_opportunity table. Querying the fused_opportunity table by deal stage will let you see where a prospect or group of prospects are in your sales process and how likely they are to close as a customer based on that stage.
Some fields available include:
For example, the following query returns the deal stage, closed date, amount of each deal, and who closed the deal from the fused opportunity table for a certain fiscal quarter.
Thanks to the links tables, running joins in Fusion are much simpler and shorter to execute. One use case would be to join the contacts and opportunities table by using the links_contact_opportunity table, then grouping the results by Lead Score, Deal Stage, and Date a deal closed or was lost.
SELECT fused_contact.hubspot_score AS `Score`, fused_opportunity.deal_stage AS `Stage`, fused_opportunity.close_date AS `Date` FROM fused_opportunity JOIN links_contact_opportunity ON fused_contact.contact_id = links_contact_opportunity.contact_id WHERE fused_opportunity.amount >= '10000' GROUP BY `Score`, `Stage`, `Date`;
Taking our analysis one step further, we can join an additional table to the fused opportunities and contacts tables. One use case for this would be whether paid search leads result in prospects who found your product. NOTE: this assumes you have a product object and not just a product field on the opportunity.
SELECT fused_contact.created_at AS `Date`, fused_opportunity.stage AS `Stage`, fused_opportunity.amount AS `MRR` FROM fused_opportunity JOIN links_contact_opportunity ON fused_contact.contact_id = links_contact_opportunity.contact_id links_contact_product ON fused_product.product_id = links_contact_product.product_id WHERE fused_contact.leadsource = 'Paid Search' AND fused_contact.created_at BETWEEN '2017-01-01' AND '2017-03-31' AND fused_product.product = 'A' GROUP BY `Date`, `Stage`, `MRR`;
Cohort Analysis & Account-Based Roll-Ups
Rolling up retention by account (also called cohort analysis) is a key metric to monitor the growth of your product lines and company. The major benefit being you can calculate rolling retention against any cohort window with any behavioral condition.
To execute this type of analysis, you'll want to SELECT fused_company, if not the following tables as well:
The fields you return will obviously vary depending on your data, but they may include fields like:
These fields will help you to calculate the percentage of returning users at a regular interval, typical weekly or monthly, grouped by their sign-up week/month, also known as cohort. For this reason you can start your SQL statement with any fields that show the progression of a start date to cancelled date, whether they are active or inactive, how much recurring revenue they have produced, and their current status.
SELECT active_customer AS `Active Customer`, product_A_start_date AS `Product A Start Date`, product_B_start_date AS `Product B Start Date`, product_A_status AS `Product A Status`, product_B_status AS `Product B Status`, product_A_cancelled_date AS `Product A Cancelled Date`, product_B_cancelled_date AS `Product B Cancelled Date`, active_customer AS `Active Customer`, product_A_mrr AS `Product A MRR`, product_B_mrr AS `Product B MRR`, product_A_status AS `Product A Status`, product_A_status AS `Product B Status`, total_mrr AS `Total MRR`, annual_revenue AS `Annual Revenue` FROM fused_company;
Fields of this ilk represent the building blocks for your cohort analysis.
If you are accessing other tables, like fused_opportunity and fused_contact, make sure you first SELECT these, along with their proper fields.
SELECT fused_contact.created_at AS `Date`, fused_opportunity.stage AS `Stage`, fused_opportunity.amount AS `MRR`
Then use the links tables to complete your join.
JOIN links_company_opportunity ON fused_company.external_company_id = links_company_opportunity.external_company_id links_contact_product ON fused_product.product_id = links_contact_product.product_id
And towards the end of your cohort analysis, you will likely want to GROUP BY date, stage, MRR, and any other fields you find relevant:
GROUP BY `Date`, `Stage`, `MRR`, `Annual Revenue`, `Product A`, `Product B`;