Build Your Own LinkedIn Analytics Part 7: Dashboard Design for Insights and Impact

Published on Medium on 24 November 2025

In the previous article, we modeled our LinkedIn analytics data in the gold layer. Now it’s finally time to use that data to build a dashboard for insights and impact— and ensure that it doesn’t end up gathering dust.

This is part of the following blog series: Build Your Own LinkedIn Analytics

Table Of Contents

TL;DR
I. Why a Dashboard?
II. Expected Outcome
III. Initial Setup
IV. Creating the Pages
V. What’s Next?

TL;DR

We explore the different approaches to reporting on business data.
We design and build our example dashboard.

I. Why a Dashboard?

Before we begin, it’s worth asking: why a dashboard?

There are many ways for data analytics to be done, as follows:

Ad-hoc analysis: A business user may ask the data analyst for a one-off insight, or the data analyst wants to explore the data in a particular fashion. If that is the case, the data analyst will gather and process all the data themselves to arrive at a conclusion, usually in the form of an Excel sheet with embedded charts, or in the form of a PDF document or a PowerPoint presentation.
Reports: If the business use case starts to recur with some regularity, the data analyst may need to formalize the data extraction and cleansing step by working with the data engineer to build appropriate data pipelines. The data analyst may still perform the last step manually and present the report in a manner similar to that of an ad-hoc analysis, or the data engineer may step in to general such a report, in which case the business case needs to be well thought out.
Dashboards: If the business user wants to do exploration of the data themselves, or wants the data on their fingertips quickly, a dashboard connected to the data becomes the way to go. The business case definately needs to be well thought out in this case.

Unfortunately many reports and dashboards are built in haste, or simply because they can be built; either way there isn’t a compelling business use case behind them. This is how one ends up with reports that no one reads and dashboards that no one uses, and how one then ends up with replicated work as the business users build their own analytics.

In this case, we are the data engineer, data analyst and business user all in one. For the purposes of learning but also for our own self-service, a dashboard is eminently suitable.

Before we continue, it’s worth noting that Databricks has an ‘Open in a dashboard’ option whenever you have a table or view loaded in your catalog. Clicking the dropdown shows the following options:

The dropdown options available for ‘Open in a dashboard’.

The first option lets us create a Databricks dashboard that’s directly connected to the selected table or view, but what are these other options?

Genie: This is Databricks’ GenAI-powered space for asking questions of your data. You’re presented with a ChatGPT-like chat interface, and from there you can pose questions about your data just as you would with any AI assistant. Genie spaces can be customized with a curated set of tables and views as well as sample queries and system instructions. However, for this project I wanted to show all the key metrics in one glance, which is the forte of a traditional dashboard, whereas Genie is designed more for conversational, ad-hoc exploration. GenAI-powered data exploration is likely the future of BI, so we will revisit Genie at the end of this blog series; meanwhile you can check out Databricks’ official documentation for more information.
Tableau: This is one of the two major business intelligence (BI) tools available. Tableau kickstarted the modern (pre-GenAI) business intelligence experience, and today can be found as the default visualization tool in major companies. However, it is a very expensive piece of software. Tableau Cloud does have a free tier, but all your data has to be public, which is not what we want for this project.
PowerBI: This is Microsoft’s answer to Tableau, and it has become the other major BI tool in the market. Anyone familiar with Excel will feel quite at home with PowerBI, and integrating it with existing Microsoft subscriptions or setups (e.g. Active Directory — I mean, Microsoft Entra) is straightforward. To top it off, PowerBI workspaces are now part of the larger Microsoft Fabric ecosystem of data infrastructure, so in theory you could do this entire stack within the Microsoft environment. That said though, there is no free tier for Microsoft Fabric (trial periods don’t count), which is why it wasn’t chosen for this project.

If your organization already has Tableau or PowerBI/Fabric dashboards, Databricks makes it easy to integrate with these workflows. For our purposes, we will stick with the Databricks’ native dashboard capabilities.

Third-party LinkedIn analytics products will already have their own, opinionated versions of dashboards. As noted in the first post, these dashboards are overkill for our purposes, especially since they are designed to appeal to the wide range of users for their products. Building a customized dashboard like what we are doing results in a product tailored to our specific needs, in exchange for needing to build and maintain the solution.

II. Expected Outcome

So then let’s refresh what our expected outcome is. My stated goal at the start of the series was to better understand my LinkedIn statistics, but what does that mean exactly?

When we chose our data source we already did some analysis of what useful data there was available, and by the modelling stage we had further refined what our data is able to do. To recap:

We have daily records for impressions and engagements on a per post level.
We have daily records for followers changes on a profile level.
We have metadata relating to each post’s content and creation timestamp / date.

Based on this, we can look at the following:

a. Historical time-series analysis for impressions and engagements

There are many ways we can approach this, below is a non-exhaustive list of possibilities.

Analysis of trends by weekday. With enough data, this can be expanded to analysis of trends by the month or quarter.
Analysis of post performance based on days after posting.
Identification and analysis of outliers.

b. Historical time-series analysis for follower count

We can do analysis of trends as well as of outliers, similar to that for impressions and engagements, but we can also explore the relationship between changes in impressions/engagments/number of posts etc with the change in follower count.

c. Contextual enrichment from post metadata

With the post metadata, we can do the following and more:

See the context of the currently selected post on a dashboard, e.g. title and content.
Segment posts by posting time (binned by the hour or by broader categories such as morning/afternoon/evening/night), type of post, or some other relevant post metadata.

There are many other ways to approach this analysis, so much so that it could be an entire blog series by itself. We’ll save that for the future, but feel free to give your own suggestions of what metrices can be considered!

Now that we’ve clarified our analytical objectives, it’s time to build a dashboard that brings these ideas into focus.

III. Initial Setup

In the Databricks menu on the left side, click on the ‘Dashboards’ option in the ‘SQL’ section.

The Databricks menu with the ‘Dashboards’ option highlighted

Straight away, you’ll see that Databricks has linked two sample dashboards for you to explore at your leisure. We’ll ignore them and click on the ‘Create dashboard’ button on the top right corner.

The button to create a Databricks dashboard.

Welcome to the fresh Databricks dashboard experience. What I like about this initial experience is how all the different options are laid out for you with helpful tooltips and indicative arrows.

What a fresh, black Databricks dashboard looks like.

Let’s first rename the dashboard by clicking on the text on the top left corner and replacing it with our own title. In this case I’ve renamed it to ‘LinkedIn Statistics’.

Then we will need to pull in data. In our case the relevant options are ‘Create from SQL’ and ‘Add data source’; ‘Upload file’ will load a compatible file format into a Databricks table and then use that as a data source, which would be useful for ad-hoc exploration or for adding external static reference tables that act as supplementary fact tables. (We discussed fact tables in our earlier post on modelling the data.)

‘Add data source’ lets you choose whichever table or view you wish, but ultimatly what it actually does is to generate a ‘SELECT *’ statement on that table or view.

What you can choose as data sources; essentially anything that’s listed in Unity Catalog.

Here is the list of data sources that I’ve defined:

List of data sources for the dashboard. The datasets in blue are being used by one or more dashboard panels and cannot be deleted without modifying the sources for those panels or deleting the panels.

We have all 4 fact and dimension tables from the gold layer, but the other data source needs some additional explanation:

enriched_daily_posts is a join of fct_daily_posts_statistics with dim_date using ‘analytics_date’ as the join key. This enriches our post data with the date related metadata and is a classic use of a dimension table to enrich a fact table.

Here’s the SQL code for enriched_daily_posts for reference:

WITH avg_stats_by_weekday AS (
    SELECT 
        AVG(impressions) AS avg_impressions_by_weekday, 
        AVG(engagements) AS avg_engagements_by_weekday, 
        weekday
    FROM gold.linkedin.fct_daily_post_statistics daily_stats
    LEFT JOIN gold.linkedin.dim_date dim_date
    ON daily_stats.analytics_date = dim_date.date
    WHERE post_id = 'total'
    GROUP BY weekday
)
SELECT 
    post_id, 
    post_publish_date, 
    post_publish_timestamp, 
    impressions, 
    engagements, 
    avg_impressions_by_weekday,
    avg_engagements_by_weekday,
    CAST(analytics_date AS DATE) AS analytics_date, 
    day as analytics_day,
    month as analytics_month,
    year as analytics_year, 
    week as analytics_week,
    CONCAT(
      dim_date.weekday, ' ', dim_date.weekday_str
    ) AS analytics_weekday,
    is_weekend as analytics_is_weekend, 
    days_since_post, 
    COALESCE(change_in_impressions, 0) AS change_in_impressions,
    COALESCE(change_in_engagements, 0) AS change_in_engagements, 
    title, 
    content
FROM gold.linkedin.fct_daily_post_statistics daily_stats
LEFT JOIN gold.linkedin.dim_date dim_date
ON daily_stats.analytics_date = dim_date.date
LEFT JOIN avg_stats_by_weekday
ON dim_date.weekday = avg_stats_by_weekday.weekday

WITH avg_stats_by_weekday AS (
    SELECT 
        AVG(impressions) AS avg_impressions_by_weekday, 
        AVG(engagements) AS avg_engagements_by_weekday, 
        weekday
    FROM gold.linkedin.fct_daily_post_statistics daily_stats
    LEFT JOIN gold.linkedin.dim_date dim_date
    ON daily_stats.analytics_date = dim_date.date
    WHERE post_id = 'total'
    GROUP BY weekday
)
SELECT 
    post_id, 
    post_publish_date, 
    post_publish_timestamp, 
    impressions, 
    engagements, 
    avg_impressions_by_weekday,
    avg_engagements_by_weekday,
    CAST(analytics_date AS DATE) AS analytics_date, 
    day as analytics_day,
    month as analytics_month,
    year as analytics_year, 
    week as analytics_week,
    CONCAT(
      dim_date.weekday, ' ', dim_date.weekday_str
    ) AS analytics_weekday,
    is_weekend as analytics_is_weekend, 
    days_since_post, 
    COALESCE(change_in_impressions, 0) AS change_in_impressions,
    COALESCE(change_in_engagements, 0) AS change_in_engagements, 
    title, 
    content
FROM gold.linkedin.fct_daily_post_statistics daily_stats
LEFT JOIN gold.linkedin.dim_date dim_date
ON daily_stats.analytics_date = dim_date.date
LEFT JOIN avg_stats_by_weekday
ON dim_date.weekday = avg_stats_by_weekday.weekday

Three additional transformations are happening here:

analytics_date and post_publish_date are being cast as DATE to prevent the vestigial timestamps from being displayed on the dashboard. This is again a sign that the gold or even the silver source table may need some refactoring, but needing to cast timestamp to date is honestly a minor inconvienience in the real world.
avg_engagements_by_weekday and avg_impressions_by_weekday are calculated via a CTE. Again, these are values that could also be calculated in the gold or silver layer; in this case, the potential amount of data to be processed as the range of analytic dates grows makes this a priority to refactor into a more efficient pipeline after the POC.
The numerical value of weekday is being prepended to the name of the weekday, for instance ‘0 Mon’ or ‘1 Tue’. This is being done to preserve the sorting of the weekdays in the dashboard graphs while still being informative about exactly what weekday is being referred to. This type of massaging is fairly typical when preparing fields in datasets for display in dashboards.

Now that our data sources are prepared, it’s time to decide what our dashboards shoud be showing, and how it should show it.

Ideally the design of the dashboard comes about from deep discussions with the business stakeholder and engagement with the underlying data, and would usually involve multiple iterations and evolve along with the needs of the business case. Even if you are both the developer and the user, the dashboard is still likely to evolve over time as you encounter limitations, discover new use cases and also as the data evolves.

IV. Creating the Pages

Within each page, there are three main types of panels that can be created:

Visualization: these are visuals associated with underlying datasets, ranging from graphs to tables.
Text box: these hold static text, usually to elaborate on certain parts of a dashboard page or the dashboard page as a whole.
Filter (field/parameter): these act to dynamically filter one or more datasets associated with the page. This concept is similar to slicers in Microsoft Excel and in Tableau.

All three options are listed from left to right in the toolbar shown at the bottom of the dashboard page.

Dashboard toolbar. From left to right: Select, Create visualization, Create text box, Create filter, Undo, and Redo.

Mixing these panels types strategically creates dashboards that are both visually enaging and deeply informative.

Every panel (or widget) will have its own settings when selected, visible on the right. Below are two examples for a filter panel and a visualization panel respectively:

Dashboard design (and business intelligence in general) is a whole discipline by itself, best covered in a dedicated series. Meanwhile, Databricks has extensive documentation on dashboards that includes best practices and advanced functionalities; you can also find best practices in the PowerBI and Tableau documentation.

For now, let’s focus on our specific use case as a practical example.

a. Summary page

The summary page is where all the important information should be front and centre. This is the page that the business user first sees, including in any email alerts sent out when the dashboard updates. So only relevant (ideally actionable) information should be on this page, even more so than for the other pages.

Below is the visual and the configurations used for the summary page.

Some things to note:

I chose Combo (line and bar) to visually seperate impressions and engagement numbers while still keeping them in the same visualization.
The Value and Target options for Counter are meant to define what the value currently is versus what its targeted value in smaller font. I instead use them to show maximun and minimun values for the same metric. As long as I am consistent with this usage, and coupled with the descriptive title, there should little chance of confusion.
I used a gradient colour to distinguish the secondary metric for the ranked daily posts so that the numbers that make up each day follow a visual progression from light to dark.
The focus is on impression numbers first and foremost. This is because engagement numbers are currently too low to be of much use, and I am more concerned about the overall reach of my posts, and by extension my profile.

b. By post page

The by post page is for deep diving into the details of a particular post, for instance to suss out what makes the post perform better than another post. Below is the visual and the configurations used:

Some things to note:

This is an example of where one of the filters (Days since post) only filters part of the chart, based on the dataset that the filter applies to along with the source dataset for the respective visualizations. The summary page has this too, but the end result is not as obvious.
I use the Table visualization to show the metadata for a specific post; I put content in its own panel so that it doesn’t clutter the other metadata.
Setting the filter for days_since_post to be between 0 and 0 (i.e. to be equal to 0) is a proxy for showing only results for the first day of the post. In this case it’s used to select a single post for the purposes of extracting the post metadata.

c. By day page

The by day page, similar to the by post page, is for deep diving into the statistics for a paticular day, including what was posted on that day and the individual daily post statistics. Here’s what the visual and configurations look like:

Some things to note:

Here we see an alternative way to show impressions and engagements side-by-side; by using a bar graph with the group layout (rather than the stack layout). This is ideal for categorical visualizations, as opposed to the time-series visualizations from prior pages.
The chart for Posts on this day is deliberately small, as I do not intend to post multiple times in a day (at most twice).

d. Historical page

The historical page is for a bird’s-eye view of everything that has gone on in the LinkedIn profile. Here’s the visual and associated configurations:

Panel configurations for historical page.

Things to note:

Historical post statistics is the only visualization which shows data reaching back to more than a year ago. This serves as a visual illustration of the change in impressions and statistics ever since I started my LinkedIn content strategy.
days_since_post is limited to 3 by default; this captures the bulk of the historical impressions for the great majority of posts.

V. What’s Next?

We have now sucessfully constructed a data product from end to end — except, none of this is scheduled. Automated orchestration and robust scheduling are crucial for true enterprise analytics, and that is the subject of our next post. Stay tuned!