A semantic model is a layer that provides meaning to data (semantic) by using objects, calculations, and logic (model) to represent a real-world process, like sales or pricing. A semantic model is a core component of Microsoft Fabric and Power BI; it’s one of the most important item types you manage in an end-to-end Fabric or Power BI solution as this is the primary way your end users interact with the data. It’s therefore important for you to know how to effectively design, build, and deploy good semantic models efficiently to meet business data needs.
In this series, we provide tips for how to effectively build a semantic model. In the previous article, we described key considerations to gather requirements for your semantic model.
In this article, we describe how you can connect and transform data when you build your model in Tabular Editor or Power BI Desktop.
There are many equally valid ways to build a semantic model that fits your data needs. This series presents one approach, sharing tips and good practices from our experience.
Before you start to build a semantic model, you should first evaluate whether the data is ready for modelling. Your objective here is to identify significant issues or challenges in the data that could cause problems and cost time once you proceed with the actual modelling.
To do this assessment, you should connect to and explore the data in a query tool, notebook, or simply opening and browsing the file (i.e. if it’s an Excel file or CSV). Depending on what you see, you may need to spend time transforming your data so you can make the best model from it.
The following points are examples of warning signs that your data might not yet be ready to begin building your semantic model:
Forecast table.
Sales table where 'Sales'[Customer Key] is null.Sales table has a 'Sales'[Net Value] of 3 million USD.Orders table has a 'Orders'[Delivery Date] before the 'Orders'[Order Creation Date], or the 'Orders'[Order Creation Date] is null.Sales table you need to do this.If you start to build your model in spite of data issues like those mentioned above, you risk lost time and resources due to issues, changes, and assumptions. Poor quality data will result in a poor quality model that delivers little value.
If your data isn’t ready, you can still create a model wireframe to define the expected or desired data structure. You can also begin with a prototype or proof-of-concept with preliminary data that’s not yet fully complete or in the ideal shape. While this can improve agility and give you early feedback, you should take steps to avoid this temporary solution being perceived as a reliable source for business decision-making. Prototypes and proof-of-concepts shouldn’t be deployed directly into production, as they typically haven’t reached acceptable quality or maturity thresholds.
It’s common that your data isn’t in the shape you need for your semantic model. Rarely – if ever – do you proceed without adjusting, or even altering the model design. Once you’ve identified any potential data issues, you should take proactive, follow-up actions.
After exploring the data, you should describe any issues you see with examples. Include the query you used to identify the issue, and the unique identifier from an example transaction. This helps you and others investigate, fix, and validate the issue. If feasible, you should also indicate why the issue is a problem, and what the potential impact is for the model and the overall solution.
If you aren't the person who maintains the data source, then discuss these issues with this team or individual. For instance, if a separate team is maintaining the data warehouse and you're building the semantic model, they need to know this information, so that they can investigate data quality issues and make adjustments.
In some organizations, the teams who deliver data from back-end systems (like data warehouses and lakehouses) are different than the teams who create semantic models and reports from that data. If this is the case, make sure that the people delivering data understand what’s needed by Power BI for a good semantic model. Otherwise, assumptions will lead to a lot of wasted time and resources trying to get the data in the right shape to report it out. Typically, this results with data in back-end systems that “isn’t what we need”, resulting in extracts on SharePoint, and lengthy Power Query transformations that make models hard to optimize and maintain. This is particularly important when performing migrations to Fabric and Power BI.
You may need to (re-)align with business stakeholders. This is necessary mainly for data provided or maintained by the business, like a budget file. Meet with the business to discuss the issues, but ensure you have concrete examples and questions prepared. Don’t get too technical; just ensure that they can give you the clarity you need to proceed.
This is important to manage stakeholder expectations. Most business stakeholders don’t understand what data is available or how it’s shaped, or even the effort required to go from raw data to an elegant report.
It’s possible that you may need to revisit the requirements or alter the solution design based upon your findings. Ideally, big issues are found during the requirements gathering process, but this isn’t always feasible.
If you’ve identified issues with the quality of the data, this should be prioritized. Do what you can to resolve these issues, but be aware that some perceived data quality issues can actually reflect abnormalities in the underlying business process. Raise these issues to the appropriate people to understand if issues that you perceive as data quality are due to business process flaws or exceptions.
Here are some examples of issues you might perceive as being a data quality problem, but actually reflect a business process exception or flaw.
Price Changes table, the 'Price Change'[Expiry Date] could be 01/01/2099.
Orders table, the 'Orders'[Creation Date] is after the 'Orders'[Requested Delivery Date].
Sales table, the 'Sales'[Billing Date] is always the end of the month, even though the business want to see sales by workday in the month.
'Sales'[Delivery Date], instead.Data quality is a broad (and contentious) topic. In general, you should ensure that you’ve dealt with any detectable data quality issues before you proceed, otherwise you may make suboptimal transformations and produce a suboptimal model.
Uncovering, understanding, and discussing data anomalies with business stakeholders can be an opportunity to improve not only the data, but also the underlying business processes. This can greatly increase the value and impact of your work.
If the issues don't pertain to data quality, you or someone else who maintains the upstream data source will need to transform the data so that it fits your needs. We discuss data transformation in detail in the next article of this series.
Once you’ve gathered the requirements for your semantic model, you should ensure that your data is model-ready. To do this, you explore the data with an appropriate tool and identify, document, and communicate issues that you find. Generally, your objective is to resolve data quality issues, and prepare to transform your data in the next step.
In the next article of this series, we’ll share how you can transform your data for a semantic model. One way to do this is by using Power Query in the semantic model itself, and we’ll explain how you can best view and edit Power Query from within Tabular Editor.
Take your semantic models further with Tabular Editor.
Give Tabular Editor a spin