Data model types, examples, and tips for Power BI (Part 1)

Written by Kurt Buhler | Oct 9, 2025 7:00:00 AM

Key takeaways

Decide storage mode early: How you store your data in Power BI is a crucial decision to make early in a project.
The wrong choice is costly: Getting it wrong can have serious consequences, and it's a common mistake in teams and projects.
Know your options: Storage modes include Import, DirectQuery, Direct Lake, and Composite models, plus special configurations like dual and hybrid tables.
Import is the default: Import should be your preferred storage mode; use the others only when your scenario calls for it.
Validate before you commit: Test the function, performance, and cost consequences of a storage mode with a proof-of-concept before deciding.

This summary is produced by the author, and not by AI.

Decide between storage modes

In Power BI, you need to make a semantic model (also called a data model) to support analyses with reports or dashboards, pivot tables, and AI. One of the first decisions that you make when you connect to data in Power BI is how you will store it. By default, Power BI will import that data in memory. This means that Power BI loads a copy of the data that you query from report visuals (which generate DAX queries behind-the-scenes). This import storage mode is the simplest and most convenient way to connect to data, but it isn't the only option. Choosing between storage mode types is one of the earliest and most important decisions that you might make in a Power BI project, and it can have significant effects on your model design or functionality.

There are several different storage modes for Power BI, which are summarized broadly and in terms of capabilities in the Microsoft documentation. Since these storage modes are determined by table, you can also combine them in a single model in some scenarios, called a composite model. We discuss composite models later in this article.

The following diagram gives you a brief overview of the different storage mode types:

In this article, we describe each of these storage modes in brief and provide some tips or examples on how to best choose between and use them in a Power BI project. In short:

Import is the preferred default suitable for most scenarios where you load data in memory.
DirectQuery is when you query data in a supported source system directly, and it’s generally considered slower and more expensive. However, you have support for near-real time data if you need it, so it can be suitable in certain situations if you can’t use Direct Lake. DirectQuery is also useful to combine data from an existing semantic model with other data in a composite model.
Direct Lake is only available in Microsoft Fabric. Column data is loaded quickly and on-demand from OneLake into memory, rather than loading all table columns in a model refresh, like with import. Direct Lake is thus a technical optimization of import with comparable query performance. It’s suitable when you have higher data volumes or other challenges to refresh performance, and the process maturity to set up and manage Delta tables in OneLake. You should generally use Direct Lake over OneLake rather than Direct Lake over SQL Endpoint, but note that as of October 2025, Direct Lake over OneLake is still in public preview. Also, Direct Lake is preferential over DirectQuery for most scenarios. Direct Lake is more technically complex and nuanced than the other storage modes.
Composite models are when you combine multiple storage modes in the same model. They are useful in specific situations to solve specific problems, but add complexity in your model design, development, and distribution for a variety of reasons.
Special table configurations like dual or hybrid tables are useful only in specific situations when you need them in composite models or incremental refresh, respectively. Hybrid tables aren’t supported in Pro workspaces.

It is important that you consider the storage mode carefully; do not make the decision about what storage mode to use frivolously, or it will cause you a lot of pain (and potentially a lot of money) down the road. The following sections go into some more detail.

Import

Import is the most common and simplest storage mode, preferred for the majority of scenarios. It’s supported by all data sources and file types. With an import storage mode, you get the best performance, because data is stored in memory. However, when you want to update your data model to include the latest data, then you need to perform a refresh. For larger data volumes or complexity, this refresh can become a tax for some teams, where refreshes become long or put strain on source systems. However, in many cases, this can be avoided by following recommended practices, such as:

Data reduction, which means that you only take the data you need and filter or remove the data that you do not need.
Setting up incremental refresh to only get new or changed rows in a table, rather than trying to refresh the entire table at once.
Power Query optimization, like ensuring that queries fold to the source (so that the data source does heavy lifting, and not Power BI).
Proper management of processes, like avoiding refreshing multiple models at once on the same data source.
Proper management of resources, like configuration of data gateways that you need to communicate with data in certain data sources or scenarios.

Import storage mode is appropriate in scenarios like the following:

You are creating a standard Power BI data model that doesn’t realistically have any specific requirements for your data source connections, data volumes, freshness, or complexity. This is the case for most Power BI data models.
You are conducting a limited analysis of local files (like Excel or .csv files) in Power BI Desktop. These files could be on your local computer, but if you want to set up a scheduled refresh to automatically get the latest data, you’d need to put those files on a cloud location, such as SharePoint or OneDrive, or use an on-premises data gateway with access to those files.
You have a requirement or desire to create calculated columns or calculated tables, which are generally not supported in DirectQuery or Direct Lake (described later).
You are new to or early in your Power BI implementation, and don’t have specific requirements that mandate the use of DirectQuery or Direct Lake (described later), which have additional complexity, limitations, and considerations for you to manage.

In general, Import should be considered the default and preferred way to connect to and store data in Power BI. Other approaches should be considered when you have an explicit reason to use them, and a benefit (over cost) for doing so.

DirectQuery

DirectQuery is an alternate storage mode where you query data directly in a supported data source (such as an Azure SQL database, Snowflake, or Databricks). With import, visuals query in-memory data with DAX. With DirectQuery, there’s an additional step where that DAX is translated into SQL and sent to the source system.

This means that when a user first views a report that’s connected to a semantic model in DirectQuery mode, they see relatively recent data as of the last page refresh or interaction. In contrast, if the semantic model were in import, then the data would only reflect the situation as of the latest semantic model refresh. As such, DirectQuery might be appropriate for some situations where you need near-real time reporting, or higher demands for data freshness.

However, since the data isn't stored in memory, it means that queries usually take longer to complete when you use DirectQuery. That means that your reports take longer to show data, and you have a higher indirect cost of optimizing data models and calculations. You have to consider special configuration and optimizations that pertain only to DirectQuery models, some of which are exclusive to PPU, Premium, or Fabric capacities, like automatic aggregations and user-defined aggregations. Furthermore, querying your data source system can have a direct cost, so higher usage of reports can lead to higher bills. An example of this is would be higher costs from a consumption-based data warehouse such as Snowflake or Databricks.

Direct Query might be appropriate in scenarios like the following:

You have an explicit, real need for near real-time data or a high demand for data freshness which you cannot meet using import storage mode, and you will not address with the Real-Time Intelligence workload in Microsoft Fabric.
You want to reuse the data security defined in your data source for your semantic model queries, and there’s an explicit, measurable benefit to doing so (beyond dogma and preference). Remember that you can simply redefine those security rules as row-level and object-level security in the semantic model, too, which is quite straightforward.
You intend to use DirectQuery for certain tables in a composite model (described later in this article).

In general, Direct Query is considered an expensive and poorly-performing approach that is rarely needed over import. If you have higher demands for data freshness, then you might want to just consider more frequent refreshes of your data model. If you have Microsoft Fabric, then you should consider Direct Lake storage mode as an alternative to Direct Query, which provides comparable data freshness, but superior performance.

Direct Lake

If you have Microsoft Fabric, then you can configure a semantic model to use Direct Lake storage mode. This is a more complex and technical storage mode to explain; we will only cover the basics, here. Direct Lake only works with data that you’ve stored in tables (or in table shortcuts) in OneLake, the storage layer for your data in Fabric. Specifically, if you have tables in a Fabric lakehouse, data warehouse, SQL database, or mirrored database, then you can create a Direct Lake semantic model that can query these tables directly without loading them into the semantic model.

What makes Direct Lake unique is that you experience better performance than DirectQuery, while also supporting demands for higher data freshness. Refreshes with Direct Lake involves a metadata operation called framing which updates the point-in-time reference the Delta log. Then, column data is only loaded into memory (or transcoded) as it’s needed. Again, we’re simplifying the process for this overview article; the key benefit is that you don’t deal with semantic model refresh times. Instead, all data transformation for tables happens upstream with tools like notebooks or pipelines that can do these things more efficiently.

A key difference between import, DirectQuery, and Direct Lake is that Direct Lake is only available on workspaces that use a Fabric capacity license mode. If you don’t have Microsoft Fabric, then you can’t use Direct Lake.

NOTE

You will notice in the Microsoft Documentation that they refer to multiple different types of Direct Lake:

Direct Lake over OneLake: This is the “new” version of Direct Lake that supports additional capabilities and options, including composite models. You create a Direct Lake over OneLake model from Power BI Desktop and Tabular Editor, and not in the Fabric Portal. This is the preferred option if you plan to use Direct Lake for a semantic model in Fabric.
Direct Lake over SQL Endpoint: This is the “old” version of Direct Lake that had additional modelling restrictions and behavior compared to Direct Lake over OneLake. It exhibits a special behavior where it “falls back” to DirectQuery (rather than Direct Lake) in certain circumstances. As of October 2025, you create a Direct Lake over SQL Endpoint model whenever you make it in the Fabric Portal from the lakehouse view. You should only use this if you have an explicit reason for DirectQuery fallback. Otherwise, you should always use Direct Lake over OneLake.
Direct Lake default semantic models: These are semantic models that would be automatically created whenever you made a new lakehouse item in a workspace. They exhibited specific behavior and were inconvenient to work with. They are now decommissioned.

Direct Lake might be appropriate in scenarios like the following:

You are dealing with data that has high volumes and complexity, and are stored in Fabric or available in Fabric via OneLake shortcuts.
You are in an enterprise environment distributing semantic models and reports with a wide or broad scope.
You have data with many different columns that are necessary, but infrequently accessed by downstream user or AI queries.
You are using DirectQuery in a Fabric environment. In general, Direct Lake provides many advantages over DirectQuery and is a direct improvement (pun intended) that might help you obtain better performance and reduce cost.

A key consideration for Direct Lake is the heightened complexity when compared to import or even DirectQuery storage modes (compounded by the existence of two different types of Direct Lake). With Direct Lake, you need to consider advanced technical concepts like the partitioning and management of Delta Tables in OneLake, which must be managed programmatically using a notebook. Query performance on Direct Lake semantic models involves many specific considerations that don’t apply to other storage modes. Furthermore, adding columns or tables requires changes to the underlying delta tables, which is more complex for the average Power BI user to do. As such, Direct Lake might be inappropriate for non-technical teams (like business teams that perform decentralized analytics) or teams early in their Power BI implementation journey that lack the technical skills to manage a Direct Lake model (and therefore the underlying Delta tables) effectively.

For instance, you should not switch to DirectQuery or Direct Lake just because you are having difficulties with the refresh of your import semantic model. First, you should try to diagnose and optimize this refresh, and understand why it’s slow. For many models, slow refreshes can be made faster with a few optimization tweaks, which is a lot less time, effort, and cost than migrating to a different (and more complex) storage mode.

However, over time, it is likely that Microsoft will introduce tools and user interface changes that streamline or reduce complexity for the average Power BI user, making Direct Lake more accessible and easier to use.

Composite (mixing tables with different storage modes)

Storage modes are determined at the level of table partitions. So, it’s possible that you might want to configure some tables to be imported, while other tables from different data sources are queried directly. For instance, you could have a model that has Sales data as well as Product and Customer dimensions imported, but then has an InventorySnapshots table from a different database that uses Direct Query storage mode. A model that contains data from different data sources and which have different storage modes is called a composite model.

NOTE

There is some confusion about the term “composite model”. Many people refer to composite model only as one that only combines Import and DirectQuery tables, or that only combines tables from a published Power BI model with another data source. “Composite” as a term simply just means that there are multiple storage modes in the same data model. Both examples are composite models, but a composite model could also combine Import and Direct Lake tables, DirectQuery and Direct Lake tables… irrespective of what the data source is (given certain source- or scenario-specific constraints). Composite models might be suitable in scenarios like the following:

You have data that needs to be infrequently accessed, such as older, historical transactions in a separate fact table or decommissioned products from the catalogue in a separate dimension table. Having these tables available in Direct Query can support niche, ad hoc cases when people need this data, but they pay a performance tax to obtain and use it. The benefit is that the data isn't stored in memory, which can save space and improve the experience of querying the more relevant, up-to-date information… and make it cheaper to query (since it’s imported into memory).
You want to extend or enhance an existing Power BI data model with additional data tables. An example might be a local sales team extending the regional Sales model by adding their own forecasts or customer mapping from Excel files. In this scenario, they query the existing Power BI model using Direct Query, but can import additional data from Excel files or other sources.

In general, composite models are a specific solution to specific problems. Using a composite model adds complexity and creates additional considerations for model design, functionality, and performance that you wouldn’t otherwise have. Always try to use the simplest solution feasible for your scenario. For instance, you might not want to use a composite model in the scenarios like the following:

Choosing or changing a storage mode for dogmatic or frivolous reasons, or to overcome certain limitations or pain points of your existing model design. An example could be storing tables like Customer, Product, or Date (typically conformed dimension tables) that you store in one published model, and then combine in other models, just to avoid duplicating data by importing it in different semantic models. Duplicating the data is definitely going to be less of a problem than issues you’ll have with this model design.
Recreating a large fact table that uses Import storage mode such that it instead uses DirectQuery storage mode, because it is taking a long time to refresh. You first want to try to optimize any transformations, reduce what you’re importing, or implement incremental refresh before considering a different storage mode to solve a refresh problem. Changing the storage mode should be one of the last options, not the first.

If you do plan to use composite models, prepare that you will need to spend extra time and effort to educate yourself about the consequences for how you build, manage, and distribute your model. Regarding performance, SQLBI also has a specific white paper and other free training material about this topic.

The following two sections describe some configurations that are specific to tables, and not data models as a whole.

Dual partition storage mode

When you change a DirectQuery table to import in a composite model, you can end up with limited relationships. Limited relationships negatively affect model performance, because joins can’t propagate to the source system. To elaborate, detailed data must be transferred from the DirectQuery source to the VertiPaq engine, then the join is done in memory on data without an index (which is slow). To avoid this, Power BI lets you configure the storage mode to dual for the partitions of certain related tables.

Note that Power BI Desktop exposes this (and storage modes in general) at the table level. Technically, however, storage modes are properties of partitions. Except for hybrid tables, all partitions in a table should share the same storage mode.

A scenario when this happens could be a composite model that has fact tables with mixed storage modes (like a Sales table that is imported, and a HistoricalSales table that is DirectQuery). In this scenario, you configure dual storage mode for your dimensions. In a nutshell, dual storage mode means that the table will use import, Direct Lake, or DirectQuery depending on the query.

In summary, you just should be aware of dual table storage mode when you’ll use composite models. Depending on your model design, they might be necessary.

Hybrid table configuration

Earlier, we discussed the concept of incremental refresh in an import model. With incremental refresh, you can configure a table to only refresh new or changing rows. This can speed up refresh times quite dramatically. How this works is the table is broken up into “chunks” called partitions (usually based on a date field) and only certain partitions (usually the most recent) are refreshed.

However, it is also possible to configure a table such that some partitions are imported, while the most recent one is configured to use DirectQuery. This hybrid table configuration supports scenarios where you have demands for higher data freshness, but that only pertains to the most recent period (like today or this week), while other periods are refreshed (like this month) and the other historical data are static; archived. In this way, we mitigate performance problems inherent to DirectQuery by only querying a very small amount of data that is relevant in a particular freshness window.

Realistically, you should only consider hybrid tables over traditional incremental refresh if this data freshness requirement is very real and explicit. For the vast majority of scenarios, this isn't necessary; standard incremental refresh will be sufficient.

A final note is that hybrid tables are a premium feature. This means that you cannot use hybrid tables in semantic models published to a Pro workspace. You can only use them if the workspace has Premium-Per User, Premium capacity, or Fabric capacity license modes.

For further reading

Data model types, examples, and tips for Power BI (Part 2) (Tabular Editor). The follow-up article covers model designs, including star schema, snowflake, and one big table, and when each structure suits the storage mode you chose here.
Composite models in Power BI and Fabric (Tabular Editor). Explores the use cases and considerations for mixing storage modes within a single model, extending the composite mode overview in this article.
Fabric lake-based storage overview (Microsoft). Microsoft's official overview of the lake-based storage mode, including how it differs from import and source-query modes and its current limitations.
Storage mode in Power BI Desktop (Microsoft). Covers table storage modes, dual storage mode propagation, and the model-level implications of mixing modes.

In conclusion

Choosing a storage mode is a very important decision that you make early-on in a Power BI project. Making the wrong decision can have disastrous consequences, and it isn't uncommon to see teams and projects that chose wrongly because of assumptions or misunderstandings in how these storage modes work and what their benefits are. Realistically, import storage mode is the preferred option for the vast majority of scenarios. In some situations, you might want (or need) to use DirectQuery, Direct Lake, or composite models, but doing so means that you must spend some extra time and effort to adequately manage the potential complexity or limitations that might come with these choices.

Before you decide, it is worthwhile to conduct a small test yourself and identify what the consequences are in terms of function, performance, and cost.

Take your semantic models further with Tabular Editor.

Give Tabular Editor a spin

Plagiarism-freeScanned on June 25, 2026 Written by human, not AIScanned on June 25, 2026

View full post