r/MicrosoftFabric

50GB worth of excel files, how to load?

Hi,

I got a task where I get hundreds of excel files, each 700-800MB of size. I cannot influence what I get so I am stuck with these files.

Things tried so far on 6 files for starters 4.5GB:

- Notebook(Python) - One file takes 30min, all 6 files it will time out.

- Copy job - I get a message that the file is too big for it :(

- Dataflow - all 6 files 24min, so to prevent timeout will probably need to build few of them and the orchestrate in pipeline.

Any suggestions on how to deal with this monster anything I am missing here? I am for now trying to put them in one table in a lake house for further data flow processing.

reddit.com
u/seacess — 11 hours ago

Bizarre Permission Issue: Visuals blank for App Viewers unless granted "Write" access (Import Model)

Hey everyone, I’m running into a wall with a bizarre permission issue in the Power BI Service and could use some architecture-level insights.

The Setup:

  • A single semantic model (100% Import) sitting in a workspace.
  • Multiple thin reports connected to this single model.
  • These reports are bundled into a Power BI App for consumers.

The Problem: Workspace Admins can see everything perfectly. However, App consumers (Viewers) are getting completely blank visuals or zeros on some of the reports. They aren’t getting the gray "X" (Cannot load visual) error—the DAX is successfully evaluating, but it’s returning an empty table context.

The Twist (and what makes no sense): If I go to the dataset permissions and explicitly grant the test Viewer Read, Write, and Build access, the visuals instantly populate. If I strip it back to just Read and Build, the visuals instantly go blank again.

What I have already investigated and ruled out:

  1. It’s not a Persistent Filter: Having the Viewer click "Reset to default" in the App does nothing.
  2. It’s not Direct Lake / DirectQuery: The model is 100% Import, so there are no underlying SSO database permission failures.
  3. No DAX Identity Functions: USERPRINCIPALNAME() or USERNAME() are not being used in any of the core measures to spoof security.
  4. Row-Level Security (RLS) is cleared: I found a "ghost" RLS role in the metadata via Tabular Editor that was filtering a Budget User table. I have completely deleted this role from the model and republished. The issue persists.
  5. Object-Level Security (OLS): Checked in Tabular Editor; OLS is set to default/none for the tables involved.

The Final Clue: Even though they share the exact same underlying semantic model, some reports in the App work perfectly for Viewers, while others return blanks.

Since the VertiPaq engine inherently forces anyone with "Write" access to bypass security roles, it feels like the Service is still enforcing a phantom security rule or interpreting a specific relationship/DAX query as an authoring action.

Has anyone seen the Power BI Service enforce a "Write" permission requirement for basic consumers on an Import model? Could a complex bi-directional cross-filter or a specific DAX aggregation pattern (like TREATAS) trigger a Build/Write requirement in the Service?

reddit.com
u/dv0812 — 15 hours ago

Open-source Fabric Spark Operations Skill — looking for feedback

Hey everyone,

We're on the Fabric Data Engineering team. We built the **Fabric Spark Operations Skill** — an open-source diagnostic tool that lets you troubleshoot Spark workloads in plain English from GitHub Copilot, VS Code, Claude, or your preferred AI assistant. It queries the Fabric APIs and returns a severity-ranked report with root cause analysis, performance insights, and fix recommendations.

It handles common scenarios such as workspace-level Spark job analysis, failed notebook diagnosis, pipeline failure tracing, historical run patterns, and more.

Setup takes a few minutes — install the skill, run `az login`, and you're ready to go.

**Repo:** skills-for-fabric/skills/spark-operations-cli at main · microsoft/skills-for-fabric

We'd love feedback from the community - try it out and let us know what works, what's missing, or what you'd want to see next. Comments here or GitHub issues both welcome!

reddit.com
u/JennyAce01 — 24 hours ago

May 18th Outage reasoning?

What was the cause for the outage the other day? There was rumors of an Iranian hacker group claiming to have carried out a DDOS attack, but I havent seen anything from Microsoft. We have folks in our org asking for the reasoning and I have to fill out RCA forms related to this.

reddit.com

Overhauled Fabric Delta Lake Docs

Folks - for those that enjoy reading technical documentation like I do, quick FYI that the Fabric Delta Lake docs have been completely overhauled. Biggest docs PR I've ever submitted... 18 new docs pages and refinement of what we already had to remove a few inconsistencies and provide a more cohesive docs story.

Want to know....

Just a sampling of what's new... would love to hear of any revelations or golden nuggets people find here.

u/mwc360 — 1 day ago

Fabric Git integration + feature branching.. am I missing something?

I’m running into a pain point with Fabric’s Git integration and I can’t tell if I’m missing something obvious or if this is just how it works.

Scenario:

  • Dev workspace linked to dev branch
  • I create a feature branch + separate workspace (using the branch-out approach)
  • Build out a new ingestion process:
    • Notebook
    • Lakehouse + supporting items
    • Pipeline (runs notebook, alerts, etc.)

So far, so good.

The problem comes when merging back. After merging the feature branch into dev and syncing the dev workspace…

The pipeline’s Run Notebook activity still points to:

  • the feature workspace ID
  • the feature notebook ID

So now the pipeline in dev is effectively wired to the wrong workspace/items.

Workarounds I can see:

  • Parameterise reference IDs - workspace id is very doable, but notebook references are messy (Fabric API calls, web activities, etc.)
  • Manually fix IDs post-merge - means another branch + PR just to fix references

My confusion is that Fabric seems to be pushing Git integration, workspace feature branching and UI-driven development, but in reality, these don’t seem to work well together when it comes to item references.

So:

  • How are people handling this in practice?
  • Is there a clean pattern I’m missing?
  • Or is this just a gap in the current model?

Would really appreciate hearing what’s worked (or hasn’t) for others.

reddit.com
u/GrimFunko — 1 day ago

Delta Table Maintenance in Microsoft Fabric - A 2026 Practitioner's Guide

I've been pulling together the current state of Delta table maintenance guidance in Fabric and wrote up everything I wish had been in one place when I started. The short version: the official docs have improved a lot recently, but they're spread across 6-7 different Microsoft Learn pages, a couple of blog posts from Microsoft engineers, and some release note archives. Most practitioners haven't read all of it.

The article covers:

- V-Order and Optimize Write defaults – including why the V-Order default page and the comparison table contradict each other (and which one to trust)

- Auto-Compaction, Adaptive Target File Size, Fast Optimize, and File Level Compaction Target - what they do, what's off by default, and why they belong in your utility notebook.

- Do you still need a scheduled OPTIMIZE job? - yes, but probably not the way you're running it now.

- Liquid Clustering vs partitioning - why Fabric's own docs make no mention of partitioning as a recommended strategy.

- Deletion vectors and their Direct Lake impact - the cold-start overhead that accumulates quietly if you're not running OPTIMIZE before your Power BI refresh.

- A reference section mapping each topic to the specific doc page it came from.

I also flag the March 2026 preview features (Lakehouse Maintenance Activity in Pipelines, SQL Endpoint Refresh Activity) that are relevant here but untested.

Full article here (https://bradcoles-dev.github.io/blog/fabric-delta-table-maintenance.html).

Happy to answer questions or hear pushback from anyone running these settings in production.

reddit.com
u/bradcoles-dev — 1 day ago

Cannot create new environments in Fabric

We get this error when trying to create a new environment in a workspace.

Tons of compute available on an F16.

Unable to complete the action because your organization's Fabric compute capacity has exceeded its limits. Try again later.

Is this a bug?

There are two ways we can create an environment.

Add Items. When we do that, we only get a weird blank square that doesn't go away:

https://preview.redd.it/stjs5b5pdc2h1.png?width=1046&format=png&auto=webp&s=d14643ce9d0803965bfd043e6ed80c9a508199b9

When we add environment through the Notebook 'environments' drop down, we can enter an environment name, but we get this error:

Unable to complete the action because your organization's Fabric compute capacity has exceeded its limits. Try again later.

Tons and tons of compute available.

Is this a bug that others can reproduce? Something wrong with our environment? A setting I'm overlooking?

reddit.com
u/Personal-Quote5226 — 1 day ago

Data source error: Expression.Error: The key didn't match any rows in the table + OneLake Security

Hi All,

I am receiving this error when trying to refresh a model in the service. Locally it refreshes just fine. Connected to the table (Table1) in my lakehouse via the SQL Endpoint.

The configuration:

Entra group: GroupA

OneLake Security Role: RoleA

Table Name: Table1

The onelake security RoleA has been given access to Table1 and the member of that RoleA is GroupA.

They can see the table in the UI both in the lakehouse in the SQL Endpoint and can query it fine. The lakehouse is configured to use Onelake security (user identity mode).

I have confirmed the gateway cloud connection to the lakehouse has the GroupA added as a user and configured that in the semantic model settings - but we still receive this error.

Did I miss something in my configuration?

reddit.com
u/kieranimal — 1 day ago

Using dbt item cross workspaces

Hi, I want to use dbt job item as my ELT engine in Fabric, but I'm facing the following:

I need to read data from a bronze workspace lakehouse and write the data into a silver workspace warehouse. As I understand it now, there is no way to connect the dbt job item to a lakehouse that is in a different workspace. So I've come up with having an additional Lakehouse in the silver workspace, that acts just a a shortcut hub for the objects that i want to read with dbt.

However, this seems like an obvious antipattern as it creates unneccesary objects across my architecture. Is this really the way to do it now or is there something I'm missing?

Having the entire medaillon in a single workspace unfortunately isn't an option.

Is the cross-workspace connectivity on the roadmap for dbt job item? I need to make a decision for the data transformation engine soon and this might kill my dbt argument.

Thanks a lot!

reddit.com
u/General-Special8320 — 1 day ago

Direct Lake on top of Databricks Mirrored tables?

I'm working on some PoCs for my client related to their Databricks catalogs. I have some mirrored Databricks tables in Fabric and I have created shortcuts to them in my Fabric lakehouse. However, when I right click on the lakehouse and choose "New Semantic Model", the shortcuts to the mirrored Databricks tables do not show up in the list of tables I can add to the new semantic model. I am aware that a Databricks mirrored table is metadata only (not a "real" Fabric delta table, so I'm guessing that we would need to "materialize" (e.g. CTAS) the data in our Fabric lakehouse in order to use it for a direct lake semantic model. Anyone have experience with this?

reddit.com
u/PeterDanielsCO — 1 day ago
▲ 29 r/MicrosoftFabric+1 crossposts

Hit Capacity limit out of nowhere?

Hey all,
First of all, I apologize for the horrible picture. Out of nowhere yesterday morning, our CU % just tripled to over the limit, and just never went back down. From what I can tell, nothing has really changed on our end (as far as legitimately using much more compute). Things that were previously using 3.5% of capacity are all of the sudden using close to 30% at any given time. Has this ever happened to anyone else? Any advice on troubleshooting? We temporarily upped our capacity from f16 to a f32 to get things up and running again.

u/ShockDuh — 2 days ago

Question around Metadata Sync in SQL Endpoint Shortcuts

Hi guys, Data Engineer using Microsoft Fabric here.

I have a pretty specific issue that I believe I know what is causing it.

So we have a Medallion Architecture setup and a Pipeline that runs from Bronze -> Silver -> Gold. All of the data runs with stored procedures sequentially and we use Lakehouse and Warehouse shortcuts, so after a Lakehouse/Warehouse is updated via the stored proc then it is used in another layer via shortcut.

I have noticed that the Metadata Sync sometimes gets delayed by 10-20 minutes, which is pretty long as the way our pipeline works it needs to be done pretty much in 5 minutes max.

Is there a way to trigger this Metadata Sync in any way. I tried doing it with a Notebook API call and it didn't work.

Thanks in advance!

reddit.com
u/kec15 — 1 day ago

Prevent Workspace Users from Accessing underlying Lakehouse schemas, files and data

We need some users to have permissions at the workspace level. Even the very lowest permission (Viewer) allows the user to view Lakehouse data via the SQL Endpoint of all Lakehouses in the workspace while Contributor and higher give you ReadAll to the Lakehouse, and access to schema/files/data.

We need users to have the ability to view the workspaces (navigate it, see items/notebooks/pipelines/etc); this is for administration, auditing, etc. However, these same users aren't business users and we don't want them to see any data whatsoever.

Is there a recommended approach to achieving this?

Update: The SQL Analytic endpoint can be restricted by using SQL policy on the underlying schema or tables (ex: DENY) .
https://learn.microsoft.com/en-us/fabric/data-warehouse/sql-granular-permissions

u/Personal-Quote5226 — 2 days ago

NEW: Fabric Jumpstart – Discover what’s possible with Microsoft Fabric

Ok r/MicrosoftFabric - finally, here it is. It's either the worst or best kept secret since we did a soft launch @ FabCon: Fabric Jumpstart

Read the blog: Fabric Jumpstart – Discover what’s possible with M... - Microsoft Fabric Community

Please share any feedback, raise an issue on the GitHub page for any issues, and for all of you out there that have cool accelerators, demos, etc., would love to see you contribute and help the entire Fabric community!

Want to quickly experience, learn, or demo Microsoft Fabric... solved.

youtube.com
u/mwc360 — 2 days ago

Power Query or Fabric or something else?

I need some Insights to this case I have. I need to build a power bi report that's connected to 3 different systems all via Rest APIs. Some of the tables can only be fetched via looping a call, for example getting all customers is one call, but getting all transactions for all customers need to be one call per customer etc.

Should I do this in power query or set up data ingestion in fabric or something else?

reddit.com
u/Inevitable-Oil-5079 — 2 days ago

Has anyone looked into Fabric Plan yet? Sharing what I found after spending time with it

I've been digging into Fabric Plan over the past few weeks and couldn't find much discussion about it here so figured I'd share what I found. 

Quick context on what it is. It's a planning and reporting product built on Microsoft Fabric, put together by Lumel and Microsoft. It went into preview at a Microsoft conference in March. 

It has three main parts. PowerTable is basically a spreadsheet style interface where you can edit data directly and it writes back to the database. Planning sheet is where you do budgets and forecasts, you can build driver-based models and run scenarios. Intelligence sheet is the reporting piece where you build the actual dashboards and financial reports. 

What I found interesting is that these three pieces share the same data environment on Fabric. If someone updates a number in the Planning sheet, the Intelligence sheet picks it up without you having to export or move anything. There's also something called InfoBridge which works like a transformation layer between the sheets, similar concept to Power Query but for planning data. 

I've spent most of my time in Intelligence sheet. The matrix component handles financial statement layouts pretty well, there are chart options that go beyond the standard Fabric visuals, and the filtering is more purpose built for finance use cases than regular slicers. It can also connect to existing semantic models on Fabric, not just planning data. 

It's still early and I've hit some rough edges but the idea of having planning and reporting in the same place without moving data around caught my attention. 

Got to try it after attending their introduction session a few weeks back. Happy to chat if anyone else is looking into this. 

reddit.com
u/Same-Bison8478 — 3 days ago

Surrogate keys or not?

So I'm translating a SQL server solution into Fabric. I had set up my SCD2 in SQL server with surrogate keys but I'm completely unclear whether they are needed in Fabric. Co-Pilot is telling me to use the natural business key and hash the columns for comparisons, throwing away the surrogate key concept at silver level. Is this a generally accepted stance or am I missing something?

reddit.com
u/First_Newspaper_612 — 2 days ago

Post that shows how to perform CI/CD for Real-Time Intelligence and how you can simplify deployments to other Fabric workspaces

This post covers one way you can simplify CI/CD for Real-Time Intelligence in Microsoft Fabric with the fab deploy command in Azure DevOps.

I often get asked if CI/CD for Real-Time Intelligence works. Since it is not a topic that gets discussed often. So, I decided to show CI/CD working for Real-Time Intelligence and how you can simplify deployments to other Fabric workspaces with the fab deploy command. Double value for post content.

I also show the new Fabric automation tools Azure DevOps extension in action. Due to the fact that I wanted an efficient way to issue two fab deploy commands. For reasons that I cover in this post.

This post is also accompanied by a sample Git repo that you can clone/copy/download and customize how you see fit.

https://chantifiedlens.com/2026/05/19/simplify-ci-cd-for-real-time-intelligence/

u/ChantifiedLens — 3 days ago