web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Microsoft Dynamics 365 | Integration, Dataverse...
Suggested answer

Securing dataverse data via synapse link - data access and reporting

(4) ShareShare
ReportReport
Posted on by

Hi All.

 

I am looking to put together a solution for secure data access for dynamics F&O, the setup is such that F&O data together with sensitive data will be exported to a data lake via synapse link. The data will of course be available in a data lake, the idea is that other aspects of the business need to query the data, part of the data contains sensitive information which should only be seen by specific individuals. The difficulty here is that data scientists also require access to the data, but they only need to create models but do not need to see the sensitive data.

From my investigation, when synapse link gets the data over to a data lake, there are concerns that users can access this information from the data lake directly even export it, some of the information could be sensitive data, secondly when a linked service is created from synapse workspaces, it gives access to the entire data since access is via the system principal which is required to have blob data reader role assignment, this means that end users could have access to restricted data via queries from the lake database.

I have the following questions.

 

  1. When synapse link is being used to move data from the dataverse to a data lake, what steps can be taken to ensure that the data is locked down ?
  2. How can one segregate the data such that a group of users can see only certain data, while other groups of users can see all the data?
  3. Power BI will be used for reports, this poses a challenge because data scientists create the models, the business users consume them. The data scientists should not see the sensitive data.
  4. How can the data lake be locked down such that only operational access will be possible, but no user backend access?
  5. Disaster recovery solutions, GZRS can be used for the storage account, how can synapse be protected from disaster or downtime ?
  6. Is it possible to create a new SQL pool and access the dataverse lake via this new SQL pool ?
  7. Cost optimisation considerations especially for synapse.
 
A few thoughts.
 
  • Will dynamic data masking help here considering that it doesnt work on the serverless pool.
  • Another idea is specific views that limit data, that would be an issue because we cant have 2 veiews, one showing PIA data and one that doesnt for thesame power BI model
 

 

Categories:
  • Suggested answer
    Sahan Hasitha Profile Picture
    on at
    Securing dataverse data via synapse link - data access and reporting
    hi
     
    When exporting Dynamics 365 Finance & Operations (F&O) or Dataverse data into a data lake using Synapse Link, secure access must be carefully designed because sensitive data is mixed with operational information. The main challenge is that once the data lands in Azure Data Lake Storage Gen2, users with storage access could potentially bypass controls and view or export confidential information. To prevent this, direct access to the data lake should be locked down by disabling shared keys, blocking public access, and only allowing private endpoints with Azure AD RBAC so that only Synapse or other managed identities can query it. Access segregation should then be applied at the Synapse SQL layer using row-level and column-level security, ensuring that different groups of users can see only the data they are authorized for. For example, business users may require sensitive fields, but data scientists should work with masked or obfuscated versions of those fields so they can build models without viewing confidential details. Power BI can connect to Synapse views with security policies, avoiding multiple datasets and keeping governance centralized. Since dynamic data masking does not work on serverless pools and is relatively weak, the stronger approach is to create secure views that apply masking or exclude restricted columns. To minimize risk, human access to the data lake should be completely removed, and operational access should only be via service principals or managed identities. For disaster recovery, the lake can use geo-zone redundant storage (GZRS), while Synapse metadata, security policies, and views should be backed up in source control and redeployable via infrastructure-as-code. Dedicated SQL pools may be used for performance and more advanced security, but serverless pools are more cost-efficient; a hybrid approach balances cost and security. With this setup, sensitive data is locked down, data scientists can still work effectively, Power BI delivers governed insights, and the organization gains a scalable, secure, and compliant analytics environment.
     
  • Suggested answer
    DAnny3211 Profile Picture
    on at
    Securing dataverse data via synapse link - data access and reporting
    Hi,
    You've raised a complex and important set of questions around securing Dynamics 365 Finance & Operations data exported via Synapse Link to a Data Lake. Here's a structured response addressing each point:
    ---
    ### **1. Securing Data Exported via Synapse Link**
    When using **Azure Synapse Link for Dataverse** to export F&O data to a Data Lake, the data is stored in **Parquet Delta format**, which supports efficient querying and incremental updates [1](https://learn.microsoft.com/en-us/power-apps/maker/data-platform/azure-synapse-link-select-fno-data). To lock down access:
    - Use **Azure RBAC** and **ACLs** to restrict access at the storage level [2](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control-model).
    - Avoid assigning broad roles like `Storage Blob Data Reader` to users unless absolutely necessary.
    - Prefer **Microsoft Entra security groups** for managing access to folders and files.
    ---
    ### **2. Segregating Data Access by User Groups**
    To ensure that only specific users see sensitive data:
    - Implement **Row-Level Security (RLS)** and **Column-Level Security (CLS)** in Synapse Dedicated SQL Pools [3](https://techcommunity.microsoft.com/blog/educatordeveloperblog/ways-to-control-who-can-access-data-in-azure-synapse-dw/4290877).
    - Use **custom views** in Serverless SQL Pools to mask or exclude sensitive columns.
    - Assign **UNMASK** permissions only to privileged roles if using Dedicated SQL Pools.
    ---
    ### **3. Power BI Model Security**
    Power BI supports multiple layers of security [4](https://community.fabric.microsoft.com/t5/Desktop/Safeguarding-Sensitive-Data-Exploring-Power-BI-s-Security/td-p/3724467):
    - Use **RLS** to restrict data visibility based on user roles.
    - Avoid exposing sensitive columns in the model if data scientists don’t need them.
    - Consider **creating separate datasets** for modeling and reporting, with masked or filtered views for data scientists.
    ---
    ### **4. Locking Down Data Lake for Operational Access Only**
    To prevent backend access:
    - Use **ACLs** to restrict access to specific folders and files [2](https://learn.microsoft.com/en-us/azure/storage/blobs/data-lake-storage-access-control-model).
    - Disable **Shared Key** and **SAS token** access; enforce **Microsoft Entra ID authentication**.
    - Use **Private Endpoints** and **Firewall Rules** to limit network-level access.
    ---
    ### **5. Disaster Recovery for Synapse**
    Azure Synapse supports **zone redundancy** and **geo-redundant storage (GZRS)**[5](https://learn.microsoft.com/en-us/azure/architecture/example-scenario/analytics/pipelines-disaster-recovery):
    - Use **CI/CD pipelines** with Git integration to replicate Synapse artifacts to a secondary region.
    - Consider **user-managed recovery** strategies for Dedicated SQL Pools and Spark Pools.
    - Document **RTO/RPO objectives** and perform regular DR drills.
    ---
    ### **6. Accessing Dataverse Lake via New SQL Pool**
    Yes, you can query Dataverse-exported lake data using **Serverless SQL Pools**[6](https://learn.microsoft.com/en-us/azure/synapse-analytics/metadata/database):
    - Lake databases created via Spark or Synapse Link are automatically available in Serverless SQL Pools.
    - You can define **custom views and procedures** on top of these lake tables.
    ---
    ### **7. Cost Optimization for Synapse**
    Key strategies include[7](https://learn.microsoft.com/en-us/azure/synapse-analytics/plan-manage-costs):
    - Use **Serverless SQL Pools** for ad-hoc queries (charged per TB scanned).
    - **Pause Dedicated SQL Pools** during off-hours.
    - Enable **auto-pause** and **autoscale** for Spark Pools.
    - Monitor usage via **Azure Cost Management** and set **budgets and alerts**.
    ---
    ### **8. Dynamic Data Masking Limitations**
    Dynamic Data Masking (DDM) is **not supported** in Serverless SQL Pools[8](https://learn.microsoft.com/en-us/answers/questions/2120635/dynamic-data-masking-on-synapse-serverless-sql-dat). Alternatives:
    - Use **custom views** to mask data.
    - Combine with **RLS** to restrict access based on user roles.
    - For full DDM support, consider using **Dedicated SQL Pools**.
    ---
    Please verify if this guidance helps address your architecture and security concerns.
    Thanks and best regards,  
    Daniele  
    *Note: This response was prepared with support from Copilot to ensure clarity and completeness.*
  • Suggested answer
    rezaniroomand Profile Picture
    on at
    Securing dataverse data via synapse link - data access and reporting
    It's a bit hard to answer your questions without knowing the full list of requirements and sometimes challenge them. I'm going to take several assumptions, but hopefully it will put you in the right path.
     
    1. You should not give direct access to the data exported to the Datalake. Consider it as the data sink for the Dataverse environment to dump its data. It's like your raw data storage. For your scenario it's better to bring data to a relational database (e.g. Synapse SQL pool) and start designing the security model from that level up.
     
    2. Depends on where those users are and what tools they use to access the data and what it they purpose and how you can group them. For example for Power BI you could create datasets and give access to those datasets to your report designers / developers, they don't even need access to the DB. But it also depends how large your teams are. For example if you have one person who is accessible for DB and reports, then it might not make sense to layer the security too much.
    If you have applications that need also access to the data, you can create accounts (Azure Entra) and give them necessary access to the tables in your DB. You can for example create DB roles and assign those roles to these accounts depending on what they need.
     
    3. The answer to this question is similar to the above. Depending on which level they need access you need to use the features and tools available in that level. Direct access to the lake should never be an option. try to manage the security at the highest level your users are. For example if they are Power BI users manage your security with means available in that layer. If they have to access data at the DB, then your tools will be different. In fact in DB layer, Power BI as a whole can be one of the users or this later :)
     
    4. Like I mentioned before, you won't even give access to anyone in that layer. No SAS tokens, nothing. Although when I'm talking about no access I mean the root folder where Dataverse is dumping its data. But, personally I would dedicate the whole storage account for this and other related operations. This keeps managing it simpler, and simpler usually translate to less risk of security exposure.
     
    5. This one is a wide topic. If have to again think about layers. Power Platform has its own implementation of BCP/DRP. start exploring from Business continuity and disaster recovery for Dynamics 365 SaaS apps - Power Platform | Microsoft Learn In the Datalake layer, it's like any storage account and the DB is like any SQL DB you will find lots of documentation about them in MS Learn. 
     
    6. I think I already answered this. You are thinking in the right path there ;)

    7. This is where you might need to challenge the requirements. Think of what is the value of this whole platform that you are building for the business and how is it going to save them costs. Then usually sticking to the out of the box features of Power Platform reduces the costs. You are already paying for the licenses so you should try to capitalize on them as much as possible. If business cannot compromise on the requirement or the added value is high, you will go towards more and more layers that gives you more control and best performance. This last one, performance is key, because one of the key benefits of exporting data to Datalake, bringing it to the RDMS and letting Power BI to access it and other apps and services to access the data from DB and all of that is performance. It also reduces the pressure from Dataverse which also translates to more capacity and performance.
     
  • CU12092249-0 Profile Picture
    on at
    Securing dataverse data via synapse link - data access and reporting
    Thank you very much for the response, whilst I understand that using he serverless SQL pool is the most cost effective solution to query the data, I do however have the following questions.
    Its not clear how the row level and column level security will work, if data scientists need to create report/models for the business users to use, I don’t see how one can exclude the data for some people and enable it for others, this is why dynamic data masking might work. If one has a report that totals employee salary, how can you make it available for the business users, and not make I available to the developer if the developer is the one creating the report for the business users to use ?

    If a dedicated SQL pool is created, can one create views on it pointing to the datalake database and have data masking applied or does the data have to be copied to the dedicated SQL pool to have dynamic data masking in place ? If the data has to be copied across, it posses a problem because pipelines will need to be created, causes extra lag in having a near real time reporting system and introduces extra complexities. Any solutions here ?

    Concerning the response here “Access segregation should then be applied at the Synapse SQL layer using row-level and column-level security, ensuring that different groups of users can see only the data they are authorized for. For example, business users may require sensitive fields, but data scientists should work with masked or obfuscated versions of those fields so they can build models without viewing confidential details”, please explain how this will work or if you have a simple example for a single table/view ?

    Thanks in advance.

     

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Pallavi Phade – Community Spotlight

We are honored to recognize Pallavi Phade as our Community Spotlight honoree for…

Leaderboard > Microsoft Dynamics 365 | Integration, Dataverse, and general topics

Last 30 days Overall leaderboard

Product updates

Dynamics 365 release plans