Hi All.
I am looking to put together a solution for secure data access for dynamics F&O, the setup is such that F&O data together with sensitive data will be exported to a data lake via synapse link. The data will of course be available in a data lake, the idea is that other aspects of the business need to query the data, part of the data contains sensitive information which should only be seen by specific individuals. The difficulty here is that data scientists also require access to the data, but they only need to create models but do not need to see the sensitive data.
From my investigation, when synapse link gets the data over to a data lake, there are concerns that users can access this information from the data lake directly even export it, some of the information could be sensitive data, secondly when a linked service is created from synapse workspaces, it gives access to the entire data since access is via the system principal which is required to have blob data reader role assignment, this means that end users could have access to restricted data via queries from the lake database.
I have the following questions.
- When synapse link is being used to move data from the dataverse to a data lake, what steps can be taken to ensure that the data is locked down ?
- How can one segregate the data such that a group of users can see only certain data, while other groups of users can see all the data?
- Power BI will be used for reports, this poses a challenge because data scientists create the models, the business users consume them. The data scientists should not see the sensitive data.
- How can the data lake be locked down such that only operational access will be possible, but no user backend access?
- Disaster recovery solutions, GZRS can be used for the storage account, how can synapse be protected from disaster or downtime ?
- Is it possible to create a new SQL pool and access the dataverse lake via this new SQL pool ?
- Cost optimisation considerations especially for synapse.
- Will dynamic data masking help here considering that it doesnt work on the serverless pool.
- Another idea is specific views that limit data, that would be an issue because we cant have 2 veiews, one showing PIA data and one that doesnt for thesame power BI model

Report
All responses (
Answers (