web
You’re offline. This is a read only version of the page.
close
Skip to main content

Announcements

No record found.

News and Announcements icon
Community site session details

Community site session details

Session Id :

Power platform Dataflows – have I missed “Incremental refresh” option somehow?

ashlega Profile Picture ashlega 34,477

This month seems to be all about data integration/data migration for me, so, even though I’m still using good old SSIS in many cases, I wanted to look at the Dataflows a little closer.

After all, they don’t require any additional deployment/licensing – we just need to go to the maker portal and start building them:

image

But, it seems, there is something missing.  I’ll get to it closer to the end of this post – right now let’s just try something first.

We will need XrmToolBox and Record Counter tool (I used the one with 5 stars below):

image

There is a special “ETL Test Entity” I just created – it has a name field and and a “Test Id” text field (which is also an alternative key). Right now there are 0 records in this entity:

image

Then, there is a SQL table which is pre-populated with 10000 records:

drop table ETLTest
go
CREATE TABLE ETLTest (
Id uniqueidentifier,
FullName nvarchar(250),
ModifiedOn DATETIME2(7)
)
go
INSERT INTO ETLTest(Id, FullName,  ModifiedOn) VALUES(newid(), ‘Test’, getutcdate())
GO 10000

 

image

 

There is a dataflow which is loading data from the SQL table above to the ETL Test Entity in Dataverse:

image

image

 

And, below, I’ll give it a try:

image

 

10000 records moved in in just under 2 minutes – that was pretty fast:

image

 

image

This is a good result so far, since it could be very suitable for all sorts of data integrations. Besides, on a related note, it seems Dataflows can handle API throttling limits intelligently, though I did not take it far enough to test it this time:

https://powerusers.microsoft.com/t5/Power-Apps-Ideas/Remove-Service-Protection-API-throttling-limits-for-Dataflows/idi-p/484654

However, what if we do another refresh?

image

This is where I was hoping that, somewhat miraculously, the data flow would know that none of the records have changed, so none of them would have to be updated in Dataverse.

Instead, I can see all those records coming in again:

image

3 minutes later, they have all been updated:

image

image

There are still only 10000 records, so none were added (pretty sure that’s because my entity had an alternative key):

image

Still, I was hoping none of them would even be updated in Dataverse, since I had not changed any of them in the source, right?

And this is where I am finding this particular scenario (using Standard dataflows to load data to Dataverse) somewhat unclear. It seems there is supposed to be incremental refresh option according to the documentation page below:

https://docs.microsoft.com/en-us/power-query/dataflows/incremental-refresh

image

I can see this functionality for analytical data flows (those which would be dumping data to the data lake, for instance):

image

But I don’t see it for the Standard v1/v2 dataflows:

image

Which means using dataflows to load large data tables into Dataverse might be a waste of API calls for the entitlement limits.

Did I miss that feature somehow? Let me know if I did.

Otherwise, it might actually be more practical to use Azure Data Factory (or, possibly, something else such as SSIS) in the scenario above. Which is a little unfortunate since, compared to those other options, setting up a dataflow takes almost no effort at all.


This was originally posted here.

Comments

*This post is locked for comments