Skip to main content

Notifications

Customer experience | Sales, Customer Insights,...
Suggested answer

On-Premises Data Gateway connector deletes imported rows in CI

(0) ShareShare
ReportReport
Posted on by Microsoft Employee

We planned to use on-premise gateway connector to import historical data for orders/order lines into CI (more than 10 million of rows, On-Premise SQL server) and upload new orders/order lines later. I have read about the limits for on-premise gateway connector about max 10Gb per table ( docs.microsoft.com/.../service-gateway-onprem-tshoot ). I decided to make ingestion in several steps:
1. Import historical data (several steps: Import 5 million records -> Clear SQL table, Import next 5 million records -> Clear SQL table….)
2. Clear the SQL table
3. Import new data on daily basis
4. Periodically clear SQL table of imported records
5. Go to step p.3

But when I delete rows in SQL table (after successful import into CI) – records also deleted in CI as well!
My questions are:
1. Is it possible to forbid deletion of imported data in CI ? Configure On-Premise Data Gateway connector in “append-only” mode ?
2. Are there another way of loading data into CI from on-premise SQL Server ?

  • Suggested answer
    John Wiese Profile Picture
    John Wiese on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Just to clarify, the Incremental Refresh functionality is on by default if you have a new instance or connect to your own Dataverse as that enables the Power Platform Data Flow connectors which support incremental refresh.  You can force using those connectors instead of the Power BI Connectors with the flag added to your URL as mentioned.

  • Community Member Profile Picture
    Community Member Microsoft Employee on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Hi Bill! MS advised us to use Incremental Refresh. It is in preview now. Te enable it - add parameter to the CI url &PPDFIncrementalRefresh=true. I did not tested this mode yet but it look promising

  • Suggested answer
    Bill Ryan Profile Picture
    Bill Ryan 8 on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    If you can script out the table, we I’ll gladly do a quick computation on it and can give you a rough estimate. Actual table size includes Keys and Indices but we can get a very good estimate.  I have a 35 million row table that's fairly wide along with an 85 million row table that's well under 10 GB together so I suspect you're going to be fine ,but we can know instead of guess.  If you know the schema, tbh, I'm happy to every use a data generator and pump up 100 million rows just to be sure.

  • Community Member Profile Picture
    Community Member Microsoft Employee on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Bill, thank you. I'll try to estimate size of real tables. Since I havent't got the 'real' data in them - only approximate model and expected number of rows.

  • Suggested answer
    Bill Ryan Profile Picture
    Bill Ryan 8 on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Ivan, I think you might be surprised at the table size.  Before going a lot further, run this inside the db using your order and Orderline tables   exec sp_spaceused 'orders'.  This will tell you the actually size.  But as far as using delimitted files, i'd personally suggest using .parquet or .avro, but yes, you can do it with one big file, or one historical ones  and reference the whole directory if you mount as a CDM Folder (the second option onwhen you create a data source).  I think the guidance is to keep them around 2GB but that's just a guide.  The same file in csv will reduce from around 10GB to 3GB if you use parquet instead in my experience.  We've had 30 GB files in csv that CI was able to process so yes, it can.  And if you mount, you can just reference the whole directory and drop a new file each day to just keep loading the incremental portions which will greatly shorten the process.  That said,before you go too far down any road, check those table sizes b/c I think they're going to be a lot smaller than you might think and it's easy enough to check for sure - if they're say 2GB or smaller with the current volume, that gives you 5x the volume before you have to really be concerned with much and that's a lot of runway.

  • Community Member Profile Picture
    Community Member Microsoft Employee on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    We will have 12 million of order records and 35 million for order lines. And these amounts will be increased on daily basis. So I do not think that 35 millions of order lines will fit into 10 Gbs :(

    In case of Azure Blob Storage - did you mean to export all rows from single table into single CSV file for example and put it into Azure Blob Storage Container ? Will it be OK to have single CSV file (more than 10Gb size) uploaded every day to Azure Blob Storage ? Will CI can handle it successfully ?

  • Community Member Profile Picture
    Community Member Microsoft Employee on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Hi! Thank you for response! Yes, I'm aware of possibility to import data from Azure SQL, but customer is currently have all infrastructure on-premise. Besides the fact that CI is expensive enough on its own, they will need to pay for Azure SQL also (just for transferring data to CI) :) That's why I'm looking for less expensive solution for importing data into CI.

  • RE: On-Premises Data Gateway connector deletes imported rows in CI

    Pardon what might be be a dumb question, but instead of using the on-prem SQL DB, why not use Azure SQL and then use a Dataverse dataflow to bring in the data you need to D365? Azure SQL is massively scalable, already lives in Azure (which is also where D365 lives, of course), and it doesn't seem that MS "taxes" dataflows against your D365 storage quota. And you don't have to worry about data gateway limits.

    You also don't have to then deal with the comparatively "flat" data structure that ADL is optimized for (as it is obviously intended for analysis/reporting, etc.).

  • Suggested answer
    Bill Ryan Profile Picture
    Bill Ryan 8 on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Here there Ivan:

    So a few things. 

    1- To look at the size of each row,you can look at the DataType of each column to get an idea of the size of data you're moving.  This has a sample procedure that can give you can use for : rows and this will give you an idea of Table Size.    Regarding compression, if the rows are compressed it'll need to be uncompressed to be ingested but unless you've enabled it, I don't think that's an issue.

    2-Understood, so performance will be an issue but transfer time is generally a lot less than simply having your contacts conflate and then associating activities unified.  I can't say it's always true, but if you have a lot of activities, chances are that Activity unification in the refresh process will be the time killer.  Usually ingestion is pretty fast, (a few minutes). 

    3-Ok, so you can use CSV, Excel, Parquet.  ON Data ingestion, the middle option is to mount a CDM folder instead of using the Power Query interface.  But that requires a little more work.  So if you just did a full dump of your table to CSV/Parquet/JSON or pick the file format and wrote it to either Azure Blob Storage or a Data Lake, you can just connect to it just like a database.  Other than setting up the connection initially, you can't really tell the difference.  So in this instance of your screen shot, pick a file format, say Parquet or CSV, then you just specify the container it's in and the key, it'll authenticate and after that, everything is exactly the same.

    You can use OneDrive or box or something different but Azure Blob storage or ADL are a lot more robust. 

    I think you're table size is well below 10gb but run those queries and we can figure it out.

  • Community Member Profile Picture
    Community Member Microsoft Employee on at
    RE: On-Premises Data Gateway connector deletes imported rows in CI

    Thank you Bill,

    1. I do not know the size or amont of rows yet. For now I'm trying stay ahead of it (do not hit the limit of 10 Gb). Also I'm worreid about the statement "The exact limitation is 10 GB of uncompressed data per table". What does it mean "uncompressed" and how can it be mapped to the size of SQL Server table?

    2. I suspect the degration of import performance, since every day we will have more and more data in SQL Server table.

    3. Could you please clarify a bit about files in Azure Data Lake or Blob? Do you mean csv files ? I hardly imagine how to import several Gbs of data from csv files :)  Also we will have record updates (order status, etc) - so I have to update data in already uploaded files. I have these options for data import. Did You mean Azure blobs ?

    pastedimage1628613325955v1.png

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Announcing Our 2025 Season 1 Super Users!

A new season of Super Users has arrived, and we are so grateful for the daily…

Vahid Ghafarpour – Community Spotlight

We are excited to recognize Vahid Ghafarpour as our February 2025 Community…

Congratulations to the January Top 10 leaders!

Check out the January community rock stars...

Leaderboard

#1
André Arnaud de Calavon Profile Picture

André Arnaud de Cal... 292,031 Super User 2025 Season 1

#2
Martin Dráb Profile Picture

Martin Dráb 230,868 Most Valuable Professional

#3
nmaenpaa Profile Picture

nmaenpaa 101,156

Leaderboard

Featured topics

Product updates

Dynamics 365 release plans