Hi,
Data ingestion was not completed and showed the following error message under "System > Data Preparation > Entity"
Usually, the corrupted records happened due to data type issue (e.g. datetime, nummerical, etc.). However, in my case, there are only 2 data types which are ingested (string and number).
Looking into "Entities > User > Entity", I went through all fields that have number (also String!) data type, but none of them have any error (only missing records). An example:
What could be other reasons that CI could not process these records, therefore labels them as "corrupted" data?
Thank you,
Thanks Scott,
1. Is that your total data set (~7M)?
correct
2. Has this ever ingested cleanly?
Eventually I changed all data types into "String" and it worked for now, because I don't need to utilize the main purpose of using different data types. But it would be great to understand the underlying problem eventually.
3. What changed between runs?
I think there was a correlation between the number of records of "non-string" data types with the number of corrupted/error records. But it was hard to pin point exactly which fields that are problematic.
4. Can you create a test data source with a small number (~100) records, to test?
Yes, this is a good idea. Will do when I need to use the data types (e.g. birthday for measures).
5. Did your schema in your input data source change since you originally ingested this data?
Yes, it changed several times. See my answer to Q2 and Q3
Thanks. When I have the time, I'd carry out your suggestion on Q4.
hi rdj,
which connector are you using? And how is your date-field formatted?
I've experienced using yyyy/MM/dd HH:mm:ss solved my corrupt records when using CDM-connector.
Gotcha.
It looks like its reporting 100% of records are corrupt (2971761 out of 6971761 corrupt records)
Is that your total data set (~7M)?
Has this ever ingested cleanly?
What changed between runs?
Can you create a test data source with a small number (~100) records, to test? Did your schema in your input data source change since you originally ingested this data?
Hi Scott,
Thanks for your reply.
I did. In fact I've been fixing previous issues by looking into the corrupt entities (e.g. replacing \n in the data that CI interprets as a new line - you see it in the entity name "no newline").
But this time, the data in the last corrupt entity looks correct.
Anywhere else I could check?
Hi RDJ
Did you click Entities, and look for the "KDB_merged_comma_with_datetime_no_newline_nocol1_corrupted" ?
This should have an indication of what went wrong in your imported data.
André Arnaud de Cal...
292,031
Super User 2025 Season 1
Martin Dráb
230,868
Most Valuable Professional
nmaenpaa
101,156