web
You’re offline. This is a read only version of the page.
close
Skip to main content
Community site session details

Community site session details

Session Id :
Customer experience | Sales, Customer Insights,...
Suggested answer

Access to match percentage

(1) ShareShare
ReportReport
Posted on by 2

Hello all,

We created a unified profile in customer insight with differents sources. In order to improve the number of deduplicated profiles, we would like to display the matching percentage in the profile, so that a data steward can access and correct the data in the source. If it is not possible to display this value, eventually export a list of profiles between two percentage match.

We did not find anything to do that in the profile, and nothing such as this percentage in power bi.

Any idea ?

Thanks

I have the same question (0)
  • Scott Stabbert Profile Picture
    on at
    RE: Access to match percentage

    Hi Vincent.
    Are you looking for something like....

    Source Table1

    ID Firstname Lastname Email Phone Lastupdated
    1 Bob Smith bob@home.com 4255551212 1/1/2000


    Source Table2

    ID Firstname Lastname Email Phone Lastupdated
    100 Robert Smith bob@hme.com 4255551212 1/1/2000

    You have a rule that matches records in Table1 to Table2:

    Rule: Match on Firstname (exact), Lastname(exact), Email (Fuzzy .85), Phone(exact)

    The records match because

    • Name normalization is configured for First and Last names, and our advanced normalization matches Bob==Robert and scores it as an exact match (1)
    • Last name: exact match
    • Email: bob@hme.com and bob@home.com would score about a .917 which meets the configured threshold of .85 in this example)
    • Phone: exact match.

    After unification, you can check the Tables page for the ConflationMatchPairs table, which will have some of this info.  
    If it doesn't have what you are looking for, give me an example of how you would like the data. There is an overall score that takes into account all the criteria, so it would be ( 1+1+.917+1 ) / 4
    Is that what you're looking for?

  • Vincent G Profile Picture
    2 on at
    RE: Access to match percentage

    Hello

    Thanks Scott

    Yes i'm looking for this kind of info : match percentage for customer insight profiles.

    Main objective is to create profile with correct matches, this is provided bi CI thanks to the rules we implement.

    Secondary objective is to give to a data steward CI profiles between two percentages, so they can "correct" the data at the origin, so next time it will be injected in CI, maybe and hopefully the match %tage will be better.

    Thats why i'm trying to access a global match percentage for each profiles. In your exemple it would be (1+1+0.917 +1) / 4 = 0.97%.

    For profiles with lower percentage, CI cannot "correct" the data, but someone could do it. It is possible, but in order to be :more efficient, we would like to give the "best" percentage first, for ex all profiles between 70% and 90%.

    Ex :

    ID name Match %tage
    002 arthur .89
    003 marc .78
    004 john 0.75

    We can see a sample of those percentage in CI (first 100 rows). We wanted this %tage in the profile, but did not find how to do so. Finally we accessed CI with Power BI and did not find it either.

    If you have any ideas i ll check !

    Vincent

  • Scott Stabbert Profile Picture
    on at
    RE: Access to match percentage

    Hi Vincent,

    The scoring data is available in several tables that are generated when you unify data.

    For deduplication, we create a table for each of your source inputs and name them: Deduplication_<DataSource_TableName>. You can find these in CI on the Tables page. 
        You will see a Rule and a Score column telling you which rule was used to match the records, and what the score was for all the matched rows. Example

    2. Similarly, a single table called ConflationMatchPairs is created to provide details on cross-table matches. The Rule and Score columns are there too.

    A score of 1 is an exact match. Scores of .99 and less are fuzzy matches.

    It is not possible to show the scores for rows that were not matched as that would require we compute the score for every possible row combination. For just 100 rows, there would be 5,050 possible matches.

  • Badge Profile Picture
    30 on at
    RE: Access to match percentage

    You could try to create a customer attribute measure which joins the customer profile to the system dedupe and conflation entities, these should include the match percentage.

  • Suggested answer
    Scott Stabbert Profile Picture
    on at
    RE: Access to match percentage

    Hi Vincent,

    Customer Insights writes an entity to Dataverse for each table you deduplicate. The tables are named Deduplication_{datasourcename}_{tablename}. You can see these in Customer Insights on the Tables page where you can download the first 100K rows. The entities can be exported from the Dataverse data lake where you can load it in the environment of your choice for analysis and processing.

    When rows are matched by a deduplication rule, the rule column will have the name of the rule used to make the match and the score column will have a number between 0 and 1 where 1 is an exact match. A typical fuzzy match score would look like .8956.

    The score is determined by a number of factors, but the primary one is 'Edit Distance'. This is the number of edits to turn string A into string B.  For example robertjones@hotmail.com and robrtjones@hotmaik.com have an edit distance of 2 since the later has a missing 'e' and a 'k' instead of an 'l'.    

    The score is computed using (Longest string length - Edit Distance) / Longest string length

    20/22 = .91

Under review

Thank you for your reply! To ensure a great experience for everyone, your content is awaiting approval by our Community Managers. Please check back later.

Helpful resources

Quick Links

Responsible AI policies

As AI tools become more common, we’re introducing a Responsible AI Use…

Abhilash Warrier – Community Spotlight

We are honored to recognize Abhilash Warrier as our Community Spotlight honoree for…

Leaderboard > Customer experience | Sales, Customer Insights, CRM

#1
Rishabh Kanaskar Profile Picture

Rishabh Kanaskar 235

#2
MVP-Daniyal Khaleel Profile Picture

MVP-Daniyal Khaleel 177

#3
Tom_Gioielli Profile Picture

Tom_Gioielli 156 Super User 2025 Season 2

Last 30 days Overall leaderboard

Product updates

Dynamics 365 release plans