Customer experience | Sales, Customer Insights,...

deduplication fails to match "%street" and "%str"

(0) Share

Report

Posted on by Paulien Lb

125

Hi,

I'm trying to deduplicate contacts from D365 CE.

Although firstname, lastname, address1_line2, address1_postcode and address1_city are similar it won't match contacts if address1_line1 = "Main Str." for one and "Main Street" for the other.

Please note, Main Street is just an example, in reality address1_line1 field will have all kinds of streetnames of varying lengths.

I have switched on all normalization options. And played with precision, but with the varying field length this still doesn't fulfill the need. Am I overseeing something?

So I was considering alias mapping, but does it allow wildcards to cover all fields ending with "%str." and "%street"

If so: how would that work exactly, which wildcards can I use?

Thanks

Paulien

All responses (3)

Answers (0)

Paulien Lb 125 on at

Like (0)

Report

RE: deduplication fails to match "%street" and "%str"

Thanks Ryan and Scott!

We're ingesting German data, where Str. is the common abbreviation :-)

Transforming on the source is currently not an option, but luckily, we're ingesting with PQE so I could there.

Paulien
Suggested answer

Scott Stabbert on at

Like (1)

Report
RE: deduplication fails to match "%street" and "%str"

Hi Paulien,

As Bill correctly called out, Str. is not a common abbreviation for street.

The Normalize option "Type (Phone, Name, Address, Organization) " does allow Customer Insights to match common abbreviations without considering punctuation or case, so

Street = ST = ST. = St = St.

However, we do not match on Str.

You also tried to use Precision to Street and Str.

This doesn't work as we don't do fuzzy matching on strings 1-3 characters in length, so Precision would have no impact on "STR"

I would suggest a data transform of Str. > St. either in the source data or as data is imported into CI.
Suggested answer

Bill Ryan 8 on at

Like (1)

Report

RE: deduplication fails to match "%street" and "%str"

I think str. specifically is not a standard abbreviation so it might be the problem. If you're using Power Query, it's easy enough to quickly profile the data (using split) and see what variations appear at the end. one replace of str. with st would probably do it. I think with distance algoritms being used, there's going to likely be too much variety unless the abbreviations are somewhat standardized. AFAIK, there's not a necessarily elegant way of handling all possible cases, in general you can kind of clean/transform around the most common ones, and try to get it clean in pre-processing. Another thing is cloning the column, then splitting it and just matching on the first stem, so if it's 123 Main Street, clone the columns, split on space and match on firstname, lastname, token1of address, postalcode and city.

As a heads up, a similar problem that I've run into is on nulls. So let's say that in your example, address1 is null. So every other field is the same. With that set of merge rules, they won't be merged, ostensibly b/c null never equals null. I've run into this a few times on both dedupe and merge. B/c you generally don't want a replacement value to be visible on the card, if you clone and replace the field with "NA" or 0 and then match on that, you don't need to include that field in the card, you can use the 'null' one. It's not the most elegant approach but it works.

Community site session details

deduplication fails to match "%street" and "%str"

Helpful resources

Quick Links

Daivat Vartak – Community Spotlight

Announcing Our 2025 Season 1 Super Users!

Tip: Become a User Group leader!

Leaderboard

Featured topics

Product updates

Community site session details

deduplication fails to match "%street" and "%str"

Helpful resources

Quick Links

Daivat Vartak – Community Spotlight

Announcing Our 2025 Season 1 Super Users!

Tip: Become a User Group leader!

Subscribe to this forum!

Select categories

Leaderboard

Featured topics

Product updates