I believe Smart matching would work.
Smart matching relies completely on the existence of similarity between emails. The subject and recipients (from, to, cc and bcc) list are the two important components that are considered with checking for similarity.
When an email is sent from CRM, there are two sets of hashes generated for it and stored in the database.
- Subject hashes:
To generate subject hashes, the subject of the email, which may include the CRM token if its usage is enabled in system settings, is first checked for noise words like RE: FW: etc. The noise words are stripped off the subject and then tokenized. All the non empty tokens (words) are then hashed to generate subject hashes.
- Recipient hashes:
To generate the recipient hashes the recipient (from, to, cc, bcc) list is analyzed for unique email addresses. For each unique email address an address hash is generated.
Next when an incoming email is tracked (arrived) in CRM, the same method is followed to create the subject and recipient hashes.
To find the correlation between the incoming email and the outgoing email the stored subject and recipient hashes are searched for matching values. Two emails are correlated if they have the same count of subject hashes and at least two matching recipient hashes.
How can smart matching be configured?
One size never fits all and so the above described constrain for correlation, which is the default behavior of out of box CRM, can be configured to suite individual needs.
There are four registry keys that allow you to manipulate the smart matching behavior. These registry keys need to be added under the CRM server registry hive only. I.e. HKLM\Software\Microsoft\MSCRM
- HashFilterKeywords
- Description: This is a regular expression that is used to cancel out the noise in the subject line. All matching instances of the regular expression present in the subject line are replaced with empty strings before generating the subject hashes.
- Default value: ^[\s]*([\w]+\s?:[\s]*)+
Basically it indicate that we internally (by default) will ignore any word at (multiples of it) at the start of the subject line that has a “:” at the end of it example:
|
|
Subject
|
Ignored words
|
|
1
|
Test
|
None
|
|
2
|
RE: Test
|
RE:
|
|
3
|
FW: RE: Test
|
FW: RE:
|
Note: By default we do not ignore starting phrases in the subject line like “Out of office:” as this does not have the first word with the “:” next to it. For ignoring this phrase you can update the regular expression in the registry as “^[\s]*([\w]+\s?:[\s]*)+|Out of office:”. Do not place the double quote that I have around the string in the example into the registry. The text in the registry should only be the regular expression you want to use for ignoring words from the subject line.
2) HashMaxCount
- Description: This is the max number of hashes that will be generated for any subject or recipient list. I.e. if the subject after noise cancellation contains more than 20 words only the first 20 words are considered.
- Default value: 20
3) HashDeltaSubjectCount
- Description: This is the maximum delta allowed between subject hash counts of the emails to be correlated.
- Default value: 0
4) HashMinAddressCount
- Description: This is the minimum hash count matches required on the recipients list for the emails to be correlated.
- Default value: 2
Limitations:
The email hashes are generated when the email are sent out. If you change the HashFilterKeywords or the HashMaxCount via registry key only the new outgoing and incoming emails will be affected. The existing email hashes are not recalculated. Also CRM does not provide any out of box functionality to re-calculate the hashes.
Also the smart matching currently does not have a time limit on how old the correlated email could be. In CRM 5.0 we would address this along other improvements to smart matching.
I hope this helps!
Ruskin | User of Microsoft Teams and MS Dynamics 365