Skip to main content

Notifications

How to determine an Email Language

Background

Our Customer Support (CS) agents serve customers globally and need to support various languages in email communication.

To translate non-English emails into English, we are using a 3rd party Translation API.

However, we need to identify the original language code first to use as an input parameter.

Power Automate

To accomplish this, we plan to use Power Automate's <Detect the language being used in text> action in the AI Builder module.

4428.pastedimage1679657562137v2.png

However, we've encountered a few challenges while implementing this solution.

Challenges faced:

1, the 'Detect the language being used in text' action limits the input text to 5,120 characters.

3058.pastedimage1679657606403v3.png

2, The Email's Description field in the Dataverse table is not plain text and a lot of HTML tags are mixed with the email content.

    This will increase the length of the Description field and might cause errors in language identification.

Testing

Using Power Automate produced errors and could not identify the correct language

1. Create an Email record written in German

2475.pastedimage1679711689615v1.png

In the Dataverse table, it's stored in an HTML format.

Picture1.png

2. Create an Instant Power Automate to test it.

7774.pastedimage1679712455861v5.png

3.  Then we find errors as below.

1488.pastedimage1679712390350v4.png

4. We use the first 5,120 characters as the input parameters.

3021.pastedimage1679712789740v6.png

5, Test it again.

It ran successfully, however, didn't get a correct result(German).

pastedimage1679721137825v3.png

HTML to Text Action

To address this problem, our plan is to extract the email plain text from the HTML formatted field.

Initially, we considered using Regular Expressions to extract the email content but found that it was not possible in Power Automate to execute the Regex.

Thankfully, we came across an Out of Box action called 'HTML to Text' that can assist us in achieving the same. 

5621.pastedimage1679713719426v8.png

Below is the new Flow Progress

6518.pastedimage1679713979013v9.png

Unfortunately, we were unable to obtain the desired outcome as the Image elements could not be eliminated.

6327.pastedimage1679714300745v10.png

Custom Action

Therefore, we have to develop a Custom Action to Extract the Plain Text from HTML and subsequently pass it to the Detect Language Action.

The diagram illustrating the process is displayed below.

0042.pastedimage1679716796652v11.png

Conclusion

We have not found a better solution to detect email language, and any suggestions are highly appreciated.

Comments

*This post is locked for comments