Popular topics: black Abuse Contact Validation

Do you detect the language of messages?

Spam message languages have become important to our customers training spam heuristics filtering. Thus, Abusix now classifies content in the message body using a common language detection library.

Description

Our challenge with language filtering is for us to deliver as much spam in a language feed to make it valuable to you, and achieving a balance between; being too strict on our language tagging, causing false negatives and ruling out false positives.

In language identification, we (1) first normalize the text in the message body and (2) then require a minimum amount of clean text in the message, to make a language tag decision.

Thus, Abusix is not identifying emails by language, those emails with too little content and those with too many special symbols.

So, if what we do isn't 100% perfect for you and you want to tighten or loosen the filter in some manner, please let us know and we will try to make adjustments accordingly.

If you have questions, please contact our support.

JSON Field / Filter

Our JSON contains a Language Field which may also be used as a filter

Languages

We are detecting and filtering on the following languages.

  • Albanian (sq)
  • Arabic (ar)
  • Armenian (hy)
  • Azerbaijani (az)
  • Belarusian (be)
  • Bengali (bn)
  • Norwegian Bokmål (nb)
  • Bosnian (bs)
  • Bulgarian (bg)
  • Catalan (ca)
  • Chinese (zh)
  • Croatian (hr)
  • Czech (cs)
  • Danish (da)
  • Dutch (nl)
  • English (en)
  • Esperanto (eo)
  • Estonian (et)
  • Finnish (fi)
  • French (fr)
  • Ganda (lg)
  • Georgian (ka)
  • German (de)
  • Greek (el)
  • Gujarati (gu)
  • Hebrew (he)
  • Hindi (hi)
  • Hungarian (hu)
  • Icelandic (is)
  • Indonesian (id)
  • Italian (it)
  • Japanese (ja)
  • Kazakh (kk)
  • Korean (ko)
  • Latvian (lv)
  • Lithuanian (lt)
  • Macedonian (mk)
  • Malay (ms)
  • Marathi (mr)
  • Mongolian (mn)
  • Norwegian Nynorsk (nn)
  • Persian (fa)
  • Polish (pl)
  • Portuguese (pt)
  • Punjabi (pa)
  • Romanian (ro)
  • Russian (ru)
  • Serbian (sr)
  • Shona (sn)
  • Slovak (sk)
  • Slovene (sl)
  • Sotho (st)
  • Spanish (es)
  • Swahili (sw)
  • Swedish (sv)
  • Tamil (ta)
  • Telugu (te)
  • Thai (th)
  • Tsonga (ts)
  • Tswana (tn)
  • Turkish (tr)
  • Ukrainian (uk)
  • Urdu (ur)
  • Vietnamese (vi)
  • Xhosa (xh)
  • Yoruba (yo)
  • Zulu (zu)


Was this article helpful?

Can’t find what you’re looking for?

Our award-winning customer care team is here for you.

Contact Support