The Jaro-Winkler Distance

The Jaro-Winkler distance is a string metric used to measure the edit distance or difference between two sequences. It’s an extension of the Jaro distance metric proposed by William E. Winkler in 1990, and is often used in record linkage, data deduplication, and string matching.

The Jaro-Winkler Distance

Areas of application

  • Record Linkage
  • Data Deduplication
  • String Matching
  • Natural Language Processing
  • Information Retrieval
  • Data Mining
  • Machine Learning
  • Artificial Intelligence

Example

For instance, if we want to measure the similarity between the words ‘car’ and ‘auto’, the Jaro-Winkler distance would give a lower value than the Jaro distance because it takes into account the position of the letters in the word.