Test HTTP Requests Tools Blog Learn Quizzes Smile API Log In / Sign Up
Test HTTP Requests Tools Blog Learn Quizzes Smile API Log In / Sign Up
« Return to the tutorials listFeatured in PHP Weekly
We have updated the website and our policies to make sure your privacy rights and security are respected.
Click here to learn more about the way our website handles your data.

Remove this message.

Implement a "sounds like" search in PHP

Difficulty: 25 / 50 Tweet
Listen

In this tutorial I will show you how to use two of the lesser known PHP functions (metaphone($str) and levenshtein($str1, $str2)) to implement a "sounds like" search using PHP. By putting these two functions to good use, you will get exceptional results when matching phrases that contain spelling mistakes or simply 'sound similar'.

If you are only here to get the code, you can just download the class directly from the GitHub repository I have created for it.

The metaphone() function is based on the algorithm that Lawrence Philips originally published in 1990 in the Computer Language Magazine and it is used to "estimate" how an English word sounds like.

For example, the metaphone() of the word "javascript" returns the string: JFSKRPT.

The levenshtein function, on the other hand, returns the minimal number of characters that need to be modified in a string to match it with the second one. As an example, the string "javacrript" needs to be changed two times to match the string "javascript" - so levenshtein('javacrript', 'javascript') returns int(2)

The "SoundsLike" class I am sharing today works by getting the metaphone() for each phrase and then applying levenshtein() for each variation in order to obtain the closest matching string.

  
    <?php 
      class SoundsLike
      {

        private $searchAgainst = array();
        private $input;

        /**
        *@param $searchAgainst - an array of strings to match against $input
        *@param $input - the string for which the class finds the closest match in $searchAgainst
        */
        public function __construct($searchAgainst, $input)
        {
          $this->searchAgainst = $searchAgainst;
          $this->input = $input;
        }

        /**
        *@param $phrase - string
        *@return an array of metaphones for each word in a string
        */
        private function getMetaPhone($phrase)
        {
          $metaphones = array();
          $words = str_word_count($phrase, 1);
          foreach ($words as $word) {
            $metaphones[] = metaphone($word);
          }
          return $metaphones;
        }

        /**
        *@return the closest matching string found in $this->searchAgainst when compared to $this->input
        */
        public function findBestMatch()
        {
          $foundbestmatch = -1;

          //get the metaphone equivalent for the input phrase
          $tempInput = implode(' ', $this->getMetaPhone($this->input));

          foreach ($this->searchAgainst as $phrase)
          {
            //get the metaphone equivalent for each phrase we're searching against
            $tempSearchAgainst = implode(' ', $this->getMetaPhone($phrase));
            $similarity = levenshtein($tempInput, $tempSearchAgainst);

            if ($similarity == 0) // we found an exact match
            {
              $closest = $phrase;
              $foundbestmatch = 0;
              break;
            }

            if ($similarity <= $foundbestmatch || $foundbestmatch < 0)
            {
              $closest  = $phrase;
              $foundbestmatch = $similarity;
            }
          }

          return $closest;
        }

      }
    ?>
  

To give the class a test run, use the code below, to match a string against an array of variations of the same phrase.

  
    <?php
    $input = "The quick brown fox jumped over the lazy dog";
    // The Metaphone will be: "0 KK BRN FKS JMPT OFR 0 LS TK"
    $searchAgainst = array("The quick brown cat jumped over the lazy dog", "Thors hammer jumped over the lazy dog", "The quick brown fax jumped over the lazy dog");
    // Metaphones will be: "0 KK BRN KT JMPT OFR 0 LS TK", "0RS HMR JMPT OFR 0 LS TK", "0 KK BRN FKS JMPT OFR 0 LS TK"

    $SoundsLike = new SoundsLike($searchAgainst, $input);
    echo $SoundsLike->findBestMatch();
    ?>
  
comments powered by Disqus