How to look for similar expressions (similarities) in a list of strings in php
Source code:
$list = [
'a' => 'Ligula, potenti elementum aenean incididunt velit ullamco leo etiam. ',
'b' => 'Sollicitudin habitasse fugiat ante aptent, vitae facilisis varius netus id porta.',
'c' => 'Vitae vivamus posuere ad commodo et cubilia mattis et quisque.',
'd' => 'Euismod donec primis convallis laborum diam ultrices dolor ut suscipit ad incididunt facilisis, tristique mattis velit',
'demo1' => 'Clas conubia molestie elite ultriciesus laoret neque veniam, fringilus risus.',
'demo2' => 'Class conubia molestie elit ultricies laoreet neque veniam, fringilla risus.',
'demo3' => 'Molestie elit ultricies laoreet',
'x' => 'Massa himenaeos diam.',
];
$phrase = "Fringilla risus, class conubia molestie elit ultricies laoreet neque veniam. ";
/**
* Look for similar expressions (similarities) in a list of expressions or strings
* $needle : string; the phrase you are looking for
* $haystack : the target array in which the phrase is searched, the elements must be string, otherwise the function will give Waring error !
* $maxhit : It can be used to get an idea of how close it is to accuracy, comparing with result from returned list !
* $minhit : value under $minhit can be ignored, the chance of similarity is very small;
* ########
* result is non-linear/ divergent series, sorted in descending order.
* The bigger the number the better similarity
*/
function search_similarity(string $needle, array $haystack, &$maxhit=0, &$minhit=0) {
$len = strlen($needle);
$maxhit = $len*($len+1)/2;
$minhit = round ( ($len/4)*(($len/4)+1)/2 );
$hit_list=[];
foreach($haystack as $key=>$val) {
$i=0;
$hits = 0;
$s = '';
while ($i < $len) {
$s .= $needle[$i];
$pos = strpos($val, $s);
if ($pos === false)
$s = '';
$hits += strlen($s);
$i++;
}
$hit_list[$key] = $hits;
}
arsort($hit_list);
return $hit_list;
}
$similarity = search_similarity($phrase, $list, $maxhit, $minhit);
echo '<pre>';
echo 'LOOK for: '.PHP_EOL.$phrase;
echo PHP_EOL.PHP_EOL;
echo 'SEARCH in LIST: ';
print_r($list);
echo PHP_EOL;
echo 'SEARCH RESULT: ';
print_r($similarity);
echo PHP_EOL;
echo 'maxhit='.$maxhit.'; minhit='.$minhit;
Result example:
LOOK for:
Fringilla risus, class conubia molestie elit ultricies laoreet neque veniam.
SEARCH in LIST: Array
(
[a] => Ligula, potenti elementum aenean incididunt velit ullamco leo etiam.
[b] => Sollicitudin habitasse fugiat ante aptent, vitae facilisis varius netus id porta.
[c] => Vitae vivamus posuere ad commodo et cubilia mattis et quisque.
[d] => Euismod donec primis convallis laborum diam ultrices dolor ut suscipit ad incididunt facilisis, tristique mattis velit
[demo1] => Clas conubia molestie elite ultriciesus laoret neque veniam, fringilus risus.
[demo2] => Class conubia molestie elit ultricies laoreet neque veniam, fringilla risus.
[demo3] => Molestie elit ultricies laoreet
[x] => Massa himenaeos diam.
)
SEARCH RESULT: Array
(
[demo2] => 1705
[demo3] => 493
[demo1] => 459
[d] => 99
[a] => 77
[c] => 60
[b] => 56
[x] => 43
)
maxhit=3003; minhit=195
Be First to Comment