Next you have two functions in PHP to get all word combination from phrase, but the algorithm can be translated into any other language, initially I had it done in delphi/pascal:
function natural_combi_words($phrase){ $arw = array(); $words = explode(' ',trim($phrase)); $nw = count ($words); for ($i=0; $i<$nw; $i++){ $k =''; $cnt = 0; for ($j=$i; $j<($nw); $j++){ $k .= $words[$j].' '; $arw[$cnt][] = trim($k); $cnt++; } } return $arw; } function natural_combi_words2($phrase){ $combi_source = natural_combi_words($phrase); $combi_words = $combi_source[0]; $nw = count ($combi_source); for ($i=1; $i<$nw; $i++){ $combi_words = array_merge($combi_words, $combi_source[$i]); } return $combi_words; } $phrase = "I have a dream that one day"; print_r(natural_combi_words($phrase)); print_r(natural_combi_words2($phrase));
if phrase is: I have a dream that one day , function natural_combi_words2() returns:
Array ( [0] => I [1] => have [2] => a [3] => dream [4] => that [5] => one [6] => day [7] => I have [8] => have a [9] => a dream [10] => dream that [11] => that one [12] => one day [13] => I have a [14] => have a dream [15] => a dream that [16] => dream that one [17] => that one day [18] => I have a dream [19] => have a dream that [20] => a dream that one [21] => dream that one day [22] => I have a dream that [23] => have a dream that one [24] => a dream that one day [25] => I have a dream that one [26] => have a dream that one day [27] => I have a dream that one day )
and function natural_combi_words() returns multi-dimensional arrays like this:
Array ( [0] => Array ( [0] => I [1] => have [2] => a [3] => dream [4] => that [5] => one [6] => day ) [1] => Array ( [0] => I have [1] => have a [2] => a dream [3] => dream that [4] => that one [5] => one day ) [2] => Array ( [0] => I have a [1] => have a dream [2] => a dream that [3] => dream that one [4] => that one day ) [3] => Array ( [0] => I have a dream [1] => have a dream that [2] => a dream that one [3] => dream that one day ) [4] => Array ( [0] => I have a dream that [1] => have a dream that one [2] => a dream that one day ) [5] => Array ( [0] => I have a dream that one [1] => have a dream that one day ) [6] => Array ( [0] => I have a dream that one day ) )
I used this solution back in 2003, in a video player for movies, which looks for subtitles for a movie in subtitles folder and even if the name of the movie was not 100% identical to the subtitle. The program always chooses the right subtitle if there is one!
Example:
video file:
Laughing.out.Loud.2009.1080p.Blu-ray.REMUX.AVC.DTS-HD.MA.5.1-playBD.avi
the subtitle chosen from thousands of other subtitles in the same folder:
LOL.(Laughing.out.Loud).2009.BDRip.XviD.HORiZON-ArtSubs.ENG-RO.srt
Practically using this algorithm you can calculate a rating for all scanned subtitles and which has the highest value, that is chosen, but you can choose 2-3 with maximum (valid) ratings not just one.
Of course, the rating must be calculated using multi-dimensional arrays on each group, eliminated on the very small ones, etc … the final solution is laborious and looks like an A.I.
Be First to Comment