Hello!
I need a PHP code which parses a multi array variable to sample and collect keywords and keyword phrases. Please find the complete project description deeper. [the scape here is too small I think]
## Deliverables
Let me start with an example how the content of the variable that have to be parsed may look like:
Array ( [title] => Array ( [0] => Alice DSL ab 14,90?'?/M* [1] => o2 DSL Flatrate [2] => Kabel Internet Telefon TV [3] => DSL Flatrate [4] => Festnetz+DSL trotz Schufa [5] => DSL Preisvergleich [6] => Hochwertige SDSL Leitung [7] => 1 Woche kostenlos testen [8] => SEO Tools XOVI [9] => PC Leistung Verbessern [10] => "Richtig Fett Abnehmen" [11] => Tagesgeld-Vergleich [12] => Adwords Optimieren [13] => Arcor Internet Flatrate [14] => Suchmaschinen-Marketing [15] => Flatrate Vergleich [16] => 14 Tage Diät mit Erfolg [17] => Kostenlose Übungen [18] => Unitymedia DSL [19] => für nur 0,03 C/Min Surfen ) [text] => Array ( [0] => Superflexibel ohne feste Laufzeit:
Verbinden, ohne sich zu binden! [1] => Jetzt zu o2 wechseln und
49 ?'? Anschlusspreis sparen! [2] => Internet, Telefon & TV Tarife der
deutschen Kabel-Anbieter gibts hier [3] => Internet und Telefon 19,99?'?/M*.
Neu: mit kostenloser Notebook-Flat! [4] => Festnetz+DSL ab 14,90 EUR/Monat
Kein bestehender Anschluß nötig. [5] => Aktuelle DSL Angebote im Vergleich:
Preise, Leistungen, Aktionen, ... [6] => High-End Lösung für Unternehmen
Nur 99?'?/Monat* Hier informieren! [7] => Weil Weight Watchers funktioniert
Jetzt gesund & effektiv abnehmen! [8] => Alle Tools nur 99 ?'? zzgl. MwSt. mtl
Preis-Leistungs-Sieger lt Suchradar [9] => Fehler bereinigen & entfernen für
einen schnelleren PC. Free Download [10] => Ohne Diät + Mit Fett - unglaublich
Hier sofort entdecken und staunen! [11] => Tagesgeld-Konten mit Top-Zinsen
im aktuellsten Online-Vergleich! [12] => Wie Sie Ihren AdWords-Umsatz
ver-10-fachen. Hier erfahren! [13] => Arcor Internet Flatrate Tarife.
Infos & Bestellung Arcor DSL hier! [14] => Großartig gefunden werden dank
unserer Suchmaschinen-Maßnahmen [15] => Großer Flatrate Vergleich von DSL-,
Telefon-, Handy-, Doppel-Flat etc. [16] => Fettverbrennung anregen & abnehmen
Ihre Traumfigur in wenigen Wochen! [17] => um die Konzentration Ihres Kindes
zu steigern. Einfach anmelden! [18] => Internet mit 32.000 kBit/s, Telefon
und HDTV bis 31.10. + 6 Mon. gratis [19] => Internet ohne Einwahlgebühr
oder Mindestumsatz ) [link] => Array ( [0] => [login to view URL] [1] => [login to view URL] [2] => [login to view URL] [3] => [login to view URL] [4] => [login to view URL] [5] => [login to view URL] [6] => [login to view URL] [7] => [login to view URL] [8] => [login to view URL] [9] => [login to view URL] [10] => [login to view URL] [11] => [login to view URL] [12] => [login to view URL] [13] => [login to view URL] [14] => [login to view URL] [15] => [login to view URL] [16] => [login to view URL] [17] => [login to view URL] [18] => [login to view URL] [19] => [login to view URL] ) )
YEAH! That looks terrible, I know. ;-) To make it look better and to make it easier to understand the structure here the content if I display it with <PRE> tags:
Array
(
[title] => Array
(
[0] => Alice DSL ab 14,90â'¬/M*
[1] => o2 DSL Flatrate
[2] => PC Leistung Verbessern
[3] => DSL Flatrate
[4] => Ihr PC wird langsamer?
[5] => SEO Tools XOVI
[6] => 1 Woche kostenlos testen
[7] => 4,45% Tagesgeld-Zinsen
[8] => AdWords Erfolg 2010
[9] => DSL Preisvergleich
[10] => Arcor Internet Flatrate
[11] => "Richtig Fett Abnehmen"
[12] => Flatrate Vergleich
[13] => Günstige Flatrate Tarife
[14] => Arcor DSL & Telefon
[15] => PC Leistung Optimieren
[16] => Unitymedia DSL
[17] => Flatrate Vergleich
[18] => 14 Tage Diät mit Erfolg
[19] => Abnehm Globuli
)
[text] => Array
(
[0] => Superflexibel ohne feste Laufzeit:Verbinden, ohne sich zu binden!
[1] => Jetzt zu o2 wechseln und49 â'¬ Anschlusspreis sparen!
[2] => Fehler bereinigen & entfernen füreinen schnelleren PC. Free Download
[3] => Internet und Telefon 19,99â'¬/M*.Neu: mit kostenloser Notebook-Flat!
[4] => Optimiert Ihren PC in 2 MinutenDeutsche Version, 100% Kostenlos!
[5] => Alle Tools nur 99 â'¬ zzgl. MwSt. mtlPreis-Leistungs-Sieger lt Suchradar
[6] => Weil Weight Watchers funktioniertJetzt gesund & effektiv abnehmen!
[7] => Tagesgeld-Konten mit Top-Zinsenim aktuellsten Online-Vergleich!
[8] => Wie Sie Ihren AdWords-Umsatzver-10-fachen. Hier erfahren!
[9] => Aktuelle DSL Angebote im Vergleich:Preise, Leistungen, Aktionen, ...
[10] => Arcor Internet Doppelflat [login to view URL] Beratung & Auftrag hier!
[11] => Ohne Diät + Mit Fett - unglaublichHier sofort entdecken und staunen!
[12] => Großer Flatrate Vergleich von DSL-,Telefon-, Handy-, Doppel-Flat etc.
[13] => Flatrate günstig - Tarife & AngebotVergleich, Beratung & Bestellung!
[14] => 120 â'¬ Aktionsprämie plus 50 â'¬Startguthaben. Nur diesen Monat!
[15] => Fehler bereinigen für einenschnelleren PC. Gratis Download!
[16] => 3play - dreifach faszinierend: DSL,Telefon & DTV inkl. 6 Gratismonate*
[17] => Flatrate bereits ab 19,90 Euro imMonat - zum Flatrate Vergleich
[18] => Fettverbrennung anregen & abnehmenIhre Traumfigur in wenigen Wochen!
[19] => Hochwertiges KomplexmittelStoffwechsel, Fettverbrennung
)
[link] => Array
(
[0] => [login to view URL]
[1] => [login to view URL]
[2] => [login to view URL]
[3] => [login to view URL]
[4] => [login to view URL]
[5] => [login to view URL]
[6] => [login to view URL]
[7] => [login to view URL]
[8] => [login to view URL]
[9] => [login to view URL]
[10] => [login to view URL]
[11] => [login to view URL]
[12] => [login to view URL]
[13] => [login to view URL]
[14] => [login to view URL]
[15] => [login to view URL]
[16] => [login to view URL]
[17] => [login to view URL]
[18] => [login to view URL]
[19] => [login to view URL]
)
)
That's much better, isn't it? ;-)
As you can see here are three depending arrays with each 20 entries. The amount of entries is variable. May be less than 20 but never more than 20.
I am only interested in the first array. Just ignore array 2 and 3.
The first step tat is to do is to delete all words in array 1 which are written in a txt file. On the webserver the file's name is filterlist.txt.
For building the code just use a txt file with the following words in it:
aber
als
am
an
auch
auf
aus
bei
bin
bis
bist
da
my
Also all numbers from 0 to 9 have to be deleted ["1304" as example has also completely to be deleted].
And the special characters ,.?;- also have to be deleted.
Now the text have to be parsed twice. The first time every single word must be checked. If it is new, it must be written in an array. If it is already part of the array, the counter have to be set up.
For example: We habe the text "hello my friend, where are you my friend, where?" So we first have to delete all we can find in the filterlist are receive: "hello friend where are you friend where"
The array your code should build have finally to contain:
$single[0] = "friend;2";
$single[1] = "where;2";
$single[2] = "hello;1";
$single[3] = "are;1";
$single[4] = "you;1";
So only the word "friend" appears twice in the text. Every other word only once. Please start the array with the highest number (here: 2) and count down.
Okay, now to the second parsing step. The text is the same (after deleting the words and characters from the filterlist) but now do noch check single words, check now always two words together.
So all available word pairs from our example are
hello friend
friend where
where are
are you
you friend
friend where
As you can see only one word pair appears twice: "friend where". So the array for this second paring is:
$double[0] = "friend where;2";
$double[1] = "hello friend;1";
$double[2] = "where are;1";
$double[3] = "are you;1";
$double[4] = "you friend;1";
That's it!