225708 Unix Commands Help Grep & Diff


I need help greping files, diff & extracting data. Ubuntu Linux.

I am running:

grep '^[a-zA-Z0-9-]\+ AA .*'[url removed, login to view]|sed 's/AA .*//'|uniq >[url removed, login to view]

I end up with a clean file called file2.txt.

I have to download this file daily & grep it. So tomorrow, I would run the above grep command & I need to diff today & yesterday's file2.txt. I would like an output of any differences between both files.

Difference # 1

Make a new text file of domains that are missing from yesterday's text file.

Difference # 2

Make a new text file of new domains that appear in today's list that weren't in yesterday's list.

For the differences found, I need to go back to the orginal file & extract data to the right of the delimiter AA.

I then need to populate a MySQL database with the daily differences & provide a search feature.

Here's the 3 fields I will need to search - [url removed, login to view]

I have made REGEX PHP scripts to import the data into mysql but they are too slow. The files vary in size from 500MB - 1GB & 1 file is 6GB in size. But the grep handles my command above on these massive files.

Kemahiran: Semua Boleh, MySQL

Lihat lagi: regex is, regex in c, aa com, regex &, file2, uniq, Regex, massive data, import text file mysql database php, text extracting, run ubuntu, php regex extract, php text differences, command help, regex php, php diff file, help linux, running unix, extract domains text, grep slow, mysql clean data, linux help, unix commands, mysql slow, run commands

Tentang Majikan:
( 11 ulasan )

ID Projek: #1971945

Dianugerahkan kepada:


Im sure we can get this sorted out ^.^

$50 USD dalam sehari
(70 Ulasan)