Lengkap

Parse and cleanup email database

I have an email database of my personal email that I would like to use the script to clean it up. For the purpose of this project, the email which is stored in a mysql database has the following field: 1) ID (unique) 2) Subject 3) Send Name 4) To 5) CC 6) Content 7) Subject Prefix 8) Normalized Subject

I can send you a sample of the database when you are winning the project. It stores all emails that I received or sent in the past few years.

There are two things I need your script to do:

1) Clean up the content. The email, as any regular email system would do, would frequently quote the original email during a reply. Example:

o - these can be neglected.

Don

> -----Original Message-----

>From: Clinton, Hilary

>Sent: Monday, May 07, 2001 11:35 AM

>To: Trump, Donald

>Subject: RE: Simulation model structure

>

>Oh, and one question is whether I should add Low Pass Filter

>before the carrier-frequency up-conversion and after carrier-

>frequency down-conversion?

>

>Hilary

What I want your script to do is to remove all the quoted previous email (which could be nested) and leave only the reply. And in the above example, create a name field for the database with name Original Content and leave only this in that field:

o - these can be neglected.

Don

2) Find the real conversation pair

The second part of the project is a little bit more complex, because I would like the script to find out the real conversation pair between each email.

To explain this, I will explain a little bit more about the fields explained in mail database. It contains the field Sender Name; To and CC

Sender Name is following the standard of LastName, FirstName as in example "Clinton, Hilary".

The recipient name is following the same standard but it could be addressed to multiple person and separated by ";". For example, To can be "Clinton, Hilary; Obama, Barack". The recipient could also be a mailing group, example POTUS which may include everyone. Example, "POTUS; Clinton Hilary; Obama, Barack". The CC is the recipient that is copied, which follows the same format standard as To recipient.

Since it is all my personal email, obviously, either I am in the sender name or the recipient name (but in the latter case it could be a mail list).

So there is a few different cases:

a) Explicit pair

If either the sender and recipient (no CC) is my name, and other party is LastName, FirstName, and it is apparent communication between the two party. Store that in Real Sender and Real Recipient field that you create on the MySQL database

b) Implicit Pair Type 1

If the email contains any FIRSTNAME in the recipient in the cleaned up content and AT THE BEGINNING OF A SENTENCE, then it is likely a communication between the Sender with that person, even though others are included and CC'ed. For example, in an email to "Clinton, Hilary; Obama, Barack", it starts with

"Barack: "

Then the Real Recipient is likely Barack Obama.

Note, in rare case, an email could have several FIRSTNAME mentioned at the beginning of a sentence. We can discuss what to do later.

3) Implicit Pair Type 2

The second type of implicit pair is that the name is not mentioned, but it is implied by the email structure. I already broke up the Subject of the email by Subject Prefix and Normalized Subject. When you reply to an email, you create a Subject that is Re: Original Subject, The Re: is stored in Subject Prefix and Original Subject is stored in the "Normalized Subject", and then you can infer the recipient if the mail has the same "Normalized Subject" and the REAL Recipient is likely the first quoted original email sender, and in the below example, Hilary Clinton.

o - these can be neglected.

Don

> -----Original Message-----

>From: Clinton, Hilary

>Sent: Monday, May 07, 2001 11:35 AM

>To: Trump, Donald

>Subject: RE: Simulation model structure

Lastly, I use the politicians' name only for illustration purpose. You are not going to find Hilary's leaked email if you win this project. :-)

Kemahiran: Kejuruteraan, MySQL, Pengurusan Projek, Python, Ujian Perisian

Lihat lagi: php parse multiple email addresses, email server parse database, india people database email addresses classifieds 2009, united kingdom people database email addresses classifieds 2009, create appointment email macro parse, server email scripts parse, vtiger database email opt, free media database email addresses, email php parse, php create csv godaddy database email, submit form results database email, email form send multiple emails, email form emails multiple users, email details parse, contact form database email, post database email form mysql php, database email united states, access database email pop3, parse comma separated values within worksheet

Tentang Majikan:
( 15 ulasan ) Irvine, United States

ID Projek: #12079562

Dianugerahkan kepada:

Snake2k

"What makes you the best candidate for this project?" I make me the best candidate for this project lol It seems like fun project. From what I can see it is a simple filtering and processing project. It shouldn't be t Lagi

$95 USD dalam 3 hari
(3 Ulasan)
1.8

14 pekerja bebas membida secara purata $153 untuk pekerjaan ini

zoloogg

Professional scrapper here. ;-) I've managed to scrape countless sites in past. It required very specific skill in data (text) manipulation. ;-) I'm quite sure I could handle it ^_^

$145 USD dalam 5 hari
(17 Ulasan)
4.6
$133 USD dalam 3 hari
(17 Ulasan)
4.6
anson418

My name is Anson and I am a programmer of about 10 years now from San Francisco. I currently work professionally using Python to do data scraping, processing and formatting. I also have experience designing and worki Lagi

$100 USD dalam 3 hari
(4 Ulasan)
2.5
zkutch

Hello. More 20 years programming experience. I need more details to set real time and price. Regards. -------------------------------------------------------------------------------------------------------------- Lagi

$100 USD dalam 3 hari
(6 Ulasan)
3.7
johns1986

Hello, My name is John, I am a Professional Programmer since 2008, I have many programs developed and successfully finished in my past employer, I am expert in IT Industry both Hardware and Software Programming, I w Lagi

$200 USD dalam 3 hari
(1 Ulasan)
1.2
adobe24816

Dear honorable Client, I am a freelancer and one of the web developers whom you can trust here and have completed ten projects successfully with your blessing. and my account is new but I am not a new developer I a Lagi

$277 USD dalam 10 hari
(0 Ulasan)
0.0
theo2120

I'm managing corporate emails for more than 5 yrs.

$111 USD dalam 3 hari
(0 Ulasan)
0.0
pradeepb4u18

Hello As your requirement is clear to me however thanks for giving us before. I have 5+ years of software QA/testing experience. I'm new as a freelancer however I have worked on 100+projects like this and tested a Lagi

$555 USD dalam 7 hari
(0 Ulasan)
0.0
keithbelcher3

Hello, I recently worked for a nonprofit organization where my primary responsibility was to clean up their Salesforce database. This sounds like an interesting problem and I would definitely be able to do an excellent Lagi

$100 USD dalam 3 hari
(0 Ulasan)
0.0
cybertarun

Dear Sir, Since you have a custom requirement, time is not an issue. Can do it in Python or Perl as you want. To start with Can you please send the sample database. thankyou

$166 USD dalam 10 hari
(0 Ulasan)
0.0
PoonamRaskar16

A proposal has not yet been provided

$133 USD dalam 3 hari
(0 Ulasan)
0.0
Zeeshanrahim38

Respected Employer, I am ACCA Qualified and BSC Hon's from Oxford Brookes University, UK with practical working experience in UAE as Accounts Manager. I am available to do online tasks that you need as I have experie Lagi

$88 USD dalam 3 hari
(0 Ulasan)
0.0
sleandro

Good Morning. I founded a new company dedicated to Email Database Building & Email Verification. Here is a brief list of our key service: >>>> VERIFIED EMAIL ADDRESSES - 0 Bounces - 8 Cents Per Record <<<< >> Lagi

$55 USD dalam 10 hari
(0 Ulasan)
0.0
Marino4ka

Hello, I have 6 years background in QA. I have been testing web applications for the health care insurance services and also have experience with automation testing of front-end and SQL verification of ETL back-end pro Lagi

$88 USD dalam 10 hari
(0 Ulasan)
0.0