We need a bot that will scrape the blog directory at Technorati ([url removed, login to view]) for a few pieces of information.
1. Blog Name
2. Blog URL
It can do this by selecting one of the subtopics of the directory (the first example is here: [url removed, login to view]), then scraping the 3 pieces of data for each of the 10 blogs listed on page 1.
For example, with the first example above, it would scrape:
[url removed, login to view]
It should then move on to page 2 and scrape the same data, then page 3, etc.
Data should be stored in an Excel document or similar document with three columns for the Blog Name, URL, and Authority.