Related Keywords Scraper for PPV

by imrat on July 28, 2010

I love to make myself redundant. Thats why I always look to automate before I outsource.

On my journey to Ditch my Dayjob, I am now making some good progress. Adsense, SEO, PPC and PoF revenue streams are in place.

So Whats Next…

…PPV (Pay Per View) or also known as CPV (Cost Per View).

In PPC its all about finding the keywords that convert. On PoF its about the Demographics. In PPV it seems….URL’s and Keywords are key.

So I am creating my own URL scraper.

“Not another scraper!!”

I know that many already exist, but most of them are limited.

They are limited to just scraping google, bing and yahoo for the main keyword.

I want to go beyond that and build something that finds the unique profitable URLs that no one else is targetting.

Phase 1

As a first component to this new scraper I want to be able to give a theme keyword, and get a list of related terms. I have build it in Yahoo Pipes and Dapper, and currently gets related keywords from Google, Bing, and Search.com.

Example: Make Money

Below are the first 20ish related terms for the keyword “make money”:

  • ways to make money
  • money making ideas
  • make money online
  • make extra money
  • make money fast
  • project payday
  • jobs
  • work from home
  • adsense
  • home business
  • free money
  • easy money
  • save money
  • win money
  • i need money
  • make your own website
  • make up games
  • make your own myspace layout
  • make millions
  • make me a supermodel
  • Google Checkout
  • Ways to Make Money
  • Make Money Free
  • Make Money Fast

Here is the Yahoo Pipe for the scraper.

Phase 2: Get URLs

The next phase will be to use the related keywords and grab a bunch of URLs for these. It will do the expected (Google, Bing, Yahoo) but I also want it to do the unexpected ;)

Phase 3: Turn it into an online tool

Following phase 2, no doubt you want access to it. So I will build a quick web front end that will spit out a nice table of URLs and a downloadable CSV file.

What would you like to see me implement in this tool?

  • Weezy

    yeah man that would be nice

  • http://imrat.com imrat

    thanks for checking the blog Weezy. Anything specific that you would like to see in the scraper tool?

  • http://twitter.com/muhacus muhacus

    how about page scraping. I mean scraping a list of url from a page.

  • http://twitter.com/muhacus muhacus

    how about page scraping or scraping a list of URls from a page. Or even scrape a list of books titles, authors from a page.

  • http://imrat.com imrat

    do you mean like scraping for example all page urls from amazon.com that are related to a particular keyword? ie something like the urls from the following for a get out of debt offer: site:amazon.com “get out of debt”

    thats a good idea and fairly easy to do ;) thanks for the suggestion.

  • http://www.oliviervasquez.com Olivier

    Hey, how do you regurgitate (for lack of better word) urls, after you’ve created a url mix with pipes? Would a pgm like feedparser do it? Could you explain? Thx :-)

  • http://imrat.com imrat

    He Olivier – Not sure I am clear on what your question is, or what a “pgm” is. What data are you working with and what are you trying to get as an output?

  • http://www.oliviervasquez.com Olivier

    Sorry, I wasn’t clear…

    When you create an RSS mix to scrape
    links for CPV purposes. How do you go
    about exporting all the urls in a CSV format
    to use on a site like traffic vance or
    something like that?

    You create Daps with dapper – and different
    pipes with yahoo; but do you manually collect
    the links?

    - Or do you use a program (pgm) or software
    that grabs the urls and exports them in an easy
    to use csv format?
    format?

  • http://imrat.com imrat

    Ah understand now. I use Yahoo Pipes to export the URLs to CSV, but there are a couple of specific (poorly documented) steps to get this to work. Ill try and do a post tonight to explain it.

Previous post:

Next post: