Thursday, April 29, 2010

Parsing Atom Feeds with Python

One of the things that I have really come to like about Python is how easy it is to prototype some code to try an idea out. I wanted to parse some blog feeds to try out some semantic analysis algorithms.

I just imported feedparser and let 'er rip.
Here is a simple example to print out the titles of this blogs Atom feed.

import feedparser

def parseit( feedurl ):
    
    data = feedparser.parse( feedurl )

    print 'Title => ' + data['title']

    print 'There are %d entries.' % (len(data['entries']))

    for entry in data['entries']:
        print entry.title

if __name__ == "__main__":

    print 'Running'
    parseit('http://dataoracle.blogspot.com/feeds/posts/default')

No comments:

Post a Comment