new project, need to pull text from a webpage,

Study group dedicated to learning how to code in the Python language.

Moderators: snarkout, Patrick, dann

Post Reply
User avatar
riddlebox
Posts: 86
Joined: Mon Jul 03, 2006 2:09 pm
Contact:

new project, need to pull text from a webpage,

Post by riddlebox » Sun Mar 22, 2009 7:30 am

I am wanting to create an app that I can pull text from a couple of websites, the text being baseball players, and their rankings for fantasy baseball. I would then like to specify an extra number, my "rating" depending on the website, like -1 if injured or +1 if from cbssportsline.com, and have the app then create a new list of positions and the rankings of players. So I can select pitchers, then it give me the top 10 guys left during a draft
I am sure I can do most of it, but I do not know how to pull the text from the sites I want to use, any ideas?

hellonorman
Posts: 267
Joined: Wed Dec 28, 2005 11:08 pm

Re: new project, need to pull text from a webpage,

Post by hellonorman » Sun Mar 22, 2009 12:56 pm

I can't tell from your question whether your stuck on how to retrieve a website page or how to parse a text file.

As for parsing the text file you will have to examine it's structure to figure out how to extract the data you are interested in. Once you can extract the pitcher data you could put that in an array or hashtable(does python have hashtables?). Or you could create a class for pitchers which has a rank element.

Getting the text from the website should be pretty well documented. Also once you have the pitcher data there should be plenty of documentation of working with collections of data. As far as finding the pitcher data in the text...that would depend on examining a specific file.
"It's not a lie, if you really believe it"
--George Costanza

User avatar
eddie
Posts: 974
Joined: Wed Sep 05, 2007 10:46 pm
Location: here
Contact:

Re: new project, need to pull text from a webpage,

Post by eddie » Wed Mar 25, 2009 11:29 pm

A.
wget file(s)
html2text file(s)
simple script (via python or whatever) to extract what you need.

B.
You can grep a website via a url.

Never forget grep, awk, cut, and sed.....
http://www.linuxconfig.org/Fgrep
http://ubuntuforums.org/showthread.php?p=6708426
http://ubuntuforums.org/showthread.php?t=906804
look at bashpodder....

brian_X7
Posts: 3
Joined: Sun Mar 29, 2009 9:33 am

Re: new project, need to pull text from a webpage,

Post by brian_X7 » Mon Mar 30, 2009 9:45 pm

You might want to also look at an O'Reilly book called Baseball Hacks. The book uses PERL for most of the examples, but they are pretty simple and shouldn't be too difficult to convert over to Python.

User avatar
jstgtpaid
Posts: 83
Joined: Mon Nov 10, 2008 2:14 pm

Re: new project, need to pull text from a webpage,

Post by jstgtpaid » Mon Aug 24, 2009 11:19 am

This sounds interesting... Did you finish this project? What method did you end up using?

Post Reply