I am wanting to create an app that I can pull text from a couple of websites, the text being baseball players, and their rankings for fantasy baseball. I would then like to specify an extra number, my "rating" depending on the website, like -1 if injured or +1 if from cbssportsline.com, and have the app then create a new list of positions and the rankings of players. So I can select pitchers, then it give me the top 10 guys left during a draft
I am sure I can do most of it, but I do not know how to pull the text from the sites I want to use, any ideas?
new project, need to pull text from a webpage,
Moderators: snarkout, Patrick, dann
-
hellonorman
- Posts: 267
- Joined: Wed Dec 28, 2005 11:08 pm
Re: new project, need to pull text from a webpage,
I can't tell from your question whether your stuck on how to retrieve a website page or how to parse a text file.
As for parsing the text file you will have to examine it's structure to figure out how to extract the data you are interested in. Once you can extract the pitcher data you could put that in an array or hashtable(does python have hashtables?). Or you could create a class for pitchers which has a rank element.
Getting the text from the website should be pretty well documented. Also once you have the pitcher data there should be plenty of documentation of working with collections of data. As far as finding the pitcher data in the text...that would depend on examining a specific file.
As for parsing the text file you will have to examine it's structure to figure out how to extract the data you are interested in. Once you can extract the pitcher data you could put that in an array or hashtable(does python have hashtables?). Or you could create a class for pitchers which has a rank element.
Getting the text from the website should be pretty well documented. Also once you have the pitcher data there should be plenty of documentation of working with collections of data. As far as finding the pitcher data in the text...that would depend on examining a specific file.
"It's not a lie, if you really believe it"
--George Costanza
--George Costanza
Re: new project, need to pull text from a webpage,
A.
wget file(s)
html2text file(s)
simple script (via python or whatever) to extract what you need.
B.
You can grep a website via a url.
Never forget grep, awk, cut, and sed.....
http://www.linuxconfig.org/Fgrep
http://ubuntuforums.org/showthread.php?p=6708426
http://ubuntuforums.org/showthread.php?t=906804
look at bashpodder....
wget file(s)
html2text file(s)
simple script (via python or whatever) to extract what you need.
B.
You can grep a website via a url.
Never forget grep, awk, cut, and sed.....
http://www.linuxconfig.org/Fgrep
http://ubuntuforums.org/showthread.php?p=6708426
http://ubuntuforums.org/showthread.php?t=906804
look at bashpodder....
Re: new project, need to pull text from a webpage,
You might want to also look at an O'Reilly book called Baseball Hacks. The book uses PERL for most of the examples, but they are pretty simple and shouldn't be too difficult to convert over to Python.
Re: new project, need to pull text from a webpage,
This sounds interesting... Did you finish this project? What method did you end up using?