Web Scrape Using Beautifulsoup , Brings Different Content
If you visit http://www.imdb.com/title/tt2375692/episodes?season=1 here, then you will see that season 1,episode 1's publish date is 25 Jan. 2014, This is the code I am using to s
Solution 1:
I get 25 Jan. 2014
when I scrape the date using BeautifulSoup
. First, find the link to the first episode I.
, then get the episode block by taking parent of the link parent, then find the date by class inside:
import urllib2
from bs4 import BeautifulSoup
url = "http://www.imdb.com/title/tt2375692/episodes?season=1"
soup = BeautifulSoup(urllib2.urlopen(url))
episode1 = soup.find('a', {'title': 'I.'}).parent.parent
print episode1.find('div', {'class': 'airdate'}).text.strip()
prints:
25Jan.2014
Post a Comment for "Web Scrape Using Beautifulsoup , Brings Different Content"