Python Scraping JavaScript using Selenium and Beautiful Soup

0
0

I’m trying to scrape a JavaScript enables page using BS and Selenium.
I have the following code so far. It still doesn’t somehow detect the JavaScript (and returns a null value). In this case I’m trying to scrape the Facebook comments in the bottom. (Inspect element shows the class as postText)

Thanks for the help!

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
import BeautifulSoup
browser
= webdriver.Firefox()
browser
.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')
html_source
= browser.page_source
browser
.quit()
soup
= BeautifulSoup.BeautifulSoup(html_source)
comments
= soup("div", {"class":"postText"})
print comments
  • You must to post comments
0
0

There are some mistakes in your code that are fixed below. However, the class “postText” must exist elsewhere, since it is not defined in the original source code.
My revised version of your code was tested and is working on multiple websites.

from selenium import webdriver  
from selenium.common.exceptions import NoSuchElementException
from selenium.webdriver.common.keys import Keys
from bs4 import BeautifulSoup
browser
= webdriver.Firefox()
browser
.get('http://techcrunch.com/2012/05/15/facebook-lightbox/')
html_source
= browser.page_source
browser
.quit()
soup
= BeautifulSoup(html_source,'html.parser')
#class "postText" is not defined in the source code
comments
= soup.findAll('div',{'class':'postText'})
print comments
  • You must to post comments
Showing 1 result
Your Answer
Post as a guest by filling out the fields below or if you already have an account.
Name*
E-mail*
Website