My HTML is not loading properly /u/TheEyebal Python Education

So I am building a web scraper for the first time and I am following a tutorial. When I ran my code, it gave me [] , so I used prettify and I noticed the HTML was not loaded properly. I researched from Stack Overflow that “if the webpage you’re scraping relies on JavaScript to load content requests and BeautifulSoup will only capture the HTML that was loaded initially and not the dynamically injected content.

I guess dynamically loaded content is the problem.

I honestly do not understand what that means but is there a way I can pull up all the content for the web page that I am scraping.

Here is my code with prettify().

from bs4 import BeautifulSoup import requests url = "https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s" headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, 'lxml') jobs = soup.find_all("div") print(soup.prettify()) 

Here is my code when running jobs

from bs4 import BeautifulSoup import requests url = "https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s" headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, 'lxml') jobs = soup.find_all("div") print(jobs) 

I removed the class_ element for now but even with the element it still gave me []

submitted by /u/TheEyebal
[link] [comments]

​r/learnpython So I am building a web scraper for the first time and I am following a tutorial. When I ran my code, it gave me [] , so I used prettify and I noticed the HTML was not loaded properly. I researched from Stack Overflow that “if the webpage you’re scraping relies on JavaScript to load content requests and BeautifulSoup will only capture the HTML that was loaded initially and not the dynamically injected content.” I guess dynamically loaded content is the problem. I honestly do not understand what that means but is there a way I can pull up all the content for the web page that I am scraping. Here is my code with prettify(). from bs4 import BeautifulSoup import requests url = “https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s” headers = {‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36’} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, ‘lxml’) jobs = soup.find_all(“div”) print(soup.prettify()) Here is my code when running jobs from bs4 import BeautifulSoup import requests url = “https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s” headers = {‘User-Agent’: ‘Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36’} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, ‘lxml’) jobs = soup.find_all(“div”) print(jobs) I removed the class_ element for now but even with the element it still gave me [] submitted by /u/TheEyebal [link] [comments] 

So I am building a web scraper for the first time and I am following a tutorial. When I ran my code, it gave me [] , so I used prettify and I noticed the HTML was not loaded properly. I researched from Stack Overflow that “if the webpage you’re scraping relies on JavaScript to load content requests and BeautifulSoup will only capture the HTML that was loaded initially and not the dynamically injected content.

I guess dynamically loaded content is the problem.

I honestly do not understand what that means but is there a way I can pull up all the content for the web page that I am scraping.

Here is my code with prettify().

from bs4 import BeautifulSoup import requests url = "https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s" headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, 'lxml') jobs = soup.find_all("div") print(soup.prettify()) 

Here is my code when running jobs

from bs4 import BeautifulSoup import requests url = "https://www.monster.com/jobs/search?q=python&where=Arlington%2C+TX&page=1&so=m.h.s" headers = {'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/96.0.4664.93 Safari/537.36'} html_text = requests.get(url, headers=headers).text # .text allows me to show the html text of the url soup = BeautifulSoup(html_text, 'lxml') jobs = soup.find_all("div") print(jobs) 

I removed the class_ element for now but even with the element it still gave me []

submitted by /u/TheEyebal
[link] [comments] 

Leave a Reply

Your email address will not be published. Required fields are marked *