This blog contain the information about web scraping like how it works , how to do , why it is important and all the necessary things cover in this blog .
Web Scraping is an automatic method to obtain large amounts of data from websites. Most of this data is unstructured data in an HTML format which is then converted into structured data in a spreadsheet .
Suppose you want get some information from the web page and you want to find large amount of data at that time web scraping concept is use . It can extract all the data from particular web page that user want to extract .
For example if user want to extract only mobile phone names and price from the e-commerce web page at that time web scraping is used .
First of all find the URL you want to scrap , after that Inspecting the web page , then find the data you want to extract and finally convert the data into spreadsheet .
For the following steps we use different libraries like selenium , Beautiful soap , Pandas . For access the web page use selenium chrome driver ,web scraping or extract the data use Beautiful soap and data manipulation
use Pandas .
First of all , we have to install library like selenium , Beautiful soup and pandas .
pip install selenium beautifulsoup4 pandas
Setup Chrome Driver:
Download chrome driver
Unzip the file and set the file after that add the PATH variable in Environment Variable.
Extract Information :-
We have to extract mobile phones information like name , price and specifications of mobile in flipkart webpage .
Make an empty list for names , price and functions .
Add URL of Webpage to the driver so that we can get the source code of the particular webpage.
Scrape the data using Beautiful soup library
we get the source code of the page and then create a Beautiful Soup Object So that we can perform some operations on it.
Now we have to fetch the names , price and functions of mobile phones .
Convert data into spreadsheet using Pandas
After getting the data , we have to convert our data into spreadsheet using pandas library .
We can fetch any data from a webpage by using a web scrapping library like beautiful soup, scrappy, etc. After converting into Pandas we can apply all pandas functions on that data.