BIG NEWS: Scrapingdog is collaborating with Serpdog.

Web Scraping Bing with Python (Step-by-Step Tutorial)

Table of Contents

Bing is a great search engine not as great as Google but it beats Google in specific areas like image Search. I prefer Yandex or Bing while making an image search. Generally, search engines are scraped to analyze fresh market trends, sentiment analysis, SEO, keyword tracking, etc. In this post, we are going to scrape search results from Bing. Once we have managed to scrape the first page we will add a pagination system to it so that we can scrape all the pages Bing has over a keyword. Know More: What is Search Engine Scraping

By scraping Bing you can analyze the data and can prepare a better SEO strategy to rank your own website. We are going to use Python for this tutorial and I am assuming that you have already installed Python on your machine.

 

Why Scrape Bing using Python?

Being a very simple language it is also flexible and easy to understand even if you are a beginner. The Python community is too big and it helps when you face any error while coding. It also has many libraries for web scraping.

Many forums like StackOverflow, GitHub, etc already have the answers to the errors that you might face while coding when you scrape Bing search results.

You can do countless things with Python but for now, I even have made one tutorial on web scraping with Python in which I have covered all the libraries we can use.

Let’s Start Scraping

I have divided this part into two sections. In the first section, we are going to scrape the first page, and then in the next section, we will scale our code to scrape all the pages by adding page numbers. In the end, you will have a script that can scrape complete Bing search results for any keyword. That is exciting, right? Let’s begin!

First part

To begin with, we will create a folder and install all the libraries we might need during the course of this tutorial. Also, our target URL will be this Bing page.

For now, we will install two libraries

  1. Requests will help us to make an HTTP connection with Bing.
  2. BeautifulSoup will help us to create an HTML tree for smooth data extraction.
>> mkdir bing
>> pip install requests 
>> pip install beautifulsoup4

Inside this folder, you can create a Python file where we will write our code.

  1. Title
  2. Link
  3. Description
  4. Position
import requests
from bs4 import BeautifulSoup

l=[]
o={}

target_url="https://www.bing.com/search?q=sydney&rdr=1"
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 Safari/537.36"}

resp=requests.get(target_url,headers=headers)

Here we have imported the libraries we just installed and then made an HTTP GET request to the target URL. Now, we are going to use BS4 to create a tree for data extraction.

This can also be done through Xpath but for now, we are using BS4.

soup = BeautifulSoup(resp.text, 'html.parser')

completeData = soup.find_all("li",{"class":"b_algo"})

In our soup variable, a complete HTML tree is stored through which we will extract our data of interest. completeData variable stores all the elements that we are going to scrape.

You can find it by inspecting it.

Let’s find out the location of each of these elements and extract them.

Scraping Title from Bing

The title is stored under a tag of parent class b_algocompleteData variable will be used as the source of the data.
o["Title"]=completeData[i].find("a").text

Scraping URLs from Bing

The description is stored under div tag with class b_caption.

o["Description"]=completeData[i].find("div",{"class":"b_caption"}).text

Let’s combine all this in a for loop and store all the data in the l array.

for i in range(0, len(completeData)):
    o["Title"]=completeData[i].find("a").text
    o["link"]=completeData[i].find("a").get("href")
    o["Description"]=completeData[i].find("div",
{"class":"b_caption"}).text
    o["Position"]=i+1
    l.append(o)
    o={}

print(l)

We have managed to scrape the first page. Now, let’s focus on scaling this code so that we can scrape all the pages for any given keyword.

Second Part

When you click on page two you will see a change in the URL. URL changes and a new query parameter is automatically added to it.

I page URL — https://www.bing.com/search?q=sydney&rdr=1&first=1

II page URL — https://www.bing.com/search?q=sydney&rdr=1&first=11

III page URL — https://www.bing.com/search?q=sydney&rdr=1&first=21

This indicates that the value of “first parameter” increases by 10 whenever you change the page. This observation will help us to change the URL pattern within the loop. We will use a for loop which will increase the value by 10 every time it runs.
for i in range(0,100,10):
    target_url="https://www.bing.com/search?q=sydney&rdr=1&first=
{}".format(i+1)
    
    print(target_url)
    
    resp=requests.get(target_url,headers=headers)
    
    soup = BeautifulSoup(resp.text, 'html.parser')
    
    completeData = soup.find_all("li",{"class":"b_algo"})
   
    for i in range(0, len(completeData)):
        o["Title"]=completeData[i].find("a").text
        o["link"]=completeData[i].find("a").get("href")
        o["Description"]=completeData[i].find("div",
{"class":"b_caption"}).text
        o["Position"]=i+1
        l.append(o)
        o={}

print(l)

Here we are changing the target_url value by changing the value of the first parameter as we talked about earlier. This will provide us with a new URL every time the loop runs and for this tutorial, we are restricting the total pages to ten only.

Just like this, you can get data for any keyword by just changing the URL.

Complete Code

The complete code for the second section will more or less look like this.

import requests
from bs4 import BeautifulSoup

l=[]
o={}
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 
Safari/537.36"}

for i in range(0,100,10):
    target_url="https://www.bing.com/search?q=sydney&rdr=1&first=
{}".format(i+1)

    print(target_url)

    resp=requests.get(target_url,headers=headers)

    soup = BeautifulSoup(resp.text, 'html.parser')

    completeData = soup.find_all("li",{"class":"b_algo"})

    for i in range(0, len(completeData)):
         o["Title"]=completeData[i].find("a").text
         o["link"]=completeData[i].find("a").get("href")
         o["Description"]=completeData[i].find("div",
       {"class":"b_caption"}).text
         o["Position"]=i+1
         l.append(o)
         o={}

print(l)

How can you scrape Bing search results without getting blocked?

Bing is a search engine that has a very sophisticated IP/bot detection system. If you want to scrape Bing at scale then scraping it just like we did above will not work.

You will need rotating proxies, headers, etc. Scrapingdog can help you collect data from Bing without getting blocked. You can leave the headache of proxies and headless browsers on Scrapingdog.

Let’s understand how you can scrape Bing with Scrapingdog with the free pack. In the free pack, you get 1000 free API calls.

Once you sign up you will get an API key on the dashboard. You can use the same code above but in place of the target_url use the Scrapingdog API.

import requests
from bs4 import BeautifulSoup

l=[]
o={}
headers={"User-Agent":"Mozilla/5.0 (Windows NT 10.0; Win64; x64) 
AppleWebKit/537.36 (KHTML, like Gecko) Chrome/107.0.0.0 
Safari/537.36"}

for i in range(0,100,10):
    target_url="https://api.scrapingdog.com/scrape?api_key=YOUR-API-
KEY&dynamic=false&url=https://www.bing.com/search?
q=sydney%26rdr=1%26first={}".format(i+1)

    print(target_url)

    resp=requests.get(target_url,headers=headers)

    soup = BeautifulSoup(resp.text, 'html.parser')

    completeData = soup.find_all("li",{"class":"b_algo"})

    for i in range(0, len(completeData)):
        o["Title"]=completeData[i].find("a").text
        o["link"]=completeData[i].find("a").get("href")
        o["Description"]=completeData[i].find("div",
      {"class":"b_caption"}).text
        o["Position"]=i+1
        l.append(o)
        o={}

print(l)
In the above code just replace “YOUR-API-KEY” with your own key. This will create a seamless data pipeline which can help you create tools like
  • Rank Tracker
  • Backlink Analysis
  • News prediction
  • Market prediction
  • Image detection

Conclusion

In this tutorial, you learned to scrape the Bing search engine. You can make some changes like calculating the number of pages it serves on the keyword provided and then adjusting the for loop accordingly. You can even customize this code to scrape images from Bing.

Of course, you are advised to use a Scraping API for scraping any search engine, not just Bing. Because once you are blocked your pipeline will be blocked and you will never be able to recover it.

I hope you like this tutorial. Please feel free to ask us any scraping-related questions we will respond to as many questions as possible.

Frequently Asked Questions

Yes, Bing does provide an official API. However, it has limitations and it is costlier than 3rd party APIs that you get at a very economical price. Further, 3rd party APIs are easy to use.

Yes, you can scrape Bing search results without any blockage by using a Web Scraping API like Scrapingdog. It offers no blockage data extraction.

Yes, as long as you are doing it for ethical purposes it is legal. In web scraping a general rule of thumb is that you can scrape any public available data from any source including search engines like bing.

Additional Resources

Here are a few additional resources that you may find helpful during your web scraping journey:

Web Scraping with Scrapingdog

Scrape the web without the hassle of getting blocked
My name is Manthan Koolwal and I am the founder of scrapingdog.com. I love creating scraper and seamless data pipelines.
Manthan Koolwal

Web Scraping with Scrapingdog

Scrape the web without the hassle of getting blocked

Recent Blogs

Web Scraping Google Lens

How To Scrape Google Lens using Python

We have scraped Google lens in this blog, using Python. Further, to scale the process we have used Scrapingdog's dedicated API.
Scraping Google Autocomplete

Scrape Google Autocomplete Suggestions using Nodejs

In this read, we have scraped Google Autocomplete using Nodejs, further we have stored it in CSV File. Further, we have used Scrapingdog's dedicated API to scale this process.