Web scraping с#: tutorial, c# data extraction sample for ebay | MyDataProvider

How to scrape data from website in C#

Find out how to create c# web scraper. Data extraction sample for Ebay: tutorial, video demo.



About Web Scaping with C#

With the passage of time, the process of extracting data is increasing. The data in different websites can be accessed through their web API or web services. If some websites does not provide or allow access to their data then Web scraping is used which is used to accessed data. Web scraping allowing developers to simulate and automate to human browsing behavior to extract content files, images and other information from web applications to perform specific task.

Video Demo

Now I’am going to scrap a website with C#. I’am going to scrap ebay which is very famous website using C#. This process follows a procedure and the procedure contains different steps which are as follow in details.

 

 

Lets develop Web Scraper with c# for Ebay

 

•Open Visual Studio. Add new project and choose the C# console application from the template. Name the project as EbayScrapper as shown in fig.

•After this go to the website of ebay. Here is the link. https://www.ebay.com
•On ebay website go to advanced search and search xbox one. The items related to this will be shown.

•Copy this URL and paste it in main program because we’ll scrap this web page later in this article.

 

Required References

 

•After this we will need to install following from C# NuGet packages by just right click on reference in program directory.
1) HTTPClient
2) HtmlAgilityPack
Following are the necessary things for scrapping this website.

 

 

•The following code will determine the exactly website code which we can check onto the website.

static void Main(string[] args)
        { GetHtmlAsync();  Console.ReadLine()}
            

Static void GetHtmlAsync()
 {
 var url ="https://www.ebay.com/sch/i.html?_nkw=xbox+one&_in_kw=1&_ex_kw=&_sacat=0&LH_Complete=1&_udlo=&_udhi=&_samilow=&_samihi=&_sadis=15&_stpos=&_sargn=-1%26saslc%3D1&_salic=1&_sop=12&_dmd=1&_ipg=50";
            var httpclient = new HttpClient();
            var html = httpclient.GetStringAsync(url);
            Console.WriteLine(Result.html);}

The output of this code is as follow:

 

•We just copy and paste website code into notepad to make sure some things which we use later

•Now after this we will parse data from ebay to our application. For this we use Html document. We will write a code for grab list items. We write this code according to the website code.

var ProductsHtml = htmlDocument.DocumentNode.Descendants("ul")
                 .Where(node => node.GetAttributeValue("id", "")
                 .Equals("ListViewInner")).ToList();
       var ProductListItems = ProductsHtml[0].Descendants("li")
                 .Where(node => node.GetAttributeValue("id", "")
                  .Contains("item")).ToList();
Console.WriteLine(ProductListItems.Count());
       Console.WriteLine();

•After running the program we use break point at specific time and it is clear that count is 50. We have all information related to one item and test visualizer will be shown as:

•Now this steps will pull out all information from website to my console application. For this I’m going to make some implementation to existing code.
•First of all I’m going to use for each loop that will sort things in my console according to website.
•First of all I’m going to pull out listing id because every ad on website has unique id. For this I must have to use some additional things in the code which are as follow:

foreach( var ProductListItem in ProductListItems)
       { 
       //ID
         Console.WriteLine(ProductListItem.GetAttributeValue("listingid", ""));
     

•The next thing I’m going to pull out the product name itself. The product name and id code must be written in for each loop. The product name code will write according to the attribute use in the website code. The code will be shown as follow:

       //ProductName
                Console.WriteLine(ProductListItem.Descendants("h3")
                .Where(node => node.GetAttributeValue("class", "")
         .Equals("lvtitle")).FirstOrDefault().InnerText.Trim('\r', '\n', '\t'));

•The next thing I’m going to pull out is the price of the product. The price of the product code will also write according to the attribute use in the website code. The code will be shown as follow:

//Price               
            Console.WriteLine(ProductListItem.Descendants("li")
            .Where(node => node.GetAttributeValue("class", "")
            .Equals("lvprice prc")).FirstOrDefault().InnerText.Trim('\r','\n','\t')                              
      );

•All these implementation are made according to the anchors and attributes of the website.
These are the details of the product. The next thing I am going to pull out is listing type which code is given as:

//ListingType
             Console.WriteLine(ProductListItem.Descendants("li")
             .Where(node => node.GetAttributeValue("class", "")
             .Equals("lvformat")).FirstOrDefault().InnerText.Trim('\r', '\n', '\t')
       );

•The last thing which need to be pull out is the link of the product as given in the website and its code is as follow:

//URL
Console.WriteLine(  ProductListItem.Descendants("a").FirstOrDefault().GetAttributeValue("href", "")
);

The overall code of this web scraping process is as below:

 

Output

 

 

Sample Product

Sample Product on ebey website.

 

Download C# Web Scraper Sample

 

Download

 

Contact c# web scraping experts

C# Web Scraping Libraries
Fizzler
Htmlagilitypack
Anglesharp
Csquery