MyDataProvider » Blog » Cheerio Node.Js for fields extraction. Lets try to export product price,title,sku

Cheerio Node.Js for fields extraction. Lets try to export product price,title,sku

  • by

At previous articles we learned how to create a simple web scraper using NodeJs and axios.

How to create a web scraper using nodejs and axios


Axios Node.Js module allows us to load html page source. Now we have to extract required fields from html and we will do that via Cheerio module for NodeJs.

Cheerio module for web scraping via NodeJS - extract fields from a product page

Lets make this function for script where we extracted html page source by product urls

How to scrape product urls from text file using NodeJS

Lets write extraction code for this test page

Product Sample Page for Web Scraping

Here is part of html with sku,price,title, currency
[code lang=”html”]
<div class="nv-content-wrap entry-content">
<table width="200px">
<tbody>
<tr>
<td>Sku:
</td>
<td>
<div class="sku">
testSku
</div>
</td>
<td>Price:
</td>
<td>
<div class="price-value">
123
</div>
</td>
<td>
Currency:
</td>
<td>
<div class="price-currency">
USD
</div>
</td>
</tr>
</tbody>
</table>

</div>
[/code]

here is a code for fields extraction

[code lang=”js”]
async function extractData(html)
{
const cheerio = require(‘cheerio’);
const $ = cheerio.load(html);

var sku = $(‘.sku’).first().html().trim();
var title = $(‘h1’).first().html().trim();
var price = $(‘.price-value’).first().html().trim();
var currency = $(‘.price-currency’).first().html().trim();

console.log(‘sku:’ + sku);
console.log(‘title:’ + title);
console.log(‘price:’ + price);
console.log(‘currency:’ + currency);
}

[/code]

This cheerio code sample from visual studio:
cheerio code sample

Hope this sample will help you to extract required for you fields from source sites / html pages.