Simple Web Scraping using node.js, cheerio, and request.

Introduction

The basic definition of Web scraping would be ‘Web Data Extraction, it is a technique to extract large amounts of data from websites and the extracted data is usually stored on a local computer in different file formats.’ The purpose of such extraction might be consuming the data in any application, to analyze or study the extracted data for competitive purposes. I happen to engage myself in web scraping as I am working on an application where I had this requirement of fetching all my blog posts from both my WordPress sites. So I ended up with web scraping which suggested me that I can scrape data from any website and meet my requirements. This is a very brief article to let your hands on web scraping. At the end of this tutorial, you should be able to go ahead and scrape/ extract data from any website.

Here, I will be scraping the home page of both my WordPress sites and extract metadata such as the post title and post URL.

Read More »