Let’s talk about the web scraping services. Imagine that you are going to use web scraping service. When you start to think about it will be necessary to make the decision and select vendor. It is really hard to do for the first time especially when you have to select web scraping service. Let’s think about it. Why?
- There is no way how to test it because you do not know how that software works.
- You see only prices.
- You make your decision which will be based on the design of vendor site.
- If you write messages to vendor’s sales team so you can decide if they are good for you.
Ok. But the first feature which is really has a big amount of influence on your decision is a price. The price for a web scraping service in average situation values of price influence on a number of queries which will be processed by vendor per month or year. Of course it is wrong to compare services by price. But we have to do it, because price is very important for customer.
So let’s look at vendors’ prices.
- 1 How to find web scraping service?
- 2 How to understand what is a web scraping service?
- 3 webscraper.io
- 4 Import.io
- 5 scrapinghub.com
- 6 dexi.io (old Cloudscrape)
- 7 parsehub.com
- 8 apifier.com
- 9 data-miner.io
- 10 mozenda.com
- 11 grepsr.com
- 12 80legs.com
- 13 Let’s combine all prices & offers into one table:
- 14 Related Posts
How to find web scraping service?
It is very easy. Go to google or Bing. Type web scraping service and be ready to get 10-20 different variants. In this article I will describe from the price view 8 web scraping services from google top.
How to understand what is a web scraping service?
There are three ways for web scraping:
- Web scraping services.
- Web scraping tools.
- Programming Web scraping using libraries.
Web scraping services are SaaS solutions which have the next core features:
- Monthly payments
- It is SaaS, so it has web UI (sometime and desktop UI)
The main difference between web scraping tools is that web scraping services give the next benefits:
- Rich tools for project setup
- Hosting of code & application
- Web access to results
The web scraping is the first in out list because it is the first in google. It is very strange because that service has very ugly UI page.
Webscraper.io offers the next price table:
- 100,000 pages – $50.
- 250,000 pages – $90.
- 500,000 pages – $125.
- 1,000,000 pages – $175.
- 2,000,000 pages – $250.
Price for a service looks very interesting & pretty but you have understood that for that price you get just an execution of your scraper configuration inside webscraper.io hosting. It means that you have to:
- Setup chrome extension.
- read documentation understand how to work with it.
- make configuration of scraper, debug, test.
- contact webscraper.io and run scraper.
So in this scenario, you pay just for hardware or script execution. All development & support will be on your side.
Interesting startup import.io offers very similar on webscraper.io scheme of monetization. You pay just for execution of your scraper in the cloud and support. The first difference from webscraper.io is a free plan which allows user to run up to 10,000 queries per month But you have to know that webscraper.io allows you to run web scraper using your hosting (or notebook or server). So this difference is not so important. There is no way to test import.io on real data when webscraper.io allows customer to do it. The next tariff is $249 per 50,000 queries per month. And the powerful tariff is $399 for 100,000 queries per month.
So if compare webscraper.io & import.io we see that import.io is cheaper minimum at 10 times. It can be strange if you do not understand that import.io offers you a better experience.
Any way when you using import.io you have to
- Learn documentation.
- Test software.
- Ask to support about different how to & bugs.
- Develop using import.io.
It is the very interesting vendor because uses open-source software which is supported by them.
The price system is not clear for general people. And look like that average scrapinghub.com customer is a developer. They also sell a number of monthly requests. There are 4 main variants of tariffs for web scraping
- C10. $25/mo 150K requests.
- C50. $100/mo 1M requests.
- C100. $250/mo 3M requests.
- C200. $500/mo 9M requests.
Let’s talk about scheme of monetization. They sell just a hosting. So the scheme is the same as in import.io & webscraper.io.
dexi.io (old Cloudscrape)
Dexi.io does not show price on the front page. There is no change to get a price without registration via mail. Ok. When registration is done you may check prices for service. Inside application you may see that the first price is $119:
- the cost of 1 worker is $119.00
- the cost of 2 workers is $238.
- the cost of 3 workers is $357.
- the cost of 4 workers is $476.
I found interesting feature inside dexi. They allow a user to use dexi’s proxies. I think that it is not a private proxies. They just purchased and it allows their customers to use it. I found it very helpful.
If you want to use parsehub.com web scraping service it means that you have to download the application for windows, mac, Linux. It is very similar to import.io but if we look at monetization politics we do not see restrictions on queries per month and price depends on a number of concurrent requests + number of projects inside 1 account.
It is necessary to say that parsehub.com has free account with the next limits:
- 5 public projects.
- 5 pages/minute.
- 200 pages per run.
- ip rotation: no.
The first tariff cost is $99 where allowed 10,000 pages per run, max download speed is 20 pages/minute.
The second tariff is $499 where pages are unlimited and max download speed is 120 pages/minute.
Free tariff allows user to do the next:
- Up to 40k pages/month.
- Full functionality.
- IP rotation.
- 7 days of historic data.
The cost of business tariff is 129$ / mo. it allows user to download 2M pages + 10 parallel crawlers.
But it is not a minimal tariff. You have to know that minimum cost is 19$ / mo. If you use that tariff than you have 80k pages + 3 parallel crawlers in web scraping service. Maximum tariff for $999 allows the user to extract data from 50M pages + 50 parallel crawlers. As you understand the situation is the same as for the others web scraping services. You have to learn how to develop using apifier.com commands, read the documentation, debug code, test project, adjust.
It is nice that apifier has a free plan, so it is possible to test it. Looks like that business & enterprise plans are intended to sell support, because 40K request for free plan is a big deal and allows many customers to use service for free. But be sure what if you want to ask them to build crawler for you you have to pay money.
When you visit the front page of that web scraping service you see the next cool message: “Data Miner is Free for 98% of our users!” it is really interesting because when you see such message it requires testing such software or messaging. Ok, let’s check.
The cost of professional tariff is 19.99$ / mo the main limit is Scrape 500 pages /month.
The cost of Business is 49. The limit is Scrape 1000 pages /month.
The cost of Enterprise is 99$. The limit is From 5,000 to unlimited pages /month.
I have to say that professional tariff has strange limit: Scraped URLs Limited on Business Domains. What does it mean? It is not clear. The main feature of data-miner is that it has big set with “recipes” so you can use that you like. You can do your recipes public and it will be available for the other data-miner users. It looks very nice.
Professional tariff starts from 99$ -199 per 5,000 – 25,000 page credits/mo. Enterprise tariff starts from 3,500 $ for the next limits:
- 1,000,000 page credits.
- 20 GB storage.
- Up to 100 Agents.
Mozenda has similar to import.io & parsehub.com architecture. You have to download desktop software, configure Web Scraper (an Agent) on the desktop, save the project and upload and run on mozenda server web scraping task. Inside plan details, you may find different rules how to calculate credits. Mozenda looks like the most enterprise ready solution. My opinion is based on the prices 🙂
grepsr.com offers two main plans. The cost of the first “starter plan” is $129 + $50 PER 50K EXTRA records. Looks like that plan they created for customers that do not want to use service every month but want to get data from data source web site. The second plan’s price is “$99 /PER SITE PLUS $50/MO PER 50K RECORDS”. It is necessary to way that grepsr.com is not a web scraping service because they do not give customer a possibility to setup project themselves. Grepsr gives a clean data from web source in a private area on their web site. They offer looks very interesting because they give cleared data and user does not worry about development.
80legs offers the next plans:
- Free Plan – Up to 10,000 URL’s per crawl + 1 crawl at a Time
- Intro Plan ($29/month) – Up to 100,000 URL’s per crawl + 2 crawl at a Time
- Plus Plan ($99/month) – Up to 1,000,000 URL’s per crawl + 3 crawl at a Time
- Premium Plan ($299/month) – Up to 10,000,000 URL’s per crawl + 5 crawl at a Time
It is necessary to say that 80legs.com is more “web crawling” service. The cases that they show describe web crawling task. They crawl all web and extract information. There is no case about web scraping tasks. And one more thing about 80legs: it is necessary to use Java Script programming language if you want to run their crawler, so it looks not so pretty if you do not have developer skills.
Let’s combine all prices & offers into one table:
Will be nice to compare price for service on the real case. Task is the next: Imagine that it is necessary to extract data from ecommerce website. It is necessary to extract in the first case 5000 items and in the second – 25000-30000 items. In the first case on debug + pure run will be user 10000 raw requests. In the seconds case on debug + pure run will be user 30000 raw requests. So in the next table collected and calculated prices of each web scraping services:
Price table is sorted by “min.price,$” column.
What is missed in this article?
I compared web scraping services by price but it is wrong. And why it is wrong I wrote in the beginning of the article. Remind what I started to select services from google output results by keyword “web scraping service”. So not all described by me web scraping services are web scraping services in general understanding of them. Every service solves same preferred tasks for them. It is necessary to test every service and only when it can solve you task as you need start to pay for full service support and management. I want to say that compare by price services is a wrong job. The right way is compare service/supported features/price. But it requires a lot of time for doing.
What do you have to understand about web scraping services?
- All web scraping services do not solve your problem with data extraction.
- Web scraping services give you only “hosting” & tools with UI.
- You have to create project inside “web service”, debug it manage, support.
- It is necessary to have developer skills & practice if you want to use it.
What extra cost to expect?
- Payments for project creation (development).
- Payments for support & maintenance.
- Your time.
Data-miner offers the minimum price for service. Why did it happen? Because data-miner concentrates its value as a simple service what allows users without development skills to extract data from web page to excel sheet.
Import.io offers the maximum “minimum”price among the other services. Why? Looks like good PR and the second place in SEO ranking by keyword “web scraping” allows them to do it. If compare with the others services import.io does not have significant benefits.
Scrapinghub.com offers the maximum number of requests for the smallest amount of money. Why? Because scrapinghub.com gives value for developers only and developers for them are the main users/customers. You cannot use scrapinghub.com because it is necessary to understand development and know special programming language (Python) and special library (Scrapy). Every web scraping service can create a custom web scraper for your needs. The cost of custom web scraper depends on many factors and you have to contact each and request a quote. My recommendation here is create a technical specification before request.