Grepsr is one of the more intriguing startups I’ve come across in a while. It is a data extraction and web crawling service which you can use if you need to automate data collection from places on the web. Initially I thought this was simply a data scraping tool, but upon closer inspection ‘service’ does seem to be a more apt description as there is some human involvement from the Grepsr team as well.
I spoke to Amit Chowdhury, who explained that Grepsr is a just a two-man operation so far. The service, which launched back in October, aims to give users a simple interface to tell Grepsr which data on a webpage they want. This is done with a sort of screencapture tool, where you can highlight or ‘snap’ the data that you want on a given webpage or document. There are also browser plugins that users can use to highlight target data as they browse websites (see picture). Once Grepsr has your requirements, along with any comments you might want to send them, they then create an extractor. Amit explains:
There is some manual intervention in this process, but we shield the users from this. It differs from [other solutions] in such a way that our focus is mostly on non-technical customers or companies who do not know how to go about data extraction… Grepsr hides the details, and just delivers the data neatly, organized and in a very streamlined fashion.
Amit estimates that the process is about 80 percent automated and 20 percent manual, the latter being the time needed to write an extractor for each project –though he points out that their “stable backend system makes it very easy and quick to write extractors.” Once an extractor is prepared, it can then be scheduled to collect data or update at certain intervals.
And Grepsr appears to present a number of options for data delivery, such as integration with Dropbox, FTP, and Google Docs, or you’re a developer they’ll provide data feeds that an application could consume. From what I can see by browsing a sample project, the results look very good.
Such a tailor-made data collection service doesn’t come free, as you might expect, with costs ranging from $99 to $129 per project depending on the chosen pricing plan. But if you’re someone who lacks the technical skills to extract data on your own [1], then it could be well worth the price and more.
Grepsr is currently based in Australia, but they are in the process of moving the startup to Nepal where they are originally from, explaining
We liked the challenge of bringing up a tech-startup from a virtually unknown country and hope to put Nepal in the map somehow – even if it’s in a very small way.
Currently Grepsr is self-financed, and has had about 15 individual clients in their first two months – ranging from real estate agents to researchers to even lottery players – with some recurring projects. That’s not a whole lot of business, but Grespr says that their costs are minimal: mostly their time, cloud computing expenses, and coffee! They hope enterprises can take advantage of their service in the future. And for those of you who might be wondering about the unusual name, and explanation is below:
But given the manual attention given to each project, can a two-man operation like this scale? Amit doesn’t foresee any problem in scaling, saying that their backend platform does not absolutely require super-talented programmers, but that an average programmer could get by because the backend hides all the details from extraction.
Given the importance of ‘big data’ these days in business, and in news as well, I really hope that Grepsr does well. Services like this that make working with data more accessible to non-programmers, are in my opinion, very much needed. And for anyone wondering about the unusual name, an explanation is below:
Our brand/product name Grepsr comes from a tech-word “grep” which roughly means “find and extract”. Check out http://t.co/Q3YIuMY for more.
-
There are a number of solutions already available for data scraping, though most require some pre-requisite technical knowledge. I usually favor Google Spreadsheet’s import functions, or even Dapper.net. ↩