80legs Users Increase Web Crawler Rates With Release of New API

Datafiniti
Knowledge from Data: The Datafiniti Blog
3 min readJul 2, 2018

--

Every web crawl is different from the next. Web scraping can eat up time, which is why 80legs aims to take on the legwork for its customers. Today, users will notice an uptick in the reliability and speed of their web crawls, thanks to a rework of the 80legs API back-end.

“We were encountering limits to achievement with our existing architecture,” said Shawn Rushefsky, software engineer. “It has been the bottleneck of our data-ingestion pipeline.”

Over the course of five months, Moe Jangda, software engineer, and Rushefsky rebuilt the API from the ground up to be more extensible, maintainable, and scalable. Together, the pair researched their technology options for addressing the problem of stuck crawls and extending the lifelong functionality of 80legs.

“We want our crawling engine to crawl as fast as the internet allows,” said Moe Jangda, software engineer. “To achieve this, we’re setting up our tech stack to dynamically scale based on load.”

Going forward the 80legs existing architecture will be replaced by RabbitMQ, an open source message broker. With a new codebase, the team can focus their efforts towards elevating throughput and overcoming the limitations of horizontal scaling. In the next few months, the team will deploy additional components to get a better window into the crawling platform and reduce the turnaround time on fixing bugs and delivering features.

Are you a current 80legs user? Here is what these changes mean for you:

Faster crawl speeds – New technology will enhance and increase crawl speeds across the board.

Fewer stuck crawls – A more maintainable API will eradicate a category of bugs that often are the cause of stuck crawls.

Improved customer support – Admin tools will enable our team to better diagnose and resolve performance problems.

Enhanced Postman collections – Custom collections will be available and include your documentation, user information, and access to endpoints.

Support for query parameters in API calls – Filter capabilities will allow you to narrow down extensive API results to retrieve the exact information you need.

Useful error messages – Messages will provide actionable information regarding the error you received.

Any current integrations with the 80legs API should remain functional without any changes, but users may want to explore taking advantage of some of the new functionality.

“We hope this will be a big quality improvement for everyone who interacts with 80legs,” said Rushefsky.

Detailed documentation and resources to guide developers and non-developers in interacting with the unique portal are available here.

If you have any questions, you can contact your customer success representative or email our team at support@datafiniti.co.

--

--

We provide instant access to business, people, product, and property data sourced from thousands of websites.