How can I improve robot performance?

A single robot execution will run from start to finish in a single processing thread, meaning that it will click one button at a time and visit one page at a time.

To increase efficiency, you can split your robot into two: one that gets all the URLs of the pages to visit and one that takes those URLs as input. This allows the robot to visit pages concurrently (up to your account's concurrency limit), significantly speeding up execution time.

Be aware that increasing concurrency should be done with care and respect for the target site so as to not interrupt services or cause excessive load/stress on the site. For smaller sites, stay below a maximum of 10 concurrent robots. For sites who experience larger amounts of traffic, you can probably go a bit higher. Always read the terms and policies for the site you're scraping to ensure you're complying with their terms.

How do I disable images, stylesheets and Javascript?

You can not globally disable images, stylesheets or javascript with a single click - but you can prevent specific network requests which will speed up load time.

How to block or ignore network requests

Block network requests for certain unneeded elements can improve robot performance. When using the robot editor, click the Network tab to get an overview of the network traffic involved in a page request. Click the URL icon to mark scripts and elements for blockiing or ignoring. You can block/ignore specific URLs/file types or entire domains.

By default, we block Google Analytics and other tracking scripts; we don't want to skew the analytics data of any Web site we scrape.

Did this answer your question?