Getting Started Main Features Examples

【Smart Mode】How to configure the scraping task

2018-10-15 19:29:38

Abstract:This tutorial shows you how to configure the scraping task.

Smart Mode can automatically identify web data. After the recognition is completed, the user needs to configure the scraping task. Click “Settings” in the lower right corner to pop up the task setting box.

Specific settings include running settings and anti-block settings, as shown in the following figure:

1. Running Settings
(1) Encountering the data that has been scraped
Scrap again: scrape all the data, regardless of whether the data has been scraped before. “Scrape again” is selected by default.
Skip and continue: When you encounter the data that has been scraped, skip this data and scrape the new data.
Stop scraping: encounter the data that has been scraped, stop scraping, and end the scraping task.

(2) Request waiting time
Some pages are slow to open and sometimes affect the effect of extraction. Users can set a waiting time, which can effectively improve the quality of the extraction.The system default wait time is 1 second, and the user can modify it according to requirements.

(3) Block Images
In general, blocking images can improve the scraping speed. However, if the web page scraped by the user needs to input a verification code, the function cannot be used. Otherwise, the verification code cannot be displayed, and the data cannot be scraped.

(4) Block Ads
Using this feature can effectively improve the speed of scraping, but under intelligent algorithms, it is possible to block content that is not an advertisement. Users should use this function with caution.

2. Anti-blocking Settings
Some websites may set some shielding measures to prevent data from being scraped properly. In this case, some anti-blocking function can be set to improve the scraping effect.
(1) Switch browser regularly
By setting the timing switch browser version, the anti-blocking effect can be achieved, and a switching cycle can be freely selected to switch the browser version.

(2)  Clear cookies regularly
By setting the timing to clear the cookies, the anti-blocking effect can be achieved, and the cycle can be freely selected to clear the cookies.