Download and Sign Up
Get a $5 Coupon For Free
Getting Started Main Features

【Flowchart Mode】How to collect web data in reverse order | Web Scraping Tool | ScrapeStorm

2023-04-21 09:23:35
901 views

Abstract:When collecting data, it is often necessary to collect in reverse order (collecting data from the last page to the first). This article will show you how to use ScrapeStorm's smart mode to collect web page data in reverse order. ScrapeStormFree Download

When collecting data, it is often necessary to collect in reverse order (collecting data from the last page to the first). This article will show you how to use ScrapeStorm’s flowchart mode to collect web page data in reverse order.

Case 1: After paging the list page, the link changes, and the link to the last page exists .

Processing method 1: Use the last page link of the list page as the collection link.

When we can directly get the link to the last page of the website list page, we can use the link of the last page to create a collection task by directly copying the link.

1. Click to the last page in the browser and copy the link of the last page.

2. Create a flowchart mode task.

3. After the flow chart mode detects the list, the software will prompt whether to detect the next page button. According to the operation prompt, manually click the “Previous” button.

4. Start the task to start collecting in reverse order.

Processing method 2: Set reverse page numbers in batches

When the link of the website will change according to the page turning, but there is no “Previous Page” button to realize the operation of turning the page forward, you can realize the reverse order collection by setting the page number.

1. Copy the link to the second page.

Generally speaking, the link on the first page may be different from the link on the second page and the third page. It is impossible to find a regular link through the link on the first page, so it is recommended to directly copy the link on the second page to create Task.

此图片的alt属性为空;文件名为URL2.jpg

2. Use the function of generating URLs in batches to generate links.

As shown in the figure below, “Start” is set to “Last Page”, “End” is set to “First Page”, and “Step” is set to “decrease”.

For more details, please refer to the tutorial: How to use URLs Generator

3. When URLs have been generated in batches, there is no need to set the page turning button. You can select “No, extract only the currrent webpage” in the operation prompt. If the page needs to be scrolled to display more data, it is recommended to set it to “Scroll to Load”.

4. Start the task to start collecting in reverse order.

Case 2: After the list page is turned, the link remains unchanged, and there is no link to the last page

Processing method 1: There is a button to jump to the last page on the web page.

When the link of the website will not change according to the page turning, and we cannot directly get the link of the last page, we can jump to the last page by directly clicking the page turning button of the last page, so as to realize reverse collection.

1. Create a flowchart mode task.

2. Add a “Click” component to turn the page to the last page.

3. After the list is detected, the software will prompt whether to detect the next page button. According to the operation prompt, manually click the “Previous” button.

4. Start the task to start collecting in reverse order.

Processing method 2: There is a page number input box on the web page

When the link of the website will not change according to the page turning, and we cannot directly get the link of the last page, we can jump to the last page by directly inputting the page number of the last page to achieve reverse collection.

1. Create a flowchart mode task.

2. Add a “Input” component and a “Click” component to turn the page to the last page.

3. After the list is detected, the software will prompt whether to detect the next page button. According to the operation prompt, manually click the “Previous” button.

4. Start the task to start collecting in reverse order.

Download images in batches Download videos in batches Keyword extraction from web content python crawler php crawler python download file Data scraping with python Generate URLs in batches Match emails with Regex Download web page as word
关闭