【Flowchart Mode】Basic operational procedures
Abstract：This tutorial demonstrates the basic operational procedures of Flowchart Mode.
1. Enter the correct URL
Copy the URL you want to scrape in the browser and open the ScrapeStorm Flowchart Mode to paste the URL to create a new scraping task.
Click here to learn more about how to enter the correct URL.
2. Scrape web pages that need to be logged in to view
In the process of data scraping, we sometimes encounter web pages that need to log in to view the content. At this time, we need to use the pre-login function to log in to the webpage and then perform normal data scraping.
Click here to learn more about how to log in to the web page.
3. How to use the components
The ScrapeStorm team turns the development scraping rules into components by visually encapsulating the complex scraping coding process. In Flowchart Mode, components are divided into behavior components and flow components. Components are the most basic elements that make up a flowchart scraping task.
4. Set the scraping task component
The user can set the scraping task component by means of system-assisted tapping, or it can be set by manual dragging. Different scraping tasks require different components to be set up.
Click here for more application scenarios for scraping tasks.
5. Set the extraction field
After setting up the scraping task component, the user can set the fields to be extracted on the Extract Data component.
Click here to learn more about setting up the extracted fields.
6. Configure the scraping task
After the extraction field is set, the user can set the scraping task. The user can use the system default setting or set the scraping task by himself.
Click here to learn more about how to configure the scraping task.
7. Scheduled job
Scheduled job function is an advanced setting of the scraping task. This function enables the scraping task to be started and stopped at a fixed time point within the time period set by the user.
If you have set up a timed acquisition, please ensure that the software is always working (cannot be turned off).
Click here for more information on scheduled job.
8. Sync to the database
Sync to the database function is an advanced setting of the scraping task. This function can automatically publish the collection results to the database at the same time as the data acquisition, without having to wait until the end of the task to export the data.
Sync to the database function is combined with the timing acquisition function, which can greatly save time and improve work efficiency.
Click here to learn more about syncing to the database.
9. Download images
If the user needs to scrape the image on the web page to the local, you can use the download image feature to complete this requirement.
Click here to learn more about how to download images to the local.
10. View the extraction results and export data
After the task is set, the user can view the extraction result and export the data.
Click here for more ways to view the results of the extraction and export the data.