Download and Sign Up
Get a $5 Coupon For Free
Getting Started Main Features

【Smart Mode】How to set up Page Type | Web Scraping Tool | ScrapeStorm

2019-12-12 20:00:38
24158 views

Abstract:This article will show you how to set the Page Type in Smart Mode. ScrapeStormFree Download

In Smart Mode, the default Page Type is List Page.

If the URL you enter is a Detail Page, the result of page type identification is certainly incorrect. Or for other reasons, such as page loading speed, even if the page you enter is a List Page, there may be identification failure.

When the Page Type is incorrect, we need to set it manually.

For an introduction to Detail page and List page, please refer to the following tutorials:

What is a Detail Page? How to scrape it?

What is a List Page? How to scrape it?

The settings menu for Page Type is shown below:

If it is a Detail Page, you can choose “Detail Page” directly.

If it is a List Page, you can click “Auto Detect” and the software will try to identify the list again.
Each element in the list is selected with a green boder on the page, and each field in the list element is selected with a red boder.

If the result of the “Auto Detect” does not meet your requirements, you can modify it by selecting “Select in Page” and “Edit Xpath“.

The operation steps of “Select in Page” are as follows:
Step 1: Click on the “Select in Page” option
Step 2: Click on the first element of the first line of the list
Step 3: Click on the first element of the second line of the list

P.S. In the figure above, we have made two changes to the list. The first is to change the recognition result to the list on the left, and the second is to change the recognition result to the list on the right.

The settings for editing Xpath are as follows:

Download images in batches Generate URLs in batches Automatically organize data into excel Match emails with Regex Download web page as word Download videos in batches python crawler Data scraping with python python download file Keyword extraction from web content
关闭