Skip to main content

How to Scroll a Webpage from Top to Bottom

Scraping page:

https://www.pulte.com/homes/florida/orlando  

Manual Page Scrolling:

To interact with content that loads dynamically after the initial page load, you can manually scroll the page and then use the "Parse All Content" button. This action will refresh the content loaded into the scraper, ensuring that newly appeared elements are captured for further processing.

  • Dynamic Content: Content that loads after the initial page load is often referred to as "dynamic content."

  • Manual Scrolling: By manually scrolling the page, you trigger the browser to load additional content.

  • "Parse All Content" Button: Clicking this button refreshes the scraper's view of the page, incorporating the newly loaded content.

This message provides instructions on how to scroll to the end of a webpage using the "scrollExpandingContent" event in Sequentum Cloud.

Using the scrollExpandingContent Event:

  1. Add the "Action" Command: Locate the command after which you want to perform the scroll. Under the "Actions" section, add an "Action" command.

  1. Set the Xpath: Set the Xpath of the node scrolling. To set the Xpath click on the "Selection" tab and pass the Xpath "/html" as in your provided URL the complete page is scrolling instead of a particular section on the page. Now click on the "Save Selection" button to save the Xpath.

  1. Configure the Action: Open the "Options" tab located in the top right corner. Navigate to the "Action" section.

  1. Customize Scripting: Uncheck the "Use default action script" option. Replace the existing script with the following:

element.scrollExpandingContent(200, 500, 600);

For detailed information on the "scrollExpandingContent" event, please refer to the Sequentum Cloud Support documentation: [https://cloudsupport.sequentum.com/scfaq/how-to-use-specific-features-and-commands-of-se-in]

  1. Adjust Browser Timeouts: If the page loads content dynamically while scrolling, navigate to the "Browser" options. Increase the "Discover activity timeout" as needed to provide sufficient time for the scrolling operation to complete.

  1. Save and Test: Click on "Save" to save your changes. Click the "Play" button to test the implemented scrolling functionality.

Sample Agent Text:

  1. Open a New Agent.

  2. Copy the given Sample Agent text.

Agent:

Proxies:

Proxy Pool: Sequentum Data Center

Commands: URL

Dynamic Load:

Timeouts:

Discover Activity: 4

Input: https://www.pulte.com/homes/florida/orlando

Commands: Action

Name: Click To Show All Action

Select: //div[@class='ProductSummary__btn-show-more-container']/div/strong

Wait:

Search All Frames: Yes

Wait For Change: true

Commands: Action

Name: scroll

Action Script: element.scrollExpandingContent(200,500,600)

Select: /html

Wait:

Search All Frames: Yes

Wait For Change: true

Dynamic Load: Wait For All Content

Timeouts:

Discover Activity: 3

Discover First Activity: 1

Commands: Page Area List

Name: Section List

Select: //div[@class='ProductSummary__communityContainer']/div

Commands:

  • Content: name

Extract: //a[@class='experience-modal-button']

  • Content: address

Extract: //p[@class='ProductSummary__address']

Export:

Commands: CSV

  1. Now Click on the "Text" tab. As shown Below.

  1. Now press "Ctrl + A" to select all content and then press "Ctrl + V" to replace the available content with the previously copied content.

  2. Now Click on the "Save Configuration" button as shown below.

  1. Now you have the Sample agent on your system to run or debug.

This will create a sample agent with the scrolling functionality pre-configured, allowing you to test and adapt it for your specific needs.

Additionally, to simplify the process of scrolling to the end of the page, we are currently working on implementing a user-friendly option within the UI. This will eliminate the need for manual scripting and provide a more intuitive solution.

JavaScript errors detected

Please note, these errors can depend on your browser setup.

If this problem persists, please contact our support.