Recursive download configuration

Site Snapshot can recursively follow HTML links.

The SST can identify and follow HTML links such as “a/href” and “img/src” from your Primary Site URL(s) and include associated content (for example, pages, images, stylesheets, etc.) in the snapshot. Choose one of the following recursion options:

Download configuration options

  • Only the URLs listed: Only those URLs specified as your Primary Site URLs will be included (for example, only those specific files - no associated images, stylesheets, etc. will be included in the snapshot).
  • Recursive upto specified levels: Use the associated fields to define the number of levels in the link tree you wish the snapshot to delve. For example, if your Primary URL links to a secondary page that links to a tertiary page, and that page links to three more pages, inputting a Level of 2 in this field would include the Primary, the secondary and the tertiary page files - the three additional pages that link off of the tertiary would not be included). Additional settings are available, when this option is selected:
    • Convert Links: With a specific depth level established, this option will change links on downloaded pages to access the appropriate page within the failover site generated by the snapshot. This will maintain link integrity between all downloaded pages within the link tree (for example, without it set, clicking a link on a downloaded page will attempt to reach the target page in origin site, not within the failover, which could be met with an error if your origin site is unavailable).
    • Page Requisites: Select this option to include all associated files in the snapshot (for example, images, stylesheets, etc.)
    • Accepted Domain List: Specify a domain accept list when using recursiveness. For multiple domains enter a comma separated list.)
  • Recursive all levels: With this option selected, the snapshot will recurse all levels and include all files.
    • Accepted Domain List: Use the following domain accept list when using recursiveness (For multiple domains enter a comma separated list)

Recursion limits, input files, and cookies

The SST cannot recursively find and download links embedded in JavaScript such as pop-ups or image links, and it cannot follow links that generate a pull-down menu or mouse-overs. In addition, SST does not download pages that are generated in response to a user interaction such as filling out a form field. As a workaround, you can specifically target pages accessed via these means by establishing a unique Primary Site URL for each page when generating a snapshot.

Note: For a list of tags/attributes followed for recursion, please see the description of the sst command option, -r --recursive.
Note: Recursion does not parse JavaScript. If certain pages on your site are only accessible via a Javascript-based link (or similar means), it is recommended that you include these pages individually as Primary Site URLs.