Example - From other sites
URL for third party content
When you want to show content from another site you own or from a third party, the Content Source to use will be URL. You may have already seen this in use in Example - CSV Data, where a CSV file is pulled straight from an external source.
But such content doesn't always arrive nicely in a CSV file, so lets look at some other types of external content.
If its web content you are looking at, there is a good chance it comes in HTML. After pulling the content using the URL content source, we have options for working with that HTML. You could just use the Pass Through content transform, but that could get messy as it would also pass through all the page headers and boilerplate for the source.
In practice, you need to be a little more sophisticated with the application of Content Transform.
- Selector - extract a section of the pulled (HTML) content
- Multi Selector - extract a multi dimensional array using selectors within selectors ... to however deep you want to go.
- Table From HTML - Multi Selector is complicated, if all you want is to get a table from the source, this can be a quick and easy solution.
Here is some content pulled from the concrete5 documentation at https://documentation.concrete5.org/developers/toc.
This isn't particularly exciting content to pull from the concrete5 documentation pages, but it serves a purpose to show it can be pulled into this site and transformed in various ways.
You may want to open up that page in another tab and with your browser developer console open on that tab to help follow what is being pulled.
To provide a basic overview, the Selector transform has been used. Settings:
- css selector: main>.container
- remove classes: container ccm-block-page-list-page-entry ccm-block-page-list-title
The css selector main>.container gets us to the first container of meaningful content. By specifying an immediate descendant with ">" it gets past the banner above and selects the first section of meaningful content.
That would be usable in itself, but including here would be well spread out down the page, so some of the page list formatting classes have been added to the removed classes to make the content a bit more basic and compact.
So you can easily see where the pulled content starts and ends, concrete5 block design has been used to add an outline.
Installation & Introduction
To get at the data in a more structured way, we can use the Multi Selector transform. You may have already seen this in action in the Example - RSS Feed. With Mutli Selector each level of selector lists the items within the level above to build a multi-dimensional array of data.
A selector list at the inner level enables both headings and links to be retrieved. The transformed data is an array of HTML (or XML) snippets.
- Container css selector - main
- Level 1 - .container
- Level 2 - h1,.ccm-block-page-list-title a
There is potentially a lot of data. There is also the banner at the top of the page to remove, so the transform is set to slice 1,2 so we just have the next 2 sections and skip the banner section.
To display this data, we only have 2 dimensions, so it could go in to a Table content display, or more flexibly in a Multi Level List content display.
The Multi Level List settings used below are
- Level 1 - Paragraphs + Capitalised First Child
- Level 2 - Unordered List, Unstyled + None (no heading). Filter 1 (first row).
Installation & Introduction
- System Requirements
- Installing concrete5
- Version History
- Versioning Numbering Guide
- Glossary of Terms
- Front-End Content – Pages, Areas, Blocks, Themes & Stacks
- Users, Groups & Authentication
- Permissions & Workflow
- Directory Structure
Building A Concrete5 Website
The Multi Leve List content display has many options. Here is the same data with the display set for a Table, Headings Left at Level 1.
|Installation & Introduction|
|Building A Concrete5 Website|