Universal Content Importer

Universal Content Importer is a dashboard utility that utilises Universal Content Puller to pull content and files in bulk from any site.  Corresponding pages are created within the ConcreteCMS site, with the pulled content added to blocks on those pages.

Where blocks cannot be fully added, UCI provides a comprehensive notification and review system to facilitate manual addition of that last little bit. 

Once installed, you will have:

  • UCI Dashboard pages - dashboard pages for configuring and running the import pipeline.
  • UCP content sources, trandforms and displays - additional sources, transforms and displays for Universal Content Puller to facilitate the import and review process.
  • Notification block - a block added at the top of any page where fully automated creation of content was not possible or where assumptions have been made and require review.
  • Notifications attribute - a user attribute that any site editor can configure to limit the notifications they need to see.
  • Placeholder block - a block inserted into the content of a page where further action is required to add a block for pulled content. For example, where a form is encountered.
  • Review Navigation block - place this block in a global header and/or footer to navigate forewards/backwards between puilled pages or to open the original source in a separate browser tab.

Universal Content Importer uses DOM selectors to home in on the section(s) of the source site to import. Thus it requires a source site that has some regularity in the way its DOM is organised. This is a reasonably dependable assumption for sites that were generated by a CMS or other site generator, but may become less dependable for sites of hand-written html.

Where blocks cannot be automatically generated, UCI places a Notification block summarising what review and completion actions are required at the top of the imported area. UCI then inserts Placeholder blocks into the page where blocks need to be added manually.

UCI will not import a source site fully automatically, but if it can do 90% of the work for you and point you to where additional work is required, reviewing and fixing the last 10% can still save many days or even weeks of editing pages when converting a large site to ConcreteCMS.

During the review process, the UCI blocks can be manually removed from individual pages as they are completed. Alternatively, they can be left in place and hidden from visitors using Advanced Permissions, or the UCI blocks can be left in place and removed in bulk by uninstalling the blocks or UCI in the dashboard.

Once the import process is completed and fully reviewed, Universal Content Importer can be safely uninstalled and is in no way required for continued operation of the site.

Please respect copyright ©

Just because you can import content in bulk from another site does not mean you can legally do so. Plese respect the copyright of others. Only import content where:

  • You own the site.
  • You are importing the content on behalf of the site owner.
  • You have permission to do so from the owner of the content.

Extension

Universal Content Importer is built on top of Universal Content Puller and designed for extension using similar mechanisms. Content sources, transforms and displays follow a pluggable and extensible architecture for easy integration of further sources, transforms and displays from within UCP or provided by third party packages or your own application specific plugins.

How to get Universal Content Importer

Universal Content Importer is only available direct from JohntheFish. Please contract me to discuss.

Universal Content Importer

jl_universal_content_importer - v0.4.1

A general purpose content importer for batch importing pages and images from any site. Internally uses Universal Content Puller, so installing Universal Content Puller is a prerequisite.

Attribute Keys
  • UCI Notifications
Attribute Sets
  • Universal Content Importer
Attribute Types
UCI Notifications User Site Type
Block Types Single Pages
  • Universal Content Importer /dashboard/​blocks/​universal_content_puller/​universal_content_importer A general purpose content importer for batch importing pages and images from any site. Internally uses Universal Content Puller, so installing Universal Content Puller is a prerequisite.
  • Universal Content Importer /dashboard/​pages/​universal_content_importer A general purpose content importer for batch importing pages and images from any site. Internally uses Universal Content Puller, so installing Universal Content Puller is a prerequisite.
  • Importer Settings /dashboard/​pages/​universal_content_importer/​settings Settings for each stage of the importer process.
  • Import Remote Sitemap /dashboard/​pages/​universal_content_importer/​settings/​import_remote_sitemap Import a remote sitemap into the UCI workspace to pull content from.
  • Grab Remote Pages /dashboard/​pages/​universal_content_importer/​settings/​grab_remote_pages Add HTML content to the UCI workspace.
  • Metadata /dashboard/​pages/​universal_content_importer/​settings/​metadata Optionally add further page data not provided by Import Remote Sitemap
  • Create Pages /dashboard/​pages/​universal_content_importer/​settings/​create_pages Create or update local concrete5 pages based on remote URLs. Blocks are added later.
  • List Images /dashboard/​pages/​universal_content_importer/​settings/​list_images Extract a list of images required for URLs in the UCI Workspace.
  • Import Images /dashboard/​pages/​universal_content_importer/​settings/​import_images Import images to the concrete5 File Manager based on image URLs listed in the UCI Workspace.
  • List Documents /dashboard/​pages/​universal_content_importer/​settings/​list_documents Extract a list of documents linked by URLs in the UCI Workspace.
  • Import Documents /dashboard/​pages/​universal_content_importer/​settings/​import_documents Import documents to the concrete5 File Manager based on document URLs listed in the UCI Workspace.
  • Extract Blocks from Pages /dashboard/​pages/​universal_content_importer/​settings/​extract_blocks_from_pages HTML grabbed from remote pages is parsed to extract Content and other blocks.
  • Add Blocks to Pages /dashboard/​pages/​universal_content_importer/​settings/​add_blocks_to_pages Extracted blocks are added to local pages.
Content Source Plugins
  • UCI Bulk Document
  • UCI Bulk Image
  • UCI Bulk URL
Content Transform Plugins
  • Block Extractor
  • Document File Extractor
  • Image Extractor

Universal Content Puller

jl_universal_content_puller - v9.3.19 - resources v9.3.0

Pull content from many sources and display it in many ways.

ConcreteCMS Marketplace v9.3.19

Block Types Single Pages
  • Universal Content Puller /dashboard/​blocks/​universal_content_puller Pull content from many sources and display it in many ways.
  • Plugins /dashboard/​blocks/​universal_content_puller/​plugins Plugins for the Universal Content Puller block
  • Global Settings /dashboard/​blocks/​universal_content_puller/​global_settings Edit global settings and defaults for the Universal Content Puller block.
Content Source Plugins
  • Calendar Event List
  • Child Area
  • Direct Table
  • Direct Text
  • Express List
  • File
  • File Folder List
  • Fileset List
  • Global Area
  • Google Sheet
  • None
  • Page Area
  • Page List
  • Parent Area
  • Stack
  • URL
  • URL With Form
  • User List
Content Transform Plugins
  • Array Hacker
  • Cache With Transform
  • Convert Encoding
  • First Row to Keys
  • HTML Repair
  • Key Filter
  • Key Mapper
  • Key Picker
  • Key Regex
  • List Selector
  • Markdown
  • Multi Selector
  • NL2BR
  • Pass Through
  • Pipeline
  • Remove Duplicate Values
  • Selector
  • Table From CSV
  • Table From HTML
  • Table From JSON
  • Table From Text Lines
  • Table Sorter
  • Value Filter
  • Value Replace
Content Display Plugins
  • DataPicker
  • JavaScript Data
  • Limited Text
  • ListPicker
  • Multi Level List
  • Paragraphs With Heading
  • Plain
  • Serialize
  • Serialize Paginate
  • Table

Universal Content Puller XX Sources

jl_universal_content_puller_xx_sources - v9.1.2.1

Sources extension for Universal Content Puller. The sources in this extension are XX because they are the kind of source you may not want to let just anyone loose with, hence a separate package so they don't have to be installed with less sensitive sources.

ConcreteCMS Marketplace v9.1.2

Content Source Plugins
  • Any Database
  • Any Database with Form
  • SQLite File Manager
  • SQLite File Manager With Form
  • SQLite File Path
  • SQLite File Path With Form
  • Site Database
  • Site Database with Form
Content Transform Plugins
  • SQL Extract
  • SQL Extract with Form

Additional Pages

About this Sidebar

Most of this sidebar is built using Universal Content Puller.

The Content Source is Page Area, set to pull the Sidebar area the Universal Content Puller page and within that sliced to just the Page List.

The Content Transform is Selector, set to remove container and row classes that, when unnecessarily nested, could mess up the Bootstrap grid. The Content Display is Plain, which just outputs the transformed text.

In the advanced settings, sanitization is disabled as we trust the source page and don't want to strip out any formatting or functionality from the pulled sidebar.