Content Migration Tips
When you move from an old house to a new one, you'll need to box up your stuff and label those boxes. You'll also need to figure out where things are going to go in your new environment. When you're moving, it's the best time to "clean house" and get rid of things you don't need anymore.
What is Content?
It seems almost unecessary to ask this question, however, the answer may be harder to define than you may think. We all know that content is the written word, but in reality the scope of what we mean by "content" is much broader than that. Content can refer to all of the following:
- Pages of text
- Images
- Videos
- Metadata (such as keywords, dates of publication, authorship, etc)
- Summaries or short descriptions of content
- Relationships between pieces of content (where does a piece of content fit into the bigger picture. Is it a landing page? A tertiary page? Featured or ancilliary?)
- Formatting (basically HTML. What is bold, italics, bulleted, heading, subheading, etc. Remember that some Word formatting cannot be faithfully rendered in HTML).
- Is the content ready for the web? PDFs and Word documents should not be copied from directly. You'll need to remove the formatting first.
Perform a Content Audit
Before you can really begin moving content at all, you need to perform an audit of what you already have. What is an audit? It's where you can determine what items you actually want to use, which ones need to be refreshed, etc. . . Here are some things to consider as you perform your audit:
- Is the item in use?
- Is it current?
- Are there multiple versions? If so, which one is the definitve version?
- Do you have multiple sizes of the same image? Are they named and filed in a meaningful fashion?
- Does the item need to be moved into the CMS?
- What about metadata?
You'll need to define a process by which all of these questions can be answered. A process map is a good way of approaching this problem. A process map is basically a logic flowchart with lots of "if, then" statements to guide your decision making process.
Define and Classify Content
Chances are that when you move into a new CMS, each "page" in the CMS will actually consist of several pieces of content. In this step, you need to begin associating content so that you know that image a goes with page b and will appear in section c with metadata d.
The most logical approach is to begin to map you content to each part of the Information Architecture, or site map of your CMS. For example, you know you'll have a staff page. What goes into that page? You'll need an image for each person, their short bio, their contact info, and possibly other information. It's quite likely that each piece is in a different location, or, if you already have a staff page, you may need to modify it to fit within your new CMS. This is what step 2 is all about.
Break Content into Component Parts
As early as possible, start thinking about the page layouts you'll be using on the website. You may have custom page layouts created for you, but there are some common elements which should be considered such as:
- Title - All content in Plone requires some kind of Title. Indicate in each of your documents what that Title will be.
- Description - All content in Plone can make use of a Description. Descriptions help with site searches, but they're also use for summaries of content. Let's say you want to publish a Report as part of a page of Featured Reports. Each report will need a Description.
- Body Text - This is the bulk of each piece of text content.
- Thumbnail Image
- Metadata
Perform the Migration
Once you have reviewed and cataloged your content you are probably ready to begin moving it all into your CMS. This process may involve several steps.
- Remove non-web formatting such as Word or PDF formatting
- Map content to the information architecture of the site. Where does the content need to live?
- Go through the steps of copy and pasting content from your original sources into the CMS
- You may need to establish a publication workflow to support this process. In other words, who gets to publish content in the CMS? Who should review it before publication? When should content be checked and refreshed?

