Questions Managers Should Ask Their Team When Contemplating dataBase Preservation

The Phone Rings,

Client: We have a website and database we need to migrate to Drupal.

Me: How big is the database?

Client: Huge!

The database is a hidden hurdle
Moving to a new platform means moving the database. Where does the data "live?" Do you own that space, or are you a renter? What's the cost of moving? It ain't cheap.

The Managerial Evaluation

Your website has been a commercial success ever since you upgraded some years ago. Revenues are strong. Traffic is up. The bottom line is solid. The database has expanded in proportion.

The staff that keeps the site working have wrung out everything they can. They are grumbling, at least a little, about how the site does not do what they want it to do. They've created some work arounds to address that, but it is not ideal.

Your visitors are happy, but they are telling you that they want performance: more data, more pictures, more speed, more raw power, and it seems your competitors have heard about these desires because they have re-launched their sites, and you are concerned that though your key performance indicators (KPIs) look solid, you don't know how much the competitors are siphoning from you.

To stay fresh, you are considering a move up-market as the competition ups the ante.

This means moving from old platforms to new platforms. It means being able to jack into the old database and then to migrate it.

It means direct costs and hidden costs have to be addressed before taking on the challenge.

Old wine
Moving a database to a new platform is not as simple as moving old wine to new bottles.

No matter how much we would want it to be, it is not merely pouring old wine into new bottles. Like Y2K, once you penetrate the top layer, you enter a labyrinth of long forgotten workflows – geological layers of code if you will — consisting of workarounds, kludges, on-the-fly "fixes" (or worse yet, fix-attempts) that were suppose to postpone the inevitable – a wholesale site upgrade.

To be sure with backups and redundancies, the content of a firm's codebase and database is stored somewhere. It can be restored.

Restoring the codebase is one thing, but moving it from one software platform to operate on another is not trivial. It is not just old wine into new bottles.

No Data Migration Project Is Simple.

Yes, there are tools. Yes, we've come a long way. Yes, everyone is onboard to make it work.

And yet retrieving the database repositories demands something that is both scientific discipline and the artistic insight — the stuff of jet fighter pilots.

"When done right, it looks like the easiest things in the world," says one of the Blue Angels aviators. He goes on to say: what is required is "total and utter concentration to the exclusion of all else."

Some people may say, database migration isn't rocket science. But to trivialize the process risks being blindsided.

If moving from software platform A to software platform B could be done without disturbing the database, the job of moving to a better platform would be relatively easy and it could be done in short order by a canned program ("wizard") or by rote.

When an organization needs to modernize its website, it inevitably requires migrating the existing database and making it comprehensible to the new website's software.

Database Folklore.

The database encapsulates everything that has gone before — good and bad and in-between. Clients are tempted to say, and service companies are ready to believe that migration will be quick, simple, and painless. But all too often, these prove to be myths.
Japanese Tea Ceremony -- to a client, the new workflow might be seen as something that takes away a vital element. This is not always immediately apparent except to those who have responsibility for the day-to-day operation

Myth One — New Websites Won't Be Reverse-Engineered.

In other words, there is an assumption that there will be little, if any retraining cost — direct or indirect. Phrases like, "everyone's on board," or "they can't wait to get away from the old way" are reported by the IT or Website Department, yet these statements are not always completely accurate. A new workflow disrupts the existing way of doing things.

The people (web staff) currently doing the day-to-day work understand the workflows. The workflows are inextricably linked to the database and how the data is stored and retrieved. The interface that the web staff employs is a direct outgrowth of how the database is called up and how it is formatted for consumption.
New Workflow - using the tea bag might be faster and more efficient, but if not everyone is on board with a new workflow, there will be surprises at launch

The workflow has a logic, but it is not always logical.

A good deal of institutional knowledge, cost, and investment in web staff training is in place. Sometimes it is hard to get the staff to address what they like about the existing site and interface. Flush with the idea that a new site will "solve everything," not enough time is spent in setting down the existing workflow and addressing what would happen if that workflow changed.

Our firm's approach is to place an "embedded" information architect or user experience person at the client site to interface with the staff to understand what their jobs entail and what these individuals are tasked to do with the existing site.

Clients often want to skip this, hoping the new site will sweep away everything that came before, but this can leave parts of the client's firm feeling that the baby has been thrown out with the bath water.

Why is no-reverse-engineering such an elusive goal?

Too many stakeholders — customers, staff, managers, you name it — see the existing workflow as having been handed down on stone tablets.

Like the Japanese tea ceremony, each step is seen as beautiful, necessary, and prized, or at least familiar, known, and comfortable.

At the beginning of a project involving an existing database, we are told that the task will be: straightforward, not complicated, requires only minor tweaks, and needs no "major" work.

There is a cost that often is not appreciated until the alpha phase.

Myth Two — Out-Of-The-Box Drupal Is All That's Required.

Home Depot has an out-of-the-box solution for painting a ceiling, and if that's what's wanted, it is probably the right way to go.

Out-of-the-box Drupal provides out-of-the-box solutions. Simple websites with no legacy content can often be solved with out-of-the-box Drupal.

But if plain white with a roller won't do, an off-the-shelf solution won't provide what's needed and desired.

More complicated workflows and specialty functionality might have to be created on a sophisticated project with specialized needs.

Drupal is a great content manager, but out of the box, the platform is not a digital asset manager. That requires some finesse, especially if pre-existing digital assets are being folded into the new CMS.

There are over 21,000 contributed Drupal modules. Knowing which ones to use is not just a matter of selecting the most popular ones. An arcane workflow might need an arcane module, or one that is still in development; or custom code to create the right work flow.

Pay me now or pay me later. While the development budget can be reduced by going with out-of-the-box modules with out-of-the-box workflows, the cost in time, money, and employee morale should be taken into consideration. There is a trade-off to be weighed.

  • What is the savings with an out-of-the-box solution?
  • What costs are we putting off when workflows are altered?

Questions Managers Might Ask As Part Of Website Upgrades.

Managers view software and websites mostly as tangible, logical, and mathematical products. This view applies mostly to code, but not so much to legacy databases, which are another animal. While servers are believed to be fungible, the data stored on the servers is a unique asset - a repository of most everything that came before.

Has Your Firm Lost Any Institutional Knowledge Of Your Code Base And/Or Database?

Sometimes it is only when an organization looks to upgrade a codebase or database, do managers in the organization address tradeoff decisions that have gone unaddressed.

Like the Y2K problem, the developers who structured the original data could not foresee the demands on the modern web. Too often documentation is incomplete. Too often the structure is arcane, tracing back to Web 1.0 origins, usually predating just about everyone associated with the new project.

Does Your Existing Database Require An Archaeological Dig?


  • Some foundations support nothing - previous structure removed.
  • No foundations, but a later structure overlaid.
  • Foundation revamped to hold more, but now crumbling.
  • A foundation that should never have been laid down.
  • A foundation whose purpose is a mystery.

Archaeological dig
Like archaeologists, the web development team examines the structure of the preexisting database(s) and often their work will uncover hidden and long forgotten foundations upon which everything rests.

Digging into a preexisting database has the challenges of an archaeological dig. You never know what you'll turn up next. The picture may be incomplete. A complex foundation may exist, but what it supported has been swept away, while exactly the opposite can also be true - a large part of the structure may be supported on practically nothing.

Is Your Database Structure Laid Down Like Geological Layers?


  • The oldest layer supports everything else.
  • Different layer types fold in around each other.
  • Structure is random.
  • The latest structure hides most of the old structure.
  • It risks being metastable.

Data Base structures sometimes resemble geological maps.
The longer a database is around, chances are, more will have been laid on top of it, while other things have been sandwiched in, not to mention 3rd party integrations that often are a moving target.

If it was just an old foundation that we discovered, things might make sense. However, bigger sites usually mean bigger databases and more functionality. So instead of an archaeological dig, we start to see a whole lot of parts. Layers of data, some that clearly show they are part of long abandoned work-arounds, but which remain in place.

What are all these pieces? How did they get there? Do they still serve a function, and if they do, is it efficiently done?

Is Your Existing Database Biological?


  • Each branch is a unique niche.
  • Branches tend to evolve independently.
  • Some branches are "hot," others are not.
  • Some parts are modern; some are atavisms.
  • Entire branches could be replaced by better ones, combined, or forgotten altogether.

Websites evolve in branches where with their own logic based on function.
Understanding an existing web structure begs a phylogenetics approach.

Why do people have an appendix? Why do all mammals have mammary glands -- even the males? Somewhere along the way these aspects came about and so far nature has not taken them out of future updates.

Data maps are helpful in showing the "how," but not always the "why," and in the absence of that, there is a tendency to replicate what went before.

Key Questions The Organization Should Ask Itself Before Migrating Data.

  1. Has the organization identified those who will be impacted by a change in CMS? What are the direct and indirect costs?
  2. Is the organization prepared for unintended consequences of a new CMS?
  3. Has the organization fully documented its current website workflow?
  4. Has the organization sensibly identified aspects of the existing CMS that must be preserved? Are there workflows that must be carried over? At what cost?
  5. What documentation does the organization have regarding the original database, its structure, and its functionality?
  6. What modifications to workflow have been implemented on the fly between the time of the original project and the new one?
  7. Will the in-house users of the current system be the same users of the new system?
  8. Will the organization need to change to work with a more advanced CMS? At what cost?
  9. Will the organization need to bring on (additional) technical people to handle a more sophisticated site?
  10. Does the development shop need to have this information to create the new site?
  11. Will/Has the new CMS uncovered other areas, such as server size and capacity, that has gone unaddressed? At what cost?
We want to work with you!