1. What Is ECM?
Modern organizations produce a lot of
information these days. Think about all the information that is
consumed and produced in your job: in meetings, in documents, e-mail,
videos, discussions, audio, video, marketing information, the stuff
that is on the web site, product catalogs, support documents,
consulting reports, RFPs, expense reports, travel logs, and so on. Now
think about what most organizations get sued about these days! What
almost always gets them in trouble is mishandling of information! If an
unruly customer breaks equipment at a fast food restaurant, don't you
wish you had your surveillance videos archived? But most modern
surveillance video equipment lets you store only a certain amount of
days before it is overwritten! Wouldn't it be nice if somehow that
information was backed up onto less expensive storage? Maybe even
automatically deleted 18 months from when it was created?
What about e-mail? Chances are that e-mail sent by
the CEO is probably more important than the ones I sent! So e-mail
should be more discoverable than mine. Thus, depending upon the content
type (uh oh, did I just use a SharePoint term here, or is this a
business term, or both?) and the necessary metadata fields in that
content type, different rules may apply to any content.
Consider one more thing: how do you collect such
information in the first place? The funny thing is that organizations
are producing, and perhaps even collecting, some of this information
already. Don't project managers create project plans? Don't people
e-mail each other anyway, and don't IT administrators run backups on
your exchange server? So, do we ask people to update yet another system
just to capture this information? Good luck implementing that! The
reality is that people are not going to do any extra work than they
already do. So you must capture all this information without changing
people's workflows.
Thus, a good enterprise content management system is pervasive, not invasive.
NOTE
A good enterprise content management system is pervasive, not invasive.
Pervasive means
that this information should be captured, stored, and maintained as
part of the usual work processes people already follow! Thus, if you
use SharePoint as your enterprise blogging engine, or if you have
SharePoint workflows enforcing document routing, then by virtue of
using the tools that people use to perform their daily tasks, you are
able to apply enterprise content management principles to it readily.
NOTE
Enterprise content management (ECM) refers to
the technologies, strategies, methods, and tools used to capture,
manage, store, preserve, and deliver content and documents related to
an organization and its processes.
A similar but often confused term is web content management (WCM).
This technology addresses the content creation, review, approval, and
publishing processes of web-based content. Key features include
creation and authoring of tools or integrations, input and presentation
template design and management, content re-use management, and dynamic
publishing capabilities. In that sense, WCM is a subset of ECM. And
when you have the same platform (hopefully SharePoint 2010) doing both
WCM and ECM, your headache is greatly reduced.
But then what is document management? Document management (DM)
technology helps organizations better manage the creation, revision,
approval, and consumption of electronic documents. It provides key
features such as library services, document profiling, searching,
check-in, check-out, version control, revision history, and document
security.
But doesn't that make DM pretty much the same as
ECM? DM is similar to ECM, but not the same! Both DM and ECM facilitate
information lifecycle management, encourage collaboration, and help
manage information. But similarities aside, there are some key
differences between ECM and DM:
ECM can manage more than just documents, including videos and even hard copies. Thus ECM's scope is much wider than that of DM.
ECM
brings a larger science of records management and formal records
management. DM is purely the management of the documents, with or
without any records management.
1.1. What is Records Management?
Records management, or RM
is the practice of maintaining the records of an organization from the
time they are created up to their eventual disposal. Duties may include
classifying, storing, securing, and destroying (or in some cases,
archiving) records.
This brings up a whole bunch of interesting
associated concepts. The classification of the information largely
depends on the associated metadata collected. The associated metadata
in SharePoint terms is the structure of your content types.
The storage of these content types is preceded by another relevant concept, referred to as the file plan. The file plan
is a hierarchical structure that allows the management of various
content types organized throughout the tree. One item can appear at
multiple locations within the file plan. But rules can be specified on
each one of these nodes within the file plan. These rules are also
referred to as retention policies and disposition workflows.
Retention policies refer to rules that
define how content moves from one bucket to another. For instance, when
I get a bill in the mail, after I am done cursing, I place it on my
table until I have paid it. Once I pay it, I put it in the drawer. When
I am sure that the payment has gone through, I move it from the drawer
to a little brown box. Finally, at the end of the year I move that
brown box into the garage. A few years later, I burn that brown box
along with all the memories of the bill I paid. Burning that brown box
is the equivalent of disposition, and all the rules I associated in
moving the content from one store to another are retention policies.
Organizations find these policies very important, because in moving the
content from one box to another, they are reducing their storage costs
by a huge amount, while making the
content less discoverable. Thus as content is moved to cheaper storage
or is destroyed, organizations can save huge sums of money. And you
know organizations love to do that, but they need to manage what gets
destroyed and when.
Finally, there are various other relevant concepts,
such as physical records management, which refer to the science of
managing non-electronic assets by using electronic systems. Unique
document IDs which give every document in an organization a
configurable and meaningful document ID. Tiered storage models reduce
costs of storage within the organization by successively moving content
from one store to another. By default, SharePoint will store every
uploaded document in SQL Server. When was the last time you saw a
content database that was petabytes in size? And what was the cost of
running it? I can assure you that the fans on that petabyte server will
keep Washington D.C. warm without all the politicians' hot air.
All these concepts, were supported by SharePoint 2007, but are vastly improved in SharePoint 2010.
In SharePoint 2010, every feature you see in the
records center and document center can be broken up and used
individually as features in any site you want. As a result, there is
this whole new concept of in-place records management,
in which users or automated processes can mark records right where they
use and produce them.
2. Document IDs
Organizations need to identify documents uniquely. Scalability and performance may be reasons why you
would want to separate out your logical topology of SharePoint
installation into various site collections and maybe even multiple web
sites. There can be other reasons however, such as security,
navigation, or simply the process of moving a document between various
audiences. As a document moves through all these various site
collections, how do you give it a unique ID that ensures the document
is always guaranteed to be found?
List items in SharePoint have an ID column, which is
an integer that would constantly increase. It is unique across a
document library, but not unique across an organization. Also, using
the document ID only, it is difficult to tell the document ID to be
anything else, except an integer. In reality, organizations have their
own schemes for numbering documents, and especially when you have
documents spread across many sites and site collections, you want the
document IDs to be more meaningful and unique as they move across the
system. Also, you want these document IDs to be more permanent and
binding if they are to be useful.
A document ID in SharePoint 2010 is a pluggable
identifier for a document or a document set (described later). It also
provides a static URL or a permalink that opens the document or
document set associated with the ID, regardless of the location of the
document.
Thus, you can reference documents as permalinks—links
that don't change or break as the location of the document changes. And
also, the format and generation logic of the generated document IDs is
customizable. Let's see how this actually works! In order to use
document IDs, you have to first activate the Document ID Service under
site collection features as shown in Figure 1.
Activating the previous feature will schedule a
timer job that will configure the feature. After the Document ID
Service feature is configured, add a document in a document library in
the site collection. As you will see, the document now gets a unique
document ID (see Figure 2).
Note the URL for the document ID. It looks like as shown in Figure 1: http://sp2010/_layouts/DocIdRedir.aspx?ID=ZKUNP6SFESZK-1-2
This URL has no bearing on the document's location;
it relies on the document redirection service to remember where that
document is. And because the ID never changes, you can always count on
that permalink to work.
Also, with the document ID service now
activated, visit the document ID settings area, which can be accessed
under the Site Collection Administration area under Site Settings. The
specific URL for document ID settings is at http://sp2010/_Layouts/DocIdSettings.aspx.
This page will allow you to specify custom prefixes to your document
IDs, and thus ensure that the documents in different site collections
do not get conflicting document IDs.