By now, you are familiar with numerous
facilities in SharePoint 2010. As exciting as these features may be
individually, they truly shine when used with each other.
Let's talk about this in SharePoint 2010. I mentioned that SharePoint 2010 lists are a lot
more performant than SharePoint 2007. I wasn't lying; they indeed are!
But even with better performing lists, you will at some point hit a
limit. Even if the limit is in millions of list items or tens of
thousands of documents, there is a limit—trust me, there is! Thus,
beyond a certain point, you will have to scale. But how exactly do you
scale in SharePoint? You can create more folders, more document
libraries, more lists, and maybe even more sites. Heck, you can create
even more site collections in more content databases, depending on your
scalability needs. Honestly, with the facility to scale using more and
more such containers, you can scale to almost infinity.
Sounds good on paper! But this creates a unique
challenge. Guess how many sites, site collections, or even document
libraries the average business user wants. Usually the answer is one.
Think about it—even if you are producing millions of documents,
wouldn't it be nice if there were a single place for you to input and
discover all your content? (Seriously, if these customers didn't pay me so well, I'd never work for them!)
The good news is that SharePoint can help! While
SharePoint allows you to scale to infinity for all practical purposes,
it also has facilities to make the content easily collected and easily
discoverable, no matter how big your store is. Discovering the content
can be done easily using document IDs. Thus, if you provision a
document center or records center site, you can easily find any
document using the document ID facility of SharePoint that I described
earlier. If the document IDs are difficult to remember, you can also
discover content by using search. So imagine this, you have content
being tagged using metadata, you can instantly filter documents within
a single document library using metadata, or you can find them with a
crawl period delay using search. Using advanced search, you can also
slice and dice all such collected content across whichever store you
prefer—a store that can span multiple web applications or even farms!
So discovery of content is possible, no matter how big your store is.
SharePoint gives you all the necessary facilities that you as an
architect can use to build a system suitable for a small or large
store. Content can remain discoverable using CAML filtering (small
store) or search (large store). In most projects, you will end up using
a combination of both.
Now let's talk about putting content into
SharePoint. Content that is being produced may need to be collected at
a different place from where it is being stored. Here are two different
and diverse examples:
Content may be produced all across the farm
or even multiple farms. But content, especially important content that
you care about (based upon its metadata, perhaps?) may need special
treatment. For instance, you could create a tag called Important, and
all such content may get aggregated in a central document library from
all SharePoint farms in the organization. Or content might be
automatically routed to a central records center, perhaps to a
humungous document library with a well-defined structure of folders, so
that there is a common place to go and discover all content that the
organization cares about! As you will see later, this humungous
document library in a records center, that I talked of, a more
appropriate and fancy sounding name for this is called as the file
plan.Thus, there may
be a need to "aggregate" important content from all over the place into
a central place. I call this the "All roads lead to Rome" scenario.
The
second and perhaps just as important scenario is completely opposite.
Imagine that there is a central document library, okay let me use the
right term: the file plan of a records center. And users have the
ability to manually submit content into this file plan. You probably
don't want all the content to sit at the root level of the document
library. Imagine this: an e-mail–enabled document library, with each
"reply" to the e-mail seeming to have the same subject. Does each reply
overwrite the previous reply? Well obviously not! Similarly in a
records center, you probably want centrally submitted content to be
"fanned out" to appropriate folders (or maybe even separate stores), so
appropriate retention policies or storage facilities can be applied on
such content. I refer to this scenario as "fanning out of content".
Both of these ("All roads lead to Rome" and "fanning
out of content" or any combination thereof) can be used in SharePoint
2010 using a facility called content organizer. Also, the best thing about SharePoint 2010's ECM facilities
is that all that I talked of can be activated as features on any site
collection, and these features can be taken advantage of in any site
collection. Let me illustrate the usage of content organizer in
SharePoint 2010 in the blank site definition.
Content organization in SharePoint 2010 is implemented as two features that need to be activated on a site.
The first feature is called
DocumentRoutingResources, with FeatureID
0c8a9a47-22a9-4798-82f1-00e62a96006e. This feature adds the necessary
field types, and a new content type called Rule under the Content
Organizer Content Types. Go ahead and activate this feature on
http://sp2010 using the following command:
stsadm -o activatefeature -id 0c8a9a47-22a9-4798-82f1-00e62a96006e -url http://sp2010
Activating this feature will give you a new content type in the Content Type Gallery as shown in Figure 1.
In this content type are various fields that define
the structure of a content organization rule, such as priority, the
details of the rule, the destination of matched content etc. Next, you
need a facility to make use of these rules. Something to store the
rules, and something to act upon these rules.
Those facilities are implemented as the
DocumentRouting feature with feature id
7ad5272a-2694-4349-953e-ea5ef290e97c. This feature has an
ActivationDependency on the DocumentRoutingResources feature, thus if
you try activating the DocumentRouting feature before the
DocumentRoutingResources feature, you will get the following error
message:
Dependency feature 'DocumentRoutingResources' (id: 0c8a9a47-22a9-4798-82f1-00e62a96006e) for
feature 'DocumentRouting' (id: 7ad5272a-2694-4349-953e-ea5ef290e97c) is not activated at this
scope.
Go ahead and activate the DocumentRouting feature on
the http://sp2010 site. Activating this feature will give you the
following:
It will give you two new custom actions under the site settings\site administration area:
Content Organizer Settings
Content Organizer Rules
It
will give you a new list called RoutingRules, which can be accessed by
clicking the Content Organizer Rules custom action created under the
site settings\site administration area.
It
will give you a new document library called the Drop Off Library, where
users should drop off content so it can be routed to various locations
per the rules defined in Content Organizer Rules.
Now before I dive into setting up a rule and routing
new documents, let's first visit the Content Organizer Settings custom
action under site actions\settings first. This is a layouts page (also
known as an application page) at /_layouts/DocumentRouterSettings.aspx.
Over here you can see various settings as applicable to content
organizer within this site. Following are the settings you can
configure for content organization in SharePoint 2010.
Redirect users to the Drop Off Library. If
this setting is checked, all the document libraries that the content
organizer rules know about will see the message shown in Figure 2.
Now this must not be confused.
Content organizer isn't magic and it won't show you this message that
the rules have no idea about. This is generally useful for organizing
content that participate in content organizer rules.
Sending
to another site: By default, the content organizer is limited to moving
content within a site. In fact, when setting up a content organizer
rule, the browse button only shows you navigation within the current
site collection. However, by checking this check box, you can send the
content anywhere you wish.
Folder
partitioning: Folder partitioning in content organizing is an important
tool. As I mentioned earlier, frequently different documents may have
the same file names. Thus in specifying the rule itself, you can set up
such de-duplication rules and individual files can be named
differently. However, as a global setting you can create folders based
on the number of documents collected. This ensures that no single
folder gets so large that the views on it become a performance hassle.
Also, to keep your storage costs low, you can now also set retention
policies to perhaps archive out content other than say the last 2500
most recent documents, and so on.
Duplicate
submissions: The content organizer rules give you facilities to ensure
unique naming. However, if the content organizer administrators were
sloppy and didn't envision a particular case that caused a duplicate
submission, SharePoint 2010 can allow you to de-duplicate content by
using either SharePoint versioning (default) or by making the file name
unique.
Preserving context: Sometimes in
records management projects, it is important to preserve audit log
histories and various properties. For example, if an insurance company
is being sued for privacy information being leaked, you probably want
to use audit logs to find out who viewed that information. Just because
content was reorganized, you probably don't want to lose this
information. However, preserving audit log information can
significantly increase the size of your content databases, thus the
default value of this property is false.
Rule managers: This is one or more individuals responsible for managing the content organization rules within a site collection.
Submission
points: By default, a document library called the Drop Off Library is
set up for you that allows you to drop content for organization.
However, using this setting you can enable more drop-off points,
notably a web service at /_vti_bin/OfficialFile.asmx, and an e-mail
address for e-mail–enabled document libraries. This facility can thus
be used to perform records management on e-mail within SharePoint.
Thus, as you can see, content organizer is quite powerful. Let's see it in action. Set up a document library called Target. My intent is that any document with the word Important
dropped in the Drop Off Library will end up in the target document
library. Using the Content Organizer Rules custom action under site
actions\site administration, set up a new content organizer rule with
the following values:
Name: Important
Priority:
5. Priority is important because sometimes you may want to control
which rule runs before which rule. You can also choose to inactivate
any particular rule by choosing the inactive option.
Submission
Content Type: I choose to target the Document content type, but as you
can see, you can choose to use content organizer based on different
content types. Depending on the content type, you may also have
different properties available to you to set up the content
organization rules. Also, content organization can span site
collections. Content types
can also be shared across site collection boundaries. This section also
lets you specify in the rule if the given content type has different
names in other site collections.
Conditions:
This is where you specify the rule conditions. All these conditions are
"And "-ed with each other. If you're wondering why Or is not an option,
remember that you can always set up multiple content rules to simulate
Or. The condition I set up was Name "contains all of" Important.
Target
location: If the rule matches, the content is moved to the specified
target location. The target location I specified is a document library
called Target. You can also choose to prevent accidental overwrites of
documents by separating them into their own folders based on property.
Okay, good! So far you have set up a content rule
and a Target document library. Now drop a document called
Important.docx in the Drop Off Library. Note that soon as you upload
the document, the content organizer shows you the message shown in Figure 3.
Because you just uploaded the document, the document
at this point is effectively checked out to you until you hit the
Submit button. The user at this point could just close the browser and
leave the document checked out, but if you noted in the content
organizer settings, the content administrator can be notified after a
configurable number of days if any stray unorganized content is left in
the drop off document library.
Now, as soon as you fill out the other properties,
and hit the Submit button, the document is effectively checked in for
you and is routed to its final destination as shown in Figure 4.
As you can see, the content organizer rule was run
on the newly created document. And per the rule settings, the document
was saved to the Target document library, and the final location is
communicated to you. The sucky thing here, however, is that because the
documents are now flowing all over SharePoint, how will the user
remember that URL ? Can you guess!?? Simple! Activate the Document ID
Service, and you are presented with a dialog box as shown in Figure 5.
So as you can see, as more and more of
these features are used together with each other, they become
increasingly compelling. Next, let's look at a feature that truly adds
a lot of value: ECM, content type synchronization.