One of the primary functions of LINQ is
querying data. In this section, we’ll dig a little deeper into how LINQ
works before looking at how to retrieve data using LINQ to SharePoint.
1. Query Limitations
LINQ works by passing lambda expressions to extension methods that have been declared to extend the IEnumerable and IEnumerable<T>
interfaces. By chaining together these extension methods, you can
create complex queries, all with compile-time syntax checking and
type-safe return values. Earlier we looked at LINQ to Objects as a
simple starting point to discuss how LINQ works. In the case of LINQ to
Objects, parsing queries is relatively straightforward since the
objects are in memory and LINQ is simply being used as a shortcut for
more traditional programming techniques. Ultimately, behind the scenes,
LINQ to Objects simply expands the query into the more verbose code
that we would have written in a non-LINQ world.
There is one major difference between LINQ to
Objects and LINQ to SharePoint, however, and that is the location of
the data being queried. For LINQ to SharePoint, the data exists not in
memory—as is the case for LINQ to Objects—but in the SharePoint content
database. Because of this, LINQ to SharePoint has to work a bit harder
in parsing the LINQ queries. Rather than simply expanding the queries,
the parser must convert them to a syntax that can be used to query the
SharePoint content database directly. Since the SharePoint platform
defines its own query language in the form of the XML dialect CAML,
LINQ to SharePoint must translate all LINQ queries into CAML. These
CAML queries can then be processed directly by the SharePoint platform.
Once the results are returned, they are mapped onto strongly typed
entity objects, which are then returned as the query results.
Expression Trees
It’s worthwhile for you to understand how this
process works, because it has implications when it comes to creating
more complex queries—as you’ll see later. To find out a bit more, let’s
start with one of the extension methods that we’d commonly use in a
LINQ query. If we examine the Where extension method in more detail, we find the following method signature:
public static IQueryable<TSource> Where<TSource>(this IQueryable<TSource>
source, Expression<Func<TSource, bool>\> predicate)
At first glance, you may think there’s too much
information in this statement and decide to skip ahead a few
paragraphs—but bear with me, because only one part of the signature is
actually important for the purposes of this discussion.
The method accepts two parameters: the first being an IQueryable data source and the second being a generic Expression object of type Func. The Func
object is a delegate that references the code that’s been entered as
the lambda expression, and this is passed as a parameter to the Expression object. The Expression object, however, converts the lambda expression into an ExpressionTree. The ExpressionTree
is where the magic that is LINQ takes place. By using expression trees,
you can programmatically analyze lambda expressions. As a result, you
can convert compiled expressions into something completely different by
applying logic to the expression tree. By using this process, LINQ to
SharePoint converts the lambda expressions that make up a query into
valid CAML syntax that can then be directly
executed against the content database. (As an aside, this is exactly
the same way that LINQ to SQL works—the only difference is the target
query syntax.)
All very interesting, you may be thinking, but why
is this relevant? Well, here’s the thing: Many extension methods are
defined for LINQ, they all have a default implementation in LINQ to
Objects, and it’s down to the creator of a new LINQ provider to
override them with a platform-specific implementation. However, CAML
doesn’t support all the available extension methods. Some things in
there simply can’t be translated into CAML. In other implementations
such as LINQ to SQL, where a method can’t be implemented directly in
SQL, the standard LINQ to Objects method is used, meaning that a
portion of the query is performed using SQL, the results are then
loaded into memory, and any remaining operations are performed using
LINQ to Objects extension methods. Of course, this is all done behind
the scenes and is completely transparent to the user.
The issue with LINQ to SharePoint is that CAML is a
very limited language when compared to SQL, and as such a significant
number of the standard LINQ operations are not possible. If such
operations were left to revert to their default implementation in LINQ
to Objects, significant performance issues could result due to the
amount of in-memory processing that could be required. Without an
in-depth understanding of how LINQ works, an uninitiated developer
could easily create a LINQ query that, if executed concurrently by many
users, could severely degrade system performance.
Inefficient Queries
At the time of writing, this problem has been
highlighted by having the LINQ to SharePoint provider throw an error if
an unimplemented extension method is used in a query. In previous
versions, this behavior could be controlled by setting the AllowInefficientQueries flag on the DataContext
object; however, the current version—at the time of writing, Beta 2—no
longer allows this flag to be publically altered and therefore
unimplemented expressions will not work with LINQ to SharePoint. This
may change in future releases of the product.
The following extension methods are considered
inefficient due to inability to convert to CAML and are therefore not
implemented by LINQ to SharePoint:
Aggregate | All | Any | Average | Distinct | ElementAt |
ElementAtOrDefault | Except | Intersect | Join | Max | Min |
Reverse | SequenceEqual | Skip | SkipWhile | Sum | |