BizTalk Server 2004, 2006, and 2009 have been built to use streams as a key part of the products' architecture. A stream
as a programming construct is a sequence of bytes with no fixed length.
When you begin to read a stream, you have no idea how long it is or
when it will end. The only control you have is over the size of the data
you will read at any one time. So what does this have to do with good
programming? It means that when you are dealing with extremely large
amounts of data, if you use a stream, you don't need to load all of this
data at once.
In this way, streams make
dealing with large amounts of data more manageable. If you have worked
with BizTalk 2002 or prior, you know that BizTalk would often produce
"out of memory" exceptions when processing large XMLDocuments.
This was because in BizTalk 2000 and 2002, the XMLDom was used to parse
and load XML documents. The DOM is not a streaming-based model. The DOM
requires you to load the entire document into memory to use it.
In supporting the streaming
paradigm, the BizTalk product team has included three classes that
optimize how you can use streams in your pipeline components and
orchestrations. These classes allow you to do stream-based XPath
queries. Each of these classes is explained in the following sections.
1. VirtualStream
Included in the BizTalk SDK
under the \Program Files\Microsoft BizTalk Server
2009\SDK\Samples\Pipelines\ArbitraryXPathPropertyHandler directory is a
class file called Virtual-Stream.cs. This class is an implementation
that holds the data in memory up to a certain threshold (by default
4MB). The remaining data it keeps on disk in temporary files. The
ArbitraryXPathPropertyHandler example in the SDK shows you an example of
how to use this class.
2. SeekableReadOnlyStream
SeekAbleReadOnlyStream
is an implementation of a stream class that provides fast, read-only,
seekable access to a stream. It is a wrapper class around a regular
stream object and can be used in cases where the base stream object is
not seekable, and does not need write access. An example of this class
can be found in the \Program Files\Microsoft BizTalk Server
2009\SDK\Samples\Pipelines\Schema Resolver Component directory.
3. XPathReader
The XPathReader class lives in
the Microsoft.BizTalk.XPathReader.dll assembly. This is a class that
provides XPath query access to a stream of XML. This is very
advantageous as it allows for very fast, read-only access to a stream of
data via an XPath expression. Normally, XPath queries require the
entire document to be loaded into memory such as in an XMLDocument.
Using the XPathReader, you can load your document via the SeekAbleReadOnlyStream class mentioned previously, and then have this stream wrapped by an XMLTextReader.
The net effect is that you have a stream-based XPath query that does
not require the entire XML document to be loaded into memory. The following example shows how this can be implemented in a pipeline component. Note the use of the SeekAbleReadOnlyStream variable in the Execute
method. This is the means by which you can have your stream of data be
seekable and read-only, which improves the performance and usability of
the pipeline component.
Imports System
Imports Microsoft.BizTalk.Component.Interop
Imports Microsoft.BizTalk.Message.Interop
Imports System.Collections
Imports Microsoft.BizTalk.XPath
Imports System.Xml
Imports System.IO
Imports Microsoft.Samples.BizTalk.Pipelines.CustomComponent
Namespace ABC.BizTalk.Pipelines.Components
<ComponentCategory(CategoryTypes.CATID_PipelineComponent)> _
<ComponentCategory(CategoryTypes.CATID_Any)> _
Public Class PropPromoteComponent
Implements IComponent
Implements IComponentUI
Implements IBaseComponent
Implements IPersistPropertyBag
Private _PropertyName As String
Private _Namespace As String
Private _XPath As String
Public Property PropertyName() As String
Get
Return _PropertyName
End Get
Set(ByVal value As String)
_PropertyName = value
End Set
End Property
Public Property [Namespace]() As String
Get
Return _Namespace
End Get
Set(ByVal value As String)
_Namespace = value
End Set
End Property
Public Property XPath() As String
Get
Return _XPath
End Get
Set(ByVal value As String)
_XPath = value
End Set
End Property
Public Function Execute(ByVal ctx As IPipelineContext, _
ByVal msg As IBaseMessage)
Dim xpathValue As Object = Nothing
Dim outMessage As IBaseMessage = ctx.GetMessageFactory.CreateMessage
Dim newBodyPart As IBaseMessagePart = _
ctx.GetMessageFactory.CreateMessagePart
newBodyPart.PartProperties = msg.BodyPart.PartProperties
Dim stream As SeekableReadOnlyStream = New _
SeekableReadOnlyStream( _
msg.BodyPart.GetOriginalDataStream)
Dim val As Object = msg.Context.Read(PropertyName, [Namespace])
If val Is Nothing Then
Throw New ArgumentNullException(PropertyName)
End If
msg.Context.Promote(PropertyName, [Namespace], val)
Dim xpc As XPathCollection = New XPathCollection
Dim xpr As XPathReader = New XPathReader(New XmlTextReader(stream), xpc)
xpc.Add(Me.XPath)
While xpr.ReadUntilMatch = True
Dim index As Integer = 0
While index < xpc.Count
If xpr.Match(index) = True Then
xpathValue = xpr.ReadString
' break
End If
System.Math.Min( _
System.Threading.Interlocked.Increment(index), index - 1)
End While
End While
If xpathValue Is Nothing Then
Throw New ArgumentNullException("xpathValue")
End If
msg.Context.Write("SomeProperty", "http://ABC.BizTalk.Pipelines", _
xpathValue)
stream.Position = 0
newBodyPart.Data = stream
outMessage.Context = msg.Context
CopyMessageParts(msg, outMessage, newBodyPart)
Return outMessage
End Function
Public ReadOnly Property Icon() As IntPtr
Get
Return IntPtr.Zero
End Get
End Property
Public Function Validate(ByVal projectSystem As Object) As IEnumerator
Return Nothing
End Function
Public ReadOnly Property Description() As String
Get
Return "Description"
End Get
End Property
Public ReadOnly Property Name() As String
Get
Return "Property Promote"
End Get
End Property
Public ReadOnly Property Version() As String
Get
Return "1"
End Get
End Property
Public Sub GetClassID(ByRef classID As Guid)
Dim g As Guid = New Guid("FE537918-327B-4a0c-9ED7-E1B993B7897E")
classID = g
End Sub
Public Sub InitNew()
Throw New Exception("The method or operation is not implemented.")
End Sub
Public Sub Load(ByVal propertyBag As IPropertyBag,
ByVal errorLog As Integer)
Dim prop As Object = Nothing
Dim nm As Object = Nothing
Dim xp As Object = Nothing
Try
propertyBag.Read("Namespace", nm, 0)
propertyBag.Read("PropertyName", prop, 0)
propertyBag.Read("XPATH", xp, 0)
Catch
Finally
If Not (prop Is Nothing) Then
PropertyName = prop.ToString
End If
If Not (nm Is Nothing) Then
[Namespace] = nm.ToString
End If
If Not (xp Is Nothing) Then
XPath = xp.ToString
End If
End Try
End Sub
Public Sub Save(ByVal propertyBag As IPropertyBag, _
ByVal clearDirty As Boolean, _
ByVal saveAllProperties As Boolean)
Dim prop As Object = PropertyName
Dim nm As Object = [Namespace]
Dim xp As Object = XPath
propertyBag.Write("PropertyName", prop)
propertyBag.Write("Namespace", nm)
propertyBag.Write("XPATH", xp)
End Sub
Private Sub CopyMessageParts(ByVal sourceMessage As IBaseMessage, _
ByVal destinationMessage As IBaseMessage, _
ByVal newBodyPart As IBaseMessagePart)
Dim bodyPartName As String = sourceMessage.BodyPartName
Dim c As Integer = 0
While c < sourceMessage.PartCount
Dim partName As String = Nothing
Dim messagePart As IBaseMessagePart = _
sourceMessage.GetPartByIndex(c, partName)
If Not (partName = bodyPartName) Then
destinationMessage.AddPart(partName, messagePart, False)
Else
destinationMessage.AddPart(bodyPartName, newBodyPart, True)
End If
System.Threading.Interlocked.Increment(c)
End While
End Sub
End Class
End Namespace