Yossi Dahan [BizTalk]


Monday, April 30, 2007

Loading custom pipeline component properties and performance

Needing to implement a pipeline component with properties slightly more complext than the odd simple type property I went back to Saravana Kumar's great whitepaper - Understanding Design-Time Properties for Custom Pipeline Components in BizTalk Server.

Before I go any further I should say I think this is a great whitepaper, as are most, is it's definitely worth reading.

I did, however, have one small reservation I thought is worth sharing -

Somewhere around page 21 Saravana shows how to save and load properties which are collections; in his example he uses Xml Serialisation and desrialisation to switch between the in-memory collection to the xml persistable form and back.

Here are the code snippets from the white paper -

public virtual void Load(Microsoft.BizTalk.Component.Interop.IPropertyBag pb, int errlog)
object val = ReadPropertyBag(pb, "CorrelationPropertiesCollection");
if (val != null)
string corrPropertiesList = (string)val;

XmlTextReader xml = new XmlTextReader(new StringReader(corrPropertiesList));
XmlReaderSettings settings = new XmlReaderSettings();
settings.IgnoreProcessingInstructions = true;
XmlReader reader = XmlReader.Create(xml, settings);

XmlSerializer ser = new XmlSerializer(typeof(CorrelationPropertiesCollection));
CorrelationPropertiesCollection obj = (CorrelationPropertiesCollection)ser.Deserialize(reader);
CorrelationSettings = obj;

public virtual void Save(Microsoft.BizTalk.Component.Interop.IPropertyBag pb, bool fClearDirty, bool fSaveAllProperties)
StringBuilder sb = new StringBuilder();
StringWriter sw = new StringWriter(sb);

XmlWriterSettings setting = new XmlWriterSettings();
setting.OmitXmlDeclaration = true;
XmlWriter writer = XmlWriter.Create(sw, setting);

XmlSerializer ser = new XmlSerializer(typeof(CorrelationPropertiesCollection));
ser.Serialize(writer, CorrelationSettings);

object val = sb.ToString();
pb.Write("CorrelationPropertiesCollection", ref val);

The problem with this is that XmlSerialisation is quite an expensive operation, and, quite counter-intuitively (in my view), the Load method is being called in every execution of the component; so having this sort of logic in the load can really slow down the pipeline execution.

With all of that in mind, what I would say - if performance is not a critical issue, doing this is a great clean approach; if, on the other hand, low latency is important, serialisation should be avoided.

In our case we've decided to implement our own string parsing logic.
We didn't go as far as implementing ISerializable or anything like that, but simply added ToString and Parse methods to our collection that converted the collection to a delimited string and back. While this still involves some processing work on the load I suspect it is much quicker than serialisation

Labels: ,

Monday, April 23, 2007

Error in receive pipeline: There is an error in XML document

I've received the following email from Oleg Gershikov today and though it was worth sharing:

There was an annoying problem with the tracking send pipeline in synchronic processes in the UAT environment. The process used to fail in send pipeline component with the following exception in event log:

“A response message sent to adapter "SOAP" on receive port "******" with URI "*********.asmx" is suspended.

Error details: There is an error in XML document (1, 2850). “

The error is confusing because xml was valid and we don’t have any xml validation/parsing element in our send pipeline.

What was really helpful is Negative Acknowledgement (NACK) message which could be found in the suspended instance in BTS Admin console:

<SOAP:Envelope xmlns:SOAP="http://schemas.xmlsoap.org/soap/envelope/" SOAP:encodingStyle="http://schemas.xmlsoap.org/soap/encoding/">
<faultcode>Microsoft BizTalk Server Negative Acknowledgment </faultcode>
<faultstring>An error occurred while processing the message, refer to the details section for more information </faultstring>
<ns0:NACK Type="NACK" xmlns:ns0="http://schema.microsoft.com/BizTalk/2003/NACKMessage.xsd">
<ErrorDescription>There is an error in XML document (1, 2867).</ErrorDescription>
<ErrorDetail>Login failed for user ''. The user is not associated with a trusted SQL Server connection.</ErrorDetail>

ErrorDetail node contained actual exception reason. It appeared to be database connection problem which was solved by cancelling impersonate in the web.config file.

Conclusion: Check NACK messages for SOAP adapter exceptions. It may contain important details.


Thursday, April 19, 2007

Wishlist - highlight remarks on shapes

It fraustrates me that there isn't (in my view) a good way to annotate orchestrations.

The main two ways you can add remarks to your orchestration is by either writing text in a shapes description field - but this is not inidcated in the designer in anyway so it is very unlikely anyone would notice your fantastic comment which renders it useless or by using remarks in expression shapes, only that you have to have at least one real line of code there which means you cannot have remarks in places you don't actually need an expression shape without having a silly meaningless line of code to satisfy the compiler.

If being able to add floating remarks of some sort, or even having a simple remark shape is too much to ask - can shapes with description at least be highlighted in some way? a little star? a different frame? anything? wouldn't that be nice?

what do you think?


Notes on streaming pipeline components

Working with BizTalk for several years now, I've had the chance to develop quite a few pipeline components, and, as you'd expect, more often than not I'm developing them in a streaming fashion.

A lot has been written on developing streaming pipeline components, so there's little point repeating it all here - the main point is to minimise the memory footprint of your pipeline component, and by doing that in all the components in the pipeline – reducing the memory footprint of the pipeline as a whole.

Reducing memory consumption in a server code is basic for any enterprise development, as many instances of the code may run in parallel, components that consume a lot of memory can quite easily “bring the server to it’s knees”.

So – I’m wrapping the message stream in my own (“eventing”) stream and implementing all the logic of my component in the events raised as the stream is being read by someone else (ideally the messaging agent, as it reads the message to write it to the message box); and by doing so, I was confused to think in one tired moment, that I would get bits of logic from different component in the same pipeline running in parallel as event fire when the stream is being read.

I'll try to describe a simple scenario I used to test this -

I’ve created a test pipeline component that replaces the original stream with a stream that fires an event when the read method is called for the first time (“firstRead”).

The component returns the message with the wrapper stream to the pipeline immediately, but when the event fires, it sleeps for 5 seconds.

I’ve put 4 trace lines in the code - when the component is called and returns, and before and after the sleep.
I then created a receive pipeline that uses this component twice.

What I expected to see is a trace like this:

1. Enter Component (as the first component in the pipeline is being called)
2. Leave Component (first component returns message to pipeline)
3. Enter Component (second component being called)
4. Leave Component (second component returns message to pipeline)
5. Start Sleep (component 1 first read event fires)
6. Start Sleep (Component 2 first read event fires)
7. End sleep (component 1 - sleep is over for first component)
8. End Sleep (Component 2 -sleep is over for second component)

Notice that what I'm expecting to happen here is that the sleep of components 1 and 2, used to simulate hard-working code happens in parallel.

However, the trace I really had was this -

1. Enter Component
2. Leave Component
3. Enter Component
4. Leave Component
5. Start Sleep (component 1 first read event fires)
6. End sleep (sleep is over for first component)
7. Start Sleep (Component 2 first read event fires)
8. End Sleep (sleep is over for second component)

Spot the subtle(?) difference in lines 6 and 7

The event raised in the 2nd instance of the component does not get executed until the 1st component finishes execution; this is because all the code runs on the same thread.

What this demonstrates is that while streaming pipeline components can significantly reduce the memory footprint of your pipeline they cannot speed up processing time by executing things in parallel. Not without exlicit effort at least.

As a side note I’ll mention that similarly the subscription evaluation in the message box does not happen until all the components have finished execution.
I believe this is because of the transaction used in receive-pipelines, so that if the pipeline fails at any stage BizTalk will not process the message.

Theoretically - a way around this, if you really want to achieve parallelism in the pipeline execution, as with any code, is to execute your code in a separate thread.

If I change the component to start a new thread when the event fires and sleep in the new thread, everything works much faster, including anything triggered by the arrival of the message to the msgbox (as the pipeline or the messaging agent no longer wait for the sleep to be over before executing their next tasts), but I understand this is not recommended.

To start with - you loose transactivity with BizTalk – if anything in the pipeline, or the messaging agent's work with message box may fail, you will not know about it; your code will still run, and to make matters worse – the pipeline may then be executed again for the same message (be it because of adapter logic or administrators decision) and so your code will execute again; for the same message. in many cases this is unacceptable.

In addition to this I am lead to believe there is some optimisation happening in the messaging agent around threading, and starting your own threads may interfere with it.

So bottom line here - if you have to use threads in your pieplines to get parellelism, accept the risks you're taking and test really carefully all possible eventualities.

There’s a little more to play around with, especially checking all of this in a send pipeline as well, as I expect some differences, but I’m not sure when I’ll get the chance to look at it, hopefully soon.


Wednesday, April 11, 2007

Great tool for performance monitoring analysis

Ewan Fairweather recently introduced me to Clint Huffman’s PAL tool.

While I've only started looking at using it on our servers, from my few "playing around" sessions with the tool its usefulness is obvious.

Should be be used carefully and responsibly of course, especially when it comes to determining threasholds and generating conclusions, but as Ewan put it - "it certainly takes the pain out of perfmon analysis".

Highly recommended.

and while I'm at it - check Clint's web cast - BizTalk Performance Methodology.

Monday, April 02, 2007

Comment on Kevin Lam's great post on dynamic direct bound ports

I would have left a comment on his blog entry , but these have been disabled.

I'll start by saying that this whole series of post by Kevin is great, and I found that last one particulariliy interesting.

Recently I was sent back to it by a question in the newsgroups so I looked again at self correlating dynamic ports on which I have a small comment -

One of the exmples given is that when sending (and receiving) messages in a loop - in this case,as you cannot initialise a correlation set in the loop you have to initialise it first outside the loop, and then "follow" on it in each subsequent send to make sure properties get promoted correctly, and so using a dynamic self correlating port can save that extra send shape outside the loop as there's no need to inialise any correlation set.

My comment is that while this is great if you don't need any other promoted properties to route the message to it's destination, I suspect this is rarely the case or else you would not use direct bound ports.

If you do need to promote other properties, you probably still need to inialise a correaltion set, and so you probably still need that send shape outside the loop.

In this case, having the dynamic port is nice, as it may save you the correlation set on the other side, but does not do much else for the sending orchestration.

Just my two cents.


Introduction to xml namespaces

No, I'm not going to write one here; on the contrary...

I'm lucky to, every now and then, have the pleasure of introducing someone to the fascinating world of BizTalk...

However, before we even get to look at BizTalk itself, I always try to make sure first that the concepts of xml and xml namespaces (coupled with xpath and xsl) are crystal clear, as these are fundemental to working with BizTalk.

To do so - I almost always refer her to Aaron Skonnard's great article Understanding XML Namespaces and/or Dare Obasanjo's XML Namespaces and How They Affect XPath and XSLT

The only problem is that it always takes me 5 minutes to find these, hence this post.