Yossi Dahan [BizTalk]


Thursday, December 23, 2010

Note to self on workflow 4, context correlation and service parameters

Sorry if this is a bit too vague, but I wanted to make note of this, before I forget, and figured – if it might help someone – might as well blog it as is -

I’ve set-up a workflow with a context correlation scenario – in my scenario I used a pick activity, but I don’t know if that’s significant, where the trigger of one Pick branch has a ‘receive and send reply’ template in which the receive activity is set to create a new instance (CanCreateInstance=true) and the trigger of another pick branch has a receive activity configured to follow the correlation (CanCreateInstance=false).

When I ran my scenario the call to the first service’s operation, the one creating the instance, worked fine, but when calling the second operation, which was intended to enter into the correlated receive, I got the following error on the client -

“The execution of an InstancePersistenceCommand was interrupted because the instance key 'd7c81b72-21a3-cd67-e7178a39e708b766' was not associated to an instance. This can occur because the instance or key has been cleaned up, or because the key is invalid. The key may be invalid if the message it was generated from was sent at the wrong time or contained incorrect correlation data.”

Took some trial and error, and at the moment it doesn’t quite make sense to me, but it turns out something seemingly completely unrelated seem to be the cause of this - initially, I forgot to assign the parameters of the second receive shape (the one following the correlation) to local variables.

If I do that, the problem goes away (!), remove the local variables from the UI, and the error is back. weird!

I’m not quite sure yet why this would have that effect, but it clearly did, I tried this several times and it happened every time.

I’ll keep digging, but for the time being – was worth making a note of that…..


Friday, December 17, 2010

More on BAM

In some places we’re using inline WCF calls to services for various reasons (a practice I’m not entirely comfortable with, but I know many people advocate, so I guess the jury is out….)

One of the problems with this approach, in our case, is that we’re bypassing the elaborate tracking infrastructure we have in place which relies, largely, on BizTalk pipelines that don’t exist when using inline sends.

The actual solution is a topic for its own blog post, hopefully in the near future, but when I conducted some load tests of it I saw more of the phantom records in BAM, which meant I had to look closely for the cause of these first.

Trying to isolate the issue, as one does, I started to cut more and more logic out of the test scenario, until I ended up with a very distilled piece of code; By now I was no longer running any of my new code either, just using the API we have, that wraps the BAM API, to begin and end an activity in a strongly typed fashion.

The test method ended up having only two lines in it, which looked like this -

Tracking.BeginActivity(<activity name>, <activity id>)
Trakcing.EndActivity(<activity id>)

The pseudo code for these two methods is as follows -

BeginActivity -

EventStream stream = new BufferedEventStream(<connection string to MsgBoxDB>,1);
stream.BeginActivity(<activitiy name>,<Activity Id>);
stream.UpdateActivity(<activity id>, <data array>);

End Activity =

EventStream stream = new BufferedEventStream(<connection string to MsgBoxDB>,1);
stream.UpdateActivity(<activity id>, <data array>); 
stream.EndActivity(<activity id>);

Running this code on its own works ok and no issues are observed, but under load this code causes more of these ‘phantom records’ where updates to the activity seem to happen after it has been closed.

As the code at this stage was very simple (I’ve removed the calls to what I actually wanted to test) the issue became apparent -

Foolishly I have left the FlushThreshold parameter of the BufferedEventStream constructor as a constant 1!
This parameter tells the stream when to flush the data to the database and exists on both the BufferedEventStream and DirectEventStream. A value of 0 tells the stream to never auto-flush and an explicit call to flush is required from the client code; any other figure is the number of records to wait for before flushing the data to the database.

Having the value of 1 for the flush threshold hardcoded is not only inefficient (in the case stated above, I always perform two actions on the record, with a value of 1 this will always cause two round-trips to the database, when it could have been batched up and done in 1), but also, due to the asynchronous nature of BAM, may cause events to be processed out of order (as they are persisted separately).

Ensuring the threshold is set to the correct value helps ensure that records, which are now persisted together, are processed in the right order, which sorts out the data inconsistency issue we’ve seen.

Labels: ,

Tuesday, December 07, 2010

Do you instrument you BizTalk application?

If you don’t, you should. If you do – you should make sure you do it well.

Either way I just bumped into this brilliant paper