Yossi Dahan [BizTalk]


Saturday, April 05, 2008

...and then just when you actually needed HAT..

I'm pretty sure I'm not alone suggesting that the HAT tool is somewhat, let's say, lacking....

There's quite a few annoying things about the tool, but there's one thing in particular that has to be at the top of the list, because, in my view, it means that in the one case that you really need HAT to help you out, it fails you miserably.

The "orchestration debugger" is a nice selling point for BizTalk: you develop your process and, assuming you have the relevant tracking settings turned on you can go back to processes already completed and "replay" them to see which shapes have been executed and which haven't.

This is really great when viewing processes already completed, and also somewhat useful when setting  a breakpoint in the process and attaching to the process in HAT (although not as useful as one might think).

However - it is completely useless when dealing with suspended orchestrations.

If your orchestration get's suspended for whatever reason, you get a nice error message in the event log, in most likelihood the event log message will even contain the name of the shape in your orchestration in which the exception occurred; however - find the suspended instance in the admin console or in HAT, open the orchestration debugger - and you're into a surprise: the viewer will only show you execution up to a few shapes BEFORE the actual shape that failed.

I'm not sure I have the story right, but I believe this "bug" (it's really a "side effect", more on this in a second) was introduced in 2006 (but I no longer have 2004 installed to prove it was actually better before hand).

As far as I know, one of the changes in BizTalk 2006 is around the way orchestrations handle exceptions - in BizTalk 2004 all unhandled exceptions in an orchestration would (if my memory serves me right) result in a suspended non-resumable instance; in 2006 these instances are resumable; this suggests that BizTalk 2006 has to keep the state of the orchestration BEFORE the error occurred - so that if an administrator chooses to resume the process (possibly after fixing whatever caused the suspension) the process could start again and retry the action where it got suspended before).

In order to achieve this BizTalk probably keeps the last GOOD state of the orchestration in the database (from the last persistence point executed); in other words - where before a suspension would cause a persistence point, from 2006 it does not and the orchestration's last persistence point is what's kept in the message box.

If that is correct it would explain why the orchestration debugger only shows information up to a point before the shape that caused the suspension - it would only have information up to the last persistence point.

I don't know if this was an oversight when releasing 2006 or a conscious sacrifice, but I think it's a big pain point; it would have been great to see it all - see where the exception occurred, see the state of the orchestration at this point as well as having an indication as to where was the last persistence point - so we could tell what will get executed when we resume the orchestration.

There are quite a few good uses for HAT - it's a great tool to know what's been executed on the server over time; it's not a bad tool to take a look at the duration it took for a service to run, it's even somewhat useful to check the flow of a particular message through the engine using the message flow or the orchestration debugger view - but when it comes to helping out finding out a cause for a suspended orchestration - it is quite pointless for that reason.

So if you counted on it bailing you out when your process fails - you may as well switch off orchestration shape tracking for your processes.

Labels: , , ,


  • Hi Yossi,

    Good post, in my opinion (unless ive just completely missed it) the other huge missing piece in HAT is the ability to see which server in a group each instance (or shape within an orchestration) has been executed on.

    This makes it difficult to identify if you have an issue on just one of your BizTalk servers within the group.

    By Anonymous Anonymous, at 06/04/2008, 20:11  

  • Oh - absolutely!

    By Blogger Yossi Dahan, at 07/04/2008, 07:05  

  • Try restarting the host instance. This will force a persistence point, and you may find your issue is solved. It's a useful trick with lots of HAT anomalies :)

    By Anonymous Anonymous, at 14/05/2008, 13:31  

  • Thank you for your comment, but somewhat I doubt that would help in this case.

    My understanding is that as the orchestration has been suspended persistence has occured, and, in order to allow the process to resume, it's state has been effectively rolled back. it is no longer maintained in the host and so I find it hard to believe restart the host will help.

    By Blogger Yossi Dahan, at 14/05/2008, 23:14  

  • Oh Yossi, Yossi, Yossi, plus ca change! Just try it before you slag it off, huh?

    By Anonymous Anonymous, at 15/05/2008, 09:45  

  • I think I have, but ok - to be sure I promise to try this out...

    By Blogger Yossi Dahan, at 15/05/2008, 11:09  

Post a Comment

<< Home