Yossi Dahan [BizTalk]


Monday, December 31, 2007

Wish list: searching for expression in subscriptions

And while I'm on the subject of subscriptions (see my last post) -

In the BizTalk Administration Console you can run a query to view subscriptions (which is a huge step forward from the old subscription viewer for BizTalk 2004, I have to admit)

You can, however, only filter based on the subscription type (activation or correlation), the service name or the service ID.

What that means is that while you can find all the subscriptions that start a particual orchestration, you can't find all the orchestrations that would be started by a message arriving through a particual receive port, which would have been quite useful, don't you think?

What's even worse though, is that even if you were happy to do a lot of manual work, there doesn't seem to be any practical way of finding out this information using the admin console, although it is all there in the database -

When you bind an orchestration recieve shape to a receive port, the subscription is is created for you using the receive port's id rather than the port's name, so it would look something like -

http://schemas.microsoft.com/BizTalk/2003/system-properties.ReceivePortID == {E1E7FE08-D421-4D5A-8CD8-CA51E25FA508}

This is not very readable, but, unfortunately there is a much greater problem with it -

There doesn't seem to be a way to know a receive port's id through the admin console; so even if you did have the patience required to go through all your receive ports, matching their GUIDs to the one in the subscription hoping to figure out this way which ones will get your orchestration activated, you would find it impossible to find out which receive port actually has the id -'{E1E7FE08-D421-4D5A-8CD8-CA51E25FA508}'

The only way to find this out is to go to the management database and lookit up yourself in the bts_receiveport table, baring in mind, of course, that this id will change on the next deploy.

Labels: , ,

BizTalk's Pub/Sub

The publish/subscribe mechanism in BizTalk is one of the key features of the product and is very useful and powerful.

I guess there's some learning curve around it, and that most first implementations of any BizTalk developer do not make much use of it (as they often start with all ports being directly bound) and that it takes a while befor a BizTalk developer and the organization involved establish a good architecture and move more towards using true publish-and-subscribe, losely-coupled, approach to implementation.

However, the existing model is not perfrect; in my view (and I suspect it is shared by many) it has two main weak points -

  • The pub/sub is implemented on top of MS-SQL which introduces a significant performance overhead

  • The orchestration subscriptions are 'compiled' and cannot be configured withouth a build-and-deploy cycle

  • The first point is quite an obvious one - there would be a latency associated with any implementation of publish/ubscribribe mechanism;. in the BizTalk case it involved writing the message and it's meta data (context) to the message box (a SQL database) and having a separate process locate newly published messages, figuring out which subscribers need to receive a copy of the message and manage the activation/correlationthe of message-process interaction (as well as keeping a list of references for house keeping etc).

    Reading and writing to the database, the the polling interval of the subscription evaluation process, etc. all introduce latency, which, in certain scenarios, can be crucial.

    If to believe the fractions of information floating around regarding Oslo then we might see an in-memory pub/sub mechanism in future version of BizTalk (in addition, not as a replacment to the existing model I suspect) which, while will no-doubt come with a price (persistance, and therefore scalabiltiy and durability to some extent), will no-doubt make supporting low-latency scenarios much easier.

    As for the second point -

    At first look the pub/sub in BizTalk is very flexible; in all the BizTalk demonstrations I can remember from the past the presenter would create a recieve port and a couple of send ports and will edit the subscriptions of those ports in the administration console to show how easy it is to create content-based routing in BizTalk server and configure it at runtime.

    In BizTalk 2006 you even did not have to restart the host to speed things up (as you did in demos with 2004), it happens pretty much instantly.

    However, the case with orchestrations is not that simple...

    The subscription for orchestrations is specified as a a filter in the properties of the initalizing receive shape in the process; this gets compiled into your assembly together with the process, and will be used to create the subscription when you deploy the orchestration.

    As far as I know, short of manipulating the management database yourself (which would not be supported) there's no way to change those subscription at runtime.

    If you want to change the subscription you have to change the filter in the orchestration, build, undeploy the old version and deploy the new one (or version the process and perform the side-by-side deployment)

    This is, in my view, an un-necessary pain, in dynamic organizations (aren't they all?) that require changes often; and to that extend developers had to find a solution to the "I need to be able to change that subscription from outside the process" requirement.

    That solution is often adding some routing metadata to messages in the form of context properties ('nextProcess', 'Operation', etc.) which would be set by publishing processes and/or pipeline components and use these in the filters (rather than the actual content data).

    So you could often see a pipeline component, often driven by some external configuraion, that would check for certain bits in the message or it's meta-data and set these context properties based on the values it found; the premise is that pipeline components are easier to replace, but also - thesee components often use database or a rules engine in one form or another to decide what goes into the message context and by doing so introducing real flexibility as is advertised.

    What all of this means is that we, developers, end up developing a pub/sub mechanism on top of the existing pub/sub simply because we need flexibility the product does not provide.

    I don't like this apprach, but I end up doing this myself occasionally, simply because I have to.

    I could possibly understand why MS has decided to do so - there are benefits to editing the subscription expression within the orchestration (known types would be one thing), and also - one could argue that the process subscription is part of the process design and so changing it is likely to involve code changes as well which will require a re-build, but really - I think we would all have benefited from the ability to edit the orchestration subscription in the same way we can edit send port subscriptions - through the admin console.

    Labels: , ,

    Thursday, December 27, 2007

    The message box as a service boundary

    For the last 18 months or so I've been working on a very exciting, and quite large, BizTalk implementation here in the UK, I'll leave the full details of it for now, but I can tell you that it involves all the nice buzzwords we keep hearing about SOA, SaaS, S+S, ESB at least to some degree (and with various level of quality, if we're honest)

    Anyway, as you can imagine we're using web services quite extensively - we expose a lot of them, and consume even more; some are internal to the company (but cross teams, although not so much platforms) and many are external (which do cross platform as well)

    The reasons to use service oriented architecture should be very clear to everyone by now, as are the famous four tenants of SOA.

    In out implementation we've abstracted the calls to all the internal web services through utility orchestrations which would take a message in our canonical format , convert it to the service's format, call the web service and transform the response to the canonical format before returning it to the calling process; this way we can re-use those transformations, and have a central place to deal with each request, apply error handling, etc.

    From the parent process we then use call orchestration to initiate these utility orchestrations passing in the request cannonical message and receiving the response canonical message as an out param, which is quite efficient (when initiating orchestrations through the call orchestration shape the request does not go through the message box)

    As we're doing the transformations in these utility processes, we consider them to be in the boundary of our process, and not, obviously, within the boundary of the called service, for this reason we call the web service from the process rather than the actual assembly behind the web service.

    When we, within the utility process call the web service, what actually happens is that the request message (now in the WS' format) gets published to the message box, being picked up by the send port which would pass it to the SOAP adpater which, in turn will serialises it and transmits it over the wire to the service; the service then deseralises the message on the other end before executing whatever code needs to be executed and the entire process now repeats in the opposite direction.

    In this case the service boundary is the web service endpoint.

    A few weeks ago I had what I thought was a brilliant idea - why not treat the message box as the service boundary!?

    If I had a process that takes in the service's format of the message using a directly bound receive shpae and a filter, execute the code internally (as we're now inside the service boundary we can use the service code directly from expression shape, no need to go through a web service) and when finished publish the response back to the message box (in it's own format), I could have simply published a request message for that service, and get the response published back for me; correlation should be used, but this can be handled using self-correlating ports or a correlation set.

    The client process would do pretty much the same - it would use a utility process to transform the canonical format to the service's format and publish the request. it would then use correlation to receive the response and transform it back to the canonical format before retunrning it to the calling process(synchronously).

    What would we save? - following this approach for at least some of our internal services can save us the need to serialise the messages over the network; in the web service case we have to go through the message box from the process to the send port anyway, so going through the message box from one process to aonther would not make a difference, but all the network traffic and the work by the SOAP adapter (which is far from being efficient) can be saved.

    This was a good idea (I thought anyway), but I suspect it won't work, as it has two main flaws (and I will be extremely happy to get some ideas around those) -

    Firsly - both subsystems will need to exist on the same BizTalk group so that they share the same message box and so we could use pub/sub to exchange messages between them (on it's own this is not necessarily a problem, but it is the main cause for the next one, which is the big one)

    Secondly - the schemas will have to be shared -

    When you're adding a web reference to a web service from a standard .net project a proxy gets generated for you; that proxy will include a local version of all the classes used by the web service (these will be in YOUR code namespace rather then ther service's but will serialize to the same XML).

    Equally - when you add a web reference in a BizTalk project, you get schemas generated so you can create messages to send and receive to/from the web service; these will be in the service's XML namespace as they have to represent the XML supported by it, and here lies the problem.

    If both the service implementation and the client implementation are on the same BizTalk group, the schemas will have to be shared as there's no way to deploy two schemas using the same root node and namespace and we all know that sharing schemas is a bad idea as it strongly couples the implementation together and that pretty much renders the idea useless (this, confusingly I suspect, means we're sharing a class and not a contract).

    Of course one could play around with the idea of having two BizTalk groups and communicating between them, and although you can choose better transports than SOAP for that internal communication I suspects that brings us closer to simply calling the web service and so I'd rather stay with that standrad approach.

    Labels: , ,

    Mapper vs. XSLT round 2

    I've received a good question today -

    "we had a little debate in the office today - what is faster - running a map with pure xsl or the standard way with functoids, what you think?"

    As I've
    blogged before - I'm a big supporter of writing custom XSL and not using the Mapper and Functoids in anything other than the simplest of maps; so - although performance is only one of my arguments - the answer should be obvious.

    Nevertheless I'll take the chance to answer properly again, although I suspect the question is not accurate enough -

    At runtime there's no difference between the two; the Mapper generates XSL (which you can see by "validating" the map in visual studio and following the link to the XSL file generated which would appear in the output window, so the question should be, in my view, whether the Mapper can generate as good XSL as a developer could, but as you can imagine the answer really depends on a particular scenario - how many functoids are you using? how are they working together? what's the size of the map? what's its complexity?

    Anyway, in my view there is a bottom line answer to that question and that is that under most real-world scenarios custom written XSL will almost always be better than generated one, but I'll try to explain a little bit more -

    When you're using Functoids in your map you're generally doing one of two things - you're either calling external assemblies or you're adding some XSL lines to perform some actions for you.

    The former one is easier to tackle - if you need external assemblies you can call them from custom XSL as well (as I've explained
    here ); as the Mapper will do exactly the same, the performance impact will generally be identical in both cases (using mapper or custom XSL).

    The latter is harder to tackle, as there's no one-rule-fits-all statement one can make - but here's a shot at it -

    The Mapper is a visual, generic, designer that generates code.
    As is always the case with these tools they come with a price, and that price is often the quality of the code generated; now - don't get me wrong - I don't argue that the Mapper is bad, or that it always generates bad, slow XSL; but if you know XSL well, there's no doubt you will write better code than a generator will.

    When you're adding a Functoid that does not call an external assembly you'll be doing one of three things -

  • You will be adding an embedded c# code - most Functoids do this, look at the string manipulations as a simple example.

  • You will be adding a template based on input nodes - the Looping Functoid for example.

  • Or - You will be adding XLS structures or functions - the record count or value mapping Functoids for example

  • All three are perfectly fine, and even more so - if you'll try them out you'll see that the designer does generate quite a nice XLS in all cases.

    The problem starts when, and this is inevitable in the real-world, the maps get more complex.

    Once you move out of the playing ground and into real scenarios, the maps get more complicated and the inefficiency of the generated code becomes both more apparent (as multiple Functoids need to work together to achieve the desired output the XSL gets 'uglier and uglier') and that inefficiency becomes a greater problem as it is repeated many times over a large-ish map.

    Bottom line is from my perspective - if you feel comfortable with XSL (and the rest of the team) - you will always achieve better scripts than any generator would so use it. If you don't feel comfortable with XSL - learn it! It's easy! (and in the mean time use the mapper).

    Labels: , , ,

    Tuesday, December 25, 2007


    I've never browsed to http://www.topxml.com/ directly and so, although I must have been directed to it hundreds of times through searches in Google and the like, I never actually paid any attention to it.

    Recently though, Mike Stephenson, a great person and a very smart cookie, has pointed them out to me, and not in a very flattering way -

    Mike has found an article he published on his blog appearing in TopXML without any reference to his name; judge for your self - here's the TopXML link and here is the link to Mike's original post

    To makes matters worse, when Mike - who's a nice bloke as I've already mentioned - pointed this out to TopXML politely he received he following (rather stupid) response -

    The blogger (you) is clearly listed at the top of the page
    "Blogger : Geekswithblogs.net"

    and at the end of the post we provide another link to you:
    "Read comments or post a reply to : What is returned from my 2 way
    port when a fault is used?"

    I am providing TWO clear links back to you and giving credit to you.
    We're increasing your fame! :)

    Hmm...makes sense of course...(or it would have been if Mike's name was indeed 'geekswithblogs' but I don't believe it is)

    That kind of started me thinking - from a blogger point of view - one could argue that, assuming your name does indeed appear next to your content when it's being aggregated or replicated in other web sites, there's little harm; if anything - you might be getting more exposure (assuming TopXml has higher rank is earch engines than one's blog)

    On the other hand, seeing my content in sites like TopXML takes it out of it's context. it does not appear next to other (possibly related) posts of mine, next to links to other content on my site/blog, next to my about box (and the MVP logo), next to my favourite links and references to other bloggers, or whatever else I chose to put on there.

    Further more - if I was trying to make money out of advertisements on my site, it denies my of that benefit as well, so to put simply - copying other people's content - without permission - in a systematic way (I'm not about the odd blog post which refers to someone else's post) with or without referencing the source is very wrong in my view.

    Wednesday, December 12, 2007

    Exception, Orchestration, Serialisation.

    I was adding a custom exception yesterday to a helper class I’m calling from an orchestration.

    Usually, my exception handlers in the orchestration are quite short; this time, however, I wanted to do a bit more, which included calling a web service when a particular expcetion is caught.

    While implementing this I learnt something interesting (which, arguably, I should have known a long time ago – just to show how difficult it is to catch up on all the changes in the .net framework) -

    In.net framework 2.0 a Data property of type IDictionary was added to the Exception class, which by it's own is not a problem, only that IDictionary is not serialisable and so could have proved rather difficult to anyone using Exception, especially in a BizTlak environment.

    Luckily (but not surprisingly) the .net framework team have implemented ISerializable in the Exception class, which helps, but does cause a small headake to the unexpecting BizTalk developer (me).

    But first - I have to apologise - again I'm not familiar with all the details around this, and am resorting to pure guesses of a couple of points (will be happy to get more information if you care to enlighten me), still - I'm sure this will be useful to most people...

    When you mark a class as [Serializable], as the runtime deserialises a class it attempts to call a parameterless constructor to create an instance of the type; the serialiser will then populates all the members of the class through their public properties (I suspect that this is, partly at least, why Xml Serialisation serialises public members only).

    When working with ISerializable, however, the runtime expects a constructor that takes SerializationInfo and StreamingContext as parameters; it is expected that the constructor will populate the members out of the SerializationInfo collection.

    I believe that the runtime interrogates the type to be deserialised and, once it finds that the type or any type in its inheritance path implements ISerializable it takes the second approach mentioned.

    Not realising the Exception class implements ISerializable ,I did not have the expected constructor in my class, which meant that when BizTalk tried to deserialise the object (between the send shape calling the web service and the receive shape expecting the response) it failed, which now exaplains the error reported in the event log -

    The constructor to deserialize an object of type ‘[custom exception class name here]’ was not found.

    Adding the constructor with the two parameters to my custom exception class allowed it to pass the deserialisation with no errors; however – I was now facing a second problem – after indicating that my class implements ISerializable and addin the constructor required the members of the Exception class, from which my class inherited, including the Data member now deserialised correctly; my own class' member,however, did not.

    There are two ways to overcome this - I could have simply mapped my properties (after all I only had a couple of strings to keep with the exception) to the Exception's Data property (have the getter and setter of each property use the collection internally, and so all the data will be capture in the Exception base class and so serialised with it, or - I could implement ISerializable fully which really only means

    1. Firstly - adding my members to the serializationInfo member of GetObjectData:

    public override void GetObjectData(SerializationInfo si, StreamingContext
    base.GetObjectData(si, context);
    si.AddValue("member1Name", member1Value);
    si.AddValue("member2Name", member2Value);

    2. Secondly - populating the members back in the constructor:

    protected MyCustomException(SerializationInfo info, StreamingContext context)
    : base(info, context)
    member1Name= info.GetString("member1Value");
    member2Name= info.GetString("member2Value");

    Voila! it all serialises and deserialises ok now. if only I didn't have to spend a whole day to figure this out!

    Labels: , ,

    Monday, December 10, 2007

    Wiki is coming to MSDN!

    This has been talked about for a while now but I've only seens it in action now -
    MSDN now supports wiki at parts - and most importantly for anyone reading this blog (I suspect) - in the BizTalk 2006 R2 documentation.

    Check out this page and see how Eric Stott has kindly pointed out a huge improvement that was mentioned almost as a by the way statement in the docs. way to go Eric!

    So here ya go, now there's an easy way to share thoughts, ideas, additions and corrections to the msdn content. brilliant!

    Labels: , ,

    Another easy way to get the list of BizTalk processes

    When you have more than one host instance running on a box, you need to find the correct BizTalk instance to attach to when you wish to debug a component.

    Last week Oleg has found yet another elegant and simple way to do this -

    tasklist /SVC /FI "IMAGENAME eq btsntsvc.exe"

    run this from a command line or add as an external tool in Visual studio (to get the result in the outout window) and you will get a list of all the BizTalk hosts with their description and process id, so you'd know which one to attach to.

    Labels: , ,

    Thursday, December 06, 2007

    So is this how web development's going to look like?

    Yesterday Microsoft Labs announced the experimental toolset codenamed "Volta".

    Using the name of another pioneer inventor (Alessandro Volta), What is Microsoft trying to hint?

    Well - Volta is described as "an experimental developer toolset that allows developers to build standards-conformant, multi-tier web applications using established .NET languages, libraries and development tools"

    From the little I've seen, it brings web development even closer to any "classic" .net development, bringing even greater separation between code and presentation on code side, while at the sime time allows an even smoother user experience, and - from a developer's perspective - it allows writing code first, and deciding whether it should be executed at the client or on the server later.

    The idea is that you write you code, with everything running on the client (which makes it easier to debug), and - as you get closer to the release - you move some attributes around and code will be refactored to be split to execution between the client side and server side.

    Naturally this means that Volta kindly takes care of all the communication and security code between the tiers, which removes the need for the developer to deal with all that "plumbing"; they even throw in some instrumentation code that lets you view the trace of the execution between the client and the server using the WCF Service Trace Viewer Tool.

    So - in a somewhat simplified statment - Volta is about refactoring your code to split it between tiers, and is then about hosting the server code.

    Hosting, I believe, is done a this point in a "Volta Server" executable; I'd expect this to evolve significantly as Volta matures into much more robust hosting options, and as MS already done a lot in this area quite recently it's not difficult to see where this is going.

    The refactoring is done on the MSIL code generated during the build of your project, and not on the actual code, which is a nice approach (and means you can definitely use any .net language (as they should all end up with the same MSIL, right?! :-) )

    The goal of Volta, and the reason most of this is happening "behind the scenes" is to reduce even further the amount of things we have to deal with when building multi-tier web application. is this going to work?

    Like most develpoers, I guess I'm a little bit of a control freak when it comes to my code, I don't even like code generators and wizards, so the thought of something taking my code and fiddle with it - split it, add a bit of this and a bit of that, and generate client side javascript to describe my classes is a bit scary; but then again - isn't that just a normal phase one has to go through in the face of innovation?, well - I guess it depends on how good the innovation is :-) I'll have to wait and see how this evolves.

    There's no doubt in my mind that the code generation bit, and probably the client side more than the server side, is the achilles heel here, and with Volta only being out a few hours, people already started complaining that the libraries are too big, they are to slow etc.

    What's important to remember, when looking at all of this, is that this is still very much experimental - as far as I know (but I don't know much :-) )there isn't a product roadmap yet that has Volta as a clear part of it.

    Think of this preview as a way to get involved early with stuff MS are playing with, and if you do - write about it, get as much feedback as you can out there, it will only help MS get the feeling of what's working and what's not working, to make sure that when it does find it's way into a product of some sort it will deliver.

    So - tt is not surprising that, even by Microsoft's own admission, Volta is not yet optimised; the javascript generated is not the most efficient or most elegant that can be created, and probably the same can be said on the server side code and the communication layer (but I haven't looked, so I can't possibly comment).

    This will definitely improve as MS keeps working on this, and as feedback is provided by us.

    Go and play!


    Saturday, December 01, 2007

    Techies needed for work at Microsoft

    Microsoft are recruiting techies for the centre in Dublin, might this be for you?
    check out www.joinmicrosofteurope.com