Yossi Dahan [BizTalk]


Sunday, March 30, 2008

Why the New Configured Port Wizard is confusing

It is generally known, I believe, that BizTalk has somewhat of a steep learning curve.

BizTalk server is by no means a simple product, but that's ok because we tend to do complex stuff with it :-)

Microsoft, over the years, from version to version, invested a lot in making the server easier to learn and use, some improvements were made in the UI, but mostly through investments in the documentation, tutorials, examples and "community content"; I do believe this has made BizTalk much more accessible.

There are still quite a few things in BizTalk which, from my experience, tend to confuse new starters; one of them is the concept of orchestration ports and port type and the "New Configured Port Wizard".

I find that too many developers use BizTalk without fully understanding the fact that it is a strongly typed system and without understanding the relationship between ports and port types (and messages and [multipart-]message-types).

This is not helped by the fact that you can quite easily develop an orchestration without explicitly defining either.

The new configured port wizard defaults to creating a new port type whenever you configure a port; I suspect many developers never give it a second though and simply create a type for each port they use; this way you can easily create quite a few copies of the same type and not even recognize it.

further more the wizard creates the port and the port type, but only completes the port type definition when you connect the port to a receive/send shape that has a message configured (the message-type portion); I believe this further confuses people as 1) it does not make it clear that the message type is indeed part of the port type definition (along-side the access modifier and the message exchange pattern) and 2) as the port type definition does get completed when you create the link, many developers do not understand why they cannot replace the message in the receive/send shape, or connect another receive/send shape to the recently created port (because the message types may not match).

If I had to guess I would say that the reason for the way the wizard works is the assumption that it lowers the entry barrier to developing BizTalk processes as people can develop processes without understanding the concept of types) but, in my view, here lies the problem - developers produce code they do not fully understand with all the problem that creates.

Labels: ,

Monday, March 24, 2008

2 BizTalk Groups, 1 SQL Server

A couple of months ago we needed to deploy our existing solution to a new BizTalk group to serve a different client.

For various reasons we have decided to dedicate a BizTalk server for this client, so we don't have to worry about the impact of deploying changes to our existing clients etc, but to share the SQL server.

Obviously this is not the most ideal setup, but as the new client is in a completely different time zone to the existing ones, and both environment are not (yet) expected to have high volume of traffic, we could be quite confident that the SQL server can support both environments (one during the day, the other at night)

Anyway - BizTalk is quite happy to support that and you can easily configure it to use different databases on the same DB server using the configuration wizard and it all works quite well. nothing to report.

That is - until we wanted to deploy our BAM activities on the server.

For the sake of this discussion let's assume that when we've configured the first BizTalk group we left the default 'BAMPrimaryImport' database name for the BAM main database, and that for the second group we use 'BAMPrimaryImport2'.

Using bm.exe deploying BAM activities is usually a no-brainer and the tool takes all the hassle of creating tables, relationships, views, indexes etc as well as registering them all in the BAM repository.

However, the tool also generates SSIS job for each activity for purge-and-archive purposes; these jobs are simply named based on the activity name they support and are then deployed to the SSIS server (which is not partitioned by an entity such as database as far as I know), and this is where we faced a problem:

As the first group already had our BAM Activities deployed, the corresponding SSIS jobs were already deployed to the server with the generated names; when we came to deploy the activities for the second group, bm.exe went on to try and generate the SSIS packages using exactly the same (generated) names and failed saying such packages already exist.

As far as I know there is no way to control the names of these packages bm.exe would use and so we were a bit stuck.

Fortunately - changing the name of the existing jobs was fairly easy in the SQL management studio, and as they are not referred to by anything (other than a schedule to run in SQL) was fairly safe and so - what we did was to rename the packages created by the first group, so that bm.exe would create the packages required for the second group with the old name without failing.

Labels: ,

Saturday, March 22, 2008

Orchestration Statuses

I was surprised to find out there's some confusion around the possible statuses a deployed orchestration can be in, but after spending 20 minutes browsing MSDN I realised I simply couldn't find one clear description of those, so here's my attempt -

An orchestration deployed into BizTalk server can be in one of four states -

  1. Unenlisted (unbound) – the process has been deployed to the server, but is unconfigured (host and/or port bindings are not set), unsubscribed and is not running.
  2. Unenlisted (bound) - the process is configured, but is still unsubscribed and is not running.
  3. Enlisted (stopped) – the process is fully configured, subscriptions have been created, but it is stopped.
  4. Started – process is ready to run (and will do so when activated by a message)
The first two are obvious I should think, so is probably the last state, the Enlisted state however, may cause some confusion - when enlisted but stopped the orchestration's subscription is active on the message box meaning it will get evaluated for every message published. if the evaluation succeeds, the message will get queued for this orchestration but it would not start (as it is in stopped state). the messages will remain waiting until the orchestration is started or they are terminated.

Labels: , ,

Sunday, March 16, 2008

Extracting values from a message in the pipeline

Another question I was ask recently is how to extract a fields value from a message in a pipeline.

One example I’ve seen goes something like this –

XmlTextReader xtr = new XmlTextReader("books.xml");
XPathCollection xc = new XPathCollection();
int onloanQuery = xc.Add("/books/book[@on-loan]");
XPathReader xpr = new XPathReader(xtr, xc);

Then, in order to get the value the stream needs to be read -

while (xpr.ReadUntilMatch())
Console.Write("{0} was loaned ", xpr.GetAttribute("on-loan"));

you can, of course, replace the Console.Write with any action required on the data located, you should also have a check to see exactly which xpath was hit (if you have more then one in the collection)

The problem with this is, as most of you may well know, that it requires that the component reads the entire stream in it's execution.

This, actually, has two disadvantages – one is performance - assuming this is a receive pipeline BizTalk will have to read the stream anyway to write the message to the message box (ina send port the send adapter will do the same). If we could avoid the need to read the stream ourselves, and simply event on the stream as BizTalk’s internals read it we would significantly improve the performance of our pipeline.

Secondly – since we’ve read the stream, we may have a problem now when we go back to return the stream to the pipeline; some streams we might receive are not seekable (such as anything coming from the HTTP or SOAP adapters) and so we can’t simply rewind them and we surely can’t return a stream pointing at the end of the message to the pipeline. It is enough to read Charles Young’s great series of articles about receive pipelines in BizTalk 2004 (http://geekswithblogs.net/cyoung/articles/12132.aspx) to see what sort of issue you might face.

Although some of these issues have since been address the underlying problem remains and that is that by reading the stream we have to then return a “touched” stream to the pipeline which may or may not cause issues, and as we can’t always be sure in what context our component will be used (can you assume send vs. Receive? Can you assume a particular adapter will be used, can you assume port maps will not be used?) we should look for a better way to do this. Luckily such a way exists that helps in most circumstances – BizTalk’s XPathMutatorStream.

Luckily for me Martijn Hoogendoorn already wrote about it a couple of years ago, check his blog entry here I just thought I’d point this out.
Obviously this stream was designed to allow replacement of values, but nothing prevents you from setting the output parameter to the input parameter and thus avoid any changes.

As you can see in the post using this stream means you never actually need to read the message’s stream yourself, you simply need to wrap the original stream with an instance of the xpath mutator stream, add the xpaths you want to the collection and return the wrapped stream with the message. When BizTalk will read the message’s stream it will now be reading your stream which would raise the appropriate event when your xpaths are being hit. Marvellous!

From a performance point of view I have not looked at the implementation of the stream so I can’t say for sure it is faster than reading the stream completely in your pipeline, but my gut feeling says it is, but definitely from robustness perspective this solution is much better as it eliminates all the problems one might encounter by reading a message’s stream in the pipeline.

When is this solution not good – when you need an entire XmlNode. This stream would be able to return a single value (element or attribute I believe), but if you need the entire contents of an XmlNode (with child elements or various attributes) it would not server you well.

Labels: , , ,

Saturday, March 15, 2008

Creating a message "from scratch"

A question I get asked repeatedly is how to create a message in an orchestration “from scratch”, i.e. – when the message is not meant to be created from any other message.

Initially one might think this does not make sense, but I find myself doing it quite often actually; two typical scenarios are when one has to create a message before branching the process to satisfy the compiler (and avoid the “use of unconstructed message” error), a second is when one needs to return a message that simply contains few values obtained from, say, a database call, or some calculation etc.

In this case often it is simpler to create the shell of the message with the elements/attributes required and then use xlang’s xpath function to push the values into the right places.

Whatever the scenario is, there are, I believe, two options to create empty messages (corresponding to the two options to create messages in orchestrations in general, actually) –

1. Using a map
2. Using a message assignment shape (and a helper method).

The first one is quite obvious – you create a map, pick any message you may have in the process (you are likely to have at least one message in your process already) - use that as your input message and the message you want to create as your output message; then you’re facing two alternatives – the first one is to use xsl – you can create an xsl file that effectively has the XML you want to create hard coded in it and instruct the map to use that xsl.

The input message is completely ignored (and so is completely irrelevant) and the output message is always the way you want it to look like. You are not sensitive to any changes in the schema of your input message.

The other alternative, which I like less, is to actually use the mapper; in this alternative you would probably map the root node of your input message to the root node of your output message and then set up default values in the output message for any element/attribute that you wish to include in your output message.

The reason I believe this is not as good as the first alternative is that it is less obvious how the output message would look like; one has to follow up on the nodes to see what is set (or test the map) to see the output while in the xsl alternative one look at the xsl is usually enough to show what the output is going to look like.

Either alternative you choose – I used to think that using the mapper is a better option (as opposed to using the message assignment shape which I will describe shortly) – mostly because I thought this is a more standard way to create messages, and so it is more obvious, looking at the process and the project, where such constructions take place but mostly,I believed, if you’re using xsl files to create the output it would be very easy to spot, read and change them in the solution when necessary (simply find the xsl files and change them, no need to look for anything else).

Thanks to several people, but mostly to Ben Gimblett with whom I work with in one of the projects I'm involved with and who has insisted not to follow my advice and use helper methods to create messages, I now agree that using helper classes is the better way, mostly because using them you don’t have to use a dummy input message (as you do in the map) – which, I have to agree, can get quite confusing to anyone trying to understand the process but also because, I suspect (but have not tried to prove), a helper class will perform better than the mapper option.

When using a helper class you again have to alternatives –

You could have a message assignment shape in which you call a method that returns a .net type (class) that has been generated (using xsd.exe) from your schema.
All you need to do in your method now is create an instance of that generated return type – populate whatever members you require and return it.

In the assignment shape you assign the return value from the method to your message[part] and so BizTalk will take care of the serialization and will convert the class to the schema and because the class and the schema both represent exactly the same thing the serialisation would work just fine.

The benefit of this approach over the mapper option is mostly that there's no need for any dummy input message, no need to write xml or xsl; only very simple (and quite minimalist) code is required.

The downside – you need to generate those .net classes to represent any schema you wish to return, and maintain them as your schemas evolve.

The second alternative is simpler on the one hand as it does not involve generating and maintaining classes; it does, however, require a bit more wiring –

It starts the same way – a message assignment calls a method whose return value is assigned to the constructed message[type].

The difference is that the method does not return a strong type; instead it returns an XmlDocument whose contents are loaded from compiled resource within the assembly.

The function takes in the name of the xml that needs to be used to create the constructed message, retrieves the resource from the resources in the assembly, loads it into an xml document and returns it to the caller.

I find that this last approach works best for me in most cases – all the generated xmls are in one place which makes them easy to maintain (which I liked in the xsl option), there’s very little co-ordination that’s required – only the name of the xml file (or any other key one wishes to use) must be known to the caller and the xml resource should match the schema – but as it is stored in one location AS XML this is very easy to achieve and maintain.

Update: I've done some performance comparison between the different methods, the results of which can be read here

Labels: , ,