Yossi Dahan [BizTalk]

Google
 

Wednesday, October 29, 2008

About "Dublin"

Microsoft have announced "Dublin" in September, but up until now (PDC) there' has been very little information about how that is going to look now.

Over at PDC are finally able to see some sessions as well as visit the various .net booths and get first hand look at the "bits" and ask questions about the new technologies.

However, at this time of writing this, I have not yet attended any Dublin sessions, which really means anything I write here currently is stipulating on some stuff I've seen and heard "in the corridors of PDC".

I have, however, spent enough time at WCF/WF booth to probably make them call me various names behind my back (I'm sure) :-) and I think have got at least an idea of what Dublin is.

Over the next couple of days we will hear and see more, as well as get the chance to play with "the bits" handed over yesterday afternoon, so I'm hoping to be able to give a fuller view of what it is (and correct any inaccuracies posted here now)

So - clearly at this point I can only speculate based on the stuff I've seen, so you can think of this post as me sharing with you the (slow) process of learning what Dublin is (or, of course, you can stop reading now).

"Dublin". to begin with it is an "add on" (or extension, if you prefer) to IIS manager; once installed you get another set of options when administrating a virtual directory.

These options, as you can imagine, are directly related to configuring, managing and monitoring WAS hosted WCF services; and so here's the first important point - Dublin is not a new host, it is a management layer on top of the existing host - WAS.

Another point is coming out of this - and again I'm stipulating and mostly trying to make sense of the stuff I've seen over the last couple of days - if "Dublin" is not a new host - where does WF fit in? well - MS are pushing really hard the message - WCF+WF (or WCF activated Workflow).

Until now I mostly thought of WF as a workflow capability one can add into one's application as an integral process; sure WF can work with WCF very well and there is , and have been for a while now, a good story around this and several examples; but, I guess coming from my BizTalk perspective, I thought that if you needed a long running, WCF activated, process you'd use BizTalk; now MS are pushing a different alternative - WF+WCF solution hosted in WAS.

The thing is that this is not a new option; as I've said - it has been around for some time now, the difference is mostly strategic I think - there's still a lot of place for BizTalk (more than I though last week, admittedly) - but if you need a fairly light weight solution for WCF activated workflow this is definitely a good option.

Back to "Dublin" - if WCF + WF have been around for a while - what is new? well - as I've suggested the main feature you'd see or hear about is administration - using Dublin's extension to IIS you would be able manage your deployed services more easily; this includes things like configuring the tracking and persistence databases, configuring tracking settings on your services (and workflow), and - I'm told - possibly configuring a lot of aspects of your endpoints (although this was not demonstrated).

So - one way to think about it, and I hope I'm not doing any mis-justice, is an improved and extended WCF configuration editor embedded into IIS (as well as supporting some aspects of WF configuration).

Another aspect, possibly one with a higher impact, is the ability to view long running instances; as MS expected you services to activate WF (there's little benefit for using the Dublin features otherwise, but the configuration tools are still very much relevant)

As your workflow runs it may wait for external events, sometimes for minutes (or hours, or days), in which case it is more than likely (one would hope) that it will get persisted using  a persistence service; Dublin has a UI over that service (or is it the database? in which case it is limited to the out-of-the-box persistence implementation) which allows you to view all your running (and "dehydrated" workflows; this is something we BizTalk guys take pretty much for granted but is a blessing for anyone serious about using WF.

If you had WF tracking enabled you could drill down into your tracking to see exactly what the WF have been through already, which would be very useful, but at the moment it is a bit rough - you get effectively the contents of the database displayed on screen, nothing like the visualisation WCF tracing has to offer.

I'm told that there are thoughts (or was it concrete plans?) to take this are further and look at combining the WCF and WF tracing/tracking and offering a better, consolidated, view of the data, one that actually helps making sense of it all; but that is not there is this very early version.

There's a bit more to the administration capabilities, but that gives you, I hope, an idea of what's on the cards - better management of WCF+WF scenarios on WAS.

In addition to that a few other, quite cool, features are talked about; for example "Dublin" introduces an ability to add a forwarding service to the solution; this is a pretty bog standard WCF service that get's added to your solution as an SVC file (specific to the scenario you are building), but it uses an implementation provided by MS internally (references assembly, presumably); I'm told you would configure some rules that will route incoming requests to your back-end services (currently these rules are configured through the config file, but clearly one can imagine this being managed through the IIS administration console as well).

Basically the service exposes and endpoint which you can configure; this would be the endpoint exposed externally to your services' consumers; on the other end it would consume your back end services  using the same or other bindings (power comes with responsibility!); in between it would runs some logic to be provided by MS, that would evaluate some rules (to be provided by you) to determine where the request should go to; I'm not sure what this rules would include at this point, but the sample I've seen included some configuration that included xpaths to elements in the request and expected values which suggests support for content based routing.

Again - this brings a very power feature from the BizTalk world to the WCF/WF world which is a blessing, but with the lack of a message context and pipelines in WCF things are bound to look a little bit different.

This mechanism might also be useful to handle correlation scenarios when load balancing the server farm, but I'm not sure if there isn't a built in mechanism for that separately.

I have not yet seen, but I understand that "Dublin" will introduce some support for load balancing, mostly around long running scenarios where requests have to resume with previous state; I understand that this might be done by routing the request to the machine holding the state, or moving the state to the machine processing the request (or both?), details around this seem to have not been finalised (or made public) yet.

As I've said I'll try to provide more details as I find them, and as I start to play around with the copy I've recieve yesterday (which won't happen before next week, realistically; I need to get *some* sleep!) , but - my bottom line for now - if you were expecting a revolution on how you will be designing, configuring and mostly hosting, your WCF and WF solution (over all the changes in .net 4.0 to those technologies) you might think this is bad news, but the reality is that, quite simply, no revolution was needed - you could do all this stuff before, and now it just got easier;  but mostly - in my view - Microsoft are signaling the direction they wish to see these technologies go; sure three's a lot of place for WF inside your application; "Dublin" shows three's a lot of place for it outside them as well.

One last note - there's a lot more everyone need to digest as a result of the various announcements in PDC (mostly Azure and the cloud services platform), and these are strongly related. the WCF activated WF, hosted in WAS, the forwarding service with its content based routing capabilities, load balancing and better administration are key to utilising the .net services in the cloud and extending your workflow out of your app to your enterprise and into the cloud.fascinating stuff!

Labels: ,

Monday, October 27, 2008

Lots more information is now published on Oslo

Now that the main announcements in PDC are happening MS are starting to release a lot more information about it all.

check out http://modelsremixed.com/ as well as http://msdn.microsoft.com/en-us/oslo/default.aspx

Labels:

At PDC MS just announce Azure and a set of online services

I'm lucky enough to attend PDC this year where Microsoft have just announced Windows Azure - the O/S for "the cloud" as well as a set of online services to be released.

Both are very big and very exciting and, naturally, very much related; as Ray Ozzie said one thing you could clearly say about Microsoft is that they have always been one of the biggest clients of their own technologies, so when they are talking about an releasing an O/S for the cloud and a a set of fairly extensive online services (yet to be seen) you can expect a big correlation between the two technologies, each influencing each other over the next few months (and beyond).

Azure, to be used by Microsoft in its own data centres, and to be made available to the paying public via commercial agreements that would be based on a combination of resources required and SLAs agreed (and met!) would allow companies to deploy their solutions (web app was one thing briefly demonstrated) to "the could" or - if you prefer a more concrete definition - Microsoft's data centres - first in the US and then worldwide, through a portal like admin console you can deploy solution developed and tested on your local development environment not much differently that any other project you would have done before; this is crucial if adoption is to be wide - and it seems Microsoft are keen on, and are on the right track, to keep familiarity, and thus productivity high as well as, obivusly, integration with existing tools.

Commissioning of more resource is a case of tweaking settings on the portal (and dishing cash, of course).

It will be very interesting to see how this get's adopted outside Microsoft, the key motivation being of course providing scalability and redundancy to applications deployed at a fraction of the cost otherwise required, as well as high flexibility in both these fields (supporting peak times, for example); but also, quite possible, simply lower cost of hosting and running the applications (for the smaller businesses?)

Even more interesting is the idea of online services - obviously these will all be hosted in the same data centres running on the same O/S so all the questions around those apply, but another later of considerations is added - what will be the capabilities of all these services? how flexible will they be? how will they perform and what would the learning curve like? how trusted can Microsoft be to make companies safeguard possible their most precious data is their data centre? and processes?

Ray Ozzie mentioned we're entering the fifth generation of software, the era of the "web tier";  there was a lot of hype about "the cloud" recently - some of it good, some of it bad - all of it suggests that we're at the brink of a big change;

I've decided to step out of the "announcements" streak and look more closely about what, I would imagine, would be the first question everyone would (or at least should") ask: how will this be secured.

I'm expecting that Kim Cameron's session on the "identity roadmap for software + services" will provide a good window into some of the aspects that need to be considered, and as I've been doing quite a lot of work recently on federated identity (posts to come) I'm particularly interested in this topic at the moment; enough to convince me to skip the "Lap around cloud services" happening next door.

Stay tuned.

Labels: , , ,

Wednesday, October 22, 2008

Business Rules Engine - Have you say!

Microsoft published a survey regarding the business rules engine; what do you like about it? what don't you? how do you use it? have you say here

Labels: ,

Saturday, October 11, 2008

Fun with Message Creation in BizTalk

Back in Match I posted this entry about creating messages "from scratch" in BizTalk.

The post started a bit of an online discussion and a slightly more intensive offline discussion about the various ways to create messages and the differences between them.

As part of that discussion, Randal van Splunteren and I have exchanged some emails and Randal took the time and effort to create a test solution to compare the performance characteristics of the various methods which I have helped validating.

Randal has been kind enough to let me summarise our findings in this blog (and it only took me 6 months...buy I have my excuses) so here it is -

The scenario we've used to test is as follows -

There is one main orchestration that takes in a ‘command’ message using a file receive location; in this command message you can define the method to create a message:

  • Map with Defaults (1)
  • Map with xsl (2)
  • Assignment with serialization (3)
  • Assignment with resource file (4)
  • Using undocumented API (5)

The first four options create messages according to the four methods I described in my blog post; the fifth one uses the CreateXmlInstance API suggested by Randal as a comment on my original post.

In the command message you can also set the number of messages that must be created;

Finally you can set if the method should use caching; we've implemented a very simple caching mechanism for the assignment and undocumented API methods (caching the generated instance in all three methods so it can be re-used in subsequent calls); for the map methods the caching parameter is ignored because BizTalk has its own caching for those methods.

When a particular test is finished the main orchestration writes out a ‘report’ message (again using file adapter) which contains the number of elapsed ticks the test took.

I've ran all the scenarios 5 times and averaged the results, between each test I have restarted the host to get as much like-for-like comparison as I could, so these numbers would not reflect true runtime performance of a live server but only the difference between the approaches; initially I ran all the tests creating 1 message at a time, here are the results:

msgs Map using defaults Map using xsl Assign using serialisation Assign using resource Assign using API
1

13,243,663

12,687,314

8,153,346

8,135,461

36,374,565

1

13,385,005

12,888,630

6,905,139

8,620,287

36,468,805

1

12,837,338

13,943,338

9,272,362

8,815,033

37,723,069

1

15,630,602

13,298,954

6,679,173

8,027,708

35,877,260

1

12,729,576

12,765,337

7,113,975

9,174,668

36,919,198

Avg

13,565,237

13,116,715

7,624,799

8,554,631

36,672,579

or to put it graphically - image

Then I ran all the tests again, this time creating 100 messages at a time -

msgs Map using defaults Map using xsl Assign using serialisation Assign using resource Assign using API
100

15,195,199

15,254,912

9,158,223

8,951,018

231,352,547

100

14,421,621

16,523,637

9,259,892

8,700,856

226,704,695

100

15,199,198

15,010,499

8,476,670

10,222,202

232,357,798

100

16,725,023

15,684,085

9,110,269

9,866,252

227,806,462

100

15,349,885

14,475,857

9,101,879

10,295,228

226,928,786

Avg

15,378,185

15,389,798

9,021,387

9,607,111

229,030,058

image

Last I ran the 3 non-mapper versions with the caching enabled -

# messages Assign using serialisation (cached) Assign using resource (Cached) Assign using API (cached)
100

9,696,044

9,478,015

41,350,100

100

8,288,120

10,087,574

37,410,620

100

9,156,289

10,473,718

36,493,118

100

8,715,621

10,001,671

40,628,198

100

8,289,295

9,951,817

37,919,237

Average

8,829,074

9,998,559

38,760,255

image

So, what I have spotted?

well, to start with, comparing my results with those Randal had I learnt that my laptop is much slower then his machine...(but you can't see that from the results, nor, I suspect, do you care...)

But seriously -

  • It is interesting to see how, with the exception of the API scenario, there is very little difference between the generation of 1 message and the generation of a 100.
  • It is quite obvious that the API call is much slower then the rest, but that does not surprise me considering the amount of work involved (getting the schema from the database, generating the instance off the XSD retrieved...)
  • For that reason, it is also quite obvious that this method was the most beneficial from the use of the cache (but was still significantly slower then the others) as the cache prevented the repeating access to the database and the xml generation.
  • On the same token, caching did not make a very significant difference in the other scenarios, but again- I wouldn't consider that surprising (as there's very little work involved)
  • And of course - it is clear that using assignment shape to create messages using either serialisation or a resource file is indeed the fastest way (serialisation being a little faster on my machine)

I hope you find this useful and again - many thanks to Randal for all his effort in helping me get this out.

Labels: , , ,

Sunday, October 05, 2008

Wish list: Transforming unknown message

The other day I ended up developing a process that would take one of two message types (using XmlDocument as the underlying message type), something I don't usually advocate, but I agree of course that occasionally it is the right way to go.

Naturally the first thing I wanted to do is to convert the message from either format to a single message type.

I normally write my own xsl (as opposed to using the mapper) and so I knew that the template-based xsl scripts are geared towards this type of work through the use of apply templates matching logic;

In theory I should be able to create a map that takes a message of type XmlDocument and spit out a single format; the xsl script under the covers would use apply templates to match the possible root nodes and execute the correct template to map to the target format correctly.

This is only theoretical though as the compiler does not let you build a project with a btm file that has System.Xml.XmlDocument as the input type; a schema (type) must be selected.

Further more - the orchestration designer does not let you select a message as an input or output of it is not strongly typed (although that's just a designer issue, if you edit the ODX directly the compiler is happy enough, not that I suggest anyone do that)

This leaves the following alternatives -

You could write your own code to run xsl scripts, and call that from your process passing in the XmlDocument.

You could, of course, have multiple maps - each with a different input message type- all using the same xsl underneath the hood.

You could decide to hack a little bit and create your btm with any schema as the input type (selecting one of the alternatives for the input might be a good idea, or a specially created schema to indicate the scenario at hand); you could then use the transform function in an expression shape to run that map passing in the XmlDocument message as the input message;
You could do that because the transform function does not actually validate the input (or the output) against the schema at runtime, only the out-of-the-box transform shape does that, so as long as you avoid it you're ok.
Of course it has the implication of having a btm file that does not correctly represent the reality (in terms of input message), but I guess for some its easier to live with that than for others...

Labels: , , ,