Yossi Dahan [BizTalk]

Google
 

Sunday, September 27, 2009

Don’t be scared of custom disassemblers

During the last few weeks I’ve been asked to review two separate projects, for two separate companies, developed – naturally – by two separate teams.

The two things both projects had in common were that they both had to deal with legacy “flat files” and they both chose to process these files outside BizTalk using custom code.

In both these cases I completely agree with the decision to use custom code to parse the incoming files – as good as the out-of-the-box flat file support in BizTalk is (made significantly better with the introduction of the flat file wizard, once one gets the hang of using it) – there’s no avoiding from writing custom code to parse flat files every now and then – some file formats are pretty challenging with different records types, conditional records, interleaving segments etc.

I do not agree with, however, the decision to perform this custom parsing outside BizTalk.

I’m pretty sure I would not even bother posting this point had I not seen two of these in the same month, but the fact that I did suggests it may be worth posting a quite note.

One of the projects had the code in a console app, called from a windows scheduled task; the application would pick up files from a folder, parse them, and drop the xml representation in another folder, for BizTalk to consume.

The other had a windows service monitoring a folder and pick up any files, parse them to a different, simplified, flat file format (!), and drop them in another folder for BizTalk to consume.

Both of these introduce another component to the mix; such component needs it’s own error handling, it’s own monitoring, deployment strategy, operations manual etc. similarly it includes a fair bit of re-inventing the wheel – writing code to monitor folders, read files, and write files – stuff that BizTalk is doing out of the box.

What would have been the correct approach then? quite simple – a custom disassembler in the receive pipeline -

Writing a custom disassembler it quite simple - at the end of the day, it boils down to developing a class library, which implements a few simple interfaces, the main one – IDisassemblerComponent defines two methods - Disassmeble and GetNext (the other interface are even simpler, almost insignificant in terms of effort)

Disassemble gets the source message and potentially parses it up front, GetNext is called repeatedly by the pipeline to receive 0 or more parsed messages, until it returns null. simple.

In one of the projects I’ve since taken their existing code (console app), refactored it into a class library, and wrapped it in a custom disassembler class that calls it; converting the scenario to a BizTalk pipeline and performing the key “developer testing” took less than a day.

Why did they not do it to begin with? whilst sometimes there are valid reasons, technical or otherwise, I suspect that in this case it was just unfamiliarity with with BizTalk and some lack of confidence in the development team’s ability to learn and implement (or their belief in themselves); these are valid concerns to any project manager, but I would suggest that a better course of action would have been to spend some time looking at what it takes to implement a custom disassembler, seeing that’s its not at all that scary, and by doing so learning more about a product used in the solution (BizTalk) and achieving a better architecture, and more maintainable approach.

Labels:

Sunday, September 13, 2009

Paolo Salvatori on ways to create messages in orchestration helper classes

I’ve been toying with message creation a few times in the past, and recently turned to Paolo for help with a question; Paolo has an amazing blog and he has now posted posted some of his wisdom around ways to create messages from a helper class to an orchestration on his blog; well worth reading (as any entry on this fantastic blob really!)

Friday, September 11, 2009

Loosely Coupled….Part II

In a previous post I’ve mentioned our constant attempt to strike the right balance when it comes to loosely coupled services; I’ve mentioned that we were looking at two different scenarios – loosely coupling calls to services outside BizTalk and loosely coupling calls to services inside BizTalk (once implemented within the BizTalk group)

I’ve also mentioned that our solution is composed of a few distinct ‘areas’ (each one generally encapsulated as a BizTalk Application), which we consider, in most cases, to be service boundaries, and – within one ‘flow’ of an incoming request message, we will often have to cross one or more of these boundaries to achieve our end goal.

In most cases, in our solution, the ‘subscriber’ service would use the schema of the ‘publisher’ service for its incoming message; this roughly follows the principle of a service’s proxy, albeit a bit upside down - for practical reasons. Only that - and that’s a much bigger difference - we don’t create a copy of the schema as a service proxy would, but rather reference the schema of the publisher directly (through a shared assembly); this, of course, creates a strong dependency between the two and - over time – this has caused us a lot of headache around deployments as whenever we wanted to update the publisher, we’d have to remove the subscriber too.

Recently we have experimented with following more closely the service proxy approach and instead of using the same referenced schema (using a shared assembly), we’ve created an identical copy of it – same root node and namespace - in the ‘subscriber’ side.

The assumption was that we will be publishing a message using the copy of the schema the publisher holds and be receiving it using the copy of the schema the subscriber holds, but as the message itself will look exactly the same, and will inevitably have exactly the same message type, and so it would be picked up by the subscriber successfully.

Had it worked, we would be have been able to avoid the dependency between the subscriber and the publisher, which would help us gain much needed flexibility in the publisher to support, and change for, multiple subscribers.

Theoretically - if the publisher schema had to change (say – to support functionality required by other subscribers), as long as the change is backwards compatible such as added elements, we could replace the publisher copy of the schema, but leave the subscriber copy as is, until such a point that we need/want to update the subscriber process.

Well - in BizTalk 2006 – this would have worked just fine; unfortunately – from R2 onward, it no longer works – when an orchestration receives a message, it often does so based on a subscription that included the ‘messaging message-type’ (root node and namespace); however – starting with R2 – an additional check has been introduced – to compare the full .net type name of the schema used by the publisher message with the full .net type name of the schema used by the subscriber, assembly, version and all.

This check obviously fails in our scenario, and our fancy loosely coupling solution no longer works (in R2 or 2009).

I think this check is actually a result of code introduced as a hotfix for BizTalk 2004, which – for on reason or another did not make its way into BizTalk 2006 but did into later versions, but I’m not sure, either way – it’s important to note the workaround described at the bottom of the hotfix description, as it appears this behaviour can be turned off, but one would have to check carefully the potential impact.

What else could we do?

Well – one pattern we know that works fairly well is the broker pattern – there’s the publisher, with it’s own schema, there’s the subscriber, with its own – completely different schema, and there’s a broker – a third process that has dependencies on both and contains a map to convert one to the other; on the plus side – this gives us all the flexibility we need – at any one point we only need to deploy two entities – the publisher or the subscriber and the broker, which is good enough; having the process, with a map, allows us to use multi-part messages if we deem them suitable, and all the complexity we need in the mapping; on the down side there are more artifacts to deploy and manage but, more crucially, one extra message box hop which, in a low latency scenario as ours, is not a small price to pay.

Another option is to simply expose the subscriber as a service and call it as such – there are big benefits to doing that – including the fact that we can now have a copy of the schema, in the form of a proxy or without one, and we have also decoupled the services in terms of BizTalk groups –the other service can be anywhere, although this was never a requirements for us; however – we’re paying in more pub/sub again, as well as more IO and quite possibly more complexity.

Theoretically we could have also use XmlDocument (or any other generic wrapper, for that matter) to convey the message, but a) I don’t like typeless intechanges and, b) this does not work well in cases where correlation is required, as the following receive tend to short-cut the subscriber and pick up the request as a response, that is unless you’re willing to introduce two wrappers – one for the request and another one for the response.

Labels:

Tuesday, September 08, 2009

Debugging Expression shape code

Sandro Pereira has posted a question, and answer, in the BizTalk newsgroup (he also described his answer, in detail, on his blog) about debugging expression code in Visual Studio

He wasn’t referring to debugging code in helper classes, but code in expression and assignment shapes.

My answer was that this was not possible, but Sandro quickly proved me wrong, as he demonstrates in his answer and blog post, and this got me thinking –firstly – despite knowing about the option to use the generated code (and actually using it on very rare cases to understand a certain BizTalk behaviour) I had never thought of using it for debugging, and that is an interesting thought.

However – I had to wonder – how come I never came across the need to? in all those years of BizTalk development, not even once can I remember thinking – oh! I could solve this if only I could debug the piece of code in this shape..

The reason, I think, is two fold -

1. I rarely have more than 2-3 lines of code in an expression shape of any kind; if it’s not straight assignments, its going into a helper method; it’s cleaner, it’s more reusable, and it’s easier to debug.

2. I use trace. a lot. and so every few shapes or so, and certainly in expression shapes, I will have a trace line that outputs to a log file important information about the state, and the flow of the process; this proves to be invaluable when troubleshooting issues on the live environments, but also really helpful in development.

Labels: ,

Monday, September 07, 2009

On ASMX, WCF, namespaces and generated schemas (in BizTalk 2006)

Recently we’ve started to consume a new version of a web service we’ve been using for a while.
We’ve known that, as a whole, not much had changed, only that they have now moved to WCF; they would have migrated their classes to VS 2008 but would expose pretty much the same functions, using pretty much the same parameters.

Still – it appears that BizTalk now insists on generating multiple schemas for the web reference, and as more of the service is moved across more schemas are introduced.

This caused Oleg a fair amount of pain as, when new schemas would be introduced, they would re-order the existing schemas, so reference1.xsd (in the web reference) would suddenly become reference2.xsd, which in turn break out maps.

The process of finding out the logic behind which schemas are created was fairly short and simple, but as I’ve documented it I thought I might as well share it -

Initial observation revealed that whilst the ASMX services’ WSDL file contains all the schemas needed, the WCF services using import statements in the WSDL file; the schemas exist in separate ‘files’.

The ASMX services always uses the XmlSerializer, WCF services use the DataContractSerializer by default, but can be configured to use the XmlSerializer if required.

Here’s a walk thorugh of the scenarios we’ve compared (using BizTalk 2006) -

Standard WCF project, DataContractFormat
We’ll start by comparing the standard WCF sample project generated when you create a new WCF service application in Visual Studio 2008 -

[ServiceContract]
public interface IService1
{
[OperationContract]
string GetData(int value);

[OperationContract]
CompositeType GetDataUsingDataContract(CompositeType composite);
}

[DataContract]
public class CompositeType
{
bool boolValue = true;
string stringValue = "Hello ";

[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}

[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}



Looking at the WSDL generated, 3 schemas are imported –

1.       The usual generic types


2.       The definition of the compositeType type


3.       The definition of the service’s messages (GetData, GetDataResposne, GetDataUusingDataContarct, GetDataUsing DataContractResponse)





Adding a web reference to this service from a BizTalk 2006 project we can see it represents this fairly accurately -


We can see all 3 schemas downloaded from the service, but within the reference.map generated code  a single reference.odx defined the methods in the form of ports and web-messages, and reference.xsd defineds the compositeType schema.



Equivalent project in an ASMX service

I’ve created an equivalent ASMX service, which looks like this –



[WebService(Namespace = "http://tempuri.org/")]
public class Service1 : System.Web.Services.WebService
{

[WebMethod]
public CompositeType HelloWorld(CompositeType composite)
{
CompositeType response = new CompositeType();
response.StringValue = "Hello World";
return response;
}
}

public class CompositeType
{
bool boolValue = true;
string stringValue = "Hello ";

public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}

public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}



Publishing this service I can see it’s WSDL contains (does not use import, but that proved to be insignificant) a single schema that represents the service’s messages and the compositeType definition.



Consume this service from a BizTalk 2006 project and only the WSDL file is downloaded (there are no ‘external schemas’ to worry about) but within the reference.map pretty much the same odx and xsd files are generated, no real difference between ASMX and WCF here.



Next I’ve looked at changing the serializer the WCF service works with from DataContract to XmlSerializer –



Standard WCF project, XmlSerializerFormatter

Now we will change the serializer to XmlSerializer  by adding XmlSerializerFormatAttributre to both the service and the data contracts  



   [ServiceContract]
[XmlSerializerFormat]
public interface IService1
{


…and



[DataContract]
[XmlSerializerFormat]
public class CompositeType
{


The WSDL in this case includes only one import, for a single schema representing both the service messages and the compositeType schema (basic types are not exposed) and BizTalk now only has one schema downloaded, but again – the reference.map code remained identical (one ODX, one schema)



How will adding a second namespace affect these behaviours? Lets investigate -



WCF project, two namespaces DataContractFormat

To demonstrate this I’ll add another data contract - AnotherCompositeType, specify an explicit namespace for it and include it as a second parameter to the GetDataUsingDataContract operation -



[DataContract(Namespace="HttpL://SomeNamespace")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";

[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}

[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}

[OperationContract]
CompositeType GetDataUsingDataContract(CompositeType composite, AnotherCompositeType anotherComposite);



Using DataContractFormat again, but with two classes, representing two different namespaces, we’re now getting yet another schema - the fourth one - representing the added data contract (if the namespaces of both data contracts were the same, the DataContractFormat would have included them in the same schema)



On the BizTalk side, the reference.map code now also contains a second schema, one describes the original CompositeType, and a second describes the second type – AnotherCompositeType and here as well – were the two types in the same namespace, a single schema would exist, describing both.



Let’s look at the same again, using the XmlSerializerFormat



WCF project, two namespaces XmlSerializerFormat

Adding the XmlSerializerFormat, I also have to remember to include the XmlRoot attribute to set the namespace, as the serializer does not look at the DataContract attribute -



[DataContract(Namespace = "http://SomeNamesapce")]
[XmlSerializerFormat]
[XmlRoot(Namespace = "http://SomeNamesapce")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";

[DataMember]
public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}

[DataMember]
public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}



Now the WSDL for this service, using the XmlSerialiserFormat imports two schemas – one for the service messages, and the CompositeType schema, all reside in the same namespace, and a second for AnotherCompositeType which is defined in a separate namespace



Consuming this from BizTalk and again I’m getting two schemas – one for each namespace.

So far – switching between DataContractFormat and XmlSerializerFormat made no difference to the generated code under reference.map, but it did change the way the WSDL is constructed (import vs. embededed schemas) and therefore the downloaded components (wsdl and schemas, vs. wsdl only)



Note - another thing I’ve noticed is that when a new schema needs to be generated under the reference.map code, as a result of a change to the service, updating web reference does not seem to do so; I had to delete the web reference and re-add it to see the newly added schema.



Last  - let’s look at how the ASMX service behaves with two namespaces –



ASMX service with two namespaces

I’ve added the second class, and added it as a parameter to my web method



[XmlRoot(Namespace = "http://AontherNamespace")]
public class AnotherCompositeType
{
bool boolValue = true;
string stringValue = "Hello ";

public bool BoolValue
{
get { return boolValue; }
set { boolValue = value; }
}

public string StringValue
{
get { return stringValue; }
set { stringValue = value; }
}
}


And, still consistent – when consumed from BizTalk 2006 I’m getting only the WSDL downloaded (two schemas are embedded) but the reference.map code contains two schemas – one for each namespace.



To summarise -



Using the DataContractFormat you will always get one schema for generic types, one schema for the service’s messages, and then one schema for each namespace any other types are declared in (0..n)



Using the XmlSerializerFormat  schemas are embedded in the WSDL file, and you would get one per xml namespace used.



As far as BizTalk generated code is concerned, however, there’s no difference between the two.



 



What this meant to us – well – we understand better, but there’s still not much we can do.



In our case – we control the service, and – in fact - we know that the only reason we encounter multiple xml namespaces in the service contract is because the various classes exist in several .net namespaces, and they have not supplied the DataContract attribute on any class, they have certainly no supplied the namespace parameter to that attribute, which meant the .net namespace was used as the xml namespace, resulting with multiple namespaces and therefore multiple schemas.



One that team added the attribute, and used a consistent xml namespace throughout, our immediate problem was solved; however - had it been a third party’s service we would not have that luxury and we would have had to update our code whenever we update our web reference, even if only new types were added in a backwards compatible way (as the schema ordering may have changed)



On that – it’s probably easier to simply rename the schemas (and the underlying .net types) under the reference.map generated code rather than the referencing maps and messages.


Labels: , ,

Thursday, September 03, 2009

Pro Business Activity Monitoring in BizTalk Server 2009

Richard Seroter wrote a review of this book on his blog

Whilst I haven’t yet finished reading the book, I completely agree with Richard – this book is very well written, and is doing a fantastic job explaining this fascinating, and often misunderstood, if not completely overlooked, capability of BizTalk.

The book also does a very good job at looking at scenarios outside BIzTalk server, making it well worth reading for any enterprise solution architect.

Highly recommended.

Wednesday, September 02, 2009

Creating messages from scratch - revisited briefly

A while back I’ve posted about the different ways to create messages in an orchestration, and later some performance comparison between them.

Mostly for fun I run a quick test on my newly installed laptop; I did not put nearly as much effort as I have previously, so don’t make out of these numbers too much, but I was amazed to see that all the results were running pretty much 10 times faster.

Now – it’s a new BizTalk (2009), new SQL server (2008), new operating system (Windows 7) and a new(-ish) laptop (Thinkpad T61), so there’s no way to know how much each component contributed to the improvement, but it is amazing how much difference can exist after just one year!

Well – not at all scientific, but I found it interesting anyway!

Labels: ,

Tuesday, September 01, 2009

Webcast: BizTalk Server 2009 Performance on Hyper-V and Physical Deployments

Ewan Fairweather is doing a web cast TODAY on BizTalk 2009 performance tests he’s done both on and off Hyper-V infrastructure.
I’ve seen bits of it before and is highly recommended- you can pretty much count on this being extremely useful if you’re serious with you BizTalk deployment.

Register here