Yossi Dahan [BizTalk]

Google
 

Friday, April 28, 2006

Converting XDR’s to XSD’s

Working in a big migration project from BizTalk 2002 to BizTalk 2006 we needed to convert all

the existing message definitions, created in XDR to XSD.


Due to the size of the current implementation (and hence the amount of schemas needed) I was asked to look at automating the process of the conversion.


The following descirbed the options considered and the conclusion reached. I'd love to hear from anyone who has any experience with this issue, to either confirm my conclusion or highlight more options.


Reusing the “Add Schema Wizard”

Since I have been experimenting in the past with reusing the logic behind the “Well Formed XML” wizard I expected this to be straight forward (and obviously the best option), I was wrong.


Apparently while the WFX and DTD wizards are pretty much external components and therefore easy to use outside BizTalk and VS (good indication to that is the fact that they need to be enabled by running a vbs script), the XDR converter seems to be embedded in the product and cannot be, so it seems, reused.


All my attempts, web searches and even using a few good contacts at MS did not bare any fruits. I could not find a way to reuse this logic.


Using XSD.exe

XSD.exe claims to be able to convert XDR’s to XSD simply by passing in an XDR file as input.


However, in most my attempts, the tool generated the XSD with an additional root node - “NewDataSet” or “Schema”.


While I’m not sure why this is done, with no switches available to use with the XSD.exe in this mode I don’t see how this can be avoided.


In addition – comparing the result of the output generated by XSD.exe with the output of the “Add Generated Schema” differences are clearly visible, it seems the wizard adds some BizTalk specific nodes as annotations to help the editor work with the schema, these cannot be expected to be added by a generic tool such as XSD.exe.


This, as well as the suspicion that there might be more, less visible, BizTalk specific logic done by the wizard should direct us to look only at BizTalk specific conversion utilities as they are more likely to do the conversion to BizTalk liking.


XDR to XSD conversion tool from MS

Looking for another BizTalk specific tool I re-discovered this little gem.

This was first introduced as an external script (JS+XSL) in biztalk 2000 (before XSD was formalised), later in 2002 it was embedded into the editor and in 2004/6 seems to be used internally by the “Add Schema Wizard” (a good indication for this is that the XDRtoXSD.xsl script file still exists in the BizTalk program files folder.


The 2000 version is not at all happy with 2002 XDR’s – probably because of changes to the contents of the XDR files between versions.


The 2002 version converts the files but when added to a VS 2005 solution errors are displayed around the usage of the wildcards ##targetNamespace which causes errors due to potential ambiguity


In addition I suspect that it does not handle the biztalk specific logic in the conversion (as mentioned in the previous section around xsd.exe) as the output generated by the xsl is different then the output generated by the “Add Schema Wizard”


Migration Toolkit

A Migration toolkit was provided by Microsoft to help convert BizTalk 2002 AIC components to BizTalk 2004 pipeline components and adapters.


While I have not actually tried to use the tool I went through its documentation, and it does not seem to do anything with regards to schemas.


Migration project

A Migration project was introduced in BizTalk 2004 to help convert BizTalk 2002 projects to BizTalk 2004. The wizard associated with this project type does convert XDR’s to XSD’s, But I suspect it is using the same “Add Schema Wizard” logic and is not available for reuse.


Visual Studio 2005 “Create Schema” menu option

At some point down the line I’ve also looked at this option as well, although manual I hoped it will shed some light on how the process can be done. If proved possible it could have been automated using macros.


However, while it does succeeds in generating an XSD with no warnings or errors, and this XSD can even be added to a BizTalk project (and the project builds ok) the output generated differs then the one generated by the Wizard which is a cause for potential errors and therefore I believe is not a viable way forward, at least not without further more experimentation.


Conclusion

Based on this I came to the conclusion that the best way forward is to resort to manual conversion.

This is still relatively painless as it takes less then a minute to convert a schema, and the results are quite acceptable.


It is important to note that I did not do any deep validation of the output of the wizard, the generated XSD should be more carefully examined before concluding that it fully represents the messages expected.


If the manual process is totally an acceptable a way forward could be to investigate the differences between the output of the XSL script at the heart the MS conversion tool and the output of the “Add Schema Wizard” and either modify it to avoid the differences or wrap it in code that will.

"Unexpected end of document. This is not a new document. The dissassembly is incomplete. " and BAM

It took me a short while to figure out, so I thought it's worth posting this as it might be useful for someone else out there..

I'm using BAM to track messaging properties through using TPE.

I've made a mistake in the tracking of one of the properties I used in the send port (type mismatch)

This had casued the messaging process to fail and the following error appeared in the even tlog - "Unexpected end of document. This is not a new document. The dissassembly is incomplete. "

Not a very clear error, but anyway...

Strangely enough if I'm doing the same mistake, but using the property from the receive port (rather then the send port) a BAM error is logged in the event log but the process compleyes ok.

This caused me to start thinking - do I want my process to stop if tracking failed? well - that depends in the process, sometimes I do, sometimes I don't. I wonder if this is configurable somewhere... Anyway - at least is should be consistend, shouldn't it?

Thursday, April 27, 2006

Wrong way to implement polling?

Consider the following scenario:


Application A on a legacy platform needs to deliver message to application B

Legacy platform can only be polled for messages, it cannot send messages.

Legacy platform includes logic to accept requests for outgoing messages from internal applications and queue them (Outbox functionality)

It also includes logic to accept polling calls from outside the platform for the "next" message and return it to the caller.


However, an implementation I recently saw did not queue the actual outgoing message

Instead it queued the request to generate the outgoing message, in the form of a call back.


So - when Application A arrives to a point it needs to send the messages it registers the fact that it needs to deliver a message. What is actually kept is method callback information.


When BizTalk is polling for messages the callback is invoked to generate the message, which is then being returned to the BizTalk polling call.


To me this bring a few issues to the table -


  • It creates a dependency between BizTalk and the application so the application (and everything that is needed to generate the message) must be available when BizTalk polls for the message.
  • Application A intended to deliver the message at a certain point in time. BizTalk polled for it at a later point in time. if the message gets created when BizTalk polls for it (as opposed to when the application sent it) - the underlying data may have changed and the message created differently.
  • In addition, if the process of creating the message fails, it is much harder to resolve, if the message was created as part of a user interaction, for instance, the user has probably moved on.

Tuesday, April 25, 2006

Setting send port filter based on properties from a schema of another application

I don't usually hurry to raise the "bug" flag, but after playing with this for an hour or so this is the only explanation I can find (and if someone thinks otherwise please correct me)


I have a very simple messaging scenario I use for demo purposes in which a receive port takes an xml message from a file drop, parses it, promotes some fields into the message's context and publishes it to the message box.

A send port then subscribes to this message and delivers it to…you guessed it right - a file drop.


I've been playing around with the promoted properties and subscriptions the other day when I stumbled into this -


A very standard scenario - create a subscription based on the existence of a promoted property.

So I went to add a filter to the send port by selecting the property I wanted ("Sabra.BizTalk.Demos.PO.Schemas.Status" in my case) and the operator "Exists", nothing special here.

Only that the property did not appear on the list.


Do the usual checks - promoted properties are defined correctly, property schema deployed…everything seems ok. The only suspect I had is that the promoted property (and the message schema) were deployed to another application (since it really belongs to another demo I'm doing).

But surely this shouldn't matter. After all - if I drop a message it gets parsed ok, no matter which application the schema was deployed to. the application is only a logical container and not a physical container.


But it did.


I've moved the schemas to the same application and I could now succesfully select the property from the list.


To me this does not make any sense. It is quite reasonable to have an application subscribe to messages published by another application. Not to mention generic services that may want to subscribe to messages with certain properties - for error handling, archiving or SLA tracking implementations for instance


This limitation simply does not make any sense. Also - it did not fit with my understanding of BizTalk application as a logical grouping of artefacts only, with no real impact on the runtime.

So - I decided to play around - I configured the send port with the subscription and then moved the schemas back to their real application.


Checked the port configuration - it still shows the subscription ok -


Restarted the host - no errors.

Ran the scenario - runs ok, the subscription caused the send port to pick up the message with no issues.


For curiosity I tried to open the property combo box in the filter and see if my property is now there - as soon as I do so the value selected disappears and again - my property does not appear in the list



After confirming that indeed the runtime did not care which application held the schemas, and that once deployed and configured propertly everything works fine, plus the fact that this behaviour does not make sense from a design perspective I had to conclyde is that this is indeed a bug in the Administration Console and not a real restriction.


While there are workarounds for this issue - either temporarily moving the schemas to the relevant application as I did (not very useful in real world as the schemas are at the base of the dependency tree and doing so may require removing the entire contents of the first application) or by editing the port configuration through a bindings file or code - they surely complicate things and makes the administration console and the applications concept unusable.


As such I'm sure (hope?) MS will fix this rather soon.


Update -


After raising this with Microsoft I was very happy to hear back from them (you have to appreciate the effort)


In the email it is confirmed that the runtime does not enforce the restriction described above but they do recommend following this rule as the design time behaviour suggests in order to avoid cross-app dependencies that are difficult to track.


Apparently the way to do this the proper way (and it is all my fault I never even realised this option exists) is to create a reference between the two applications (by right clicking on the application and selecting Add->Reference from the context menu ).


This links the two applications together, and when you will export your application to an MSI you will get a nice screen informing you which other applications should be imported on the target group as well in order to get the solution to work. pretty nice.

It will also, of course, let you select the properties in the send port filter property combo box...


Tuesday, April 04, 2006

POP3 adapter in BizTalk 2006

I've been doing some work recently with the POP3 adapter.

This is a very useful adapter and I find myselft using it in many places.

In fact I always thought it was weird one was not shipped with BizTalk 2004. luckily at the time GotDotNet came to the rescue.


Now it BizTalk 2006 we get a fuller, more tightly integrated, adapter which is great.

I found that there are a few things worth noting though. -


The adapter has a built in MIME decoder. This is a bit of a duplication on the MIME decoder pipeline component, and consideration should be taken as to what to implement where.


The settings in a receive location using the adapter lets you decide if you want to apply MIME decoding in the adapter or not.

If you choose to do so you can then either ask the adapter to treat a specific part of the email as the email body (part 1 being the email body and any attachments are additional parts) or, and this is a real gem - you can specify a content type to pick up.

If you do this, the adapter will look at the list of parts and pick up the first one with the specified content type.


Both options are very useful but should be handled with care.

In many cases, especially when using email client to create MIME messages, it is hard to predict the order in which the attachments will appear in the message. At least from my experience. This makes specifying the part number to use potentially risky.

Same applies to the content type feature, as you will not be sure which attachment will appear first in the message. If you know you only have one xml attachments (and the rest are…say…images) then this is great. If you may have more then one xml attachment you may be in trouble.


A second nice feature of the adapter, which is MIME related, is the ability to decrypt the message. But again - attention should be paid to the following quote from the BizTalk 2006 help file - "Encrypted Messages Received by the POP3 Adapter that are sent to the Suspended Queue may be Viewable in Clear Text"


If you decrypt the message in the adapter and the message gets suspended for any reason, it will be stored in clear text. Now it really depends how much you trust your operations guys…(and girls)


Attention should also be paid to having the correct certificates in the correct store. As usual.


All this said it is nice that adapter actually allows you to disable the MIME decoding feature.

This means that the adapter will pass the whole email message, MIME structure and all to the pipeline. There you can either use the existing MIME decoder or any custom code you require to process it correctly.


In my recent implementation of the adapter we actually needed to receive emails with several attachments in each, but instead of treating this as a multi-part message we needed to extract the attachments and treat each attachment as an independent message.


We've considered two alternatives - one is indeed disabling the MIME decoding in the adapter and developing a custom disassembler that will read the stream and break the MIME message to its part passing back to the pipeline all the messages.

The second approach was to let the adapter decode the message and pass in a multi-part message. Then still have a custom disassembler, but instead of knowing all about MIME processing etc. all the disassembler needs to do is loop on the message parts and form a message from each part.

Both disassemblers would need to call any payload disassembler such as the XmlDisassembler to process the message, perform property promotion etc.


I think it's quite obvious what we chose (well. Ok. The 2nd option, if it wasn't obvious). Main reason was that although we expect a minor hit in performance, the code is so much simpler and there for safer to implement and maintain.