Yossi Dahan [BizTalk]


Monday, November 29, 2010

Controlling the credentials used by the SOAP adapter from an orchestration

A long time ago I looked into how one could provider the credentials to use when accessing a web service using the SOAP adapter from the process.

The context for this is, of course, a multi-tenancy solution, where the system calls a third party service, but needs to make that call in the context of the current user it is serving. Each instance of the process may be serving a different user/company and so needs to provide different credentials to the back-end systems.

As you can read from the post, at the time, I wasn’t very successful, but now I have solved that mystery, and it turns out it’s quite straight forward.

The thing that baffled me was that the SOAP adapter’s property schema did contain the necessary context propertied (Authentication Scheme, Username and Password), but I could not get it to use them when set from the calling orchestration.

Whatever I did, the adapter seemed to use the pre-defined settings (from the send port).

Eventually, and after talking to Paolo Salvatori, the answer dawned on me -

In many (if not all) cases, the way BizTalk communicates send port settings to the adapters is via the message context. The adapter defines a property schema and BizTalk populates the settings when creating an instance of the pipeline in the outgoing message.

Given this, when I set the context properties in my orchestration and then sent the message through the port, BizTalk overwritten my properties with the ones defined in the send port and so the adapter did in fact read the context properties, only that my values were no longer there.

If the adapter configuration UI allowed not writing any authentication details into the message context, my values would have remained and it all would have worked first time, but sadly that option does not exist, and the adapter will always write the authentication scheme, and, if not Anonymous, the username and password properties.

The workaround, however, was quite simple – I’ve created my own properties for the username and password, and a pipeline component to read these and copy them to the adapter’s context properties. as my pipeline component will run after BizTalk had set the send port configuration, my component in this case will overwrite these settings and win.

Fight fire with fire!


Wednesday, November 24, 2010

The tale of the mysterious BAM records

We’ve been using BAM to log the activity of our solution which is running in production under a reasonable load.
Most of the data comes from receive and send pipelines but some is written from orchestrations or even custom code as well, all of it is stored using the BAM API, mostly using the buffered event stream, but occasionally using the direct event stream and orchestration event stream.

Generally speaking, this had been working very well, and the log kept proved invaluable.

A few of months ago, though, we started seeing incidents where BizTalk processing would simply grid to a halt.
The first symptom was fairly easy to spot – we’ve seen the Tracking Data performance counter climb steadily, and CPU on the SQL box supporting BizTalk would be in the high nineties constantly.

Data was being transferred from the message box TrackingData tables to the BAMPrimaryImport, so it wasn’t the previously reported sequencing error, but it was simply not going fast enough – we were suddently writing more data than BizTalk was able to process.

Initially we thought we may have reach the breaking point for our solution on the existing infrastructure, but quickly we realised this doesn’t make sense – yes – we’re running some load – but not enough to warrant this.

A closer look revealed that the active tables in the BAMPrimaryImport database were very large and so, needing to avert the impeding production issue the team had decided to truncate all the active tables, knowing that we would lose some tracking data but, hopefully, keep the system running and buy us some time.

Now – generally speaking this is not the best idea – you mustn’t touch these tables manually, but given the alternative – a non-functioning system, we’ve decided to take the risk, and it kind of worked – once the tables were truncated, BizTalk happily cleared the TrackingData tables in the message store.

So – we know we have a problem – but what caused it? well – me – for the most part!

We knew the active tables kept growing, and a closer look revealed we were getting a lot of ‘orphaned’ records – these are records with very partial information, with many nulls, and in which – crucially – the IsVisible column is null.

A quick search revealed that the IsVisible should really never be null, as these records will never move from the active table to the completed table, and that this happens when the record is being updated AFTER it has been completed and moved away from the active table. I must have done something wrong then!

Turns out we had at least a couple of issues -

The first was a simple misunderstanding on my part -

We have 4 activities, which are hierarchical; we use the relationships to define the connection between the activities.
In one place, having already created an instance of ‘Activity A’ we went on to create a new instance of ‘Activity B’, end it, and then add a reference from the ‘Activity B’ instance to the ‘Activity A’ instance.

Turns out if you add a reference FROM an activity instance that had already been completed, BAM will ‘kindly’ create a new record in the active table (it doesn’t matter what you’re referring to, and if it’s an activity – if it’s closed or open. only the source is important).

Now – learning that, it does makes sense – but my defence would be that it seems quite logical to me to create entity A, create entity B, and then define the relationship between them, rather than having to do that half-way through the creation of entity B.
In any case – this was easy enough to fix, even if it took a bit more effort that it would appear given the broader story of our tracking, but, as these things go – once you know the answer, coding is never the difficult part.

The second issue was more clearly a bug than a ‘misunderstanding’ – it would not surprise you if I said all of my pipeline code is streaming ,and that works fine without too many issues.
In one of our scenarios we create an activity in the send pipeline of a solicit-response port, and close it in the receive pipeline.
Recently I added some asynchronous logic in there that, in some circumstances, will update an activity shortly after the message had been delivered.

Unfortunately, under load, and due to the asynchronous nature of BAM, this update may be delayed.
If this happens, and the system we’re calling happened to respond quickly, it may happen than my update will occur after the activity was closed by the receive pipeline and, in that case, we end up with this ‘null record’ again.

This latter problem could potentially be solved using continuation, ensuring that the completed record does not get moved from the active table until any related records are completed as well, but this does not fit our scenario very well.

In any case, resolving these two issues saw the number of bad records in the active table drop by about 96%, we still get a few a day, which I still need to iron out, but the rate is very slow now, and we know how to surgically remove them until we sort this out, and so our environment is much healthier now.

The TrackingData counter that would often show 9,000 records before, now rarely climbs above 200.

To summarise I would make the following points -

  • Obviously there’s a lesson I’ve learnt about how BAM references work, and my assumption that it’s ok to create a reference from a closed activity was wrong.
  • The other reminder is to be very careful with asynchronous coding, something I’m sure we’ll see more and more in the future. it is very powerful, but can get hairy.
  • There’s also a lesson about checking carefully the contents of the BAM active tables, not only completed table, after any BAM related change, as well as a reminder how important it is to fully understand any mechanism you use!
  • Last – it seems to me that, whilst I put hand up and admit I was wrong – it’s not a good story if such a simple mistake done by a developer can bring down the entire server. surely BAM should be more defensive and either error or deal with someone updating a records that has been closed, but writing records that will never get cleaned up cannot be a good store, don’t you think?

Labels: ,

Wednesday, November 03, 2010

MockingBird - Reminder to self

If you’re using MockingBird (I quite like it!), you’re on windows 7, and you keep getting http 404 -

Make sure the application pool used is configured for the ‘classic’ .net pipeline (as opposed to the ‘integrated’ one)

I keep forgetting!

More on Bindings

Recently I’ve worked on splitting our binding files -

Until now we would normally build our application, deploy it locally, sort out ports and stuff, export binding and package that with our deployment framework; we would typically also create copies of the binding files for the various environment (dev, test, UAT, production) and modify values in there as needed before adding these copies to the framework as well.

The thing is, that half of these binding files were static – the list of schemas, and orchestrations, and the binding of the orchestrations to hosts and ports, did not change from one environment to another; for the most part it is only port properties that change.

And so, from a maintenance point of view, whenever we added an orchestration to an application, we have to edit several binding files, even if we haven’t introduced any new ports.

Looking to improve on this we’ve split the binding file – we have one, mostly shared, file containing the stuff that doesn’t change (modules), and then one file per environment containing the stuff that does change (ports).

Now when we add a new orchestration, we only have to modify one binding file.

The question was – what do we do in our deployment framework!?

When adding a binding file as a resource the UI suggests that by not specifying an environment name for the binding file you will be instructing BizTalk to apply this to all environments, when you go ahead and do that you see that the binding file gets added with ‘ENV:ALL’ as the ‘destination location’ as opposed to ‘ENV:Production’, for example, if you type in ‘Production’ as the environment name.

Easy enough, I thought, but being the pessimistic guy that I am I figured that there’s no chance the API will take an empty environment name, so I’ve decided to put ALL as the environment name; after all – that’s what it stores, isn’t it?

Running my script I could see the binding file being added as a resource, and I could see that the destination location appears as I expected it to in the administration console, so I took the next step of exporting the MSI, deleting the application and trying to import from the MSI.

Here was the sign that something isn’t quite right – when prompted to choose environments ‘ALL’ was listed as one of the possible options; this didn’t seem right, but I carried on regardless with my experiment and selected the ‘Production’ environment.

The MSI was imported successfully, and the ports from the shared binding file were used, despite me not picking the ‘ALL’ environment, which is what I wanted. good. but why did ‘ALL’ appear there?

To start with, I wanted to go back to the ‘standard behaivour’ and see how this looks like if I add the binding file through the UI and not providing an environment name. Could it be that the ‘ALL’ option was always there and I simply never paid attention to that?

As it turns out – no.  removing the binding file I added via script, and re-adding it though the UI, not providing any environment name, re-added it with the ENV:ALL as the destination location, but on export MSI and re-import, the list of environments to choose from did not include the name ‘ALL’.

So – clearly there’s a behavioural difference.

I’ve changed our script to add the binding without specifying environment name. despite my unfounded doubts this works like a charm and again in the admin console I could see my binding with the ENV:ALL destination location.

This time, when exporting and importing the MSI, the ‘ALL’ environment did not appear in the list of the import MSI wizard, but when selecting any other environment, the shared binding file did apply to either. Mission accomplished.

Conclusions -

Have faith in the API. but also – there is a little bit of magic, and no tall ENV:ALL are alike – it looks as if on the one hand any binding file with the destination location ‘ENV:ALL’ will be applied to all environments, but only if you add it without specifying an environment name will it NOT appear in the environments list. magic!