Yossi Dahan [BizTalk]

Google
 

Saturday, February 21, 2009

Serialisation, mixed content and string[]

When you generate a class out of a schema with an element configured to allow mixed content (child attributes and elements as well as text), you should expect the corresponding generated field type to be a string array;

So - if you have a schema that looks like this

<?xml version="1.0" encoding="utf-8"?>
<
xs:schema targetNamespace="http://tempuri.org/XMLSchema.xsd" elementFormDefault="qualified" xmlns="http://tempuri.org/XMLSchema.xsd" xmlns:mstns="http://tempuri.org/XMLSchema.xsd"
xmlns:xs="http://www.w3.org/2001/XMLSchema">
<
xs:element name="SomeElement">
<
xs:complexType mixed="true">
<
xs:sequence>
<
xs:element name="Child1" type="xs:string"/>
<
xs:element name="Child2" type="xs:string"/>
<
xs:element name="Child3" type="xs:string"/>
</
xs:sequence>
<
xs:attribute name="SomeAttribute" type="xs:string"/>
</
xs:complexType>
</
xs:element>
</
xs:schema>

(‘SomeElement’ being a complex type allowing mixed content)

The fields in the generated class would look like

public partial class SomeElement {

private string child1Field;

private string child2Field;

private string child3Field;

private string[] textField;

private string someAttributeField;
.
.
.

The reason for the array of strings (instead of just one string field) is that an XML corresponding to the schema might look like this –


<SomeElement xmlns="http://tempuri.org/XMLSchema.xsd" SomeAttribute="someAttributeValue">
Some free text
<
Child1>Child1 text</Child1>
Some more free text
<
Child2>Child2 text</Child2>
yet some more free text
<
Child3>Child3 text</Child3>
</
SomeElement>

And so by using a string array to hold the text the deserialiser can keep string portions separately.

Initially, I thought, this allows the structure to represent the original xml accurately, but this is not exactly the case – you would still not know for certain where each string portion existed, especially if in the source XML you get a few elements that don’t have text between them, which , I suspect, is why when I serialise the instance back to xml I actually get –

<?xml version="1.0"?>
<
SomeElement xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:xsd="http://www.w3.org/2001/XMLSchema" SomeAttribute="someAttributeValue" xmlns="http://tempuri.org/XMLSchema.xsd">
<
Child1>Child1 text</Child1>
<
Child2>Child2 text</Child2>
<
Child3>Child3 text</Child3>
Some free text

Some more free text

yet some more free text
</
SomeElement>

Now, I don’t particularly like this sort of xml, and shy away from mixed content; I don’t believe that xml snippets like my samples above are useful, specifically I don’t think that mixing elements and text is particularly nice.


However, consider an element with an attribute and some text – the following is quite reasonable I think, and yet requires mixed content -


<Phone type="mobile">some text here</Phone>



Labels: , ,

3 Comments:

  • Hi Yossi,

    To me it seems as a bug in the DOM model, when the back/forth XML <-->Object transformations are ambiquous.
    At first glance we should use the order in the string[] array. But the XML standard mentioned that attributes do NOT have the order, only elements. Oops!

    By Blogger Leonid Ganeline, at 01/04/2009, 19:39  

  • Hi,

    So what do you say? That using Mixed property in schema set to true is not advised?

    Thanx,

    Yonathan.

    By Blogger Yonathan, at 26/08/2009, 16:01  

  • Not really; personally I don't like it in many cases, as I pointed out, but I think it's completely valid.

    I bumped into this doing some serialisation and de-serialisation - as long as you know what to expect, and what are the limitations (not being able to repesent the message accurately in the deserialised format) and you can live with them - it's a perfectly valid usage.

    By Blogger Yossi Dahan, at 26/08/2009, 17:40  

Post a Comment

<< Home