When I first started writing the back-end code for CrimeSpot.net, I was confronted with a dilemma: I had to import two different versions of Atom and three of RSS, all of which had slightly different formats. I had two options. I could create a separate routine within the program to import each of these formats, or I could transform each of them to a single format using XSL templates.
I decided to use templates and import a single, common XML format. Originally, I chose to do this because it made it very simple to separate program and data. Combining the two is one of my biggest pet peeves. By doing it this way, I could just create an entry in the database for each input type and include an XSL file to change it to the common form.
Subsequent events have shown this to be a wise decision.
Why? Well, I have been fooling around with one of the features of the .NET framework – the ability to take objects within programs and “serialize” them to XML files. Normally this is used so that you can retain the object’s value between instances of the program. If you need that object back at a later time, you can “deserialize” that XML file back into an object.
But when you’re deserializing, there’s no reason that the XML must come from an object that was previously serialized. You can use any XML file that matches the object’s format. With a little work, you can even import a collection of objects.
This helps me tremendously because the objects I will be importing need some processing before they can be saved. In particular, I need to inspect a date/time field and capture the offset from Universal time (UTC, aka GMT). This information is lost when the date is created as a date, so I need to get it when the date is still just text.
And .NET supports saving XML directly into a database (via the DataSet object), so when I’m done, I can just serialize the object and save the resulting XML. This approach may have performance issues, but it’s simple and elegant, and I can always buy a faster computer.
UPDATE: Here’s a little source code to show how this works. This code will read XML from a DataSet and import it into a collection of objects. First, the object classes:
Public Class SourceTypes
Private TypeList As New SourceTypeList
<System.Xml.Serialization.XmlElementAttribute("SourceType", Form:=System.Xml.Schema.XmlSchemaForm.Unqualified)> _
Public Property Types() As SourceTypeList
Get
Return Me.TypeList
End Get
Set(ByVal TypeList As SourceTypeList)
Me.TypeList = TypeList
End Set
End Property
Public Sub New()
End Sub
End Class
This class is a serialization wrapper. It exists only to provide a convenient XML representation of the collection of SourceType objects. For information on the SourceTypeList class, please see this post. Incidentally, the XmlElementAttribute causes the list not to have an XML element of its own; instead it presents the list items directly below the root element.
Here is the SourceType class:
Public Class SourceType
Public SourceTypeID As New Long
Public Name As String
Public Description As String
Public ItemField As String
Public UpdateField As String
Public UpdateCheckRX As String
Public UpdateSelectRX As String
Public UpdateReplaceRX As String
Private TemplateString As String
<System.Xml.Serialization.XmlIgnore()> _
Public TemplateTransform As New XslCompiledTransform
.
.
.
End Class
I have simplified the class a bit. It had a property that accepted an XSL string and used it to initialize the TemplateTransform field. I can’t emphasize enough how helpful properties are when using serialization. It makes it easy to do some processing without having to explicity invoke any methods. Here, the XmlIgnore attribute prevents that field from participating in serialization.
Now here’s the guts of the program, where we instanciate the class from the DataSet (which we will assume has already been filled):
Dim TypeList As SourceTypes
Dim TypeSerializer As New XmlSerializer(GetType(SourceTypes))
Dim TypeReader As StringReader
TypeReader = New StringReader(TypeSet.GetXml)
TypeList = CType(TypeSerializer.Deserialize(TypeReader), SourceTypes)
It may be more efficient to use DataSet.WriteXML and an XML reader here, I haven’t tested it. The result is an object that contains a collection of SourceType objects.
As always, please drop a note in the contents if this helps.

[...] You can load up the data either by passing a Dataset object from your data access layer and then grinding through all the data, or (my favorite) you can define the XML output of your dataset with an XSD schema, use the GetXML method, and serialize this XML into the objects directly. [...]