Back before I migrated from XML to regular expressions, I used XSL transforms to change various flavors of RSS and Atom feeds into a common format for importing. XSLT had a very nice function in it called normalize-space(). This function would take a string and return you that same string, except with all instances of multiple whitespace characters reduced to a single space. This was pretty handy, as I needed to be able to count words so I could create a short extract, and knowing that I’d only need to worry about a single space at a time.
When I moved the GetExtract functionality into Visual Basic, I figured I didn’t need to worry about this, since I would be using the String.Split function to create an array of words, and that function would be smart enough to deal with consecutive spaces, right? Turns out I wasn’t giving Bill Gates and his minions enough credit. When the Split function is confronted with two or more consecutive spaces, it does indeed count some of them as words*. A web search didn’t turn up a native .NET way to do this, so I had to implement it myself.
And as it turns out it’s pretty simple. I just used the regular expression \s\s+ to match any sequence of more than a single whitespace character – \s matches whitespace, and + means one or more occurences.
Here’s all the code required:
Public Shared Function NormalizeWhitespace (ByVal InputStr As String) As String
Dim NormRx As Regex = New Regex("\s\s+")
Return NormRx.Replace(InputString.Trim, " ")
End Function
That’s it, and it works like a champ.
* As it happens I didn’t check to see if it was counting the spaces themselves as words, or if it was creating words that were empty strings (i.e. the text “between” consecutive spaces). Either way, I was getting extracts that had 10 or 12 words instead of the desired 25.
So a few years ago, as I was clicking the links in my blogroll for the 10th time that day, I wondered if there was a better way of finding out when my favorite sites were updated. This was before feed readers were really popular, and anyway I wanted more of a Google News layout. Then I had an epiphay: site feeds were just XML files, and I had already created a bunch of XML-processing script in ASP for my Bleeker Books site (and God help me, it’s still using them).
So over a long weekend I cobbled together a program that would read these feeds into a database and spit them back out in a nice format, and CrimeSpot was born.
Over the next year or so I recoded the site in .NET, and I have to say those tools made it a lot easier. But I still ran into a problem from time to time, one that I couldn’t do anything about: every so often, I couldn’t import a feed. It would have some sort of formatting problem that made it an invalid XML document, and my program would throw up its hands and give up.
Usually this was because of Microsoft Word – if you copy the contents of a Word document and paste it as HTML, a lot of the formatting information gets converted in a weird way. In particular, you end up with a lot of tags that look like <o:p>. To XML, that looks like an undefined namespace, and the document can’t be read. More broadly, any error anywhere in an XML document causes the entire document to be unreadable.
As I said, this has been going on for a while, but as I add feeds I can see that it’s going to be a more and more common problem. So I finally made a command decision. Processing these documents as XML is out. From now on I’m going to use regular expressions to extract the data I want.
For those of you not in the know, regular expressions are pattern matching tools that can find and extract information from a longer document. In practical terms, this means that as long as the tags surrounding the content are correct, I can retrieve the information I want. Any errors in the content itself I can clean up once I’ve got it.
This goes back to Postel’s Law, “Be conservative in what you send, be liberal in what you receive.” In my case, this means I have make my best effort to accept the data that I’m given, ignoring errors whenever possible. And using regular expressions makes that possible.
Now, I love XML (and XSLT, too), and I use it a lot. In fact, XML is my Golden Hammer – I can find a way to work it into just about every project. But in this case I’m working with data that’s not entirely under my control, and I need to be as flexible as I can. And hammers aren’t noted for flexibility!
When Microsoft introduced the SqlDataSource, they made it very easy to add database connectivity to your ASP.NET pages with a minimum of programming. The SqlDataSource (and the related AccessDataSource) allow you to define database connections declaratively – that is, you define what data they will retrieve and how to manipulate it inside the web page, instead of writing code to provide these functions. Here’s a sample of what an SqlDataSource might look like:
<asp:SqlDataSource ID="SampleDataSource" runat="server"
ConnectionString="Data Source=localhost;Initial Catalog=SampleDB;Integrated Security=True"
SelectCommand="GetSampleData" SelectCommandType="StoredProcedure">
Since SQL Server can return multiple data sets in a single request, the SqlDataSource also made it easy to work with hierarchical data. In other words, if you have an Author record, it’s pretty simple to get a list of Book records associated with it. Microsoft has an in-depth article on this worth reading. For now we’ll just look at a simple example.
Let’s start by assuming that we have a data source called AuthorDatasource with tables named Author and Book, and a relation between them named Author-BookRelation. We then create a FormView called AuthorFormView:
<asp:FormView ID="AuthorFormView" runat="server"
DataSourceID="AuthorDatasource"
DataMember="Author"
DataKeyNames="AuthorID">
Next we’ll create a DataList to display the books associated with this author:
<asp:DataList ID="BookList" runat="server"
DataSource='<%# Container.DataItem.CreateChildView(Author-BookRelation) %>'
DataKeyField="AuthorBookID">
Instead of assigning a DataSourceID, we’re providing the actual data using CreateChildView to create a DataView. When you provide the data this way, however, you do lose the ability to use the declared insert, update, and delete statements for the child data – you have to implement this yourself, in code. You can still use the declared commands on the parent, though.
(Note: In this example it isn’t particularly useful to use CreateChildView – it would be trivial to create another data source and select only the books assigned to this author. It’s much nicer to be able to do this when you have multiple parent records, or parent-child-grandchild data.)
But direct data access is the old school way of doing things. Microsoft has introduced a new datasource, the ObjectDataSource, which facilitates separating the display of data (web page), from the business logic and database layers. And ObjectDataSource does not work with hierarchical data. Or… does it?
Actually there is a way to get it to work, and it’s not that complicated.
Going back to our Author – Book example, let’s create a couple of objects. (I’m not even going to pretend to use a data-access layer here – we’ll pretend the data appears by magic.) First we’ll look at the Library object, which will supply the lists of authors and books:
Public Class Library
Public Shared Function GetAuthors() As List(Of Author)
...
Return AuthorList
End Function
End Class
If you wanted to select a single author, you could create a function GetAuthorByID which would accept a parameter of the appropriate type. Next let’s look at the Author object:
Public Class Author
Public AuthorID As Integer
Public AuthorName As String
Public AuthorDescription As String
End Class
Now we’ve defined the objects we want to use, let’s create the ObjectDataSource:
<asp:ObjectDataSource ID="AuthorDatasource" runat="server"
SelectMethod="GetAuthors" TypeName="Biblio" DataObjectTypeName="Author">
What does this tell us? It tells us that ASP.NET will use the Biblio class’s GetAuthors method to return a list of type Author. We can then bind our controls to the fields such as AuthorName, etc.
So – how would we get and bind a list of books, eh, smart guy? We add one more property to our Author object:
Public Class Author
...
Public AuthorBooks As List(Of Book)
End Class
A DataSource property can work with any class that implements either IEnumerable or IList (or so I’ve been told). So, updating the code we used above, we now have this:
<asp:DataList ID="BookList" runat="server"
DataSource='<%# CType(Container.DataItem, Author).Books %>'
DataKeyField="AuthorBookID">
I don’t know if the CType is absolutely required but it doesn’t hurt. The result – we can now painlessly display the list of books from the Author object.
You can load up the data either by passing a Dataset object from your data access layer and then grinding through all the data, or (my favorite) you can define the XML output of your dataset with an XSD schema, use the GetXML method, and serialize this XML into the objects directly.
Of course I’m leaving out the code to update the books, insert new records, etc., but that should be simple for any semi-intelligent programmer (right?).
Now I don’t know everything about ASP.NET, this data isn’t a particularly good example, and I’m sure there’s plenty of naivete in what I’ve written here, but I still think this can be a useful technique to display related data using the ObjectDatasource.
If this code helps you out, be sure to let me know in the comments.
ASP.NET’s DataList control lets you display a list of data, but also lets you edit the entries in this list (unlike the Repeater control, for example). One thing it won’t let you do, however, is to add a new item to the list. Here’s a simple technique to do just that.
(I actually didn’t come up with this myself – I found it on the Internet. Unfortunately I didn’t keep the URL so I can’t give proper credit.)
The list portion of a DataList is a collection of DataListItem objects, each of which has an ItemType property. In addition to the types representing list entries (Item, SelectedItem, EditItem, etc.) there are the types Header and Footer. This means that the header and footer templates are part of the list, not separate from it, and controls that you place there can be accessed by the DataList events.
Let’s look at an example. In this case we’re editing a database of server information. We’ll use the DataList to add a list of applications to each server. So each DataList Item will display a single field containing the name of the application. When we edit the item, it will display a drop-down list containing all of the applications.
In addition, I’m going to put a drop-down list containing the same list of applications in the footer of the DataList control:

As you can see, there are link buttons on each row labeled “Edit” and “Delete”, and a link button labeled “Add” next to the drop-down list in the footer.
This Add button is the one we’re interested in. Here’s the markup defining it:
<asp:LinkButton ID="InsertButton" runat="server"_
CommandName="Insert">Add</asp:LinkButton>
The item of interest, and the one that does the work, is CommandName="Insert". As it happens, the DataList does not have an Insert command. But no worries – as I explain in this article, ASP.NET makes it easy to add our own commands.
To do this, we instruct the DataList control to let us define our own handlers for its commands by adding the OnItemCommand attribute to its definition:
<asp:DataList ID="ServerAppList" runat="server" DataSourceID="ServerAppDataSource"_
DataKeyField="ServerAppID" OnItemCommand="ServerAppList_ItemCommand">
Next we’ll need to define an event handler for ItemCommand:
Protected Sub ServerAppList_ItemCommand(ByVal sender As Object,_
ByVal e As System.Web.UI.WebControls.DataListCommandEventArgs)
I’m not going to cover each of the commands – I already went over several of them in this aforementioned article, so you can refer to that if you need some guidance. What we’re interested in here is the Insert command. Picking up inside the Select Case statement, we first get a reference to the drop-down list named AppList and extract its value:
Select Case e.CommandName
…
Case "Insert"
Dim AppListCtrl As DropDownList = e.Item.FindControl("AppList")
Dim AppName As String = AppListCtrl.SelectedValue
In this case we’re using a datasource with an InsertParameter named AppName, so next we’ll set the value of this parameter and insert the record:
ServerAppDataSource.InsertParameters("AppName").DefaultValue = AppName
ServerAppDataSource.Insert()
Lastly, we’ll make sure that no record is selected for editing, and bind the control:
ServerAppList.EditItemIndex = -1
ServerAppList.DataBind()
I’m glossing over a bit here – in order to link the new record back to the parent of the list (the server record, in this case) you’ll need to supply a foreign key to the parent record as an InsertParameter. And of course you’ll need to code the other commands. But do that, and this technique will help make a useful control even better.
Update: After looking over this, I realized I never stated exactly why you should handle this in the ItemCommand event instead of, say, in the InsertButton_Click event. The reason: in the ItemCommand event, you get the parameter e of type DataListCommandEventArgs. This parameter has a property Item that points to the row from which the command was called. Therefore you can use e.Item.FindControl to get the controls containing the values to add.
Without this parameter, finding controls in the header or footer of a DataList can be problematic. Since the header and footer are data items much like the other rows, you have to loop through DataList.Items, testing each item’s type, to find the correct row, after which you can use FindControl to get the controls you want. It’s much simpler the other way.
I have two hobbies: writing and programming. In both of these it’s very important to be able to track your revisions and recover previous versions. For the past year or so I have used the TortoiseSVN client for source control, but I didn’t really understand what I was doing.
Until the other day, when I read Eric Sink’s excellent source control HOWTO. This covers pretty much everything you would need to know about editing documents and saving your changes.
Better still, not only does it apply to computer code, it applies to stories equally well. Finish a draft? Save it with a tag. An editor asks for revisions? Create a branch that may be later merged into the main trunk. Decide you liked that scene you deleted 2 months ago? It’s still there.
Plus it’s handy for procrastination. Think of all the time you could waste getting this set up right when you should be writing prose/code!