Back before I migrated from XML to regular expressions, I used XSL transforms to change various flavors of RSS and Atom feeds into a common format for importing. XSLT had a very nice function in it called normalize-space(). This function would take a string and return you that same string, except with all instances of multiple whitespace characters reduced to a single space. This was pretty handy, as I needed to be able to count words so I could create a short extract, and knowing that I’d only need to worry about a single space at a time.
When I moved the GetExtract functionality into Visual Basic, I figured I didn’t need to worry about this, since I would be using the String.Split function to create an array of words, and that function would be smart enough to deal with consecutive spaces, right? Turns out I wasn’t giving Bill Gates and his minions enough credit. When the Split function is confronted with two or more consecutive spaces, it does indeed count some of them as words*. A web search didn’t turn up a native .NET way to do this, so I had to implement it myself.
And as it turns out it’s pretty simple. I just used the regular expression \s\s+ to match any sequence of more than a single whitespace character – \s matches whitespace, and + means one or more occurences.
Here’s all the code required:
Public Shared Function NormalizeWhitespace (ByVal InputStr As String) As String
Dim NormRx As Regex = New Regex("\s\s+")
Return NormRx.Replace(InputString.Trim, " ")
End Function
That’s it, and it works like a champ.
* As it happens I didn’t check to see if it was counting the spaces themselves as words, or if it was creating words that were empty strings (i.e. the text “between” consecutive spaces). Either way, I was getting extracts that had 10 or 12 words instead of the desired 25.
So a few years ago, as I was clicking the links in my blogroll for the 10th time that day, I wondered if there was a better way of finding out when my favorite sites were updated. This was before feed readers were really popular, and anyway I wanted more of a Google News layout. Then I had an epiphay: site feeds were just XML files, and I had already created a bunch of XML-processing script in ASP for my Bleeker Books site (and God help me, it’s still using them).
So over a long weekend I cobbled together a program that would read these feeds into a database and spit them back out in a nice format, and CrimeSpot was born.
Over the next year or so I recoded the site in .NET, and I have to say those tools made it a lot easier. But I still ran into a problem from time to time, one that I couldn’t do anything about: every so often, I couldn’t import a feed. It would have some sort of formatting problem that made it an invalid XML document, and my program would throw up its hands and give up.
Usually this was because of Microsoft Word – if you copy the contents of a Word document and paste it as HTML, a lot of the formatting information gets converted in a weird way. In particular, you end up with a lot of tags that look like <o:p>. To XML, that looks like an undefined namespace, and the document can’t be read. More broadly, any error anywhere in an XML document causes the entire document to be unreadable.
As I said, this has been going on for a while, but as I add feeds I can see that it’s going to be a more and more common problem. So I finally made a command decision. Processing these documents as XML is out. From now on I’m going to use regular expressions to extract the data I want.
For those of you not in the know, regular expressions are pattern matching tools that can find and extract information from a longer document. In practical terms, this means that as long as the tags surrounding the content are correct, I can retrieve the information I want. Any errors in the content itself I can clean up once I’ve got it.
This goes back to Postel’s Law, “Be conservative in what you send, be liberal in what you receive.” In my case, this means I have make my best effort to accept the data that I’m given, ignoring errors whenever possible. And using regular expressions makes that possible.
Now, I love XML (and XSLT, too), and I use it a lot. In fact, XML is my Golden Hammer – I can find a way to work it into just about every project. But in this case I’m working with data that’s not entirely under my control, and I need to be as flexible as I can. And hammers aren’t noted for flexibility!
When Microsoft introduced the SqlDataSource, they made it very easy to add database connectivity to your ASP.NET pages with a minimum of programming. The SqlDataSource (and the related AccessDataSource) allow you to define database connections declaratively – that is, you define what data they will retrieve and how to manipulate it inside the web page, instead of writing code to provide these functions. Here’s a sample of what an SqlDataSource might look like:
<asp:SqlDataSource ID="SampleDataSource" runat="server"
ConnectionString="Data Source=localhost;Initial Catalog=SampleDB;Integrated Security=True"
SelectCommand="GetSampleData" SelectCommandType="StoredProcedure">
Since SQL Server can return multiple data sets in a single request, the SqlDataSource also made it easy to work with hierarchical data. In other words, if you have an Author record, it’s pretty simple to get a list of Book records associated with it. Microsoft has an in-depth article on this worth reading. For now we’ll just look at a simple example.
Let’s start by assuming that we have a data source called AuthorDatasource with tables named Author and Book, and a relation between them named Author-BookRelation. We then create a FormView called AuthorFormView:
<asp:FormView ID="AuthorFormView" runat="server"
DataSourceID="AuthorDatasource"
DataMember="Author"
DataKeyNames="AuthorID">
Next we’ll create a DataList to display the books associated with this author:
<asp:DataList ID="BookList" runat="server"
DataSource='<%# Container.DataItem.CreateChildView(Author-BookRelation) %>'
DataKeyField="AuthorBookID">
Instead of assigning a DataSourceID, we’re providing the actual data using CreateChildView to create a DataView. When you provide the data this way, however, you do lose the ability to use the declared insert, update, and delete statements for the child data – you have to implement this yourself, in code. You can still use the declared commands on the parent, though.
(Note: In this example it isn’t particularly useful to use CreateChildView – it would be trivial to create another data source and select only the books assigned to this author. It’s much nicer to be able to do this when you have multiple parent records, or parent-child-grandchild data.)
But direct data access is the old school way of doing things. Microsoft has introduced a new datasource, the ObjectDataSource, which facilitates separating the display of data (web page), from the business logic and database layers. And ObjectDataSource does not work with hierarchical data. Or… does it?
Actually there is a way to get it to work, and it’s not that complicated.
Going back to our Author – Book example, let’s create a couple of objects. (I’m not even going to pretend to use a data-access layer here – we’ll pretend the data appears by magic.) First we’ll look at the Library object, which will supply the lists of authors and books:
Public Class Library
Public Shared Function GetAuthors() As List(Of Author)
...
Return AuthorList
End Function
End Class
If you wanted to select a single author, you could create a function GetAuthorByID which would accept a parameter of the appropriate type. Next let’s look at the Author object:
Public Class Author
Public AuthorID As Integer
Public AuthorName As String
Public AuthorDescription As String
End Class
Now we’ve defined the objects we want to use, let’s create the ObjectDataSource:
<asp:ObjectDataSource ID="AuthorDatasource" runat="server"
SelectMethod="GetAuthors" TypeName="Biblio" DataObjectTypeName="Author">
What does this tell us? It tells us that ASP.NET will use the Biblio class’s GetAuthors method to return a list of type Author. We can then bind our controls to the fields such as AuthorName, etc.
So – how would we get and bind a list of books, eh, smart guy? We add one more property to our Author object:
Public Class Author
...
Public AuthorBooks As List(Of Book)
End Class
A DataSource property can work with any class that implements either IEnumerable or IList (or so I’ve been told). So, updating the code we used above, we now have this:
<asp:DataList ID="BookList" runat="server"
DataSource='<%# CType(Container.DataItem, Author).Books %>'
DataKeyField="AuthorBookID">
I don’t know if the CType is absolutely required but it doesn’t hurt. The result – we can now painlessly display the list of books from the Author object.
You can load up the data either by passing a Dataset object from your data access layer and then grinding through all the data, or (my favorite) you can define the XML output of your dataset with an XSD schema, use the GetXML method, and serialize this XML into the objects directly.
Of course I’m leaving out the code to update the books, insert new records, etc., but that should be simple for any semi-intelligent programmer (right?).
Now I don’t know everything about ASP.NET, this data isn’t a particularly good example, and I’m sure there’s plenty of naivete in what I’ve written here, but I still think this can be a useful technique to display related data using the ObjectDatasource.
If this code helps you out, be sure to let me know in the comments.
ASP.NET’s DataList control lets you display a list of data, but also lets you edit the entries in this list (unlike the Repeater control, for example). One thing it won’t let you do, however, is to add a new item to the list. Here’s a simple technique to do just that.
(I actually didn’t come up with this myself – I found it on the Internet. Unfortunately I didn’t keep the URL so I can’t give proper credit.)
The list portion of a DataList is a collection of DataListItem objects, each of which has an ItemType property. In addition to the types representing list entries (Item, SelectedItem, EditItem, etc.) there are the types Header and Footer. This means that the header and footer templates are part of the list, not separate from it, and controls that you place there can be accessed by the DataList events.
Let’s look at an example. In this case we’re editing a database of server information. We’ll use the DataList to add a list of applications to each server. So each DataList Item will display a single field containing the name of the application. When we edit the item, it will display a drop-down list containing all of the applications.
In addition, I’m going to put a drop-down list containing the same list of applications in the footer of the DataList control:

As you can see, there are link buttons on each row labeled “Edit” and “Delete”, and a link button labeled “Add” next to the drop-down list in the footer.
This Add button is the one we’re interested in. Here’s the markup defining it:
<asp:LinkButton ID="InsertButton" runat="server"_
CommandName="Insert">Add</asp:LinkButton>
The item of interest, and the one that does the work, is CommandName="Insert". As it happens, the DataList does not have an Insert command. But no worries – as I explain in this article, ASP.NET makes it easy to add our own commands.
To do this, we instruct the DataList control to let us define our own handlers for its commands by adding the OnItemCommand attribute to its definition:
<asp:DataList ID="ServerAppList" runat="server" DataSourceID="ServerAppDataSource"_
DataKeyField="ServerAppID" OnItemCommand="ServerAppList_ItemCommand">
Next we’ll need to define an event handler for ItemCommand:
Protected Sub ServerAppList_ItemCommand(ByVal sender As Object,_
ByVal e As System.Web.UI.WebControls.DataListCommandEventArgs)
I’m not going to cover each of the commands – I already went over several of them in this aforementioned article, so you can refer to that if you need some guidance. What we’re interested in here is the Insert command. Picking up inside the Select Case statement, we first get a reference to the drop-down list named AppList and extract its value:
Select Case e.CommandName
…
Case "Insert"
Dim AppListCtrl As DropDownList = e.Item.FindControl("AppList")
Dim AppName As String = AppListCtrl.SelectedValue
In this case we’re using a datasource with an InsertParameter named AppName, so next we’ll set the value of this parameter and insert the record:
ServerAppDataSource.InsertParameters("AppName").DefaultValue = AppName
ServerAppDataSource.Insert()
Lastly, we’ll make sure that no record is selected for editing, and bind the control:
ServerAppList.EditItemIndex = -1
ServerAppList.DataBind()
I’m glossing over a bit here – in order to link the new record back to the parent of the list (the server record, in this case) you’ll need to supply a foreign key to the parent record as an InsertParameter. And of course you’ll need to code the other commands. But do that, and this technique will help make a useful control even better.
Update: After looking over this, I realized I never stated exactly why you should handle this in the ItemCommand event instead of, say, in the InsertButton_Click event. The reason: in the ItemCommand event, you get the parameter e of type DataListCommandEventArgs. This parameter has a property Item that points to the row from which the command was called. Therefore you can use e.Item.FindControl to get the controls containing the values to add.
Without this parameter, finding controls in the header or footer of a DataList can be problematic. Since the header and footer are data items much like the other rows, you have to loop through DataList.Items, testing each item’s type, to find the correct row, after which you can use FindControl to get the controls you want. It’s much simpler the other way.
When programming an ASP.NET application, it’s very simple to use the built-in Button and LinkButton controls to perform actions like adding a record to a database or saving changes. When you create one of these buttons, the HTML looks something like this:
<asp:LinkButton ID="EditButton" runat="server" CommandName="Edit">Edit</asp:LinkButton>
Notice the CommandName attribute. Built-in commands include Insert, Update, Delete, Edit, and Cancel. If you are using a container such as a DataGrid or a DataList, you can customize these commands by adding an event handler:
<asp:DataList ID="AppList" runat="server" DataSourceID="AppDataSource"_
DataKeyField="AppID" OnEditCommand="AppList_EditCommand">
But what if, instead of redefining the standard command, you want to create your own? This is very, very simple, and you implement it using the OnItemCommand event handler. The ItemCommand also applies to objects like the FormView, and adds a lot of flexibility to your applications. First add it to the container definition (a FormView in this example):
<asp:FormView ID="AppFormView" runat="server" DataSourceID="AppDataSource"_
DataKeyNames="AppID" OnItemCommand="AppFormView_ItemCommand">
You can do a lot of cool stuff in the Item Command procedure, but here we’re going to focus on a common problem: calculating fields before saving to the database. For example, if you are uploading a file using the FileUpload control, you can modify the Update command to save it in the database, or you may want to record the time each record is modified and the username of the person who modified it.
To implement the various commands, edit the <name>_ItemCommand procedure in your code and add a Select statement:
Protected Sub AppFormView_ItemCommand(ByVal sender As Object,_
ByVal e As System.Web.UI.WebControls.FormViewCommandEventArgs)
Select Case e.CommandName
Case "Cancel"
…
Let’s look at how to handle some of the standard commands, using the FormView as an example. These will be different for a DataList or other objects – that code can generally be found in the MSDN Library.
Most of the commands have a built-in method – for example, if the command is “Update”, the FormView comes with a UpdateItem method. Cancel is an exception; to implement this command, run FormView.DataBind to replace any changed fields with the values retrieved from the database.
Now we’re ready to create our own custom command. Let’s take a look at it first, then I’ll explain later:
<asp:LinkButton ID="RetireButton" runat="server"_
CommandName="Retire">Retire</asp:LinkButton>
…
Case "Retire"
AppDatasource.UpdateParameters("AppRetired").DefaultValue = True
AppFormView.UpdateItem(True)
So – the first step is the create a button or link button invoking your custom command, in this case a command to retire an application. Once the button is created, you can add a Case to the ItemCommand procedure with the name of the new command, and then fill in your code. For this example we are using an update parameter to store the new value. You can also place this value in a bound control, as we’ll see in a minute.
There are a couple of limitations here. First, you can only use the parameter method if there are no controls bound to this field. If there are any, the parameter’s default value will be replaced by the value in the control. Second, if you will not be setting this parameter every time you update this record, then you must use a bound field. Otherwise when you update without specifying a value, the data in the database will be overwritten by the default. In practical terms this means that you’ll probably be writing this value to a bound control.
Since we only want to mark an application as retired once, we’ll instead stick this value in a bound field, and it will be “remembered” even when we don’t supply it:
Case "Retire"
AppRetiredCtrl = AppFormView.FindControl("AppRetired")
AppRetiredCtrl.Value = True
AppFormView.UpdateItem(True)
All we had to do here is find the control (a HiddenField) and stuff in the value. When we call UpdateItem this new value will be saved to the database.
Customizing the built-in commands and adding your own new ones is a powerful way to build your application without writing too much extra code. If these examples help you out, please leave a note in the comments and let me know.
Clarification: I didn’t cover handling the standard FormView commands such as “Insert”, but to be clear: You should not call a FormView.InsertItem command (for example) from inside the ItemCommand handler if the CommandName actually is “Insert”. A FormView control always executes the built-in actions for the standard events, even if you have a custom handler defined inside the ItemCommand procedure. So you don’t have to call FormView.ItemUpdate for the “Update” command, or FormView.DeleteItem for the “Delete” command. Your code is in addition to the default actions of these commands, and takes place before they are executed.
If you are creating a custom command, then you will need to invoke the appropriate method, InsertItem or UpdateItem or whatever.
Things are a bit different for a DataList. It isn’t as tightly integrated with its data source, and doesn’t even have methods such as InsertItem – the programmer is responsible for handling these commands himself inside ItemCommand, and will usually implement something like DataSource.Update (or .Insert, or whatever).
I wasn’t too clear about this in my own mind and as a result I wasn’t clear in this post. Sorry about that.