| |
Using Regular Expressions
Regular expressions are a powerful tool to manipulate strings often used to extract or modify data from complicated strings. In a sense, regular expressions are a seperate language allowing you to specify what type of data you're attempting to match. Instead of focusing on the details of regular expression syntax, we'll discuss how to use regular expressions using C# and ASP.NET pages. A useful reference and brief overview of regular expressions can be found at the web page.
To begin using regular expressions, you'll need a reference to the System.Text.RegularExpressions namespace. In the Solution Explorer of your Visual Studio.NET, right click on the References folder and select "Add Reference". A window will appear listing several components you can reference. Select the System.Text.RegularExpressions component and click "OK".
Matching
One simple goal of regular expressions is to find relevant data. The .NET Framework supports regular expression operations with the Regex class. Since Regex contains static methods, regular expressions can be used without instantiating a Regex object. To test if a data string matches against a regular expression, we simply call the Regex.IsMatch method as follows:
string strData = "Sally sells sea shells by the seashore";
string strSearch = "s\\w{6,}";
if (Regex.IsMatch(strData, strSearch)) { ... }
|
The code above is attempting to match the regular expression s\w{6,} (the letter 's' followed by at least 6 word characters) to the string "Sally sells sea shells by the seashore". Notice that construction of strSearch required the use of double backslashes (\\) to denote a single backslash. This is because a single backslash is an escape sequence in the C# string.
Now we can determine if a match was made, but what if we wanted to know what the match was? We can continue the code by calling the Regex.Match method, which returns an object of type Match. A Match object is an enumeration of all the matches for a regular expression.
string strData = "Sally sells sea shells by the seashore";
string strSearch = "s\\w{6,}";
string strMatches = "";
if (Regex.IsMatch(strData, strSearch)) {
Match match = Regex.Match(strData, strSearch);
while (match != Match.Empty) {
strMatches = strMatches + match.ToString() + "<br>";
match = match.NextMatch();
}
}
|
Replacing
Another large part of regular expressions is manipulating your string data by specifying search and replace strings. One common usage is the replacement of the characters '<' and '>' with < and > respectively to display HTML tags. This can be accomplished the the Regex.Replace, a static method that returns the result of the search and replace.
string strData = "<b>Sally sells sea shells by the seashore</b>";
strData = Regex.Replace(strData, "<", "<");
strData = Regex.Replace(strData, ">", ">");
|
Finally you'll probably want some more complicated examples of replacing strings with Regex.Replace. For example, we can rewrite the code above to make one Regex.Replace call as follows:
string strData = "<b>Sally sells sea shells by the seashore</b>";
string strSearch = "<(?<tagname>[/\\w]*?)>";
string strReplace = "<${tagname}>";
strData = Regex.Replace(strData, strSearch, strReplace);
|
This last example demonstrates several useful features of regular expressions. First, notice that the match between the angle brackets was given the name "tagname" by declaring ?<tagname> before the match condition. This allowed the replacement string to use the matched text in the replacement by specifying ${tagname}. The search string also used other regular expression syntax such as character set ([/\\w] denotes a match with / or \w) and non-greedy matching (? denotes that the * operator should match the smallest possible number of characters).
|
|