Strings and Characters, Part 1

Find solutions to some typical programming problems with C#. This chapter covers the String data type and the Char data type. Various recipies show how to compare strings in various ways, encode/decode strings, break strings apart, and put them back together again. (From C# Cookbook by Stephen Teilhet and Jay Hilyard (O'Reilly Media, ISBN: 0596003390, 2004.)

Contributed by
Rating: 3 stars3 stars3 stars3 stars3 stars / 20
July 21, 2004
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

teilhet

Introduction

String usage abounds in just about all types of applications. The System.String type does not derive from System.ValueType and is therefore considered a reference type. The string alias is built into C# and can be used instead of the full name.

The FCL does not stop with just the string class;there is also a System.Text. StringBuilder class for performing string manipulations and the System.Text. RegularExpressions namespace for searching strings. This chapter will cover the string class and the System.Text.StringBuilder class.

The System.Text.StringBuilder class provides an easy, performance friendly, method of manipulating string objects. This class duplicates much of the functionality of a string class. However, this duplicated functionality provides a more efficient manipulation of strings than is obtainable by using the string class.

2.1 Determining the Kind of Character

Problem

You have a variable of type char and wish to determine the kind of character it contains—a letter, digit, number, punctuation character, control character, separator character, symbol, whitespace, or surrogate character. Similarly, you have a string variable and want to determine the kind of character in one or more positions within this string.

Solution

Use the built-in static methods on the System.Char structure shown here:

Char.IsControl
Char.IsDigit
Char.IsLetter
Char.IsNumber

Char.IsPunctuation
Char.IsSeparator
Char.IsSurrogate
Char.IsSymbol
Char.IsWhitespace

Discussion

The following examples demonstrate how to use the methods shown in the Solution section in a function to return the kind of a character. First, create an enumeration to define the various types of characters:

public enum CharKind
{
Control,
Digit,
Letter,
Number,
Punctuation,
Separator,
Surrogate,
Symbol,
Whitespace,
Unknown
}

Next, create a method that contains the logic to determine the type of a character and to return a CharKind enumeration value indicating that type:

public static CharKind GetCharKind(char theChar)

{
if (Char.IsControl(theChar))
{

return CharKind.Control;
} else if (Char.IsDigit(theChar))
{

return CharKind.Digit;
} else if (Char.IsLetter(theChar))
{

return CharKind.Letter;
} else if (Char.IsNumber(theChar))
{

return CharKind.Number;
} else if (Char.IsPunctuation(theChar))
{

return CharKind.Punctuation;
} else if (Char.IsSeparator(theChar))

{

return CharKind.Separator;
} else if (Char.IsSurrogate(theChar))
{

return CharKind.Surrogate;
} else if (Char.IsSymbol(theChar))
{

return CharKind.Symbol;
} else if (Char.IsWhiteSpace(theChar))
{

return CharKind.Whitespace;
}
else
{

return CharKind.Unknown;
}
}

If, however, a character in a string needs to be evaluated, use the overloaded static methods on the Char structure. The following code modifies the GetCharKind method to accept a string variable and a character position in that string. The character position determines which character in the string is evaluated:

public static CharKind GetCharKindInString(string theString, int charPosition)

{
if (Char.IsControl(theString, charPosition))
{

return CharKind.Control;
} else if (Char.IsDigit(theString, charPosition))
{

return CharKind.Digit;
} else if (Char.IsLetter(theString, charPosition))
{

return CharKind.Letter;
} else if (Char.IsNumber(theString, charPosition))
{

return CharKind.Number;
} else if (Char.IsPunctuation(theString, charPosition))
{

return CharKind.Punctuation;
} else if (Char.IsSeparator(theString, charPosition))
{

return CharKind.Separator;
} else if (Char.IsSurrogate(theString, charPosition))
{

return CharKind.Surrogate;

} else if (Char.IsSymbol(theString, charPosition))
{

return CharKind.Symbol;
} else if (Char.IsWhiteSpace(theString, charPosition))
{

return CharKind.Whitespace;

} else {

return CharKind.Unknown;
}
}

The GetCharKind method accepts a character as a parameter and performs a series of tests on that character using the Char type’s built-in static methods. An enumeration of all the different types of characters is defined and is returned by the GetCharKind method.

Table 2-1 describes each of the static Char methods:

Char method Description
IsControl A control code in the ranges \U007F, \U0000–\U001F, and \U0080–\U009F.
IsDigit Any decimal digit in the range 0–9.
IsLetter Any alphabetic letter.
IsNumber Any decimal digit or hexadecimal digit.
IsPunctuation Any punctuation character.
IsSeparator A space separating words, a line separator, or a paragraph separator.
IsSurrogate Any surrogate character in the range \UD800–\UDFFF.
IsSymbol Any mathematical, currency, or other symbol character. Includes characters that modify sur
rounding characters.
IsWhitespace Any space character and the following characters:
\U0009
\U000A
\U000B
\U000C
\U000D
\U0085
\U2028
\U2029

The following code example determines whether the fifth character (the charPosition parameter is zero-based) in the string is a digit:

if (GetCharKind("abcdefg", 4) == CharKind.Digit) {...}

See Also

See the “Char Structure” topic in the MSDN documentation. 

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.2 Determine if a character is in a Specified Range

2.2 Determining Whether a Character Is Within a Specified Range

Problem

You need to determine whether a character in a char data type is within a range, such as between 1 and 5 or between A and M.

Solution

Use the built-in comparison support for the char data type. The following code shows how to use the built-in comparison support:

public static bool IsInRange(char testChar, char startOfRange, char endOfRange)

{

if (testChar >= startOfRange && testChar <= endOfRange)

{

// testChar is within the range

return (true);

}

else

{

// testChar is NOT within the range return (false);

}
}

There is only one problem with that code. If the startOfRange and endOfRange characters have different cases, the result may not be what you expect. By adding the following code, which makes all characters uppercase, to the beginning of the method in Recipe 2.7, we can solve this problem:

testChar = char.ToUpper(testChar);
startOfRange = char.ToUpper(startOfRange);
endOfRange = char.ToUpper(endOfRange);

Discussion

The IsInRange method accepts three parameters. The first is the testChar character that you need to check on, to test if it falls between the last two parameters on this method. The last two parameters are the starting and ending characters, respectively, of a range of characters. The testChar parameter must be between startOfRange and endOfRange or equal to one of theses parameters for this method to return true;other-wise, false is returned.

The IsInRange method can be called in the following manner:

bool inRange = IsInRange('c', 'a', 'g'); bool inRange = IsInRange('c', 'a', 'b'); bool inRange = IsInRange((char)32, 'a', 'g');

The first call to this method returns true, since c is between a and g. The second method returns false, since c is not between a and b. The third method indicates how an integer value representative of a character would be passed to this method.

Note that this method tests whether the testChar value is inclusive between the range of characters startOfRange and endOfRange. If you wish to determine only whether testChar is between this range exclusive of the startOfRange and endOfRange character values, you should modify the if statement, as follows:

if (testChar > startOfRange && testChar < endOfRange)

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.3 Controlling Case Sensitivity when Comparing Two Characters

Problem

You need to compare two characters for equality, but you need the flexibility of performing a case-sensitive or case-insensitive comparison.

Solution

Use the Equals instance method on the char structure to compare the two characters:

public static bool IsCharEqual(char firstChar, char secondChar)
{

return (IsCharEqual(firstChar, secondChar, false));
}

public static bool IsCharEqual(char firstChar, char secondChar,

bool caseSensitiveCompare) {

if (caseSensitiveCompare)

{

return (firstChar.Equals(secondChar));

}

else

{

return (char.ToUpper(firstChar).Equals(char.ToUpper(secondChar)));
}
}

The first overloaded IsCharEqual method takes only two parameters: the characters to be compared. This method then calls the second IsCharEqual method with three parameters. The third parameter on this method call defaults to false so that when this method is called, you do not have to pass in a value for the caseSensitiveCompare parameter—it will automatically default to false.

Discussion

Using the ToUpper method in conjunction with the Equals method on the string class allows us to choose whether to take into account the case of the strings when comparing them. To perform a case-sensitive comparison of two char variables, simply use the Equals method, which, by default, performs a case-sensitive comparison. Performing a case-insensitive comparison requires that both characters be converted to their uppercase values (they could just as easily be converted to their lowercase equivalents, but for this recipe we convert them to uppercase) before the Equals method is invoked. Setting both characters to their uppercase equivalents removes any case-sensitivity between the character values, and they can be compared using the case-sensitive Equals comparison method as though it were a case-insensitive comparison.

You can further extend the overloaded IsCharEqual methods to handle the culture of the characters passed in to it:

public static bool IsCharEqual(char firstChar, CultureInfo firstCharCulture, char secondChar, CultureInfo secondCharCulture)
{
return (IsCharEqual(firstChar, firstCharCulture, secondChar, secondCharCulture, false));
}

public static bool IsCharEqual(char firstChar, CultureInfo firstCharCulture, char secondChar, CultureInfo secondCharCulture, bool caseSensitiveCompare)

{
if (caseSensitiveCompare)
{

return (firstChar.Equals(secondChar));
}
else
{

return (char.ToUpper(firstChar, firstCharCulture).Equals (char.ToUpper(secondChar, secondCharCulture)));
}
}

The addition of the CultureInfo parameters to these methods allows us to pass in the culture information for the strings that we are calling ToUpper on. This information allows the ToUpper method to correctly uppercase the character based in the culture-specific details of the character (i.e., the language, region, etc., of the character).

Note that you must include the following using directives to compile this code:

using System;
using System.Globalization;

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.4 Finding All Occurrences of a Character Within a String

Problem

You need a way of searching a string for multiple occurrences of a specific character.

Solution

Use IndexOf in a loop to determine how many occurrences of a character exist, as well as identify their location within the string:

using System;
using System.Collections;

public static int[] FindAllOccurrences(char matchChar, string source)
{
return (FindAllOccurrences(matchChar, source, -1, false));
}

public static int[] FindAllOccurrences(char matchChar, string source, int maxMatches)
{
return (FindAllOccurrences(matchChar, source, maxMatches, false));
}

public static int[] FindAllOccurrences(char matchChar, string source, bool caseSensitivity)
{
return (FindAllOccurrences(matchChar, source, -1, caseSensitivity));
}

public static int[] FindAllOccurrences(char matchChar, string source, int maxMatches, bool caseSensitivity)

{
ArrayList occurrences = new ArrayList();
int foundPos = -1; // -1 represents not found
int numberFound = 0;
int startPos = 0;
char tempMatchChar = matchChar;
string tempSource = source;

if (!caseSensitivity)

{ tempMatchChar = char.ToUpper(matchChar);
tempSource = source.ToUpper();

}

do

{ foundPos = tempSource.IndexOf(matchChar, startPos);
if (foundPos > -1)

{ startPos = foundPos + 1; numberFound++;

if (maxMatches > -1 && numberFound > maxMatches) {

break;

}

else

{

occurrences.Add(foundPos);
}
}
}while (foundPos > -1);

return ((int[])occurrences.ToArray(typeof(int)));
}

Discussion

The FindAllOccurrences method is overloaded to allow the last two parameters (maxMatches and caseSensitivity) to be set to a default value if the developer chooses not to pass in one or both of these parameters. The maxMatches parameter defaults to -1, indicating that all matches are to be found. The caseSensitivity parameter defaults to false to allow for a case-insensitive search.

The FindAllOccurrences method starts out by determining whether case sensitivity is turned on. If false was passed in to the caseSensitivity parameter, both matchChar and source are set to all uppercase. This prevents a case-sensitive search.

The main loop in this method is a simple do loop that terminates when foundPos returns -1, meaning that no more matchChar characters can be found in the source string. We use a do loop so that the IndexOf operation would be executed at least one time before the check in the while clause is performed to determine whether there are any more character matches to be found in the source string.

Once a match is found by the IndexOf method, the numberFound variable is incremented by one to indicate that another match was found, and startPos is moved past the previously found match to indicate where the next IndexOf operation should start. The startPos is increased to the starting position of the last match found plus one. The +1 is needed so that we do not keep matching the same character that was previously matched. An infinite loop would occur in the code if at least one match was found in the source string.

Finally, a check is made to determine whether we are done searching for matchChar characters. If the maxMatches parameter is set to -1, the code keeps searching until it arrives at the end of the source string. Any other number indicates the maximum number of matchChar characters to search for. The maxMatches parameter limits the number of matches that can be made in the source string. If this check indicates that we are able to keep this match, it is stored in the occurrences ArrayList.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.5 Finding the Location of All Occurrences of a String Within Another String

Problem

You need to search a string for every occurrence of a specific string. In addition, the case-sensitivity, or insensitivity, of the search needs to be controlled.

Solution

Using IndexOf or IndexOfAny in a loop, we can determine how many occurrences of a character or string exist as well as their locations within the string. To find each occurrence of a case-sensitive string in another string, use the following code:

using System;
using System.Collections;

public static int[] FindAll(string matchStr, string searchedStr, int startPos)

{
int foundPos = -1; // -1 represents not found
int count = 0;
ArrayList foundItems = new ArrayList();

do

{
foundPos = searchedStr.IndexOf(matchStr, startPos); if (foundPos > -1)
{

startPos = foundPos + 1;
count++;
foundItems.Add(foundPos);

Console.WriteLine("Found item at position: " + foundPos.ToString());

}
}while (foundPos > -1 && startPos < searchedStr.Length);

return ((int[])foundItems.ToArray(typeof(int)));
}

If the FindAll method is called with the following parameters:

int[] allOccurrences = FindAll("Red", "BlueTealRedredGreenRedYellow", 0);

the string "Red" is found at locations 8 and 19 in the string searchedStr. This code uses the IndexOf method inside a loop to iterate through each found matchStr string in the searchStr string.

To find a case-sensitive character in a string, use the following code:

public static int[] FindAll(char MatchChar, string searchedStr, int startPos)

{
int foundPos = -1; // -1 represents not found
int count = 0;
ArrayList foundItems = new ArrayList();

do

{
foundPos = searchedStr.IndexOf(MatchChar, startPos); if (foundPos > -1)
{

startPos = foundPos + 1;
count++;
foundItems.Add(foundPos);

Console.WriteLine("Found item at position: " + foundPos.ToString());
}
}while (foundPos > -1 && startPos < searchedStr.Length);

return ((int[])foundItems.ToArray(typeof(int)));
}

If the FindAll method is called with the following parameters:

int[] allOccurrences = FindAll('r', "BlueTealRedredGreenRedYellow", 0);

the character 'r' is found at locations 11 and 15 in the string searchedStr. This code uses the IndexOf method inside a do loop to iterate through each found matchChar character in the searchStr string. Overloading the FindAll method to accept either a char or string type avoids the performance hit of boxing the char type to a string type.

To find each case-insensitive occurrence of a string in another string, use the following code:

public static int[] FindAny(string matchStr, string searchedStr, int startPos)

{
int foundPos = -1; // -1 represents not found
int count = 0;
ArrayList foundItems = new ArrayList();

// Factor out case-sensitivity
searchedStr = searchedStr.ToUpper();
matchStr = matchStr.ToUpper();

do

{
foundPos = searchedStr.IndexOf(matchStr, startPos); if (foundPos > -1)
{

startPos = foundPos + 1;
count++;
foundItems.Add(foundPos);

Console.WriteLine("Found item at position: " + foundPos.ToString());
}
}while (foundPos > -1 && startPos < searchedStr.Length);

return ((int[])foundItems.ToArray(typeof(int)));
}

If the FindAny method is called with the following parameters:

int[] allOccurrences = FindAll("Red", "BlueTealRedredGreenRedYellow", 0);

the string "Red" is found at locations 8, 11, and 19 in the string searchedStr. This code uses the IndexOf method inside a loop to iterate through each found matchStr string in the searchStr string. The search is rendered case-insensitive by using the ToUpper method on both the searchedStr and the matchStr strings.

To find a character in a string, use the following code:

public static int[] FindAny(char[] MatchCharArray, string searchedStr, int startPos)

{
int foundPos = -1; // -1 represents not found
int count = 0;
ArrayList foundItems = new ArrayList();

do

{
foundPos = searchedStr.IndexOfAny(MatchCharArray, startPos);
if (foundPos > -1)
{

startPos = foundPos + 1;
count++;
foundItems.Add(foundPos);

Console.WriteLine("Found item at position: " + foundPos.ToString());
}
}while (foundPos > -1 && startPos < searchedStr.Length);

return ((int[])foundItems.ToArray(typeof(int)));
}

If the FindAll method is called with the following parameters:

int[] allOccurrences = FindAll(new char[] MatchCharArray = {'R', 'r'}, "BlueTealRedredGreenRedYellow", 0);

the characters 'r' or 'R' are found at locations 8, 11, 15, and 19 in the string searchedStr. This code uses the IndexOfAny method inside a loop to iterate through each found matchStr string in the searchStr string. The search is rendered case-insensi-tive by using an array of char containing all characters, both upper-and lowercase, to be searched for.

Discussion

In the example code, the foundPos variable contains the location of the found character/ string within the searchedStr string. The startPos variable contains the next position in which to start the search. The IndexOf or IndexOfAny method is used to perform the actual searching. The count variable simply counts the number of times the character/string was found in the searchedStr string.

The example used a do loop so that the IndexOf or IndexOfAny operation would be executed at least one time before the check in the while clause is performed to determine whether there are any more character/string matches to be found in the searchedStr string. This loop terminates when foundPos returns -1 (meaning that no more character/strings can be found in the searchedStr string) or when an out-of-bounds condition exists. When foundPos equals -1, there are no more instances of the match value in the searchedStr string;therefore, we can exit the loop. If, however, the startPos overshoots the last character element of the searchedStr string, an out-of-bounds condition exists and an exception is thrown. To prevent this, always check to make sure that any positioning variables that are modified inside of the loop, such as the startPos variable, are within their intended bounds.

Once a match is found by the IndexOf or IndexOfAny method, the if statement body is executed to increment the count variable by one and to move the startPos up past the previously found match. The count variable is incremented by one to indicate that another match was found. The startPos is increased to the starting position of the last match found plus 1. Adding 1 is necessary so that we do not keep matching the same character/string that was previously matched, which would cause an infinite loop to occur in the code if at least one match was found in the searchedStr string. To see this behavior, remove the +1 from the code.

There is one potential problem with this code. Consider the case where:

searchedStr = "aa";
matchStr = "aaaa";

The code contained in this recipe would match "aa" three times.

(aa)aa
a(aa)a
aa(aa)

This situation may be fine for some applications, but not if you need it to return only the following matches:

(aa)aa

aa(aa)

To do this, change the following line in the while loop:

startPos = foundPos + 1;

to this:

startPos = foundPos + matchStr.Length;

This code moves the startPos pointer beyond the first matched string, disallowing any internal matches.

To convert this code to use a while loop rather than a do loop, the foundPos variable must be initialized to 0 and the while loop expression should be as follows:

while (foundPos >= 0 && startPos < searchStr.Length)

{

foundPos = searchedStr.IndexOf(matchChar, startPos);

If (foundPos > -1)

{

startPos = foundPos + 1; count++;
}
}

See Also

See the “String.IndexOf Method” and “String.IndexOfAny Method” topics in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.6 The Poor Man’s Tokenizer Problem

You need a quick method of breaking up a string into a series of discrete tokens or words.

Solution

Use the Split instance method of the string class. For example:

string equation = "1 + 2 – 4 * 5";
string[] equationTokens = equation.Split(new char[1]{' '});

foreach (string Tok in equationTokens)
Console.WriteLine(Tok);

This code produces the following output:

1
+
2
-
4
*
5

The Split method may also be used to separate people’s first, middle, and last names. For example:

string fullName1 = "John Doe";
string fullName2 = "Doe,John";
string fullName3 = "John Q. Doe";

string[] nameTokens1 = fullName1.Split(new char[3]{' ', ',', '.'});
string[] nameTokens2 = fullName2.Split(new char[3]{' ', ',', '.'});
string[] nameTokens3 = fullName3.Split(new char[3]{' ', ',', '.'});

foreach (string tok in nameTokens1) {

Console.WriteLine(tok); } Console.WriteLine("");

foreach (string tok in nameTokens2) {

Console.WriteLine(tok); } Console.WriteLine("");

foreach (string tok in nameTokens3) {

Console.WriteLine(tok); }

This code produces the following output:

John
Doe

Doe
John

John
Q

Doe

Notice that a blank is inserted between the '.' and the space delimiters of the fullName3 name;this is correct behavior. If you did not want to process this space in your code, you can choose to ignore it.

Discussion

If you have a consistent string whose parts, or tokens, are separated by well-defined characters, the Split function can tokenize the string. Tokenizing a string consists of breaking the string down into well-defined, discrete parts, each of which is considered a token. In the two previous examples, the tokens were either parts of a mathematical equation (numbers and operators) or parts of a name (first, middle, and last).

There are several drawbacks to this approach. First, if the string of tokens is not separated by any well-defined character(s), it will be impossible to use the Split method to break up the string. For example, if the equation string looked like this:

string equation = "1+2-4*5";

we would clearly have to use a more robust method of tokenizing this string (see Recipe 8.7 for a more robust tokenizer).

A second drawback is that a string of tokenized words must be entered consistently in order to gain meaning from the tokens. For example, if we ask users to type in their names, they may enter any of the following:

John Doe

Doe John

John Q Doe

If one user enters in his name the first way and another user enters it the second way, our code will have a difficult time determining whether the first token in the string array represents the first or last name. The same problem will exist for all of the other tokens in the array. However, if all users enter their names in a consistent style, such as First Name, space, Last Name, we will have a much easier time tokenizing the name and understanding what each token represents.

See Also

See Recipe 8.7; see the “String.Split Method” topic in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.7 Controlling Case Sensitivity when Comparing Two Strings

Problem

You need to compare the contents of two strings for equality. In addition, the case sensitivity of the comparison needs to be controlled.

Solution

Use the Compare static method on the string class to compare the two strings. Whether the comparison is case-insensitive is determined by the third parameter of one of its overloads. For example:

string lowerCase = "abc";
string upperCase = "AbC";

int caseSensitiveResult = string.Compare(lowerCase, upperCase, false); int caseInsensitiveResult = string.Compare(lowerCase, upperCase, true);

The caseSensitiveResult value is -1 (indicating that lowerCase is “less than” upperCase) and the caseInsensitiveResult is zero (indicating that lowerCase “equals” upperCase).

Discussion

Using the static string.Compare method allows us the freedom to choose whether to take into account the case of the strings when comparing them. This method returns an integer indicating the lexical relationship between the two strings. A zero means that the two strings are equal, a negative number means that the first string is less than the second string, and a positive number indicates that the first string is greater than the second string.

By setting the last parameter of this method (the IgnoreCase parameter) to true or false, we can determine whether the Compare method takes into account the case of both strings when comparing. Setting this parameter to true forces a case-insensitive comparison and setting this parameter to false forces a case-sensitive comparison. In the case of the overloaded version of the method with no IgnoreCase parameter, comparisons are always case-sensitive.

See Also

See the “String.Compare Method” topic in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.8 Comparing a String to the Beginning or End of a Second String

Problem

You need to determine whether a string is at the head or tail of a second string. In addition, the case sensitivity of the search needs to be controlled.

Solution

Use the EndsWith or StartsWith instance methods on a string object. Comparisons with EndsWith and StartsWith are always case-sensitive. The following code compares the value in the string variable head to the beginning of the string Test:

string head = "str";
string test = "strVarName";
bool isFound = test.StartsWith(head);

The following example compares the value in the string variable Tail to the end of the string test:

string tail = "Name";
string test = "strVarName";
bool isFound = test.EndsWith(tail);

In both examples, the isFound Boolean variable is set to true, since each string is found in test.

To do a case-insensitive comparison, employ the static string.Compare method. The following two examples modify the previous two examples by performing a case-insensitive comparison. The first is equivalent to a case-insensitive StartsWith string search:

string head = "str";
string test = "strVarName";
int isFound = string.Compare(head, 0, test, 0, head.Length, true);

The second is equivalent to a case-insensitive EndsWith string search:

string tail = "Name";
string test = "strVarName";
int isFound = string.Compare(tail, 0, test, (test.Length – tail.Length),

tail.Length, true);

Discussion

Use the BeginsWith or EndsWith instance methods to do a case-sensitive search for a particular string at the beginning or end of a string. The equivalent case-insensitive comparison requires the use of the static Compare method in the string class. If the return value of the Compare method is zero, a match was found. Any other number means that a match was not found.

See Also

See the “String.StartsWith Method,” “String.EndsWith Method,” and “String.Com-pare Method” topics in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.9 Inserting Text into a String

Problem

You have some text (either a char or a string value) that needs to be inserted at a specific location inside of a second string.

Solution

Using the Insert instance method of the string class, a string or char can easily be inserted into a string. For example, in the code fragment:

string sourceString = "The Inserted Text is here -><-";

sourceString = sourceString.Insert(28, "Insert-This"); Console.WriteLine(sourceString);

the string sourceString is inserted between the > and < characters in a second string. The result is:

The Inserted Text is here ->Insert-This<-

Inserting the character in sourceString into a second literal string between the > and < characters is shown here:

string sourceString = "The Inserted Text is here -><-";
char insertChar = '1';

sourceString = sourceString.Insert(28, Convert.ToString(insertChar)); Console.WriteLine(sourceString);

There is no overloaded method for Insert that takes a char value, so using a string of length one is the next best solution.

Discussion

There are two ways of inserting strings into other strings, unless, of course, you are using the regular expression classes. The first involves using the Insert instance method on the string class. This method is also slower than the others since strings are immutable, and, therefore, a new string object must be created to hold the modified value. In this recipe, the reference to the old string object is then changed to point to the new string object. Note that the Insert method leaves the original string untouched and creates a new string object with the inserted characters.

To add flexibility and speed to your string insertions, use the Insert instance method on the StringBuilder class. This method is overloaded to accept all of the built-in types. In addition, the StringBuilder object optimizes string insertion by not making copies of the original string; instead, the original string is modified.

If we use the StringBuilder class instead of the string class to insert a string, our code appears as:

StringBuilder sourceString =

new StringBuilder("The Inserted Text is here -><-");
sourceString.Insert (28, "Insert-This");
Console.WriteLine(sourceString);

The character insertion example would be changed to the following code:

char charToInsert = '1';
StringBuilder sourceString =

new StringBuilder("The Inserted Text is here -><-");
sourceString.Insert (28, charToInsert);
Console.WriteLine(sourceString);

Note that when using the StringBuilder class, you must also use the System.Text namespace.

See Also

See the “String.Insert Method” topic in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.10 Removing or Replacing Characters Within a String

Problem

You have some text within a string that needs to be either removed or replaced with a different character or string. Since the replacing operation is somewhat simple, you do not require the overhead of using a regular expression to aid in the replacing operation.

Solution

To remove a substring from a string, use the Remove instance method on the string class. For example:

string name = "Doe, John";
name = name.Remove(3, 1);
Console.WriteLine(name);

This code creates a new string and then sets the name variable to refer to it. The string contained in name now looks like this:

Doe John

If performance is critical, and particularly if the string removal operation occurs in a loop so that the operation is performed multiple times, you can instead use the Remove method of the StringBuilder object. The following code modifies the str variable so that its value becomes 12345678:

StringBuilder str = new StringBuilder("1234abc5678", 12);
str.Remove(4, 3);
Console.WriteLine(str);

To replace a delimiting character within a string, use the following code:

string commaDelimitedString = "100,200,300,400,500"; commaDelimitedString = commaDelimitedString.Replace(',', ':'); Console.WriteLine(commaDelimitedString);

This code creates a new string and then makes the commaDelimitedString variable refer to it. The string in commaDelimitedString now looks like this:

100:200:300:400:500

To replace a place-holding string within a string, use the following code:

string theName = "Mary";
string theObject = "car";
string ID = "This is the property of .";
ID = ID.Replace(" ", theObject);
ID = ID.Replace(" ", theName);
Console.WriteLine(ID);

This code creates a new string and then makes the ID variable refer to it. The string in ID now looks like this:

This car is the property of Mary.

As when removing a portion of a string, you may, for performance reasons, choose to use the Replace method of the StringBuilder class instead. For example:

string newName = "John Doe";

str = new StringBuilder("name = ");
str.Replace(" ", newName);
Console.WriteLine(str.ToString());

str.Replace('=', ':');
Console.WriteLine(str.ToString());

str = new StringBuilder("name1 = , name2 = ");
str.Replace(" ", newName, 7, 12);
Console.WriteLine(str.ToString());
str.Replace('=', ':', 0, 7);
Console.WriteLine(str.ToString());

This code produces the following results:

name = John Doe
name : John Doe
name1 = John Doe, name2 = <FIRSTNAME>
name1 : John Doe, name2 = <FIRSTNAME>

Note that when using the StringBuilder class, you must use the System.Text namespace.

Discussion

The string class provides two methods that allow easy removal and modification of characters in a string: the Remove instance method and the Replace instance method. The Remove method deletes a specified number of characters starting at a given location within a string. This method returns a new string object containing the modified string.

The Replace instance method that the string class provides is very useful for removing characters from a string and replacing them with a new character or string. At any point where the Replace method finds an instance of the string passed in as the first parameter, it will replace it with the string passed in as the second parameter. The Replace method is case-sensitive and returns a new string object containing the modified string. If the string being searched for cannot be found in the original string, the method returns a copy of the original string object.

The Replace and Remove methods on a string object always create a new string object that contains the modified text. If this action hurts performance, consider using the Replace and Remove methods on the StringBuilder class.

The Remove method of the StringBuilder class is not overloaded and is straightfoward to use. Simply give it a starting position and the number of characters to remove. This method returns a reference to the same instance of the StringBuilder object whose Replace method modified the string value.

The Replace method of the StringBuilder class allows for fast character or string replacement to be performed on the original StringBuilder object. These methods return a reference to the same instance of the StringBuilder object whose Replace method was called.

Note that this method is case-sensitive.

See Also

See the “String.Replace Method,” “String.Remove Method,” “StringBuilder.Replace Method,” and “StringBuilder.Remove Method” topics in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.11 Encoding Binary Data as Base64

Problem

You have a byte[], which could represent some binary information such as a bitmap. You need to encode this data into a string so that it can be sent over a binary-unfriendly transport such as email.

Solution

Using the static method Convert.ToBase64CharArray on the Convert class, a byte[] may be encoded to a char[] equivalent, and the char[] can then be converted to a string:

using System;

public static string Base64EncodeBytes(byte[] inputBytes)

{

// Each 3-byte sequence in inputBytes must be converted to a 4-byte

// sequence

long arrLength = (long)(4.0d * inputBytes.Length / 3.0d);

if ((arrLength % 4) != 0)

{
// increment the array length to the next multiple of 4
// if it is not already divisible by 4
arrLength += 4 - (arrLength % 4);
}

char[] encodedCharArray = new char[arrLength]; Convert.ToBase64CharArray(inputBytes, 0, inputBytes.Length, encodedCharArray, 0);

return (new string(encodedCharArray));
}

Discussion

The Convert class makes encoding between a byte[] and a char[] and/or a string a simple matter. The ToBase64CharArray method fills the specified character array with converted bytes, and also returns an integer specifying the number of elements in the resulting byte[], which, in this recipe, is discarded. As you can see, the parameters for this method are quite flexible. It provides the ability to start and stop the conversion at any point in the input byte array and to add elements starting at any position in the resulting char[].

To encode a bitmap file into a string that can be sent to some destination via email, you could use the following code:

FileStream fstrm = new FileStream(@"C:\WINNT\winnt.bmp", FileMode.Open, FileAccess. Read);
BinaryReader reader = new BinaryReader(fstrm);
byte[] image = new byte[reader.BaseStream.Length];
for (int i = 0; i < reader.BaseStream.Length; i++)
{

image[i] = reader.ReadByte();
}
reader.Close();
fstrm.Close();
string bmpAsString = Base64EncodeBytes(image);

The bmpAsString string can then be sent as the body of an email message. To decode an encoded string to a byte[], see Recipe 2.12.

See Also

See Recipe 2.12;see the “Convert.ToBase64CharArray Method” topic in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

2.12 Decoding a Base64-Encoded Binary

Problem

You have a string that contains information such as a bitmap encoded as base64. You need to decode this data (which may have been embedded in an email message) from a string into a byte[] so that you can access the original binary.

Solution

Using the static method Convert.FromBase64CharArray on the Convert class, an encoded char[] and/or string may be decoded to its equivalent byte[]:

using System;

public static byte[] Base64DecodeString(string inputStr)
{
byte[] decodedByteArray =
Convert.FromBase64CharArray(inputStr.ToCharArray(),
0, inputStr.Length);
return (decodedByteArray);
}

Discussion

The static FromBase64CharArray method on the Convert class makes decoding an encoded base64 string a simple matter. This method returns a byte[] that contains the decoded elements of the string.

If you receive a file via email, such as an image file (.bmp), that has previously been converted to a string, to convert it back into its original bitmap file, you could do something like the following:

byte[] imageBytes = Base64DecodeString(bmpAsString);
fstrm = new FileStream(@"C:\winnt_copy.bmp", FileMode.CreateNew, FileAccess.Write);
BinaryWriter writer = new BinaryWriter(fstrm);
writer.Write(imageBytes);
writer.Close();
fstrm.Close();

In this code, the bmpAsString variable was obtained from the code in the Discussion section of Recipe 2.11. The imageBytes byte[] is the bmpAsString string converted back to a byte[], which can then be written back to disk.

To encode a byte[] to a string, see Recipe 2.13.

See Also

See Recipe 2.11;see the “Convert.FromBase64CharArray Method” topic in the MSDN documentation.

Buy the book!If you've enjoyed what you've seen here, or to get more information, click on the "Buy the book!" graphic. Pick up a copy today!

Visit the O'Reilly Network http://www.oreillynet.com for more online content.

blog comments powered by Disqus
C# ARTICLES

- Beginning C#
- ASP.NET RedirectPermanent Method using C# an...
- C Programming Language and UNIX Pioneer Pass...
- Using Facebook JavaScript SDK in ASP.NET wit...
- ASP.NET Export to Excel and Word using VB.NE...
- WAV and MP3 Streaming with ASP.Net and C#
- Game Programming using SDL: the File I/O API
- C# and Java Developer Jobs on the Rise
- The Future Evolution of C# and VB.NET
- C# If and Else-if Statements
- How To Use the C# String Replace Method
- 5 Ways to Parse XML in C#
- C# Meets Design Patterns
- Coding a CRC-Generating Algorithm in C
- Cyclic Redundancy Check

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 3 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials