Windows Scripting
  Home arrow Windows Scripting arrow Page 3 - Regular Expressions in VBScript
ASP Free Forums 
.NET  
ASP  
ASP Code  
ASP.NET  
ASP.NET Code  
BrainDump  
C#  
Code Examples  
Database  
Database Code  
IIS  
Microsoft Access  
MS SQL Server  
Silverlight  
Visual Basic.NET  
Windows Scripting  
Windows Security  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
ASP Web Hosting  
ASP.NET Web Hosting 
Windows Web Hosting
 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
WINDOWS SCRIPTING

Regular Expressions in VBScript
By: Nilpo
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 5 stars5 stars5 stars5 stars5 stars / 3
    2009-02-17

    Table of Contents:
  • Regular Expressions in VBScript
  • Searching and Replacing
  • Constructing Patterns
  • Building useful patterns

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Regular Expressions in VBScript - Constructing Patterns


    (Page 3 of 4 )

    At this point regular expressions don’t seem any better that VBS’s own string functions.  In fact, they only seem to take more code!  But that’s because you haven’t seen the magic of patterns yet.  What if you wanted to know how many words were in a sentence?

    strTest = "This is my test string."

     

    Set objRegExp = New RegExp

    objRegExp.Global = True

    objRegExp.IgnoreCase = True

    objRegExp.Pattern = "w+"

     

    Set colMatches = objRegExp.Execute(strTest)

    WScript.Echo colMatches.Count

    Or which words begin with the letter “t”?

    strTest = "This is my test string."

     

    Set objRegExp = New RegExp

    objRegExp.Global = True

    objRegExp.IgnoreCase = True

    objRegExp.Pattern = "bt[a-z]+b"

     

    Set colMatches = objRegExp.Execute(strTest)

    For Each objMatch In colMatches

       WScript.Echo objMatch.Value

    Next

    You can quickly see how patterns can make all of the difference.  But what are patterns and how do you make them?  A pattern is a string of literal characters to be matched.  However, there are a series of reserved and escaped characters that can be used to control match positions, occurrences, wild cards, and more.  We’ll begin with match positions as listed in Table 1 below.

    Table 1: Position Matching

    Symbol

    Description

    ^

    Matches the beginning of a string.

    “^This” would match the word “This” if it appeared at the beginning of a string.

    $

    Matches the end of a string.

    “.$” matches the period at the end of a string.

    b

    Matches a word boundary.

    “bt” matches the letter t at the beginning of a word.

    B

    Matches a non-word boundary.

    “BxB” matches any letter x that does not appear at the beginning or end of a word.

    After positioning, you’ll want to match literal characters.  Alphanumeric characters are treated as literals.  However, some of them have special meanings.  Those characters must be escaped by a back-slash.

    Table 2: Matching Literals

    Symbol

    Description

    Alphanumeric

    Matches any alphanumeric character literally.

    n

    Matches a new line

    f

    Matches a form feed

    r

    Matches a carriage return

    t

    Matches horizontal tab

    v

    Matches a vertical tab

    ?

    Matches a ?

    *

    Matches a *

    +

    Matches a +

    .

    Matches a .

    |

    Matches a |

    {

    Matches a {

    }

    Matches a }

    Matches a

    [

    Matches a [

    ]

    Matches a ]

    (

    Matches a (

    )

    Matches a )

    xxx

    Matches the ASCII character expressed by the Octal number.

    “50” matches “(“ or Chr(40)

    xdd

    Matches the ASCII character expressed by the Hex number.

    “x28” matches “(“ or Chr(40)

    uxxxx

    Matches the ASCII character expressed by the Unicode number.

    “u00A3” matches “£”

    Once you have the ability to match literal characters, you’ll probably find the need to expand a bit.  You may want to match any one character in a range of characters, or perhaps everything except a specified character.  This is done with character classes.

    Table 3: Matching Character Classes

    Symbol

    Description

    [xyz]

    Matches any character is the character set.  Hyphens denote ranges.

    “[a-z]” matches any character a through z

    [^xyz]

    Matches any character not in the character set.

    “[^0-9] matches any non-digit character

    .

    Matches any character except n.

    w

    Match any word character.  Equivalent to [a-zA-Z_0-9]

    W

    Match any non-word character.  Equivalent to [^a-zA-Z_0-9]

    d

    Match any digit.  Equivalent to [0-9]

    D

    Match any non-digit character.  Equivalent to [^0-9]

    s

    Match any whitespace character. Equivalent to [ trnvf]

    S

    Match any non-whitespace character.  Equivalent to [^ trnvf]

    At this point, your patterns will still be matching one character at a time.  To unleash the power of regular expressions, you need to match repeating characters.

    Table 4: Matching Repetition

    Symbol

    Description

    {x}

    Matches x occurrences.

    “d{5}” matches 5 digits.

    {x,}

    Matches x or more occurrences.

    “d{2,}” matches 2 or more consecutive digits.

    {x,y}

    Matches x to y occurrences.

    “d{2,3}” matches no less than two digits and no more than three.

    ?

    Matches 0 or 1 occurrence.  Equivalent to {0, 1}.

    “d?” matches 0 or 1 digit.

    *

    Matches 0 or more occurrences.  Equivalent to {0,}.

    “d*” matches 0 or more digits.

    +

    Matches 1 or more occurrences.  Equivalent to {1,}.

    “d+” matches 1 or more digits.

    Finally, grouping and alternation offer the ability to make extremely complex regular expressions.  Grouping allows you to match clauses.  Alternation allows you to add more than one clause and match any one of them.

    Table 5: Grouping and Alternation

    Symbol

    Description

    ()

    Grouping creates a clause.  Clauses may be nested.

    “(ab)?(c)” matches “abc” or “c”.

    ()|()

    Alternation groups clauses into one expression and then matches any one of the clauses.

    “(ab)|(cd)|(ef)” matches “ab”, “cd”, or “ef”.

    Regular expressions also allow a feature called back referencing.  Back referencing allows you to reuse part of an expression.  This is done by providing a back-slash followed by a digit.  For example, the expression “(w+)s+1” matches any one word that occurs twice in a row.  In other words, the same match must be made twice in a row.

    More Windows Scripting Articles
    More By Nilpo


     

    WINDOWS SCRIPTING ARTICLES

    - More Windows Scripting Workarounds from Nilpo
    - Overloading Methods and More in VBScript
    - Improving MFC for Windows Vista
    - Regular Expressions in VBScript
    - Working with Dates in WMI
    - Completing Calendars with VBScript Date Func...
    - Building Calendars with VBScript Date Functi...
    - Working With Dates and Times in VBScript
    - Designing WCF DataContract Classes Using the...
    - Understanding Dates and Times in VBScript
    - Working With Arrays in VBScript
    - Compressed Folders in WSH
    - Using .NET Interops in VBScript
    - Nilpo`s Scripting Secrets, Vol I
    - Database operations using Silverlight 2.0 WC...





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 2 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek