Extracting Metadata - Extracting Metadata from Existing Applications and Source Code
(Page 7 of 22 )
An existing application can be a metadata source. Wow, that’s a weird concept. But it can be a metadata source as source code, a Component Object Model (COM) type library, or via reflection from a .NET application. It turns out that source code is an extremely difficult metadata source to use, and I actually suggest you avoid it unless there’s compelling information to extract from it. Reflection, on the other hand, is an easy metadata source to use.
TIP: Using existing applications as a metadata source when interfacing with them makes up a significant aspect of your application.
Using Reflection Reflection is a set of tools in .NET that allows you to query the assembly header (often called metadata, but I think that’s confusing in this context) and retrieve information about the types within the assembly. Type is the .NET word for classes, enums, structures, and so on. In addition to basic type information, the header describes the methods, properties, constants, fields, and nested types within the class or structure. (See Footnote 7.) Retrieving this information allows you to build a metadata file describing the type, which can be useful in generating code that accesses the types.
(Footnote 7. Fields are variables declared at the top of your class outside the scope of any method or property.)
For example, knowing the controls on a form allows you to automate binding or scatter/gather processing, particularly if you use careful naming for the controls. Careful naming is a specific technique discussed in the “Introducing Careful Naming” section. Careful naming allows names themselves to carry usable information. Attributes can also help you sort out what each method and property actually does. You can use reflection to generate an XML representation of any .NET EXE or DLL, whether you have the source code. This allows you to automate the interface to a component even if you don’t have access to its source code.
TIP: To use reflection, you’ll have to compile your application. Approaches that result in half-built classes you’re planning to finish out with code generation often don’t compile and thus aren’t usable with reflection.
Using TypeLibs You may want to extract metadata from an existing COM application that was written in Visual Basic 6, C++, or another language. Extracting the defined interface into metadata allows you to generate code that interacts with these applications. You could also use code generation to create wrapper objects on either the COM or .NET side to reduce the chattiness of interop conversations.
NOTE: Chapter 4 explains why it’s hard to protect handcrafted code in Visual Basic (VB) 6 and thus why it’s a poor language for appli-cation-wide code generation. However, if you want a static wrapper class without handcrafted code, generating VB 6 code works just fine. |
You can use TlbInf32.dll (which you can download from Microsoft) to provide information about most COM type libraries that’s similar to the information provided by .NET’s reflection features. If you search MSDN for this file, you’ll find how to download it, as well as several articles about using it.
Using Source Code It’d be great to get metadata directly from your source code. This would allow you to extract comments, track internal conditions, work with the internals of files in other languages such as Visual Basic 6, and incorporate regions. Unfortunately, it’s a really hard job. The fundamental reason it’s hard is that source code is poorly structured. The job of taking poorly structured information and placing it into consistent, highly structured XML considering all possible variations makes parsing very complex. To do it right, you have to incorporate at least variable file openings, nested types, complex class and method headers (XML comments), multiple namespaces and classes, and nested regions; and you’ll want to handle attributes at every level. Whew!
NOTE: One reason you might want source code parsing is to capture header comments in VB .NET, similar to C# XML comments. Because XML comments are planned for the Whidbey version of Visual Basic .NET, you might just want to be patient. (See Footnote 8.) If you want XML comments in VB today in version 1.0 or 1.1, search Google for VB .NET XML comments. At the time of this writing, commercial products from at least two vendors (Fesersoft’s VB.NET XML Comments Creator and VBXC—VB.NET XML), an entry on GotDotNet (http://www.gotdotnet.com), and an open-source project at SourceForge (http://www.sourceforge.net) are available. |
(Footnote 8. Whidbey is the code name for the next version of the Visual Studio family after the Visual Studio 2003 version.)
If you have other reasons you want to parse .NET source code into meta-data, you can try the automation model of Visual Studio. The EnvDTE namespace contains the automation model and parses out the source code structure, which is the hard part. It isn’t going to hand you the comments and it requires the file be open in the editor, but it may make some jobs easier.
This is from Code Generation in Microsoft .NET, by Kathleen Dollard (Apress, ISBN 1590591372). Check it out at your favorite bookstore today. Buy this book now. |
Next: Why Extract Metadata? >>
More Database Articles
More By Apress Publishing