Preparing an MCMS Website for Search with SharePoint

Last week, we began examining SharePoint as a way to enable search capabilities on a web site that uses Microsoft Content Management Server. This week, we roll up our sleeves and get our site ready for content indexing. This article is excerpted from chapter five of the book Advanced Microsoft Content Management Server Development, written by Lim Mei Ying et al. (PACKT, 2005; ISBN: 1904811531).

Contributed by
Rating: 5 stars5 stars5 stars5 stars5 stars / 1
October 12, 2006
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

Enable Guest Access for Tropical Green

Because our site will be publicly available, we need to make sure that it's not going to require visitors to log in. To configure our Tropical Green site to allow guests to view it, we need to enable guest access. If you would like full details of how this is done, full instructions are given in the first book, Building Websites with Microsoft Content Management Server (Packt Publishing, January 2005, ISBN 1-904811-16-7). See the section Welcoming Guests to the Site in Chapter 18.

If you have already configured your site for guest access, you can skip this step.

In order to allow guests into our site, we need to:

  1. Create a new MCMS Guest Account in the domain or as a local user on the server.
  2. Using the SCA, configure MCMS to allow guests and to use the account created in step 1 as the Guest Login Account.
  3. Now that MCMS is configured to allow guests into the site, add the account created in step 1 to a subscribers' rights group and grant the rights group access to all channels, resource galleries, and template galleries that currently exist in the site, except for the Members channel located in the Gardens channel.

We've chosen to enable guest access to the site for simplicity. However, it is possible for SharePoint to index our site using Forms or Windows Authentication. For Forms Authentication, we would need to create a special home page that automatically logs the user in with a predefined account to gain access to the site so that SharePoint could start its crawl. We would have to grant the SharePoint crawler account permission to the appropriate subscriber rights group.

If you chose to create an alternate home page to automatically log SharePoint into your site, keep in mind that any user could use this page to gain access to your site. Special care should be taken if you use this method, such as adding IP restriction to this page so that only the SharePoint server can access it.

Output Caching and Last-Modified Dates of Postings

ASP.NET is not particularly sophisticated when it comes to generating HTTP status codes, simply returning an HTTP code of 200 (OK) for every request rather than sending a Last-Modified HTTP header. When SharePoint Portal Server performs an incremental index crawl, it sends HTTP GET requests for every page on the site it finds. If a page has previously been indexed and it returned a Last-Modified header, SharePoint Portal Server sends a conditional HTTP GET request that includes an If-Modified-Since HTTP header with the date previously returned in the Last-Modified HTTP header. If the response is an HTTP status code 304 (not modified), SharePoint will not index the page again. However, because ASP.NET always returns status code 200, the site will never be incrementally crawled by SPS, and will effectively undergo a full index every time the gatherer is executed. As MCMS templates files are actually a special kind of ASP.NET Web Form, this also affects postings based on these template files. Refer to Chapter 4, Preparing Postings for Search Indexing, for instructions on how to address this problem.

One thing to consider is channel rendering scripts and postings that contain dynamic lists of links to other postings. While these scripts and postings may not have changed since the last index, the content generated by these scripts can change between different calls to the posting. If this content is content you wish to search for, you should ensure that postings containing such controls are always indexed by not returning a Last-Modified HTTP header.


The Connector SearchMetaTagGenerator Control

The last modification we need to do is to add a control that ships with MCMS Connector for SharePoint Technologies. The SearchMetaTagGenerator outputs standard and/or custom page properties. In addition, we can use it to control what properties are output and even add our own custom properties. Adding the SearchMetaTagGenerator control to your templates is very easy. Let's add it to our Plant.aspx template:

  1. Open the Plant.aspx file in Design view.
  2. In the Toolbox, select the Content ManagementServer tab, and drag the SearchMetaTagGenerator to the top of our template.

    If you don't see the SearchMetaTagGenerator in the Toolbox and you've installed the MCMS Connector for SharePoint Technologies, right-click on the Toolbox and select Add/Remove Items. In the Customize ToolBox dialog, click Browse, navigate to the Microsoft Content Management Server\Server\bin\ directory, and select the Microsoft.ContentManagement.SharePoint. WebControls.dll assembly. Finally click OK in the Customize ToolBox dialog. You should now see the additional controls in your Toolbox.

    If you haven't installed the MCMS Connector for SharePoint Technologies, refer to Appendix B, MCMS Connector for SharePoint Technologies, for download and installation information.
  3. Click the SearchMetaTagGenerator control we just added, and in the property window, select one of the following PropertyTypes:

    • CustomProperties: Generates META tags for custom page properties.
    • StandardProperties: Generates META tags for standard page properties (such as DisplayName, DisplayPath).
    • CustomAndStandardProperties: Generates META tags for both custom and standard page properties (default).
    • PropertiesFromXMLFile: Generates META tags for the properties specified in the SearchPropertyCollection.xml file. More on this in just a moment. 
  4. For now, let's chose the CustomAndStandardProperties property type.

     
  5. Because the Visual Studio .NET designer won't allow us to drop controls into the <head></head> portion of the page, we need to move the code declaration of the SearchMetaTagGenerator control from the body of the page to the heading. Switch to HTML view, find the SearchMetaTagGenerator control we just added, and move it up between the <head> and </head> tags.
  6. As with any changes, we should now rebuild the Tropical Green project.
  7. Open a browser, and navigate through the site to a plant posting in the plant catalog section of the site. Take a moment to view the source of the posting you navigate to. Notice all the extra META tags that have been added. Here's an example:

Notice the FIRSTSAVEDBY property listed at the top of the META tags. This is a custom property that has been added to the posting. It is added to the META tags because we selected the CustomAndStandardProperties property type in the
SearchMetaTagGenerator control. The other META tags are the standard properties generated by the SearchMetaTagGenerator control.

When rendered, any posting implemented with the Plant template will contain META tags in the <HEAD> portion of the page for each of the page's custom properties and standard properties.

One of the items available to us in the PropertyType field is PropertiesFromXMLFile. This option allows us to specify exactly which properties will be exported as META tags using an XML file located at MicrosoftContentManagementServer\Server\IIS_CMS\ WssIntegration\SearchPropertyCollection.xml.

Once you have specified which properties you want to use, including custom properties you've added, you need to tell SharePoint to index these properties in the crawl. The console application SearchPropertiesSetup.exe included with MCMS Connector will tell SharePoint about the updated XML file. Run it using the following syntax:

SearchPropertiesSetup.exe file "<path to file>\SearchPropertyCollection.xml"

The SearchPropertiesSetup.exe utility can be found in the following location:
<install drive>:\Program Files\MCMS 2002 Connector for SharePoint
Technologies\WSS\bin\.

Go ahead and execute the SearchPropertiesSetup.exe utility as above because our custom search solution will use one of the META tags it generates.

If you change the SearchPropertyCollection.xml file, you will need to re-execute the
SearchPropertiesSetup.exe utility.

The MCMS Connector for SharePoint Technologies includes a help file with instructions on how to modify the XML file. Be aware that a Microsoft Support Knowledge Base article exists addressing an error in the help file instructions. The MSKB article A problem occurs when you add the SearchMetaTagGenerator control to a template in Content Management Server 2002 Connector for SharePoint Technologies (#872932) contains corrected instructions.

Our Tropical Green site is now configured to allow guests to visit the site, our templates have been modified to be more SPS search friendly, and we have included additional metadata in the <HEAD> section of all our rendered postings. Let's proceed to create a content source in SharePoint to index our site.

Configuring SharePoint Portal Server Search

With our MCMS site ready for indexing, we now turn to SPS. First, we will configure SharePoint to index our Tropical Green site. After creating the index, we'll create a source group that will contain the content source. Source groups are used to group content sources together in a logical collection. In our case, we'll have a single content source in our source group. The source group is what we'll reference when we create our search logic in the Tropical Green site.

The next few steps assume you've created a portal in SPS. Refer to Appendix A, Setting up MCMS and SPS on the Same Virtual Server, for instructions on how to create a portal.

While Appendix A details how to configure a virtual server to host an MCMS site and SharePoint portal at the same time, we do not want to do that for this chapter. We need two virtual servers, one for the www.tropicalgreen.net MCMS site and the other for the SharePoint portal.tropicalgreen.net site. Appendix A details how to create a new virtual server and a new SharePoint Portal Server portal.

Creating a New Content Source

The first step in configuring SPS search is to create a content source. One way to accomplish this is to use the SearchSetup.exe command-line tool included with the MCMS Connector. This utility can be found in the MCMS 2002 Connector for SharePoint Technologies\WSS\bin folder. The SearchSetup.exe utility creates the necessary content sources in SharePoint as well as all the site rules to include the root channel and all top-level channels in your site hierarchy in order to include and exclude the appropriate content. For more information on the SearchSetup.exe utility, refer to the help included with the MCMS Connector.

In order to use the MCMS Connector search controls SearchInputControl and SearchResultControl, you need to use the SearchSetup.exe utility to create your content source and source group in your SharePoint portal. This is because the MCMS Connector search controls are hard-coded to look for a specific SharePoint search group named "CMSChannels". To complete the two search examples in this chapter, create two sets of content sources by following the steps in this section using the SearchSetup.exe
utility and by creating the content source manually.

Creating a Content Source with the MCMS Connector Utility

Let's use the MCMS Connector SearchSetup.exe command-line utility to create a new content source and source group:

  1. Open a command prompt and change the current directory to the following MCMS Connector default utility directory:
    cd "C:\Program Files\MCMS Connector for SharePoint Technologies\WSS\Bin" 
  2. Enter the following command to create a new content source that will index our Tropical Green website, using the MCMS guest account to crawl the content, and initiate the crawl immediately after creating the content source (replacing the user and password credentials with your MCMS guest account credentials):

    searchsetup.exe -url "http://www.tropicalgreen.net/ TropicalGreen/"
                    -crawl "1"
                    -user "<domain>\SearchCrawler"
                    -password "<password>"
                    -portalurl "http://portal.tropicalgreen.net"

The table below describes each of the possible switches:

Switch

Description

url

The MCMS URL that will be used by SharePoint as the start point of the crawl.

crawl

Indicates whether or not a crawl is performed immediately after SharePoint creates the content source. A value of "1" instructs SharePoint to perform a crawl immediately. Otherwise, set it to a value of "0" to stop SharePoint from crawling the site.

user

The user account that has access to the MCMS content to be indexed.

password

Password of the user account.

portalurl

URL of the SharePoint portal server that will contain the content source.

You only have to run this command-line program once, not every time you update the site. If you need to perform a full crawl of the site again, you can do so by resetting the content source and executing a full crawl. Refer to the SharePoint Portal Server documentation for more information on this.

Now that we have a new content source created, let's create a new search scope to make it easier to test our search results.

Creating a New Search Scope

While SharePoint is indexing our site, we should go ahead and create a search scope.

  1. Open the General Content Settings and Indexing Status page by browsing to your portal and clicking the Site Settings link in the upper right. Under the Search Settings and Indexed Content section, click the Configure search and indexing link. Then click the Manage search scopes link. 
  2. On the Manage Search Scopes page, click the New Search Scope button. When prompted to create a new search scope, enter the following:

Field Value
Name: TropicalGreen.net (SearchSetup.exe)
Topics and Areas: Include no topic or area in this scope
Content Source Groups: Limit the scope to the following groups of content sources: CMSChannels

  1. After clicking OK, SharePoint will take us back to the Manage Search Scopes page with our new scope.
  2. Let's get back to the search configuration page. Click the Site Settings link in the heading of the Manage Search Scopes page. Then click the Configure search and indexing link under the Search Settings and Indexed Content section.
  3. At this point, we should make sure everything is configured correctly. We've created a content source and added that source to a new site group. By now, SharePoint should have finished indexing our site (unless you added hundreds of postings to it). Look at the Non-portal content column. If you see errors, warnings, or zero documents indexed, examine the logsome errors might not be errors at all, others may indicate errors within the MCMS site itself.


    One common error, The address could not be found, is usually caused by links to empty
    channels that are not configured to use channel rendering scripts. Since we'd expect guests to receive this error when browsing the site, it's not surprising the SharePoint gatherer ran into the same problem. This is not a problem with the SharePoint index, but rather with the structure or of our site: channels that could be empty should have channel rendering scripts or be hidden from the navigation.

  4. If there are no problems, we can test our index. Click the Home link in the portal navigation to get to the homepage. In the upper-right corner, select TropicalGreen.net (SearchSetup.exe) in the dropdown (the whole name may not appear due to design constraints on the width of the dropdown), enter ficus in the search box, and click the green arrow to execute the search. The search results should find the posting in the plant catalog.



    Your search results may not match what is indicated in the image above as your postings
    may have been modified recently.

We now have a SharePoint search scope created and indexing our Tropical Green site. While this search scope can be used within the portal to search our site, we will use it via the SPS Query Service Web Service from our MCMS site to provide search functionality to our users.

Please check back next week for the continuation of this article.

blog comments powered by Disqus
WINDOWS SCRIPTING ARTICLES

- More Windows Scripting Workarounds from Nilpo
- Overloading Methods and More in VBScript
- Improving MFC for Windows Vista
- Regular Expressions in VBScript
- Working with Dates in WMI
- Completing Calendars with VBScript Date Func...
- Building Calendars with VBScript Date Functi...
- Working With Dates and Times in VBScript
- Designing WCF DataContract Classes Using the...
- Understanding Dates and Times in VBScript
- Working With Arrays in VBScript
- Compressed Folders in WSH
- Using .NET Interops in VBScript
- Nilpo`s Scripting Secrets, Vol I
- Database operations using Silverlight 2.0 WC...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 10 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials