Searching MCMS with SharePoint

If you have a web site that uses Microsoft Content Management Server, you probably noticed that the CMS doesn't offer any search capabilities out of the box. But you're not stuck without search; this article explores your options. It is excerpted from chapter five of the book Advanced Microsoft Content Management Server Development, written by Lim Mei Ying et al. (PACKT, 2005; ISBN: 1904811531).

Contributed by
Rating: 5 stars5 stars5 stars5 stars5 stars / 1
October 05, 2006
Rate this Article:
MEH MEH++


SEARCH ASP FREE
TOOLS YOU CAN USE

advertisement

For as long as content-centric websites have been around, the need for searching the content has been there. Many of the most successful dot-com businesses have been search sites such as Yahoo! and Google. Every few months a new search site opens its doors, many of which perform aggregate searches of multiple sites simultaneously. At the other end of the spectrum, many site owners require a search capability that returns only results for their specific site.

Microsoft Content Management Server (MCMS), while a very robust content management solution, does not offer any search capabilities out of the box. However, just because you have an MCMS website doesn't mean you are stuck without search capabilities.

MCMS Search Options

There are quite a few ways to implement searching on an MCMS website, each with varying costs, implementation complexity, and limitations. As usual, each option has its advantages and disadvantages. Google (http://www.google.com) provides a Web Service API for you to submit queries against at no cost, but you are limited to 1,000 searches per day, and there are some licensing requirements regarding logo placement. Coveo (http://www.coveo.com) provides a free, no-expiration license for its Enterprise Search product, but it's limited to searching 5,000 documents (searching more than 5,000 requires a license to be purchased from Coveo). Mondosoft's (http://www.mondosoft.com) MondoSearch seamlessly integrates into MCMS and offers up a robust feature set, but it's not free.

Microsoft's enterprise portal solution, SharePoint Portal Server 2003 (SPS), contains a powerful and customizable search engine. The indexes SPS creates are accessible for searches by submitting a Microsoft SQL Full-Text query via a Web Service. If your organization has already implemented, or plans to deploy SPS, then you could leverage it as your MCMS search engine. Do be aware, however, that if your site is publicly accessible, this solution may not be as compelling, as a SharePoint External Connector license would be required. For this reason, the SharePoint search solution we look at is typically only a viable option for intranet-based MCMS sites.

In this chapter, we will leverage SPS's search to provide a robust search capability for our Tropical Green MCMS site. On the way, we'll configure SharePoint to index our Tropical Green site. We will also try out some free components you can use in your MCMS site to execute search queries against the SharePoint index.

Microsoft SharePoint Portal Server Search

To fully leverage SharePoint Portal Server Search to your advantage, you need to understand how it works and how to configure it. Before we explain how it works, there are a few key components that need to be understood:

  1. A content source contains the information that will be indexed. Content sources can be external websites, file shares, Windows SharePoint Services sites, Microsoft Exchange public folders, or other systems that provide a protocol handler for SharePoint Search such as Lotus Notes.
  2. Index files contain crawled content from one or more content sources. Aggregating and cataloging content from disparate content sources enables future search queries to be much more efficient. Index files can also be copied or propagated to SharePoint Web servers for more efficient searching. Two indexes are created by default when you create a new portal: Portal_Content and Non_Portal_Content. As expected, the former contains all content stored in the portal while the latter contains content outside of the portal.
  3. Search scopes are used to provide a logical grouping of content sources for end users to search. For example, a company may have multiple internal file shares and websites. An employee looking for a specific document doesn't care if it's in site A or file share B, they just know it's out there. An administrator can create multiple content sources and group them together in a single search scope that the user can search against. In addition, search scopes can be configured to only include specific portions of a website, providing even more granular control over what content is indexed and searchable by your users.
  4. The SharePoint gatherer is responsible for crawling all content sources, extracting content, removing noise words (such as 'and', 'a', 'the', 'or' to name only a few noise word files are customizable so you can add your own noise words), and creating index files that will be used when search queries are executed.

The gatherer is part of the MSSearch service that performs the content crawling and creates the index files. This service runs on schedules that you can configure through the SharePoint Central Administration tool. The MSSearch service activates the gatherer, based on the specified scheduled timetable, which generates a master index for search queries.

An end user uses a search scope to select a collection of content sources to query. SharePoint looks at the catalog containing the content sources and determines the best candidates that match the search query.


Preparing the MCMS Site for Indexing

Before we can configure SharePoint to index our MCMS site, there are a few steps we need to take to make the indexing more efficient and useful. First and foremost, check if your site has the MCMS option Map Channel Names to Host Header Names set. If so, you'll need to disable it because one of the two options we have, utilizing the MCMS Connector, does not support host header names. For the rest of this chapter, we will assume our site exists in the top-level channel TropicalGreen.

If your site uses the Map Channel Names to Host Header Names option, you may need to rename the top level channel to reflect the channel we'll use in this example (namely TropicalGreen).

In addition, our example assumes you've set up MCMS and SharePoint according to Appendix A, Setting up MCMS & SPS on the Same Virtual Server. If your MCMS Web Entry Point and SharePoint portal are not in the same virtual server, this requirement may not affect you.

Second, we'll configure our site for guest access. The majority of our Tropical Green site is intended to be available to any anonymous visitor. While we do have one restricted section of our site, we will set up a new account that will have read access to our entire site for use by SharePoint as it crawls our site. Then we'll filter the results to ensure that the user running the search will only see items in the search results he or she has access to.

Next, we need to address how MCMS and output caching behave on requests for postings. The default page rendering behavior of MCMS is not performance-friendly to SPS searching. Because all MCMS requests return an HTTP status code of 200, SharePoint will always perform full crawls of our site and not an incremental crawl. We have already explained the details of what happens with each index crawl request and implemented a solution in Chapter 4, Preparing Postings for Search Indexing.

Finally, we'll add a control, supplied with the MCMS Connector for SharePoint Technologies, to our templates that makes additional metadata properties available to the index crawler, giving additional information for users searching our site.

Disabling Channel Names to Host Header Names Mapping

One of the examples we'll run through in this chapter involves using the MCMS Connector for SharePoint Technologies. The search controls shipped with MCMS Connector do not support the host header mapping feature and therefore, we cannot enable mapping the channel names to the host header names. If your site employs this option, you'll need to disable it. In addition, we should rename the top level channel www.tropicalgreen.net to TropicalGreen which is much more convenient as this will now become part of the path in the URL.

The MCMS Connector for SharePoint Technologies requires the .NET Framework 1.1. It will not function properly on a site running version 1.0 of the .NET Framework.

This change may cause some User Controls in our site to throw errors as they reference a channel path that no longer exists. Check the following files to make sure any references to /Channels/www.tropicalgreen.net/ are changed to /Channels/TropicalGreen/:

  • /Login.aspx
  • /UserControls/RightMenu.aspx
  • /UserControls/SiteMapTree.aspx
  • /UserControls/TopMenu.aspx

You'll probably want to add an additional file in the root of our website that automatically redirects users to our site's channel. Call the file default.aspx, and it should contain the following line:

  <% Response.Redirect("/TropicalGreen/") %>

Any requests for http://www.tropicalgreen.net will now be redirected to http://www.tropicalgreen.net/TropicalGreen/.


If your solution requires the Map Channel Names to Host Header Names feature, the MCMS Connector search solution will not be appropriate for your needs. You can, however, build your own custom search solution as described in detail later in this chapter.

Assigning a Search Account

 

Our Tropical Green site has both a public section of the site and a members-only section. If an anonymous user, or guest, executes a search, they should only see results from the public portion of the site. However, if an authenticated user executes a search query, they should see appropriate results from both the public and private portions of the site.

In order for SharePoint to index our entire site, including the members-only section, we need to create a new account that will have access to the entire site. We'll then configure SharePoint to use this account when indexing. Let's assume we have an account already created called MCMSBOOK\SearchCrawler. The first thing we need to do is configure SharePoint Portal Server to use this account when crawling content.

  1. Start the SharePoint Central Administration by pointing to Start | All Programs | SharePoint Portal Server | SharePoint Central Administration.
  2. Under the section Server Configuration, click the Configure Server Farm Account Settings link. 
  3. Enter the search crawler account credentials in the Default Content Access Account section and click OK.

Now we need to grant our SearchCrawler account subscriber rights to the entire Tropical Green website.

We're going to assume you have already installed the MCMS Connector for SharePoint Technologies as its installer creates an MCMS Search subscriber group in Site Manager for use in searching your MCMS channel structure. Refer to Appendix B for assistance in installing the MCMS Connector.

  1. Start Site Manager by pointing to Start | All Programs | Microsoft Content Management Server | Site Manager.
  2. Select the User Roles button on the left panel within Site Manager.
  3. Select the Subscribers user role. 
  4. Then right-click the MCMS Search User Subscribers role and select Properties.
  5. Click the Group Rights tab to view all the channels, templates, and resources the MCMS Search User role has rights to. All channels, templates, and resources should be checked.
  6. Click the Group Members tab and click the Modify button.
  7. Enter the MCMSBOOK\SearchCrawler user that we added above as the SharePoint crawl account and click OK.
  8. Click OK again to close the property window.

We have now configured SharePoint to crawl our site using the dedicated account and granted the account access to all content within the Tropical Green site.

Please check back next week for the continuation of this article.

blog comments powered by Disqus
WINDOWS SCRIPTING ARTICLES

- More Windows Scripting Workarounds from Nilpo
- Overloading Methods and More in VBScript
- Improving MFC for Windows Vista
- Regular Expressions in VBScript
- Working with Dates in WMI
- Completing Calendars with VBScript Date Func...
- Building Calendars with VBScript Date Functi...
- Working With Dates and Times in VBScript
- Designing WCF DataContract Classes Using the...
- Understanding Dates and Times in VBScript
- Working With Arrays in VBScript
- Compressed Folders in WSH
- Using .NET Interops in VBScript
- Nilpo`s Scripting Secrets, Vol I
- Database operations using Silverlight 2.0 WC...

ASP Web Hosting ASP.Net Web Hosting Windows Web Hosting
ASP Free Forums 
 RSS  Tutorials RSS
 RSS  Forums RSS
 RSS  All Feeds
Site Map 
Request Media Kit
Write For Us Get Paid 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
Privacy Policy 
Support 


© 2003-2012 by Developer Shed. All rights reserved. DS Cluster 11 - Follow our Sitemap
Most Popular Topics
All ASP.Net Tutorials