Preparing an MCMS Website for Search with SharePoint
(Page 1 of 4 )
Last week, we began examining SharePoint as a way to enable search capabilities on a web site that uses Microsoft Content Management Server. This week, we roll up our sleeves and get our site ready for content indexing. This article is excerpted from chapter five of the book
Advanced Microsoft Content Management Server Development, written by Lim Mei Ying et al. (PACKT, 2005; ISBN: 1904811531).
Enable Guest Access for Tropical Green
Because our site will be publicly available, we need to make sure that it's not going to require visitors to log in. To configure our Tropical Green site to allow guests to view it, we need to enable guest access. If you would like full details of how this is done, full instructions are given in the first book, Building Websites with Microsoft Content Management Server (Packt Publishing, January 2005, ISBN 1-904811-16-7). See the section Welcoming Guests to the Site in Chapter 18.
If you have already configured your site for guest access, you can skip this step.
In order to allow guests into our site, we need to:
- Create a new MCMS Guest Account in the domain or as a local user on the server.
- Using the SCA, configure MCMS to allow guests and to use the account created in step 1 as the Guest Login Account.
- Now that MCMS is configured to allow guests into the site, add the account created in step 1 to a subscribers' rights group and grant the rights group access to all channels, resource galleries, and template galleries that currently exist in the site, except for the Members channel located in the Gardens channel.
We've chosen to enable guest access to the site for simplicity. However, it is possible for SharePoint to index our site using Forms or Windows Authentication. For Forms Authentication, we would need to create a special home page that automatically logs the user in with a predefined account to gain access to the site so that SharePoint could start its crawl. We would have to grant the SharePoint crawler account permission to the appropriate subscriber rights group.
If you chose to create an alternate home page to automatically log SharePoint into your site, keep in mind that any user could use this page to gain access to your site. Special care should be taken if you use this method, such as adding IP restriction to this page so that only the SharePoint server can access it.
Output Caching and Last-Modified Dates of Postings
ASP.NET is not particularly sophisticated when it comes to generating HTTP status codes, simply returning an HTTP code of 200 (OK) for every request rather than sending a Last-Modified HTTP header. When SharePoint Portal Server performs an incremental index crawl, it sends HTTP GET requests for every page on the site it finds. If a page has previously been indexed and it returned a Last-Modified header, SharePoint Portal Server sends a conditional HTTP GET request that includes an If-Modified-Since HTTP header with the date previously returned in the Last-Modified HTTP header. If the response is an HTTP status code 304 (not modified), SharePoint will not index the page again. However, because ASP.NET always returns status code 200, the site will never be incrementally crawled by SPS, and will effectively undergo a full index every time the gatherer is executed. As MCMS templates files are actually a special kind of ASP.NET Web Form, this also affects postings based on these template files. Refer to Chapter 4, Preparing Postings for Search Indexing, for instructions on how to address this problem.
One thing to consider is channel rendering scripts and postings that contain dynamic lists of links to other postings. While these scripts and postings may not have changed since the last index, the content generated by these scripts can change between different calls to the posting. If this content is content you wish to search for, you should ensure that postings containing such controls are always indexed by not returning a Last-Modified HTTP header.
Next: The Connector SearchMetaTagGenerator Control >>
More Windows Scripting Articles
More By PACKT Publishing
|
This article is excerpted from chapter five of the book Advanced Microsoft Content Management Server Development, written by Lim Mei Ying et al. (PACKT, 2005; ISBN: 1904811531). Check it out today at your favorite bookstore. Buy this book now.
|
|