BrainDump
  Home arrow BrainDump arrow Page 3 - Extracting Google-Indexed Web Site Pages U...
ASP Free Forums 
.NET  
ASP  
ASP Code  
ASP.NET  
ASP.NET Code  
BrainDump  
C#  
Code Examples  
Database  
Database Code  
IIS  
Microsoft Access  
MS SQL Server  
Silverlight  
Visual Basic.NET  
Windows Scripting  
Windows Security  
XML  
Mobile Linux 
App Generation ROI 
IBM® developerWorks 
ASP Web Hosting  
ASP.NET Web Hosting 
Windows Web Hosting
 
Weekly Newsletter
 
Developer Updates  
Free Website Content 
 RSS  Articles
 RSS  Forums
 RSS  All Feeds
Write For Us Get Paid 
Request Media Kit
Contact Us 
Site Map 
Privacy Policy 
Support 
 USERNAME
 
 PASSWORD
 
 
  >>> SIGN UP!  
  Lost Password? 
BRAINDUMP

Extracting Google-Indexed Web Site Pages Using MS Excel
By: Codex-M
  • Search For More Articles!
  • Disclaimer
  • Author Terms
  • Rating: 4 stars4 stars4 stars4 stars4 stars / 3
    2009-06-11

    Table of Contents:
  • Extracting Google-Indexed Web Site Pages Using MS Excel
  • Understanding the Google Search Result
  • The Process
  • Explaining the Results

  • Rate this Article: Poor Best 
      ADD THIS ARTICLE TO:
      Del.ici.ous Digg
      Blink Simpy
      Google Spurl
      Y! MyWeb Furl
    Email Me Similar Content When Posted
    Add Developer Shed Article Feed To Your Site
    Email Article To Friend
    Print Version Of Article
    PDF Version Of Article
     
     
    ADVERTISEMENT


    Extracting Google-Indexed Web Site Pages Using MS Excel - The Process


    (Page 3 of 4 )

    This method does not involve Visual Basic programming in Excel, but plain manipulation using the built-in text functions.

    Step 1. After entering the search query: site:thisisyourwebsite.com in the Google search box, select the appropriate portion of data in the Google search result.

    Only select the indexed pages and nothing else; see the sample screen shot below (highlighted regions are the selected areas to be copied and pasted into the Excel spreadsheet).

    Do not include Ads and Sponsored results in the text selection.

    Step 2. Right click and copy, open MS Excel, and then on cell A1: Right click again, Paste special, and paste as "text."

    After pasting the data, it should look like this:

    After pasting the data as text, we will now filter the column for relevant information:

    Step 3: Click on cell A1, then on the Excel buttons, click "Data," Filter, and finally Auto filter.

    Step 4: On the auto filter, click the drop down arrow button and then click "Custom." The Custom Auto Filter Dialog Box will then appear.

    Step 5: On the first drop down menu under "Shows rows where #:" select "Contains," and then on the second drop down menu type www.aspfree.com. This means you will need to filter only information containing www.aspfree.com -- only the indexed URLs of the domain.

    Step 6: On the Custom Auto filter Dialog Box, find the checkbox containing "AND" and also "OR." Check the "AND" checkbox.

    Step 7. Below the AND/OR checkboxes, there are two additional drop down menus. On the left (first) drop down, select "contains" and then on the right (second) drop down, type "cached." Finally, when everything is set, click OK.

    In case you found the above steps confusing, make sure the Custom Auto filter Dialog Box looks like this after following Steps 1 to 7:

    More BrainDump Articles
    More By Codex-M


       · Hi,The post is helpful. Though I think we can further enhance the last step where...
     

    BRAINDUMP ARTICLES

    - Introduction to Office Live Workspace
    - Using MS Excel for One-way Analysis of Varia...
    - Comparing Data Sets Using Statistical Analys...
    - Import Blogger Posts into WordPress Using Wi...
    - Download WordPress from an FTP Server and Ru...
    - Install and Run WordPress in XAMPP Local Host
    - What Windows 7 Brings to the Table
    - Virtualization and Sandbox Detection
    - Advanced Firebug Techniques in Windows XP Ho...
    - Editing CSS with Firebug in Windows XP Home
    - Using Firebug in Windows XP Home
    - Migrating to Exchange Server 2007
    - Using System Restore on a Non-Bootable PC
    - Finding Logged on Users and More Scripting S...
    - Developing Macro Commands in MS Excel





    © 2003-2009 by Developer Shed. All rights reserved. DS Cluster 1 Hosted by Hostway
    For more Enterprise Application Development news, visit eWeek