Extracting Google-Indexed Web Site Pages Using MS Excel - The Process
(Page 3 of 4 )
This method does not involve Visual Basic programming in Excel, but plain manipulation using the built-in text functions.
Step 1. After entering the search query: site:thisisyourwebsite.com in the Google search box, select the appropriate portion of data in the Google search result.
Only select the indexed pages and nothing else; see the sample screen shot below (highlighted regions are the selected areas to be copied and pasted into the Excel spreadsheet).
Do not include Ads and Sponsored results in the text selection.
Step 2. Right click and copy, open MS Excel, and then on cell A1: Right click again, Paste special, and paste as "text."
After pasting the data, it should look like this:
After pasting the data as text, we will now filter the column for relevant information:
Step 3: Click on cell A1, then on the Excel buttons, click "Data," Filter, and finally Auto filter.
Step 4: On the auto filter, click the drop down arrow button and then click "Custom." The Custom Auto Filter Dialog Box will then appear.
Step 5: On the first drop down menu under "Shows rows where #:" select "Contains," and then on the second drop down menu type www.aspfree.com. This means you will need to filter only information containing www.aspfree.com -- only the indexed URLs of the domain.
Step 6: On the Custom Auto filter Dialog Box, find the checkbox containing "AND" and also "OR." Check the "AND" checkbox.
Step 7. Below the AND/OR checkboxes, there are two additional drop down menus. On the left (first) drop down, select "contains" and then on the right (second) drop down, type "cached." Finally, when everything is set, click OK.
In case you found the above steps confusing, make sure the Custom Auto filter Dialog Box looks like this after following Steps 1 to 7:
Next: Explaining the Results >>
More BrainDump Articles
More By Codex-M