2. Specifying Files Types to Include or ExcludeFrom previous discussions, you know that the Windows
Search service is designed to index: What you don’t know is how the Windows Search
service determines which types of files and folders to index. It does so
according to the file extension. File extensions and file types go hand in hand. File type
associations determine what type of data is stored in a file and how the
file should be handled when opened. When you open most types of files, a
helper application handles the display of the file. For example, when
you open a document file with the .doc extension, Microsoft Office Word is used
to display the document. The Windows Search service uses the information that it knows
about file types and file extensions to help it index files more
efficiently. More specifically, Windows 7 assigns a file filter to each
file extension, and this filter determines exactly how files with a
particular extension are indexed. Table 1 provides
an overview of the standard file filters. As you install additional applications on
your computer, additional file filters may be installed as well to
improve indexing of related application files. Table 1. File filters used by the Windows Search serviceFilter
name | Filter
description |
---|
File Properties
filter | This filter is used with
binary files, media files and other nontext-based file formats.
As the name implies, this filter retrieves only the filename and
file properties. It does not filter the contents of a file, but
it is extremely useful when searching for image files, which are
rich with metadata such as file size, camera type, and
more. | HTML
filter | This filter is designed
to work with files formatted using Hypertext Markup Language
(HTML). Because this filter recognizes HTML markup tags, you can
use it to extract filenames, file properties, and file contents.
Because this filter also understands <META> tags, you can also use it
to extract meta tag properties within the <HEAD> </HEAD> tags of an
HTML file. | Microsoft Office Document
filter, Microsoft Office
Filter, Officefilters Open XML
Format | These filters are
designed to work with documents in Microsoft Office, including
the documents for Word, Excel, and PowerPoint. Because these
filters recognize Office document formats, Windows Search uses
them to extract text contents and properties unique to
Office. | MIME
filter | This filter is designed
to work with email attachments formatted using the Multipurpose
Internet Mail Extension (MIME) file format. For messages
containing attachments, this filter helps the Windows Search
service identify the associated file type so that the
attachment’s contents can be indexed
appropriately. | Plain Text
filter | This filter is designed
to improve indexing of plain-text files and file types not
registered for use with specific applications. It filters
filenames, file properties, and file contents. This is the
default filter, and it is not able to recognize any document
formats. It handles files as a sequence of ASCII or Unicode
characters. | XML filter | This filter is designed
to work with files formatted using the eXtensible Markup
Language (XML). Because this filter recognizes XML markup tags,
you can use it to extract filenames, file properties, and file
contents. |
You can specify file types that the Windows Search service should
include or exclude when indexing files by completing the following
steps: Click Start and then click Control Panel. In the Control
Panel, click Large Icons or Small Icons on the View By list (to
return to the original view, click the View by list and select
Category). Finally, click Indexing Options. In the Indexing Options dialog box, click Advanced to display
the Advanced Options dialog box shown in Figure 3.
On the Index Settings tab, select the “Index encrypted files” checkbox if you want the
Windows Search service to index files that have been encrypted.
Selecting or clearing this option will cause the Windows Search
service to completely rebuild the indexes on your computer. If you want to improve indexing of non-English characters,
select the “Treat similar words with diacritics as different words”
checkbox. A diacritic is a mark above or below a letter that
indicates a change in the way it is pronounced or stressed.
NOTE If you select “Treat similar words with diacritics as different
words,” you’ll see a warning prompt stating that the Windows Search
service will completely rebuild the indexes for indexed locations on
your computer. The Windows Search service needs to rebuild the indexes
completely to include previously ignored or substituted
characters. On the File Types tab, shown in Figure 4, each file
extension and filter association is listed. If a file extension is
selected, the Windows Search service includes files of this type
when indexing. If a file extension is not selected, the Windows
Search service excludes files of this type when indexing. Select or
clear file extensions as appropriate.
When you install new applications, those applications may
register new filters with the Windows Search service and configure
related file extensions to use these filters. This is the best way
to add indexing functionality. If you want to add support for a
particular file extension, type the file extension in the text box
provided and then click Add. To change the way files with a particular extension are
indexed, select the file extension and then click either Index
Properties Only or Index Properties and File Contents.
Change the way indexing works for a file extension only
when you are sure the indexing configuration you’ve chosen works.
Generally speaking, you can always stop indexing the contents of a
particular file type but rarely can you index the contents of a file
type that isn’t already being indexed. Trying to index the contents of
a nontext-based file type can
cause indexing problems. |
|
Click OK to save your settings.
|