Space  Contents Previous Next


Internet Data Query Files

Names Section
Query Section
Effect of Parameters on Query Performance


Internet Data Query files (files with an .idq extension) for Microsoft Index Server (together with the form parameters) specify the query that Microsoft Index Server will run. The .idq file is divided into two sections, the names section and the query section. The names section is optional, and need not be supplied for standard queries.

Note   All paths to .idq files must be the full path name from a virtual root, not a relative path or a physical path. In other words, all paths must start with a slash and cannot contain “.” or “..” components. See the following examples:

Valid Paths
/scripts/myquery.idq
/scripts/samples/search/query.idq

Invalid Paths
c:\inetsrv\scripts\myquery.idq
scripts/query.idq
/samples/../scripts/query.idq

The .idq files cannot be on a virtual root pointing to a remote Uniform Naming Convention (UNC) share.


To TopNames Section

The names section of the Internet Data Query file defines nonstandard column names that can be referred to in the query. The columns refer to ActiveX™ properties that have been created in document files with IPropertyStorage, or in the Microsoft® Office summary and custom properties. The globally unique identifier (GUID) for Microsoft Office is 0xF29F85E0,0x4FF9,0x1068,0xAB9108002B27B3D9. The following sample defines a few of the ActiveX Summary Information properties:

[Names]
#Property set for OLE document properties
DocTitle                                  = F29F85E0-4FF9-1068-AB91-08002B27B3D9 2
DocSubject( DBTYPE_STR|DBTYPE_BYREF )     = F29F85E0-4FF9-1068-AB91-08002B27B3D9 3
DocAuthor( DBTYPE_STR|DBTYPE_BYREF )      = F29F85E0-4FF9-1068-AB91-08002B27B3D9 4
DocEditTime( DBTYPE_DATE )                = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xa
DocLastPrinted( DBTYPE_DATE )             = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xb
DocPageCount( DBTYPE_I4 )                 = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xe
DocWordCount( DBTYPE_I4 )                 = F29F85E0-4FF9-1068-AB91-08002B27B3D9 0xf
SalesRegion( DBTYPE_WSTR | DBTYPE_BYREF ) = D5CDD505-2E9C-101B-9397-08002B2CF9AE "SalesRegion"

Within the section, any blank line, or a line beginning with a number sign (#) is ignored. Other lines consist of a friendly name, optionally followed by a datatype in parenthesis, followed by an equal sign (=), then a GUID identifying the property set for the column, followed by either a number or a string giving the PROPID or the property name, respectively. If no datatype is provided, DBTYPE_WSTR is assumed.

The friendly name is the token in query restrictions, sort specifications, and so on. Multiple friendly names can point to the same property. For example, the friendly name “Author” might be replaced by “Auteur” if an author property is to be shown to a French audience. Friendly names cannot contain spaces or special characters such as angle brackets, equal signs, exclamation points, commas, periods, and asterisks (>=<!,.*).

The GUID and PROPID/property name is the name of the property within the ActiveX property namespace. See the Win32 Software Development Kit (SDK) for more information on ActiveX properties. The PROPID may be specified as a decimal (base 10) or in hexadecimal (base 16) number. In the latter case, the number must be preceded by 0x. Property names must be enclosed in quotation marks. For example, “10” is not the same as 10.

The datatype is used during restriction parsing to correctly interpret user input. The following table lists the datatypes supported, their equivalent ActiveX mnemonics, and any formatting restrictions.

DatatypeActiveX mnemonicsFormatting restrictions
DBTYPE_I1VT_I1Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number, for example, 0x3F8.
DBTYPE_UI1VT_UI1Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_I2VT_I2Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_UI2VT_UI2Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_I4VT_I4Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_UI4VT_UI4Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_I8VT_I8Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_UI8VT_UI8Integer. Expressed in either decimal (base 10) or hexadecimal (base 16) notation. The latter requires 0x before the number.
DBTYPE_R4VT_R4Real number. Can be expressed in scientific notation.
DBTYPE_R8VT_R8Real number. Can be expressed in scientific notation.
DBTYPE_CYVT_CYCurrency. Expressed as two integers, separated by a period, for example, 100.55. Cannot be preceded by $, ¥, £, and so on. This datatype does not specify the currency format.
DBTYPE_DATEVT_DATEDate. Expressed as an absolute in two forms: yyyy/mm/dd and yyyy/mm/dd hh:mm:ss. Also expressed as a relative date: -#y, -#m, -#w, -#d, -#h, -#n, -#s where the letters correspond to year, month, week, day, hour, minute and second, respectively. Positive relative dates into the future are not supported.
DBTYPE_BOOLVT_BOOLBoolean. Expressed as TRUE or FALSE.
DBTYPE_STRVT_LPSTRString. Any input accepted.
DBTYPE_WSTRVT_LPWSTRUnicode string. Any input accepted.
DBTYPE_BSTRVT_BSTRBasic string. Any input accepted.
DBTYPE_GUIDVT_CLSIDGUID (Globally Unique IDentifier). Expressed as xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx.
DBTYPE_BYREF(not applicable)Older operator. Should be added to strings. For example: DBTYPE_WSTR | DBTYPE_BYREF.
DBTYPE_VECTORVT_VECTOROlder operator. Vector properties are fully supported.
VT_FILETIMEVT_FILETIMEExpressed as an absolute in two forms: yyyy/mm/dd and yyyy/mm/dd hh:mm:ss. Also expressed as a relative date: -#y, -#m, -#w, -#d, -#h, -#n, -#s where the letters correspond to year, month, week, day, hour, minute and second, respectively. Positive relative dates into the future are not supported.

The friendly names are always available, even if they are not explicitly defined in the names section. See List of Property Names on the “Query Language” page. For other Microsoft Office properties, see the Microsoft Office Software Developer’s Kit (SDK). For properties available with other products, see the documentation for each independent software vendor.

The HTML filter extracts text from the content field of a meta element. For example, if an HTML file has this line:

<META NAME="DESCRIPTION" CONTENT="Sample query form for Microsoft Index Server">

Then a user can query the information in the content field, namely “Sample query form for Microsoft Index Server”, by using the HTML meta property. The GUID for the meta property is D1B5D3F0-C0B3-11CF-9A92-00A0C908DBF1 and the property name is specified by the name field, or the HTTP-EQUIV field. In the above example, the property name is DESCRIPTION. Thus a friendly name, say MetaDescription, for the meta property can be defined as

MetaDescription(DBTYPE_WSTR) = D1B5D3F0-C0B3-11CF-9A92-00A0C908DBF1 description

The GUID for meta property is a registry parameter located at

HKEY_LOCAL_MACHINE
 \System
  \CurrentControlSet
   \Control\HtmlFilter
    \MetaTagClsid

To TopQuery Section

The query section of the .idq file specifies parameters that will be used in the query. It can refer to form variables and can include conditional expressions to set a variable to alternative values depending upon some condition. The section begins with a [Query] tag, and is followed by a set of parameters. Here is a simple .idq file:

[Query]
CiScope=/
CiColumns=FileName
CiRestriction=#filename *.*
CiTemplate=/Scripts/Template.htx

The preceding four parameters are required. In many cases, one or more parameters will be passed down from a form. Here is a very simple form:

<FORM ACTION="/scripts/simple.idq" METHOD="GET">
Query : <INPUT TYPE="TEXT" NAME="Restriction" SIZE="60" MAXLENGTH="100" VALUE="">
<INPUT TYPE="SUBMIT" VALUE="Execute Query">
</FORM>

This form can work with the following .idq file to pass parameters through from the user:

[Query]
CiScope=/
CiColumns=FileName
CiRestriction=%Restriction%
CiTemplate=/Scripts/Template.htx

Conditional expressions can also be used in .idq files in exactly the same manner as .htx files. In addition to the four parameters shown earlier, there are many other optional parameters. Common additions include CiSort and CiForceUseCi. See the full list of additions.


Warning   Be careful when substituting parameters for the CiTemplate parameter because you could unintentionally allow files in execute-only scripts directories to be sent over the network. For example, if an .idq file contained the line

CiTemplate=%CiTemplate%

a client could send a URL that contained the following line in the query string:

CiTemplate=/scripts/mysecretfile.pl

With this string, an unauthorized user could read the contents of a confidential file.

It is better to switch among different. htx files by just using the base name of the file and adding the script directory and file name extension in the parameter substitution. The following file, Sample.idq, shows how to do this:

[Query]
CiRestriction=%q%
CiTemplate=/scripts/%t%.htx
CiSort=%s%
CiScope=/

The query can be executed with a URL like the following:

http://computername/scripts/sample.idq?q=ActiveX&t=form1

To TopEffect of Parameters on Query Performance

The fastest query is a sequential query that uses the content index. Certain parameter settings will force the query engine to use a less efficient method to resolve the query. To guarantee fast queries, set CiSort to nothing (or descending by rank) set CiForceUseCi to TRUE, and do not reference CiMatchedRecordCount, CiRecordsNextPage, or CiTotalNumberPages in the .htx template.

Note: A Uniform Resource Locator (URL) or a form-based query can send up to 4 kilobytes (K) of data. If a query larger than 4K is sent, the behavior is unpredictable. The query size includes all variables sent from the browser to the .idq file.

Sequential versus Nonsequential Execution

A query can be executed sequentially (results fetched as needed) or it can be executed nonsequentially (results cached on the server). A sequential query requires fewer server resources, but also has some limitations. Backwards scrolling (CiBookmarkSkipCount < 0) will re-execute the query and scroll forward to the specified position. Sequential queries cannot refer to the following variables: CiMatchedRecordCount, CiRecordsNextPage, and CiTotalNumberPages.

Either of the following actions will force a query to be nonsequential:

Enumerated versus Indexed Resolution

Executing queries that must be enumerated can also slow down performance. Most queries are resolved by using the content index, but certain conditions force the query engine to recursively search the disk to locate matching files. These queries include:

Queries can be forced to use the content index by setting CiForceUseCi to TRUE in the .idq file. The query engine will always use the content index, but query results may be out-of-date for recently modified files. If the content index was used for a query, and some files on disk have been modified more recently than their contents have been filtered, the built-in variable CiOutOfDate will be set to the value 1. In some cases, a query is simply too complex to be resolved solely through use of the content index. In these cases, the built-in variable CiQueryIncomplete will be set to 1. Content queries can always be out of date and can use the content index anytime.

Deferring Nonindexed Trimming

Special support has been put in Index Server to optimize content queries that are sorted descending by rank (CiSort = Rank[d]). For such queries, minimal information can be retrieved from the index, before additional property and security tests are performed. However, if the total number of results matching the query is greater than CiMaxRecordsInResultSet then additional testing must be performed during index retrieval to remove items from this set that fail additional property and security tests. This frees up space in the result set for items matching the full query. This processing uses up resources, and can be deferred by setting CiDeferNonIndexedTrimming to TRUE. The query will then pick CiMaxRecordsInResultSet items first, and trim those. The end result may be a number of matching items less than CiMaxRecordsInResultSet. For queries with the scope set to the entire corpus, on a server with little or no security, you can consider setting CiDeferNonIndexedTrimming to TRUE to improve performance.


 Contents Previous To Top Next


© 1996 by Microsoft Corporation. All rights reserved.