Monday, March 26, 2012

How to make full-text search accent-insensitive?

My SQL Server 2000 does not use the accent insensitive collation setting
(collation containing _AI) in full-text serches:
While SELECT * FROM <table> WHERE <column> LIKE '%a%' returns 'Muse',
SELECT * FROM <table> WHERE CONTAINS(*, 'a') does not.

Setting 'default full-text language' to neutral (0) does not help.

How can I make full-text searches accent insensitive?

Thanks for any ideas!
Matthias"Matthias HALDIMANN" <matthias.haldimann@.epfl.ch> wrote in message
news:4039d459$1@.epflnews.epfl.ch...
> My SQL Server 2000 does not use the accent insensitive collation setting
> (collation containing _AI) in full-text serches:
> While SELECT * FROM <table> WHERE <column> LIKE '%a%' returns 'Muse',
> SELECT * FROM <table> WHERE CONTAINS(*, 'a') does not.
> Setting 'default full-text language' to neutral (0) does not help.
> How can I make full-text searches accent insensitive?
> Thanks for any ideas!
> Matthias

You'll probably get a better reply if you post this in
microsoft.public.sqlserver.fulltext, since it's a relatively specialized
area.

Simon|||The "solution" for those who are interested:

There is NO solution! This is a known bug, full-text search is ALWAYS
accent-sensitive. All you can do is wait for a future update that may
correct this.

Matthias

"Matthias HALDIMANN" <matthias.haldimann@.epfl.ch> wrote in message
news:4039d459$1@.epflnews.epfl.ch...
> My SQL Server 2000 does not use the accent insensitive collation setting
> (collation containing _AI) in full-text serches:
> While SELECT * FROM <table> WHERE <column> LIKE '%a%' returns 'Muse',
> SELECT * FROM <table> WHERE CONTAINS(*, 'a') does not.
> Setting 'default full-text language' to neutral (0) does not help.
> How can I make full-text searches accent insensitive?
> Thanks for any ideas!
> Matthias|||Making accent insensitive searches with Full Text Search
Installing an accent insensitive version of Microsoft Search
Service

It is really a shame that Microsoft did not provide a solution for doing case
insensitive Full Text Search (FTS) before 2005. It is reaslly a lack of
consideration for all of their customers speaking or using language
that has accents. This is why I decided to post these instructions,
because there is a way around it. It is a post by Alex Hubner that
pointed me in the right direction. His solution works, I tried it.
He did not give the details on how to do it, so I decided I would:

* Get SharePoint Portal Server 2001 Service Pack 3 (SP3): KB837017 from
http://www.microsoft.com/downloads/...=15677a92-3470-
465f-9f63-e621094103e0&DisplayLang=en. There are five files to download:

oFile Name: SPSFull1.exe
File Size: 27937 KB
oFile Name: SPSFull2.exe
File Size: 25464 KB
oFile Name: SPSFull3.exe
File Size: 25975 KB
oFile Name: SPSFull4.exe
File Size: 27206 KB
oFile Name: SPSFull5.exe
File Size: 25198 KB

You can download and unpack the five downloaded files, but in fact, you only
need the content of SPSFull3.exe. A directory called SharePointPortalInstall will
be created.

Note: Do not use SharePoint Portal Server 2003, since its directory structure is
different, and the Microsoft Search Services don't seem to be a separate and
independent module in this version.

* Only the MS-Search part of the package is needed. You can find it in
SharePointPortalInstall\Server\Search. The installation program is called
SearchStp.exe. This is the installation program we will use to re-install Microsoft
Search Service and make searches accent insensitive. This will allow, for
example, the get the same search results not matter if the user enters a keyword
with or without accents (Eg.: 'Montreal' or 'Montral');
* You can use the documentation found on
http://support.microsoft.com/?kbid=827449 to re-install the Microsoft Search
Services. However, it is not necessary to go through all that trouble. I did so the
first time. However, some registry keys used by SQL Server and MS Search
Services were missing after. Gladfully, I had made a backup of all the registry
keys that we are asked to delete in this documentation. Also, the MS Search
Services have been installed into a different directory, so I had to do a search
through the registry to replace the old path by the new one where it had not been
updated by the installation program. In fact, you can skip this paragraph. I'm just
adding this information in case you would need it;
* Here are the registry keys I backup prior to this installation, just in case:

.HKEY_LOCAL_MACHINE\Software\Microsoft\Search
.HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servi ces\MSSCNTRS
.HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servi ces\MSSEARCH
.HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servi ces\MSSGATHERER
.HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servi ces\MSSGTHRSVC
.HKEY_LOCAL_MACHINE\System\CurrentControlSet\Servi ces\MSSINDEX

* The second time I installed the Accent Insensitive MS Search Services, I only
used section 2 of the documentation found on
http://support.microsoft.com/?kbid=827449 called Install the Microsoft Search
Service. However, I used the SearchStp.exe file downloaded earlier instead of
using the one they specify, and I stopped the Microsoft Search Service before
doing the installation even if it is not specified in the documentation. As specified
in this document, "To view the original domain name and user account, you can
use the most recent SQL Server setup log file (SqlstpN.log). In the SqlstpN.log
file, locate the line where the SQL Server Setup program ran the Ftsetup.exe
program. Additionally, make sure that the information is the same as the
information that is included in the command." On my computer, this file was
called sqlstp.log and it was located in the Windows directory.
* You can probably also follow the steps in section 3, but I haven't done so.

Once this is all done, you can use Enterprise Manager to create indexes on your database.
There is one thing though to know. When using Enterprise Manager to manage a remote
server, Tools->Full Text Indexing remains greyed out. I had to start Enterprise Manager
locally on the server in order to gain access to the Full Text Indexing tool.

Creating indexes
In order to create Full Text Search indexes on a table, the table needs to have a primary
key. So make sure all the tables you want to index have primary keys.

Here are the steps to create indexes. The first time you create an index, you will need to
create a catalog:

* Start Enterprise Manager and open the your database Table View;
* Select the table you want to index using FTS (Full Text Search) and right-click on
it.
* In the menu that just opened with the right-click, select Full-Text Index Table,
and then in the sub-menu select Define Full-Text Indexing on a Table. This will
start the Welcome to the SQL Server Full-Text Indexing Wizard.

* Click on Next;
* On the Next Screen, select the key to index, and click on Net;
* Select the field(s) to index and choose the proper language settings in the second
column. Click on Next;
* You will then have to create a Catalog (An FTS Catalog). Give it a name and
enter a location where it should be stored. You can use the default path. Click on
Next;
* You will then have to create a schedule for the indexation process. Click on New
Catalog Schedule, and define a Schedule. Let's make it a Full Population. Maybe
an Incremental Population would work, but I am not sure in that case what
happens if information is removed from the database being indexed. Once the
schedule is defined, click on Ok and then on Next;
* Click on Finish, and the catalogue will be created:

We are now ready to create an index on another table. This time, it is not necessary to
create a new catalogue. Simply reuse to one we previously created. Same thing for the
schedule; the one already created can be reused.

We now have a catalogue, but the indexes are still empty, since they are scheduled to be
generated in the future (based on the schedule you created). We should now do a full
population of the indexes. In SQL Server Enterprise Manager, right click on each table
you wish to index, select Full-Text Index Table, and then select Start Full Population.

You can now use a query similar to the one bellow in SQL Query Analyzer to test
whether the case insensitive indexing is working properly:

SELECT *
FROM table_name
WHERE CONTAINS(field_name, 'raphal')

otable_name should be replaced by the name of the table you want to search;
ofield_name should be replaced by the name of the field you want to search;

Use this query with the same word with and without accents, and you should get the same
result both times.

Enjoy!

Jean-Franois Beauchamp
IT Consultant
jackojf-fts at yahoo.com

Quote:

Originally Posted by Matthias HALDIMANN

The "solution" for those who are interested:

There is NO solution! This is a known bug, full-text search is ALWAYS
accent-sensitive. All you can do is wait for a future update that may
correct this.

Matthias

"Matthias HALDIMANN" <matthias.haldimann@.epfl.ch> wrote in message
news:4039d459$1@.epflnews.epfl.ch...
> My SQL Server 2000 does not use the accent insensitive collation setting
> (collation containing _AI) in full-text serches:
> While SELECT * FROM <table> WHERE <column> LIKE '%a%' returns 'Muse',
> SELECT * FROM <table> WHERE CONTAINS(*, 'a') does not.
> Setting 'default full-text language' to neutral (0) does not help.
> How can I make full-text searches accent insensitive?
> Thanks for any ideas!
> Matthias
>

No comments:

Post a Comment