Sorting the query results in sql2000 by relevance may sound attractive, but it is really achievable.
I saw an article on the Internet this morning that uses Microsoft index server to perform full-text query (I have seen this before, and there is also such a query function in the computer management)
My IIS default web server is at There are more than 100,000 html documents under g:/wwwroot
Test: strSearch = "SELECT DocTitle, Path, FileName, Characterization, Size,write,RANK" & _
" FROM SCOPE()" & _
" WHERE CONTAINS ('" & Request.Form("txtSearchFor") & "') ORDER BY RANK desc" I also sorted the relevance. I did not calculate the specific time cost
, but it gave people The feeling is acceptable, and it is very fast when turning pages. But the biggest disadvantage seems to be that it can only index static pages.
In the afternoon, I indexed a previous database with more than 500,000 records (mainly song names and artist names) in sql2000, and I can start testing it in the evening.
Test 1: "select top 26 * from song1 where contains(songtitle,'love')", the results are not processed in any way, just arranged in ascending order of IDs
The time overhead is basically maintained at 0.016s , the speed is very satisfying, at least it doesn’t feel slow.
Test 2: Use the rank value to sort the relevance, "order by rank desc" or "order by rank asc", the query results are satisfactory in terms of the quality of the sorting, both are relatively
accurate, regardless of It is okay to use or or and to sort multiple keywords when querying, but the time cost is unbearable to me. It is between 6s and 8s,
and the CPU usage is relatively high
I saw The relevance ranking of other searches on the Internet is relatively fast. I have not studied the open source Lucene because I do not understand Java.
But I think if the correlation calculation is performed on each keyword during indexing, the query will not be slow. I also feel depressed about this.