Why not let the user choose the character set to index at runtime? That is, the user could choose from options ("ASCII only", "ASCII with @ and .", "Unicode French only") etc. By using smaller sets of characters, but letting the user pick which one, you can keep the index file small but still have the power to search multiple languages? -- SA Jesse Kornblum Chief, Computer Investigations and Operations Air Force Office of Special Investigations DSN 857-1143 Commercial 240-857-1143 Fax 857-0963 STU-III 857-0965 email: jesse.kornblumat_private siprnet: jesse.kornblumat_private http://afosi-web/xos/xosi/ ----------------------------------------------------------------- This list is provided by the SecurityFocus ARIS analyzer service. For more information on this free incident handling, management and tracking system please see: http://aris.securityfocus.com
This archive was generated by hypermail 2b30 : Tue May 27 2003 - 08:53:40 PDT