0000106: Change in search should not take ! & other kinds of signs - AniDB Bug Tracker

ID	Project	Category	View Status	Date Submitted	Last Update

0000106	HTML & PERL	Feature Request - Database	public	2005-02-25 17:29	2014-09-02 15:13

Reporter	OnegaiNL	Assigned To	DerIdiot
Priority	normal	Severity	trivial	Reproducibility	always
Status	resolved	Resolution	fixed

Summary	0000106: Change in search should not take ! & other kinds of signs
Description	I Discovered this annoyance because i searched for Yakikate Japan, but it seems to be called Yakitate!! Japan so nothing was found... i think there should be a change into the search, so that it doesn't take those signs as a part of the name or optional... i think optional would work better since some anime series are alike and are only different because there is a sign.
Tags	No tags attached.

exp 2005-02-25 22:20 administrator ~0000217	and how would that look in SQL?

OnegaiNL 2005-02-25 22:59 reporter ~0000227	hehe i dunno :P, you're the coders here not me ^_^ so i don't know how to code it >.<

egg 2005-02-26 00:40 reporter ~0000229	Well in you use 'similar to' instead of 'ilike' and if you replace space ' ' with '[^A-Za-z0-9]+' then that would still allow for word ending searches with the tailing space AND it would match even if there is any punctuation where the space was. Unforfunately 'similar to' is case sensitive so you would have to do something like: '%[yY][aA][kK][iI][tT][aA][tT][eE][^A-Za-z0-9]+[jJ][aA][pP][aA][nN]%' Also to handle the case people have been complaining about with a trailing space, so let's say you put in 'word ' this would match a non-alphanumeric or the end of the title: name SIMILAR TO '%[wW][oO][rR][dD]([^A-Za-z0-9]+%\|$)' I just checked the docs and PoostgreSQL supports regular exprossions, so you could have something like: name ~* 'yakitate\\W+japan' name ~* 'word(\\W\|$)'

egg 2005-02-26 01:46 reporter ~0000232 Last edited: 2005-02-26 01:58	Another way to do it would be to create another column for searching that would do things like: 1) Convert to lower case 2) Convert accented characters to non-accented ones??? 3) Strip out non-alphanumerics 4) Append a space 5) Strip out consecutive spaces I would leave the original search in there so that in case they did enter some formatting it would match. When you search the search_name column, then you would apply the same rules you had used to create the data for that column. For instance if you did a search for "Yakitate! Japan", the system would do the following, the first part would not find anything, but the second one would... name ilike '%Yakitate! Japan%' or search_name like '%yakitate japan%' One way to do it is to only run the second part of the query if the first one doesn't return any results, so something like: "No entries matched your query, did you mean these:" After an initial conversion this would be relatively easy to maintain this column, you would just have to do it when a title gets updated... I think this system is more versatile and probably faster than the other (I don't know the efficiency of DB regexps...), but it would mean additional storage space for the column and index. Actually the ideal solution would be to do the normal search and display the results and put a [find more] link on the screen. If there were no results, or the user clicks the [find more] then do the secondary search on the search column, but break it up by word, so it would be: search_name LIKE '%yakitate%' AND search_name LIKE '%japan%' So that even if there was another word in there or the order was different, the anime still could be found.

exp 2005-02-26 08:33 administrator ~0000235	well, I am not sure how big the performance hit of something like this would be. But the [find more] (automatically activated on 0 results) approach sounds pretty usefull. I'd probably do: 1. the current search on find more: 2. ilike search with one word at a time (AND) 3. regexp search which ignores special chars if there are still no results and the search query contained multiple words 4. ilike search with one word at a time (OR) BYe! EXP

ninjamask 2007-08-09 00:28 updater ~0001422	done?

exp 2007-08-09 05:34 administrator ~0001427	done? after merely 2 years? you're way too optimistic :P this is anidb we're talking about }:o) well, at least it has been decided that the search definitely needs improvement. a couple of ideas have popped up, but so far nothing has been implemented (aside from the search assist feature which also aims at improving search usability). BYe! EXP

DerIdiot 2014-09-02 15:12 administrator ~0003408	use the fulltext search which will be more prominent in the next release

Date Modified	Username	Field	Change
2005-02-25 17:29	OnegaiNL	New Issue
2005-02-25 22:20	exp	Note Added: 0000217
2005-02-25 22:59	OnegaiNL	Note Added: 0000227
2005-02-26 00:40	egg	Note Added: 0000229
2005-02-26 01:46	egg	Note Added: 0000232
2005-02-26 01:58	egg	Note Edited: 0000232
2005-02-26 08:33	exp	Note Added: 0000235
2005-02-26 08:33	exp	Assigned To	=> exp
2005-02-26 08:33	exp	Status	new => acknowledged
2007-08-09 00:28	ninjamask	Note Added: 0001422
2007-08-09 05:34	exp	Note Added: 0001427
2007-12-06 11:11	epoximator	Category	Bug Report - Database => Feature Request - Database
2011-02-02 21:44	exp	Assigned To	exp =>
2014-09-01 13:24	DerIdiot	Relationship added	duplicate of 0000448
2014-09-02 15:12	DerIdiot	Note Added: 0003408
2014-09-02 15:12	DerIdiot	Status	acknowledged => resolved
2014-09-02 15:12	DerIdiot	Resolution	open => fixed
2014-09-02 15:12	DerIdiot	Assigned To	=> DerIdiot