View Issue Details

IDProjectCategoryView StatusLast Update
0000106HTML & PERLFeature Request - Databasepublic2014-09-02 15:13
ReporterOnegaiNL Assigned ToDerIdiot  
PrioritynormalSeveritytrivialReproducibilityalways
Status resolvedResolutionfixed 
Summary0000106: Change in search should not take ! & other kinds of signs
DescriptionI Discovered this annoyance because i searched for Yakikate Japan, but it seems to be called Yakitate!! Japan so nothing was found... i think there should be a change into the search, so that it doesn't take those signs as a part of the name or optional... i think optional would work better since some anime series are alike and are only different because there is a sign.
TagsNo tags attached.

Relationships

duplicate of 0000448 closedDerIdiot fuzzy search 

Activities

exp

2005-02-25 22:20

administrator   ~0000217

and how would that look in SQL?

OnegaiNL

2005-02-25 22:59

reporter   ~0000227

hehe i dunno :P, you're the coders here not me ^_^
so i don't know how to code it >.<

egg

2005-02-26 00:40

reporter   ~0000229

Well in you use 'similar to' instead of 'ilike' and if you replace space ' ' with '[^A-Za-z0-9]+' then that would still allow for word ending searches with the tailing space AND it would match even if there is any punctuation where the space was. Unforfunately 'similar to' is case sensitive so you would have to do something like:
'%[yY][aA][kK][iI][tT][aA][tT][eE][^A-Za-z0-9]+[jJ][aA][pP][aA][nN]%'

Also to handle the case people have been complaining about with a trailing space, so let's say you put in 'word ' this would match a non-alphanumeric or the end of the title:
name SIMILAR TO '%[wW][oO][rR][dD]([^A-Za-z0-9]+%|$)'

I just checked the docs and PoostgreSQL supports regular exprossions, so you could have something like:
name ~* 'yakitate\\W+japan'
name ~* 'word(\\W|$)'

egg

2005-02-26 01:46

reporter   ~0000232

Last edited: 2005-02-26 01:58

Another way to do it would be to create another column for searching that would do things like:
1) Convert to lower case
2) Convert accented characters to non-accented ones???
3) Strip out non-alphanumerics
4) Append a space
5) Strip out consecutive spaces

I would leave the original search in there so that in case they did enter some formatting it would match. When you search the search_name column, then you would apply the same rules you had used to create the data for that column.

For instance if you did a search for "Yakitate! Japan", the system would do the following, the first part would not find anything, but the second one would...
name ilike '%Yakitate! Japan%' or search_name like '%yakitate japan%'

One way to do it is to only run the second part of the query if the first one doesn't return any results, so something like:
"No entries matched your query, did you mean these:"

After an initial conversion this would be relatively easy to maintain this column, you would just have to do it when a title gets updated...

I think this system is more versatile and probably faster than the other (I don't know the efficiency of DB regexps...), but it would mean additional storage space for the column and index.

Actually the ideal solution would be to do the normal search and display the results and put a [find more] link on the screen. If there were no results, or the user clicks the [find more] then do the secondary search on the search column, but break it up by word, so it would be:

search_name LIKE '%yakitate%' AND search_name LIKE '%japan%'

So that even if there was another word in there or the order was different, the anime still could be found.

exp

2005-02-26 08:33

administrator   ~0000235

well,

I am not sure how big the performance hit of something like this would be.
But the [find more] (automatically activated on 0 results) approach sounds pretty usefull.
I'd probably do:
1. the current search
on find more:
2. ilike search with one word at a time (AND)
3. regexp search which ignores special chars
if there are still no results and the search query contained multiple words
4. ilike search with one word at a time (OR)

BYe!
EXP

ninjamask

2007-08-09 00:28

updater   ~0001422

done?

exp

2007-08-09 05:34

administrator   ~0001427

done? after merely 2 years? you're way too optimistic :P
this is anidb we're talking about }:o)

well, at least it has been decided that the search definitely needs improvement. a couple of ideas have popped up, but so far nothing has been implemented (aside from the search assist feature which also aims at improving search usability).

BYe!
EXP

DerIdiot

2014-09-02 15:12

administrator   ~0003408

use the fulltext search which will be more prominent in the next release

Issue History

Date Modified Username Field Change
2005-02-25 17:29 OnegaiNL New Issue
2005-02-25 22:20 exp Note Added: 0000217
2005-02-25 22:59 OnegaiNL Note Added: 0000227
2005-02-26 00:40 egg Note Added: 0000229
2005-02-26 01:46 egg Note Added: 0000232
2005-02-26 01:58 egg Note Edited: 0000232
2005-02-26 08:33 exp Note Added: 0000235
2005-02-26 08:33 exp Assigned To => exp
2005-02-26 08:33 exp Status new => acknowledged
2007-08-09 00:28 ninjamask Note Added: 0001422
2007-08-09 05:34 exp Note Added: 0001427
2007-12-06 11:11 epoximator Category Bug Report - Database => Feature Request - Database
2011-02-02 21:44 exp Assigned To exp =>
2014-09-01 13:24 DerIdiot Relationship added duplicate of 0000448
2014-09-02 15:12 DerIdiot Note Added: 0003408
2014-09-02 15:12 DerIdiot Status acknowledged => resolved
2014-09-02 15:12 DerIdiot Resolution open => fixed
2014-09-02 15:12 DerIdiot Assigned To => DerIdiot