Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

enhancement: add several improvements to local search #1612

Merged
merged 4 commits into from
Apr 22, 2017

Conversation

uchuhimo
Copy link
Contributor

@uchuhimo uchuhimo commented Apr 21, 2017

Improvements:

  • Let search input get focus automatically when search panel pops up

  • Clean search input and search result when search panel closes

  • When search panel pops up, click anywhere outside search panel can close it

  • Sort result by hit count in article. Demo:

    before:

    search-result-order-before

    after:

    search-result-order-after
  • Padding search icon and button. Comparison:

    before:

    search-panel-before

    after:

    search-panel-after
  • Support to highlight keywords in title. Demo:

    highlight-keywords-demo
  • [fix] Let titles be shown with correct letter case instead of all lower case. Demo:

    before:

    search-lowercase

    after:

    search-correct-case
  • [fix] When you input several keywords, all the keywords' result will be shown. Demo:

    before:

    search-show-one

    after:

    search-fix-letter-case
  • Support to show first n highlighted results per article. Originally, search panel can only show first one result per article. Add a new configuration item local_search.top_n_per_article to config it. Default value of local_search.top_n_per_article is 1 to keep compatibility. Demo:

    local_search.top_n_per_article = 0:

    search-show-0

    local_search.top_n_per_article = 1 (default):

    search-show-1

    local_search.top_n_per_article = 3:

    search-show-3

    local_search.top_n_per_article = -1 (show all results):

    search-show-all

    Note: I adjust the result concatenation algorithm to show specified number of results without overlapping highlighted word between adjacent results. It complicates the algorithm. If anyone has simpler solution, tell me ^_^

@ivan-nginx
Copy link
Collaborator

It's cool! It's exactly what i was want from searchbox. I will test it soon.

@uchuhimo
Copy link
Contributor Author

@ivan-nginx Thanks for your testing. I cannot find all the corner cases by myself. Found any bug, tell me ^_^
P.S. surprised to find someone response to my pull request so quickly

@uchuhimo uchuhimo changed the title feature: add several improvements to local search enhancement: add several improvements to local search Apr 21, 2017
@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 21, 2017

@uchuhimo

  1. In current search and in your patch — all titles are lowercases:
    image
    Can u do their true cases, like in post titles?
    P.S. In search.xml is true cases.
  2. If we want to search by 2 (or more) words, script will search only by 1 word. On old search was same.
    image
    image

P.S. Need to test it on English language. I think what Chinese language worked fine by both this bugs.

@uchuhimo
Copy link
Contributor Author

uchuhimo commented Apr 22, 2017

@ivan-nginx I fix both of the bugs. Now title is shown with correct letter case, and result is shown with all keyword hits. Demo:

search-fix-letter-case

I also adjust the algorithm to order results by the hit count in article.

I will update pull request later. Welcome for more bug reports.

- fix: let titles be shown with correct letter case instead of all lower
case
- fix: when you input several keywords, all the keywords' result will be
shown
- feature: sort result by hit count in article
@ivan-nginx
Copy link
Collaborator

@uchuhimo seems u profi. GJ and thank's!

@ivan-nginx ivan-nginx merged commit 340c984 into iissnan:master Apr 22, 2017
@uchuhimo uchuhimo deleted the feature-local-search branch April 22, 2017 16:55
@uchuhimo
Copy link
Contributor Author

@ivan-nginx

What does "profi" means? Google Translate tells me that "profi" is German and means "professional", is it right?

I guess "profi" is abbreviation of "professional" in English. But I'm not sure. I'm not a native English speaker.

Just out of curiosity ^_^

@ivan-nginx
Copy link
Collaborator

@uchuhimo > I guess "profi" is abbreviation of "professional" in English.
That's right.
Check your e-mail, please.

@ivan-nginx
Copy link
Collaborator

@uchuhimo u got e-mail? What u think?
Also, there is some bugs, i can reproducte it in test English site, if u hard to understand Russian (what i give u to test).

@uchuhimo
Copy link
Contributor Author

uchuhimo commented Apr 23, 2017

@ivan-nginx seems your e-mail is treated as spam by Outlook. I have checked them just now.

You report 2 bugs in your e-mails:

  • Search by multiwords not working or working not right
    Can you give me a search.xml in English site and some keywords to reproduce the bug? It will help me a lot.
  • With big search data, browser is not responding for a long seconds
    I did not tune it for performance. Since I have not written over 1000 line of JavaScript, I'm not sure I can solve it. But I will try my best. It will helps if you can send a large enough search.xml in English site for reproduction.

Are there any other bugs? Send them to me together. Thanks~

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

@uchuhimo no no, that's all bugs for now. I just think what u don't understand this bugs or anotther any things, cause u not give some answers.

Can you give me a search.xml in English site and some keywords to reproduce the bug? It will help me a lot.

Yes, i will send it on your e-mail.

I'm not sure I can solve it.

Need just to add button to search at the left, for example. Issue is that when you press the buttons on keyboard — appear some delay and search do on each button pressed. I think, we may add some option for auto search and for search by press search button, that's what i mean.

@uchuhimo
Copy link
Contributor Author

@ivan-nginx

I just think what u don't understand this bugs or anotther any things, cause u not give some answers.

Sorry for the delay. I cannot get notification if a e-mail is treated as spam.

I think, we may add some option for auto search and for search by press search button, that's what i mean.

How about trigger search by pressing enter? It seems more intuitive than pressing search button, which is on the left side and hard to reach (at least for me).

@uchuhimo
Copy link
Contributor Author

@ivan-nginx

article is find by word "you", but not by multiwords "you can".

OK, it's not a bug. The logic is:

  1. "you can" will be split into "you" and "can", all the position of them will be found
  2. Since the default value of local_search.top_n_per_article is 1, search panel will show the first hit, which contains not more than 100 characters. Thus "you can find the answer in troubleshooting or you can ask me on GitHub" is not shown.

You can set local_search.top_n_per_article to 2 to get your desired result in this case (or set local_search.top_n_per_article to -1 to get all results).

The problem is: should I explain "you can" in this way? "you" + "can" or "you can", which is desired?

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

@uchuhimo this is simple example. The problem is with search in search.xml at sites with big data (example you get by e-mail). For example, if we want to search by multiwords "you can give it from code" and we have many many other articles with words in start with "you" or "you can" — this articles will be shown firstly, and, probally, needed article with "you can give it from code" not be shown.
By the way, if we want start search with multiwords "you can give it from code", search must shown in 1 results article, what contain this multiwords firstly, and, if other separated words exists, must shown it secondly, thirdly, etc.

Another words: i want to find "we wait for car" ("мы ждём машину"), and i have many many results by words "we" ("мы"), but needed article with this multiwords not exists in list. This multiwords wasn't find what u want.
image

The problem with search multiwords is: need see at top in list that article, which absolutely contain this multiwords.

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

@uchuhimo > OK, it's not a bug.
I understand, what this is not a bug. But, this is not search by multiwords what i need with big data (over 100 contains there). And if i set -1, it may delay for 1-2 minutes at each key pressed. :)

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

@uchuhimo u understand what i mean? Give me answer, please.

@uchuhimo
Copy link
Contributor Author

uchuhimo commented Apr 23, 2017

@ivan-nginx I get what you mean. To solve this problem completely, I have to search every combination of keywords, and then sort them by their match rate. By splitting N keywords and search each of them, I scan all the articles by N passes; By combining N keywords and search each of the combinations, I scan all the articles by 2^N passes, which is apparently unacceptable, especially when the number of articles is huge.

There are several solutions/workarounds:

  • Use Natural language processing (NLP) to reduce the possible combination of keywords. But NLP will consume a lot of CPU/memory resources. It's not realistic for browser environment.
  • Use "" to imply that "don't split it". It's Google's search rule, but not intuitive enough.
  • Search both of the original string and the split string, and give higher priority for results containing the original string. It means, if you search "we wait for car", results contain "we wait for car" will be shown first, and then the others. However, results containing "we wait" or "wait for car" are not guaranteed to show before results containing "we" only.

I think the third proposal is most suitable. How about you? Any better suggestion?

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

@uchuhimo i'm for 3rd variant, of course. Need to show firstly by 100% contains multiwords, and others.. others results may be shown after main search keys. Search is search — if i search by "1 2 3 4 5" — search must me give article, what include "1 2 3 4 5" firstly with 100% contain, isn't it?

However, results containing "we wait" or "wait for car" are not guaranteed to show before results containing "we" only.

This is no matter. If i search by template of words and get that's what i searched — this is already victory for search. Other "we" or "we wait" no matter, if i get my searched article firstly by searched keywords.

P.S. I talk for now from user-side only. ;)

@uchuhimo
Copy link
Contributor Author

@ivan-nginx OK, I will add two improvements in the next pull request:

  • Give higher priority for results containing the whole search string
  • add a new flag local_search.auto_search, search will be triggered by pressing enter if the value is false (It's the default behavior considering sites containing many posts)

Any supplement?

@ivan-nginx
Copy link
Collaborator

ivan-nginx commented Apr 23, 2017

Give higher priority for results containing the whole search string.

Yes, it's will be nice. I'm talking about it.

add a new flag local_search.auto_search, search will be triggered by pressing enter if the value is false

And i may add to this option button. Will be button or Enter. Yep.

@uchuhimo
Copy link
Contributor Author

@ivan-nginx I submit a new pull request for these 2 improvements, check it please.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants