## What does this PR do?
This should fix#3164.
The problem is that `httpx` keeps making breaking changes to their library, so we just have to adjust the code a little bit to make it work with the new version of the library.
## Related issues
Closes #3164
* [mod] /image_proxy: don't decompress images
* [fix] image_proxy: always close the httpx respone
previously, when the content type was not an image and some other error,
the httpx response was not closed
* [mod] /image_proxy: use HTTP/1 instead of HTTP/2
httpx: HTTP/2 is slow when a lot data is downloaded.
https://github.com/dalf/pyhttp-benchmark
also, the usage of HTTP/1 decreases the load average
* [mod] searx.utils.dict_subset: rewrite with comprehension
Co-authored-by: Alexandre Flament <alex@al-f.net>
* [enh] Add Tineye reverse image search
Other optional parametesr:
"&sort=crawl_date" can be appended to search_string to sort results by date.
"&domain=example.org" can be implemented to search_string to get results from just one domain.
Public instances could get relatively fast timed-out for 3600s.
* [enh] Add TIneye to settings.yml
Check if that's the right shortcut.
* [mod] Fix checks
* [mod] Try to fix checks
* [mod] Use Four spaces for indentation
And set paging back to True
Co-authored-by: Noémi Ványi <kvch@users.noreply.github.com>
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.
[1] https://github.com/searxng/searxng/pull/695
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:
1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed
Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
## What does this PR do?
Gives the user the possibility to search their own prowlarr instances.
Info: https://wiki.servarr.com/en/prowlarr
Github: https://github.com/Prowlarr/Prowlarr
## Why is this change important?
Prowlarr searchs multiple upstream search providers, thus allows to use that functionality through searx.
* [enh] Add autocompleter from Brave
Raw response example: https://search.brave.com/api/suggest?q=how%20to:%20with%20j
Headers are needed in order to get a 200 response, thus Searx user-agent is used.
Other URL param could be '&rich=false' or '&rich=true'.
Languages are supported by mapping the language to a domain. If domain is not
found in :py:obj:`lang2domain` URL ``<lang>.search.yahoo.com`` is used.
Closes#3020
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Add configurable setting to rank search results higher when part of the
domain (e.g. 'en' in 'en.wikipedia.org' or 'de' in 'beispiel.de')
matches the selected search language. Does not apply to e.g. 'be' in
'youtube.com'.
Closes#206
The patch introduced earlier broke the behaviour for instance
admins running searx from packages. This fix aims to provide
compatibility for everyone.
Closes#3061
The implementation uses the Qwant API (https://api.qwant.com/v3). The API is
undocumented but can be reverse engineered by reading the network log of
https://www.qwant.com/ queries.
This implementation is used by different qwant engines in the settings.yml::
- name: qwant
categories: general
...
- name: qwant news
categories: news
...
- name: qwant images
categories: images
...
- name: qwant videos
categories: videos
...
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
* [fix] Update about section of Invidious
Another website and new documentation
* [fix] Correct engine name in for Rumble (#11)
* [fix] Wording for Filtron error message (#12)
- Change Libgen provider and use https by default.
- Umcomment Urbandictionary but disable it by default, it is working.
- Uncomment Ebay as it is working correctly.
(For ebay in the future: base_url should be changed from settings.yml just like peertube or invidious)
searx/data/engines_languages.json stores language information for
several searchengines in a json endoded dict that maps engine-"types" to
their supported languages; for instance there is an entry "google",
mapping to the supported languages of the google engine.
However, the lookup code did not use the engine 'type' (as in: the
filename searx/engines/<enginetype>.py), but instead the manually
configured engine name from settings.yml when querying. This is
problematic as soon as users start to specify additional engine
instances with custom names in the config file, as for instance
suggested as a workaround for multilingual search in the manual[0]:
> engines:
> - name : google english
> engine : google
> language : english
Here, the engine name "google english" will be used for the lookup in
the json file, which does not exist. The empty supported_languages then
lead to a type error later in the processing callchain.
This patch changes the behaviour to use the engine's entry-"type"
("google" in the above example) for the lookup. This should fix bug #2928.
0: https://searx.github.io/searx/user/search_syntax.html#multilingual-search
## What does this PR do?
Fixes the self_info plugin to support uppercase ip queries.
## Why is this change important?
This PR solves the mild annoyance of retyping IP in lowercase.
## Related issues
Closes#2888
Implement a scrapper for DuckDuckGo-Lite [1]. The existing DuckDuckGo [2]
engine does not support paging. DuckDuckgo-Lite is much faster, less verbose
and does have a paging option (reversed engineered from the input form of [1]).
[1] https://lite.duckduckgo.com/lite
[2] https://duckduckgo.com/
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Suggestions should be added too.
suggestion_xpath: //div[@class="text-gray h6"]/a
You can try it with:
!brave recurzuoin
Suggested-by: @allendema in https://github.com/searx/searx/issues/2857#issuecomment-904837023
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
BTW add about section to the YAML configuration
It now shows descriptions with their correct URLs when there are videos in the
search results, pulling content_xpath from snippet-description instead of
snippet-content.
Suggested-by: @eagle-dogtooth https://github.com/searx/searx/issues/2857#issuecomment-869119968
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Instead of raising an exception and therefore hiding all results of the engine.
It make sense to remove that requirement in order to allow the implementation of
search engines that do not always have a description. In fact some search
engines that in 99% of the case have a description like Brave Search or Mojeek
crash completely if they for some reason included a result with no description.
To test this patch try Mojeek:
!mjk xyz
before and after the patch.
Suggested-by: 0xhtml in https://github.com/searx/searx/discussions/2933
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>