Same behaviour behaviour than Whoogle [1]. Only the google engine with the
"Default language" choice "(all)"" is changed by this patch.
When searching for a locate place, the result are in the expect language,
without missing results [2]:
> When a language is not specified, the language interpretation is left up to
> Google to decide how the search results should be delivered.
The query parameters are copied from Whoogle. With the ``all`` language:
- add parameter ``source=lnt``
- don't use parameter ``lr``
- don't add a ``Accept-Language`` HTTP header.
The new signature of function ``get_lang_info()`` is:
lang_info = get_lang_info(params, lang_list, custom_aliases, supported_any_language)
Argument ``supported_any_language`` is True for google.py and False for the other
google engines. With this patch the function now returns:
- query parameters: ``lang_info['params']``
- HTTP headers: ``lang_info['headers']``
- and as before this patch:
- ``lang_info['subdomain']``
- ``lang_info['country']``
- ``lang_info['language']``
[1] https://github.com/benbusby/whoogle-search
[2] https://github.com/benbusby/whoogle-search/releases/tag/v0.5.4
Make 'soft_max_redirects' configurable per Xpath engine::
- name : <engine-name>
engine : xpath
soft_max_redirects: 1
...
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
settings.yml:
* outgoing.networks:
* can contains network definition
* propertiers: enable_http, verify, http2, max_connections, max_keepalive_connections,
keepalive_expiry, local_addresses, support_ipv4, support_ipv6, proxies, max_redirects, retries
* retries: 0 by default, number of times searx retries to send the HTTP request (using different IP & proxy each time)
* local_addresses can be "192.168.0.1/24" (it supports IPv6)
* support_ipv4 & support_ipv6: both True by default
see https://github.com/searx/searx/pull/1034
* each engine can define a "network" section:
* either a full network description
* either reference an existing network
* all HTTP requests of engine use the same HTTP configuration (it was not the case before, see proxy configuration in master)
Springer Nature is a global publisher dedicated to providing service to research
community [1] with official API [2].
To test this PR, first get your API key following this page:
https://dev.springernature.com/signup
In searx/engines/springer.py at line 24, add this API key. I left my own key,
commented out in the line aboce. Feel free to use it, if needed.
[1] https://www.springernature.com/
[2] https://dev.springernature.com/
In the EU there exists a "General Data Protection Regulation" [1] aka GDPR (BTW:
very user friendly!) which requires consent to tracking. To get the consent
from the user, youtube requests are redirected to confirm and get a CONSENT
Cookie from https://consent.youtube.com
This patch adds a CONSENT Cookie to the youtube request to avoid redirection.
[1] https://en.wikipedia.org/wiki/General_Data_Protection_Regulation
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
Reported-by: https://github.com/searx/searx/issues/2774
I also found some items missing a thumbnail and I used text_extract for content
and title, to remove unneeded whitespaces.
BTW: added bandcamp's favicon
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
fr.wikipedia.org (and it seems not other wikipedia websites),
adds HTML to api_result['displayTitle'].
(Search for '!wp :fr Braid' for example)
The commit uses api_result['title']
The get_cliend_id() function:
* fetches https://soundcloud.com
* then fetches each referenced javascript URL to get the client id.
This commit fetches the javascript URLs in the reverse order: the client id is in the last javascript URL.