Commit Graph

61 Commits

Author SHA1 Message Date
Brett Kosinski 3c84af95ba
Fix scraping of 'sc' value from homepage (#3397)
Looking at the current HTML for the Startpage front page, the previous
footer logo element is no longer present.  This change scrapes the "sc"
parameter from one of the hidden HTML form elements, which should
(hopefully) be a bit more stable long term, since that form is used by
Startpage to submit requests to the engine.
2022-10-31 22:34:43 +01:00
Kian-Meng Ang 629ebb426f
Fix typos (#3366)
Found via `codespell -S ./searx/translations,./searx/data,./searx/static -L ans,te,fo,doubleclick,tthe,dum`
2022-09-29 23:06:59 +02:00
Noémi Ványi 85034b49ef
Remove `httpx` and use `requests` instead (#3305)
## What does this PR do?

This PR prepares for removing `httpx`, and reverts back to `requests`.

## Why is this change important?

`httpx` hasn't proven itself to be faster or better than `requests`. On the other
hand it has caused issues on Windows.

=============================================
Please update your environment to use requests instead of httpx.
=============================================
2022-07-30 20:56:56 +02:00
Noémi Ványi 148090df12 Minor fixes to satisfy the linter 2022-01-21 17:59:10 +01:00
Alexandre Flament d592159cc5 [fix] startpage: workaround to use the startpage network
workaround for the issue #762
2022-01-21 17:59:10 +01:00
Markus Heiser 036d80ed20 [mod] starpage engine: add comment about Startpage's FFox add-on
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser a4bc089091 [fix] startpage engine: fetch CAPTCHA & issues related to PR-695
In case of CAPTCHA raise a SearxEngineCaptchaException and suspend for 7 days.
When get_sc_code() fails raise a SearxEngineResponseException and suspend for 7
days.

[1] https://github.com/searxng/searxng/pull/695

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 1076d7e52e [fix] Get an actual `sc` argument from startpage's home page.
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser a6184ac32c [pylint] Startpage engine
Fix remarks from pylint

Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Markus Heiser 4750586fb0 [fix] startpage engine - avoid captcha
Startpage has introduced new anti-scraping measures that make SearXNG instances
run into captchas:

1. some arguments has been removed and a new `sc` has been added.
2. search path changed from `do/search` to `sp/search`
3. POST request is no longer needed

Closes: https://github.com/searxng/searxng/issues/692
Signed-off-by: Markus Heiser <markus.heiser@darmarit.de>
2022-01-21 17:59:10 +01:00
Alexandre Flament ca93a01844 [mod] dynamically set language_support variable
The language_support variable is set to True by default,
and set to False in only 5 engines.

Except the documentation and the /config URL, this variable is not used.

This commit remove the variable definition in the engines, and
set value according to supported_languages length: False when the length is 0,
True otherwise.

Close #2485
2021-02-01 17:10:37 +01:00
Alexandre Flament a4dcfa025c [enh] engines: add about variable
move meta information from comment to the about variable
so the preferences, the documentation can show these information
2021-01-14 20:57:17 +01:00
lucky13820 fea8958e99
Fix the StartPage result title is showing the url
Fix the issue 2395 where StartPage result title is showing the url. https://github.com/searx/searx/issues/2395
2020-12-16 13:54:14 -08:00
joshu9h 8260435c8b
[Fix] Startpage 2020-12-13 15:43:50 +01:00
Alexandre Flament 3038052c79 [mod] remove unused import
use
from searx.engines.duckduckgo import _fetch_supported_languages, supported_languages_url  # NOQA
so it is possible to easily remove all unused import using autoflake:
autoflake --in-place --recursive --remove-all-unused-imports searx tests
2020-11-14 14:11:02 +01:00
Alexandre Flament 2006eb4680 [mod] move extract_text, extract_url to searx.utils 2020-10-02 18:13:56 +02:00
Marc Abonce Seguin 41800835f9 fetch supported languages for startpage engine 2020-09-22 11:37:44 +02:00
Spühler Stefan 4f90fb6a92 [Fix] Startpage ValueError on Spanish date format
datetime.parser.parse() does not know the Spanish date format which
leads to a ValueError. Fixes #1870

Traceback (most recent call last):
  File "/usr/local/searx/searx/search.py", line 160, in search_one_http_request_safe
    search_results = search_one_http_request(engine, query, request_params)
  File "/usr/local/searx/searx/search.py", line 97, in search_one_http_request
    return engine.response(response)
  File "/usr/local/searx/searx/engines/startpage.py", line 102, in response
    published_date = parser.parse(date_string, dayfirst=True)
  File "/usr/local/searx/searx-ve/lib/python3.6/site-packages/dateutil/parser/_parser.py", line 1358, in parse
    return DEFAULTPARSER.parse(timestr, **kwargs)
  File "/usr/local/searx/searx-ve/lib/python3.6/site-packages/dateutil/parser/_parser.py", line 649, in parse
    raise ValueError("Unknown string format:", timestr)
ValueError: ('Unknown string format:', '24 Ene 2013')
2020-03-09 09:31:20 +01:00
Dalf 85b3723345 [mod] speed optimization
compile XPath only once
avoid redundant call to urlparse
get_locale(webapp.py): avoid useless call to request.accept_languages.best_match
2019-11-15 09:33:15 +01:00
Adam Tauber ed1c1bdb04 [fix] pep8 2019-10-14 15:09:39 +02:00
Adam Tauber 77a70fe541 [fix] update startpage engine - closes #1601 2019-10-14 14:18:41 +02:00
Noémi Ványi b63d645a52 Revert "remove 'all' option from search languages"
This reverts commit 4d1770398a.
2019-01-07 21:19:00 +01:00
Noémi Ványi aeb6dab187
Merge branch 'master' into master 2019-01-04 22:14:40 +01:00
Michael Pfitzner 44ce51f0c5 restore startpage search results 2018-12-14 21:38:48 +01:00
dimqua 0d86ed9c7e update startpage.py 2018-12-11 21:45:47 +03:00
marc 4d1770398a remove 'all' option from search languages 2017-12-06 01:20:15 -06:00
Adam Tauber 52e615dede [enh] py3 compatibility 2017-05-15 12:02:30 +02:00
marc f62ce21f50 [mod] fetch supported languages for several engines
utils/fetch_languages.py gets languages supported by each engine and
generates engines_languages.json with each engine's supported language.
2016-12-13 19:58:10 -06:00
marc a11948c71b Add language support for more engines. 2016-12-13 19:32:43 -06:00
marc 149802c569 [enh] add supported_languages on engines and auto-generate languages.py 2016-12-13 19:32:00 -06:00
Adam Tauber 16bdc0baf4 [mod] do not escape html content in engines 2016-12-09 18:59:19 +01:00
stepshal b3ab221b98 Fix anomalous backslash in string 2016-07-11 23:53:13 +07:00
Adam Tauber bd22e9a336 [fix] pep8 compatibilty 2016-01-18 12:47:31 +01:00
Thomas Pointhuber 4508c96667 [enh] fix content fetching, parse published date from description 2015-10-24 16:19:47 +02:00
Thomas Pointhuber 996c96ffff [fix] block ixquick search url's 2015-08-24 11:31:30 +02:00
Thomas Pointhuber 23b9095cbf [fix] improve result handling of startpage engine 2015-08-24 11:28:55 +02:00
Cqoicebordel f1c10f4fe4 Startpage's unit test 2015-02-06 17:31:10 +01:00
Cqoicebordel b4b666e703 Flake8 2015-01-15 20:27:30 +01:00
Cqoicebordel fa0330f0ff Fix startpage
Fix issue with unicode caracters in startpage : we shouldn't urlencode them if we are using POST.
Should fix #169. @dimqua can you confirm ?
2015-01-15 20:18:40 +01:00
Adam Tauber c8be128e97 [mod] ignore startpage unicode errors 2015-01-09 11:21:46 +01:00
Adam Tauber b1234ee889 [fix] startpage engine compatibility 2014-11-17 10:19:23 +01:00
Thomas Pointhuber 678a80f043 fix startpage engine and add comments
* add language support
* remove not required code
* improve google-ad detection (no false detection anymore, I hope)
* other improvements
2014-09-02 19:57:01 +02:00
Adam Tauber 111a86d355 [fix] html escape 2014-08-06 14:43:44 +02:00
asciimoo 7db4558de7 [mod][fix] startpage engine updates 2014-02-18 16:14:31 +01:00
asciimoo c1d7d30b8e [mod] len() removed from conditions 2014-02-11 13:13:51 +01:00
asciimoo 68a0832524 [enh] search language support upadtes 2014-01-31 05:10:49 +01:00
asciimoo 14f4083ba1 [fix] print removed 2014-01-30 02:13:43 +01:00
asciimoo 9a74113b1c [enh] startpage paging init 2014-01-30 02:10:32 +01:00
asciimoo 85b81be35b [fix] pep8 2014-01-24 09:35:27 +01:00
pw3t b82ba74a7d Merge branch 'ixquick' of https://github.com/pw3t/searx into ixquick
Conflicts:
	searx/engines/startpage.py
2014-01-23 22:17:19 +01:00