searx/searx/engines/gigablast.py

"""
 Gigablast (Web)

 @website     https://gigablast.com
 @provide-api yes (https://gigablast.com/api.html)

 @using-api   yes
 @results     XML
 @stable      yes
 @parse       url, title, content
"""
# pylint: disable=missing-function-docstring, invalid-name

from time import time
from json import loads
from searx.url_utils import urlencode

# engine dependent config
categories = ['general']
paging = True
number_of_results = 10
language_support = True
safesearch = True

# search-url

base_url = 'https://gigablast.com/'

# do search-request
def request(query, params):  # pylint: disable=unused-argument

    # see API http://www.gigablast.com/api.html#/search
    # Take into account, that the API has some quirks ..

    query_args = dict(
        c = 'main'
        , n = number_of_results
        , format = 'json'
        , q = query
        # The gigablast HTTP client sends a random number and a nsga argument
        # (the values don't seem to matter)
        , rand = int(time() * 1000)
        , nsga = int(time() * 1000)
    )

    page_no = (params['pageno'] - 1)
    if page_no:
        # API quirk; adds +2 to the number_of_results
        offset = (params['pageno'] - 1) * number_of_results
        query_args['s'] = offset

    if params['language'] and params['language'] != 'all':
        query_args['qlangcountry'] = params['language']
        query_args['qlang'] = params['language'].split('-')[0]

    if params['safesearch'] >= 1:
        query_args['ff'] = 1

    search_url = 'search?' + urlencode(query_args)
    params['url'] = base_url + search_url

    return params

# get response from search-request
def response(resp):
    results = []

    response_json = loads(resp.text)
    for result in response_json['results']:
        # see "Example JSON Output (&format=json)"
        # at http://www.gigablast.com/api.html#/search

        # sort out meaningless result

        title = result.get('title')
        if len(title) < 2:
            continue

        url = result.get('url')
        if len(url) < 9:
            continue

        content = result.get('sum')
        if len(content) < 5:
            continue

        # extend fields

        subtitle = result.get('title')
        if len(subtitle) > 3:
            title += " - " + subtitle

        results.append(dict(
            url = url
            , title = title
            , content = content
        ))

    return results
update versions.cfg to use the current up-to-date packages 2015-05-02 15:45:17 +02:00			`"""`
			`Gigablast (Web)`

[fix] gigablast https + url params 2015-12-22 20:25:57 +01:00			`@website https://gigablast.com`
			`@provide-api yes (https://gigablast.com/api.html)`
update versions.cfg to use the current up-to-date packages 2015-05-02 15:45:17 +02:00
			`@using-api yes`
			`@results XML`
			`@stable yes`
			`@parse url, title, content`
			`"""`
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`# pylint: disable=missing-function-docstring, invalid-name`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
[fix] gigablast url params 2015-10-16 12:05:50 +02:00			`from time import time`
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`from json import loads`
[enh] py3 compatibility 2016-11-30 18:43:03 +01:00			`from searx.url_utils import urlencode`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
			`# engine dependent config`
			`categories = ['general']`
			`paging = True`
[enh] improve gigablast engine add language and safesearch support 2015-12-23 18:43:35 +01:00			`number_of_results = 10`
			`language_support = True`
			`safesearch = True`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
[enh] improve gigablast engine add language and safesearch support 2015-12-23 18:43:35 +01:00			`# search-url`
[fix] fetch extra search param of gigablast - fixes #1293 2019-12-21 20:51:30 +01:00
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`base_url = 'https://gigablast.com/'`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
			`# do search-request`
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`def request(query, params): # pylint: disable=unused-argument`

			`# see API http://www.gigablast.com/api.html#/search`
			`# Take into account, that the API has some quirks ..`

			`query_args = dict(`
			`c = 'main'`
			`, n = number_of_results`
			`, format = 'json'`
			`, q = query`
			`# The gigablast HTTP client sends a random number and a nsga argument`
			`# (the values don't seem to matter)`
			`, rand = int(time() * 1000)`
			`, nsga = int(time() * 1000)`
			`)`

			`page_no = (params['pageno'] - 1)`
			`if page_no:`
			`# API quirk; adds +2 to the number_of_results`
			`offset = (params['pageno'] - 1) * number_of_results`
			`query_args['s'] = offset`

			`if params['language'] and params['language'] != 'all':`
			`query_args['qlangcountry'] = params['language']`
			`query_args['qlang'] = params['language'].split('-')[0]`
[enh] improve gigablast engine add language and safesearch support 2015-12-23 18:43:35 +01:00
			`if params['safesearch'] >= 1:`
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`query_args['ff'] = 1`
[enh] improve gigablast engine add language and safesearch support 2015-12-23 18:43:35 +01:00
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`search_url = 'search?' + urlencode(query_args)`
			`params['url'] = base_url + search_url`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
			`return params`

			`# get response from search-request`
			`def response(resp):`
			`results = []`

[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`response_json = loads(resp.text)`
[fix] gigablast params ++ json response format 2016-01-31 13:24:09 +01:00			`for result in response_json['results']:`
[fix] revise of the gigablast engine (WIP) The gigablast API has changed and seems to have some quirks, this is the first revise. More work (hacks) are needed. Signed-off-by: Markus Heiser <markus.heiser@darmarit.de> 2020-05-28 19:24:03 +02:00			`# see "Example JSON Output (&format=json)"`
			`# at http://www.gigablast.com/api.html#/search`

			`# sort out meaningless result`

			`title = result.get('title')`
			`if len(title) < 2:`
			`continue`

			`url = result.get('url')`
			`if len(url) < 9:`
			`continue`

			`content = result.get('sum')`
			`if len(content) < 5:`
			`continue`

			`# extend fields`

			`subtitle = result.get('title')`
			`if len(subtitle) > 3:`
			`title += " - " + subtitle`

			`results.append(dict(`
			`url = url`
			`, title = title`
			`, content = content`
			`))`
[enh] add gigablast engine 2015-02-08 14:12:14 +01:00
			`return results`