searx/blog/intro-offline.html

174 lines
9.6 KiB
HTML

<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta name="viewport" content="width=device-width, initial-scale=1.0" />
<meta name="viewport" content="width=device-width, initial-scale=1">
<title>Preparation for offline engines &#8212; Searx Documentation (Searx-0.18.0.tex)</title>
<link rel="stylesheet" href="../_static/pygments.css" type="text/css" />
<link rel="stylesheet" href="../_static/searx.css" type="text/css" />
<link rel="stylesheet" type="text/css" href="../_static/tabs.css" />
<script id="documentation_options" data-url_root="../" src="../_static/documentation_options.js"></script>
<script src="../_static/jquery.js"></script>
<script src="../_static/underscore.js"></script>
<script src="../_static/doctools.js"></script>
<link rel="index" title="Index" href="../genindex.html" />
<link rel="search" title="Search" href="../search.html" />
<link rel="next" title="Limit access to your searx engines" href="private-engines.html" />
<link rel="prev" title="Searx admin interface" href="admin.html" />
<script>DOCUMENTATION_OPTIONS.URL_ROOT = '../';</script>
</head><body>
<div class="related" role="navigation" aria-label="related navigation">
<h3>Navigation</h3>
<ul>
<li class="right" style="margin-right: 10px">
<a href="../genindex.html" title="General Index"
accesskey="I">index</a></li>
<li class="right" >
<a href="../py-modindex.html" title="Python Module Index"
>modules</a> |</li>
<li class="right" >
<a href="private-engines.html" title="Limit access to your searx engines"
accesskey="N">next</a> |</li>
<li class="right" >
<a href="admin.html" title="Searx admin interface"
accesskey="P">previous</a> |</li>
<li class="nav-item nav-item-0"><a href="../index.html">Searx Documentation (Searx-0.18.0.tex)</a> &#187;</li>
<li class="nav-item nav-item-1"><a href="index.html" accesskey="U">Blog</a> &#187;</li>
<li class="nav-item nav-item-this"><a href="">Preparation for offline engines</a></li>
</ul>
</div>
<div class="document">
<div class="documentwrapper">
<div class="bodywrapper">
<div class="body" role="main">
<div class="section" id="preparation-for-offline-engines">
<h1>Preparation for offline engines<a class="headerlink" href="#preparation-for-offline-engines" title="Permalink to this headline"></a></h1>
<div class="section" id="offline-engines">
<h2>Offline engines<a class="headerlink" href="#offline-engines" title="Permalink to this headline"></a></h2>
<p>To extend the functionality of searx, offline engines are going to be
introduced. An offline engine is an engine which does not need Internet
connection to perform a search and does not use HTTP to communicate.</p>
<p>Offline engines can be configured as online engines, by adding those to the
<cite>engines</cite> list of <a class="reference external" href="https://github.com/searx/searx/blob/master/searx/settings.yml">settings.yml</a>. Thus, searx
finds the engine file and imports it.</p>
<p>Example skeleton for the new engines:</p>
<div class="highlight-python notranslate"><div class="highlight"><pre><span></span><span class="kn">from</span> <span class="nn">subprocess</span> <span class="kn">import</span> <span class="n">PIPE</span><span class="p">,</span> <span class="n">Popen</span>
<span class="n">categories</span> <span class="o">=</span> <span class="p">[</span><span class="s1">&#39;general&#39;</span><span class="p">]</span>
<span class="n">offline</span> <span class="o">=</span> <span class="bp">True</span>
<span class="k">def</span> <span class="nf">init</span><span class="p">(</span><span class="n">settings</span><span class="p">):</span>
<span class="k">pass</span>
<span class="k">def</span> <span class="nf">search</span><span class="p">(</span><span class="n">query</span><span class="p">,</span> <span class="n">params</span><span class="p">):</span>
<span class="n">process</span> <span class="o">=</span> <span class="n">Popen</span><span class="p">([</span><span class="s1">&#39;ls&#39;</span><span class="p">,</span> <span class="n">query</span><span class="p">],</span> <span class="n">stdout</span><span class="o">=</span><span class="n">PIPE</span><span class="p">)</span>
<span class="n">return_code</span> <span class="o">=</span> <span class="n">process</span><span class="o">.</span><span class="n">wait</span><span class="p">()</span>
<span class="k">if</span> <span class="n">return_code</span> <span class="o">!=</span> <span class="mi">0</span><span class="p">:</span>
<span class="k">raise</span> <span class="ne">RuntimeError</span><span class="p">(</span><span class="s1">&#39;non-zero return code&#39;</span><span class="p">,</span> <span class="n">return_code</span><span class="p">)</span>
<span class="n">results</span> <span class="o">=</span> <span class="p">[]</span>
<span class="n">line</span> <span class="o">=</span> <span class="n">process</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span>
<span class="k">while</span> <span class="n">line</span><span class="p">:</span>
<span class="n">result</span> <span class="o">=</span> <span class="n">parse_line</span><span class="p">(</span><span class="n">line</span><span class="p">)</span>
<span class="n">results</span><span class="o">.</span><span class="n">append</span><span class="p">(</span><span class="n">results</span><span class="p">)</span>
<span class="n">line</span> <span class="o">=</span> <span class="n">process</span><span class="o">.</span><span class="n">stdout</span><span class="o">.</span><span class="n">readline</span><span class="p">()</span>
<span class="k">return</span> <span class="n">results</span>
</pre></div>
</div>
</div>
<div class="section" id="development-progress">
<h2>Development progress<a class="headerlink" href="#development-progress" title="Permalink to this headline"></a></h2>
<p>First, a proposal has been created as a Github issue. Then it was moved to the
wiki as a design document. You can read it here: <a class="reference external" href="https://github.com/searx/searx/wiki/Offline-engines"> Offline-engines</a>.</p>
<p>In this development step, searx core was prepared to accept and perform offline
searches. Offline search requests are scheduled together with regular offline
requests.</p>
<p>As offline searches can return arbitrary results depending on the engine, the
current result templates were insufficient to present such results. Thus, a new
template is introduced which is caplable of presenting arbitrary key value pairs
as a table. You can check out the pull request for more details see
<a class="reference external" href="https://github.com/searx/searx/pull/1700">PR 1700</a>.</p>
</div>
<div class="section" id="next-steps">
<h2>Next steps<a class="headerlink" href="#next-steps" title="Permalink to this headline"></a></h2>
<p>Today, it is possible to create/run an offline engine. However, it is going to be publicly available for everyone who knows the searx instance. So the next step is to introduce token based access for engines. This way administrators are able to limit the access to private engines.</p>
</div>
<div class="section" id="acknowledgement">
<h2>Acknowledgement<a class="headerlink" href="#acknowledgement" title="Permalink to this headline"></a></h2>
<p>This development was sponsored by <a class="reference external" href="https://nlnet.nl/discovery">Search and Discovery Fund</a> of <a class="reference external" href="https://nlnet.nl/">NLnet Foundation</a> .</p>
<div class="line-block">
<div class="line">Happy hacking.</div>
<div class="line">kvch // 2019.10.21 17:03</div>
</div>
</div>
</div>
<div class="clearer"></div>
</div>
</div>
</div>
<span id="sidebar-top"></span>
<div class="sphinxsidebar" role="navigation" aria-label="main navigation">
<div class="sphinxsidebarwrapper">
<p class="logo"><a href="../index.html">
<img class="logo" src="../_static/searx_logo_small.png" alt="Logo"/>
</a></p>
<h3>Project Links</h3>
<ul>
<li><a href="https://github.com/searx/searx">Source</a>
<li><a href="https://github.com/searx/searx/wiki">Wiki</a>
<li><a href="https://searx.space">Public instances</a>
<li><a href="https://twitter.com/Searx_engine">Twitter</a>
<li><a href="https://github.com/searx/searx/issues">Issue Tracker</a>
</ul><h3>Navigation</h3>
<ul>
<li><a href="../index.html">Overview</a>
<ul>
<li><a href="index.html">Blog</a>
<ul>
<li>Previous: <a href="admin.html" title="previous chapter">Searx admin interface</a>
<li>Next: <a href="private-engines.html" title="next chapter">Limit access to your searx engines</a></ul>
</li>
</ul>
</li>
</ul>
<div id="searchbox" style="display: none" role="search">
<h3 id="searchlabel">Quick search</h3>
<div class="searchformwrapper">
<form class="search" action="../search.html" method="get">
<input type="text" name="q" aria-labelledby="searchlabel" />
<input type="submit" value="Go" />
</form>
</div>
</div>
<script>$('#searchbox').show(0);</script>
</div>
</div>
<div class="clearer"></div>
</div>
<div class="footer" role="contentinfo">
&#169; Copyright 2015-2020, Adam Tauber, Noémi Ványi.
Created using <a href="https://www.sphinx-doc.org/">Sphinx</a> 3.5.2.
</div>
<script src="../_static/version_warning_offset.js"></script>
</body>
</html>