<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
    <title>Thomas Koch – Posts tagged "search"</title>
    <link href="https://blog.koch.ro/tags/search.atom.xml" rel="self" />
    <link href="https://blog.koch.ro" />
    <id>https://blog.koch.ro/tags/search.atom.xml</id>
    <author>
        <name>Thomas Koch</name>
        
        <email>thomas+blog@koch.ro</email>
        
    </author>
    <updated>2025-07-10T13:30:23Z</updated>
    <entry>
    <title>Rebuild search with trust</title>
    <link href="https://blog.koch.ro/posts/2024-01-20-rebuild-search-with-trust.html" />
    <id>https://blog.koch.ro/posts/2024-01-20-rebuild-search-with-trust.html</id>
    <published>2024-01-20T00:00:00Z</published>
    <updated>2025-07-10T13:30:23Z</updated>
    <summary type="html"><![CDATA[<div class="info">
    Posted on January 20, 2024
    
</div>
<div class="info">
    
    Tags: <a title="All pages tagged &#39;debian&#39;." href="/tags/debian.html" rel="tag">debian</a>, <a title="All pages tagged &#39;free software&#39;." href="/tags/free%20software.html" rel="tag">free software</a>, <a title="All pages tagged &#39;life&#39;." href="/tags/life.html" rel="tag">life</a>, <a title="All pages tagged &#39;search&#39;." href="/tags/search.html" rel="tag">search</a>, <a title="All pages tagged &#39;decentralization&#39;." href="/tags/decentralization.html" rel="tag">decentralization</a>
    
</div>

<p>Finally there is a thing people can agree on:</p>
<ul>
<li>2023-08-28, OSNews: <a href="https://www.osnews.com/story/136829/the-end-of-the-googleverse/">The end of the Googleverse</a></li>
<li>2023-07-28, Cory Doctorow: <a href="https://pluralistic.net/2023/07/28/microincentives-and-enshittification/">Microincentives and Enshittification</a></li>
<li>2023-10-03, Cory Doctorow: <a href="https://pluralistic.net/2023/10/03/not-feeling-lucky/">Google’s enshittification memos</a></li>
<li>2024-01-15, Tim Bray: <a href="https://www.tbray.org/ongoing/When/202x/2024/01/15/Google-2024">Mourning Google</a></li>
</ul>
<p>Apparently, Google Search is not good anymore. And I’m not the only one
thinking about decentralization to fix it:</p>
<p><a href="https://media.ccc.de/v/37c3-lightningtalks-58060-honey-i-federated-the-search-engine-finding-stuff-online-post-big-tech">Honey I federated the search engine - finding stuff online post-big tech</a> - a
lightning talk at the recent chaos communication congress</p>
<p>The speaker however did not mention, <a href="https://en.wikipedia.org/wiki/Distributed_search_engine">that</a> <a href="https://wiki.p2pfoundation.net/Distributed_Search_Engines">there</a> <a href="https://blog.florence.chat/a-distributed-search-engine-for-the-distributed-web-39c377dc700e">have</a> <a href="https://web.archive.org/web/20230902052010/https://hackernoon.com/is-the-concept-of-a-distributed-search-engine-potent-enough-to-challenge-googles-dominance-l1s44t2">already</a> <a href="https://web.archive.org/web/20200914192255/https://github.com/nvasilakis/yippee">been</a> <a href="https://www.techdirt.com/2014/07/08/distributed-search-engines-why-we-need-them-post-snowden-world/">many</a>
<a href="https://github.com/kearch/kearch">attempts</a> at building distributed search engines. So why do I think that such
an attempt could finally succeed?</p>
<ul>
<li>More people are searching for alternatives to Google.</li>
<li>Mainstream hard discs are incredibly big.</li>
<li>Mainstream internet connection is incredibly fast.</li>
<li>Google is bleeding talent.</li>
<li>Most of the building blocks are available as free software.</li>
<li>“Success” depends on your definition…</li>
</ul>
<p>My definition of success is:</p>
<blockquote>
<p>A mildly technical computer user (able to install software) has access to a
search engine that provides them with superior search results compared to
Google for at least a few predefined areas of interest.</p>
</blockquote>
<p>The exact algorithm used by Google Search to rank websites is a secret even to
most Googlers. Still it is clear, that it relies heavily on big data: billions
of queries, a comprehensive web index and user behaviour data. - All this is
not available to us.</p>
<p>A distributed search engine however can instead rely on user input. Every
admin of one node seeds the node ranking with their personal selection of
trusted sites. They connect their node with nodes of people they trust. This
results in a web of (transitive) trust much like pgp.</p>
<p>For comparison, imagine you are searching for something in a world without
computers: You ask the people around you. They probably forward your
question to their peers.</p>
<p>I already had a look at <a href="https://yacy.net">YaCy</a>. It is active, somewhat usable and has a friendly
maintainer. Unfortunately I consider the codebase to show its age. It takes a
lot of time for a newcomer to find their way around and it contains a lot of
cruft. Nevertheless, YaCy is a good example that a decentralized search
software can be done even by a small team or just one person.</p>
<p>I myself started working on a software in Haskell and keep my notes here:
<a href="https://de.populus.wiki/wiki/Populus:DezInV">Populus:DezInV</a>. Since I’m learning Haskell along the way, there is nothing
there to see yet. Additionally I took a yak shaving break to learn <a href="file:///tags/nix.html">nix</a>.</p>
<p>By the way: <a href="https://thepeoplesvoice.tv/google-lite-duckduckgo-signs-secret-deal-with-bill-gates-to-track-users-online/">DuckDuckGo is not the alternative</a>. And while I would encourage you
to also try Yandex for a second opinion, I don’t consider this a solution.</p>
]]></summary>
</entry>

</feed>
