Today we’re reviewing Algolia. This blog post is part of our review series where we uncover best-of-class SaaS solutions for developers. Reviewing other API services helps us come up with ideas for improving Stream, our API for building scalable and perisonalized feeds. You can try an interactive tutorial of Stream here.
Getting Started with Algolia
Algolia is a hosted, full-text, numerical, and faceted search engine capable of delivering real-time results in milliseconds. Built for developers, Algolia provides a powerful API that lets you quickly and seamlessly implement real-time search within your website or application, without dealing with or having to maintain infrastructure.
Algolia offers a 14-day free trial for new users with no credit card required, and they even offer a Hacker plan that is 100% free for up to 10,000 records and 100,000 operations per month.
To better acquaint ourselves with the Algolia service, we’ll create a simple application to guide you through the process of constructing a search experience for your users. For this guide, you’ll need an Algolia account, which can be created by following their signup process and selecting a datacenter.
Once your sign-up is complete, create your search experience with these steps:
- Create your index
- Import the data you want to be searchable
- Configure the relevance of results you would like to receive
- Implement the search UI using one of Algolia’s front-end API clients
Creating an Index & Importing Your Data to Search
The first step to building your search experience on Algolia is configuring the data that you would like to search. This data will be modeled into an index that holds records. An index is an entity within Algolia where you import data through the indexing process, and query later through the search process. The records are the individual data objects containing the attributes that we will use to search and filter with.
To create an index, click on “Create Index” under the “Indices” section of the dashboard. From there, you’ll be provided with a prompt to name your index. For the purpose of this tutorial, let’s go with “programming-languages” as our index name.
Index Your Data
There are many ways to accomplish indexing, but the recommended way is to import existing data into Algolia. The initial import and index creation can be done with an API client or the online dashboard.
Algolia performs well with semi-structured data. In most cases it's plug-and-play. As your data (and needs) become more complex, some formatting of the data beforehand can ensure that you are using Algolia in the most effective way. Check out this guide for more information on indexing your data.
Option 1: Using the Dashboard
After naming your new index, the navigation system will change to let you manually enter data or upload a file. On the left sidebar under the "Browse" tab, click on "Add New Records", and add your records via an import file (JSON or CSV).
Shown below is the import step, using a file named languages.json (GitHub Gist). This file represents an index of popular programming languages that we will use as our search dataset.
Option 2: Using the API
Algolia allows you to create and manage indices using their API. To learn more on how to index data using an API client or integration, check out the documentation here.
To proceed with the API option, you’ll need to get ahold of your API credentials, which can be found under the “API Keys” section in the left navigation.
For this example, we’ve written a sample script with Node.js using the official Algolia NPM package to loop over the languages and import each language into the "programming-languages" index.
Now that we have data imported, we can use the “Browse” tab to query the dataset, or view your data as raw JSON.
After the initial import, the index needs to be kept in sync with your database. This can be accomplished by implementing synchronization logic using one of the API clients. The process should automatically replicate additions, deletions, and updates from your database to your Algolia indices.
Configuring Your Relevance
“Searchable Attributes” (named attributesToIndex in the API) allows us to list the attributes we want the engine to use to search and order them by importance.
Searching Against Our Dataset
The returned result set would look something like this:
Settings can be customized to tune the search behavior. For example, you can add a custom sort by popularity (assuming that we had included popularity in our dataset during the indexing phase) to the already great built-in relevance:
Building The UI
Now that we have a newly created index and a ranking strategy including Searchable Attributes and Custom Ranking, creating a user-friendly UI is as simple as leveraging some of the provided libraries from Algolia, including autocomplete.js to build dropdown menus, and instantsearch.js to build as-you-type results pages.
Instant Search Library
instantsearch.js is a library of UI widgets to help you build the best instant-search experience for your users. For example, we can easily implement an instant search with our programming languages index, showing the result of the programming language in our result set. Instant Search can be expanded with facets, filters and more to build an amazing search experience.
Algolia allows for typo-tolerance – with a custom algorithm that based on the Damerau–Levenshtein distance. This custom algorithm provides more relevant results in an as-you-type search experience than more traditional techniques like tokenization, lemmatization and stemming.
When Algolia compares two words, they count as one typo every time a character is missing, unnecessary, substituted, or if two characters are transposed. Since it’s very rare that typing mistakes happen on the first letter, Algolia counts two typos if the first letter is involved.
Note: Uppercase/lowercase, accents, and other special characters are always ignored, and never counted as a typo.
Typos are taken into account in the typo criterion of the Ranking Formula. By default, this criterion is at the first position of the Formula, which makes it the most impactful on the ranking.
If a typo is in the initial position of the Ranking Formula, it means that you’ll never see an object with more typos ranked higher than an object with lower typos. We highly recommend keeping this default configuration, as it is relevant in a vast majority of use-cases.
Alright, so we’ve covered some of the basics around what Algolia is capable of, and we’ve indexed our programming languages. Let’s work on retrieving it using a sample application:
Once complete, you’ll have a functional autocomplete box! Type in a programming language of your choice, and you’ll see up to 10 results come back in your result set.
Alternatives to Algolia
If you’re not 100% satisfied with Algolia, there are always alternative methods to accomplish similar results, such as Solr (open-source & self-hosted) or ElasticSearch (open-source or hosted). Both of these are built on Apache Lucene, and their search syntax is very similar.
Amazon Elasticsearch Service provides a fully managed Elasticsearch service which makes it easy to deploy, operate, and scale Elasticsearch for log analytics, full-text search, application monitoring, and more. Amazon Elasticsearch offers built-in integrations with Kibana, Logstash, and AWS services including Amazon Kinesis Firehose, AWS Lambda, and Amazon CloudWatch so that you can go from raw data to actionable insights quickly.
Final Algolia Review Verdict
We hope that this Quick Start helped guide you through the process of getting you up to speed with Algolia. As with many of the services we recommend, Algolia is dead simple to integrate. We hope you feel the same way we do about the power and simplicity of Algolia and would love to hear your thoughts below.
PS. You might also like our ImgIX review
Also published on Medium.