28 November 2013

While working on an application with Riak, I was in a bit of discomfort that I had to resort to hacky ways to store data in order to suit the querying methods available in Riak v1.4.

I found out some really cool features are coming to Riak 2.0. Including bundled Solr. This layer it seems was a ground-up write codenamed Yokozuna, set to replace the search that came with Riak v1.4. Basho has a technical preview release Riak 2.0pre5, available for download, as of the time of writing.

UPDATE: I suggest you use the develop branch of the riak repo, which has tagged releases. And as of writing, the latest tagged release being 2.0pre7. Clone the repo, run make rel inside of it and enjoy your release of riak at rel/riak directory inside of the repo.

Since this is just a preview release, the docs haven’t been updated a lot. Sometimes you’ll have to dig into source code yourself. And unfortunately, atleast for erlang, there’s no tagged versions for the riak client, that works with the preview releases.

Real stuff

Once you download the preview release for your OS, untar it (and compile it if required). For Mac OSX, the preview release has compiled binaries.

cd into the directory, edit etc/riak.conf to turn on Yokozuna or Riak Search 2.0, depending on your pre-release version.

## use this for Riak 2.0pre5
yokozuna = on

## use this for Riak 2.0pre7
search = on

Also change the storage backend to leveldb. UPDATE: Eric Redmond says any memory backend will work fine. Thank you for the tip :]

storage_backend = leveldb

Start Riak using the binary bin/riak. You are all good now.

The instructions for using the preview release, can be found in the readme of the master branch of the Yokozuna github repo. For Riak 2.0pre5, the exact readme version is this one. The commands below are a mix of both (hey they seem to work, so why not?).

First create an index called my_index:

curl -XPUT -i 'http://localhost:8093/yz/index/my_index'

UPDATE: If you are using the latest tagged release from Github, then use the url http://localhost:8098/search/index/my_index.

In Riak 2.0, you can store shared bucket properties across buckets, by creating bucket types. To create a bucket type and associate it with the index you just created, run this:

bin/riak-admin bucket-type create awesomeness '{"props":{"yz_index":"my_index"}}'

Using bucket types with the Erlang client

The code snippets are in Elixir, but it must be pretty easy to read them in Erlang. /me runs and hides

To associate a bucket with a bucket type, the bucket should be referred to as a nested resource of the bucket type.

So instead of /buckets/your_fav_bucket, it goes something like /bucket_types/awesomeness/buckets/your_fav_bucket/. First variant also works, but for a bucket to be associated with a bucket type, you’ll have to use the long one. This should be irrelevant if you are using one of the Riak clients for your choice of language.

To create an object in a bucket, without a bucket type, use the following:

:riakc_obj.new("bucket", :undefined, "Wow this is just as before")

If you have a bucket associated with a bucket type, the only change is that, you have to pass the bucket type and the bucket as a tuple.

:riakc_obj.new({"awesomeness", "your_fav_bucket"}, :undefined, "And this is a small change")

You can also get the get the bucket type of a riak object by

:riakc_obj.bucket_type(obj)  # where obj is a riak object.

This stuff, I found by reading the source code for the riak-erlang-client.

Querying Solr

You can now add objects to your Riak bucket and start using Solr to search. On your local host, the solr url would be,

http://localhost:8098/search/your_index?q=your_query

Replace, your_index with the index name you would like to query and your_query with the Solr query.

There’s no well-maintained Solr client in Erlang or Elixir. Until there’s one, using the HTTP API directly is the only way. I’ll probably write one along the way, when working on my application.

What’s cool?

Solr has very sophisticated querying mechanisms. Once you add Solr, atleast in Rails projects I’ve worked on, it becomes another thing to start up in production.

Riak not only starts up Solr, but also feeds it with data and syncs the index across the Riak cluster. I remember reading that it leverages Solr’s Distributed API.

If you are building an application, there’s now only two things to start and manage - your application and Riak. This has to be the best Q4 news ever.

From Eric Redmond’s talk, which contain’s a quote from Ryan Zezeski’s talk…

Write it like Riak. Query it like Solr.

Thanks

  • Thanks to @RuneSkouLarsen for pointing out the correct url for index creation.
  • Thanks to lenary on the #Riak freenode channel for linking me to useful resources.
  • Thanks to Eric Redmond for the tip about being able to use any memory backend for Riak Search 2.0.