RavenDB - The Image Gallery Project (XIV) - Implementing a real-time tag search

Published on 2010-10-21

The code for this and all other entries in this series can be found here: http://github.com/robashton/RavenGallery/

We don’t currently have any search functionality in the image browser which means the browser is all but useless, so let’s look at adding that with some RavenDB magic.

In a RDBMS project, performing loose searches on tags and other fields of our documents would be a non-trivial operation both in implementation and execution – in RavenDB we can just execute a LINQ query and know an index will be created in the background to make this operation ludicrously fast.

Some background: The Web bit

In order to demo the kind of functionality we can get from RavenDB, I’m going to implement a search as you type system which constantly asks RavenDB for the search results for a given term.

Because I’m trying to do this as a more real-world example, I’m not going to cheat by doing partial page updates and passing XHTML all over the show with JavaScript, I’m going to rip apart my original controller action and make it return just the view.

Instead, I am going to use some client-side templating with jquery-tmpl, and a call to a service to get the relevant view as a blob of JSON will be used, any changes to the textbox will just mean this initial call will get made again and the view re-populated.

        public ActionResult Browse()

            return View();

        public ActionResult _GetBrowseData(ImageBrowseInputModel input)

            var model = viewRepository.Load<ImageBrowseInputModel, ImageBrowseView>(input);

            return Json(model, JsonRequestBehavior.AllowGet);

(*Microsoft in their infinite wisdom don’t allow GET requests for JSON by default, to protect us from our own stupidity, AGH!!!)

Along with:

    <script id="browsing-image-template" type="text/x-jquery-tmpl" >

        <div class="browsing-image">

             <h4>${Title}</h4>

             <img src="/Resources/Image/${Filename}" alt={Title}" />

        </div>

    </script>

    <div id="image-browser">

    </div>

and

    populateImageBrowser: function (page, pageSize, searchText) {

        var query = '?page=' + page

                + '&pageSize=' + pageSize

                + '&searchText=' + searchText;

        $.ajax({

            dataType: "json",

            url: '/Image/_GetBrowseData' + query,

            error:function (xhr, ajaxOptions){

                alert(xhr.status + ':' + xhr.responseText);

},

            success: function (data) {

                $('#browsing-image-template')

                    .tmpl(data.Items)

                    .appendTo('#image-browser');

});

I’m going to have a textbox on the page which I listen for changes from every time the keyboard is pressed, and I’ll just call that method again to do the search every time the request is made – I’ll not bother covering that though, it’s common enough functionality – I just wanted to show what I’m doing with my view now I’ve moved to JSON instead of server-side HTML.

Querying a document with collections in RavenDB using LINQ

I’ve added SearchText to my InputModel so that gets bound automatically via the query string, so all we need to do now is create an index that actually allows SearchText to be used.

I’ve added a test for this new SearchText property in the view factory integration test which looks something like this

        [Test]

        public void WhenLoadIsInvokedWithTagSearch_ExpectedResultsAreReturned()

            PopulateStore();

            var result = this.ViewFactory.Load(new ImageBrowseInputModel()

                Page = 0,

                PageSize = 100,

                SearchText = "tag5"

            }).Items.FirstOrDefault();

            WaitForIndexing();

            Assert.AreEqual("Title5", result.Title);

PopulateStore just throws a hundred documents in with various tags, and I know that one of the documents has a tag with name ‘tag5’ and a title of ‘Title5’. If I run this, it fails because I haven’t updated the code to fit our new requirements.

Here is the code that implements the desired functionality

        public ImageBrowseView Load(ImageBrowseInputModel input)

            // Adjust the model appropriately

            input.PageSize = input.PageSize == 0 || input.PageSize > 20 ? 20 : input.PageSize;

            // Perform the paged query

            var query = documentSession.Query<ImageDocument>()

                    .Skip(input.Page * input.PageSize)

                    .Take(input.PageSize);

            // Add a clause for search text if necessary

            if(!string.IsNullOrEmpty(input.SearchText)){

                query = query.Where(x=>x.Tags.Any(tag=>tag.Name.StartsWith(input.SearchText)));

            // And enact this query

            var items = query

                .ToArray()

                .Select(x => new ImageBrowseItem(x.Title, x.Filename));

            return new ImageBrowseView(

                input.Page,

                input.PageSize,

                items);

The important bit to take away here, is that we add a Where clause where Any Tag has a name that starts with the search text passed in – this just works. We don’t add that clause if we haven’t got any search text, because asking for any documents with an empty tag would most likely yield in no results. (Yeah, I have tests for that too)

The index that was created for us

As mentioned in the previous entry, when performing ad-hoc queries against RavenDB, indexes are created for us in the background, so here is what the above index would look like if we had created it ourselves.

    public class Images_ByTag : AbstractIndexCreationTask<ImageDocument>

        public Images_ByTag()

            Map = docs => from doc in docs

                          from tag in doc.Tags

                          select new

                              tag.Name

};

When creating our mapping, we effectively say “Get all the documents, get all their tags, create an index entry for each of those tags”, a search for “Name” via this index will result in RavenDB searching the index and collating that into a document look up. This is a flattening of the document and is what happens whenever we look at specific properties within collections

What this has given us

As the user types, images with tags that match the current search text are displayed within the search results area – Lucene indexes are seriously fast and this is a good demonstration of that

There are still some improvements that could be made at this time.

This could be made more user friendly by showing suggestions as the user types by listing tags in the system that start with the current text (auto-complete), and we could also search the title and description (if there was a description).

We are also still returning the entire document each time and then just plucking the relevant fields from it – this is rather heavy weight with all that data travelling across the wire and still needs changing so that we only transmit the fields from the document that we want in our end view model. (Projections)

That makes clear what we’ll be doing next…

Index Subscribe Respond

Rob Ashton

RavenDB - The Image Gallery Project (XIV) - Implementing a real-time tag search

Published on 2010-10-21