Advanced Solr/Lucene topics: High-performance nested search for e-commerce applications
Jul 19, 2016 • 2 min read

Solr/Lucene has emerged over the last few years as a leading open source search platform for large-scale e-commerce search engines. Systems based on Solr power major sites including Macy’s, Kohl’s, Walmart, Etsy, and many others. An increasing number of tier-1 digital retailers are building their next-generation search and catalog navigation platforms using the Solr technology stack, often replacing commercial engines such as Oracle Endeca, FAST or Mercado.
Grid Dynamics has been one of the early adopters of Solr for large-scale, complex e-commerce catalogs with millions of SKUs, providing a highly optimized, omni-channel, “Black Friday-ready” experience for some of the world’s largest retailers.
Our engineers have made numerous contributions to Solr and Lucene. They’ve spoken at technical conferences and published many technical blogs that help Solr developers. One specific area of extensive research and innovation where our team shines is dealing with nested document structures, which are very useful when modeling complex e-commerce catalogs. The Nested Search and Faceting approach solves many of the performance and scalability challenges encountered when building large scale, contextualized, omnichannel catalogs. It’s especially useful for effective and scalable processing of relationships (such as “chair is a part of furniture set”) and their attributes in a search system, optimized to deal with documents which are completely independent from one another.
-
We were early adopters of, and actively advocated the Block Join (also known as index-time join) approach to implementing nested search. This approach is supported in Lucene with Block Join Query (BJQ) and in Solr with the Block Join Facet component, which we contributed to the Solr/Lucene codebase.
For a more complete introduction to this topic, please see the joint talk at Lucene Revolution 2015 given by Eugene Steinberg of Grid Dynamics and Peter Gazaryan of Macys.com. Both slides and a video are online.
In this series of blog posts on Nested Search for E-commerce, we are republishing a collection of revised and updated blog posts written over the last 2 years. These posts are specifically targeted at the developers of the search engines, so the information is presented in rather technical form, and assumes good familiarity with Solr/Lucene framework. Specifically, these blog posts will cover the following aspects of BJQ design:
Post 1: Introduction to Block Join Faceting
Post 2: High-Performance Join in Solr with BlockJoinQuery
Post 3: How to Implement Block Join Faceting in Solr/Lucene
Post 4: Using Block Join to Improve Search Efficiency with Nested Documents in Solr
Post 5: The Segmented Filter Cache and Block join Query Parser in Solr
Post 6: Searching Grandchildren and Siblings with Solr Block Join
Post 7: A Frustrating Personal Experience with Unfaceted Search
If you have questions about any of the topics covered in this series of posts or more generally related to the design and tuning of search engines, please drop us a line and one of our Search Architects will follow up promptly.
Tags
You might also like
The buzzword “composable commerce” has dominated digital strategy conversations since Gartner popularized the term in 2020. But behind the marketing hype lies a longstanding, proven practice of integrating specialized, best-of-breed technology components into a flexible and scalable ecosystem....

For many businesses, moving away from familiar but inherently unadaptable legacy suites is challenging. However, eliminating this technical debt one step at a time can bolster your confidence. The best starting point is transitioning from a monolithic CMS to a headless CMS. This shift to a modern c...

As a retail leader, are you in complete control of your search, browse, and recommendation strategies? Do your digital experiences align with your business goals while delivering what customers expect? Can you control product rankings to highlight specific items in search results, adjust categories...
The headless CMS market is experiencing unprecedented growth as organizations recognize its potential for delivering flexible, personalized digital experiences. Recent market analysis reveals striking momentum—the global headless CMS software market, valued at $851.48 million in 2024, is projected...
In today's fast-paced and data-driven world, accurately predicting demand is more critical than ever for businesses aiming to stay competitive. Traditional forecasting methods often provide a single-point estimate, which can be useful but falls short in accounting for the inherent uncertainties and...
Have you come across a retail marketing message lately that states, 'Bring the fitting room home and find what you love'? Many retail brands today showcase their customer-first mindset through 'try before you buy' experiences, allowing customers to order products online, try everything, and return...
Demand forecasting is a crucial aspect of retail supply chain management that involves predicting future customer demand to make informed decisions about inventory levels, production, and resource allocation. It is a statistical analysis that considers numerous variables to optimize the predict...