dancroak.com



email ~ twitter ~ ruby on rails work ~ open source work ~ thoughtbot
permalink

"Solr" ~ Erik Hatcher

RubyConf 2007

Caching, replication. Very fast, highlighting. Active community. Built on top of Lucene.

Used by CNET, Internet Archive, Netflix, digg.

Lucene is a Java search engine library. Powers Technorati, Monster, Wikipedia, others.

Inverted index under the covers. Takes out the words (“terms”). Documents > Fields > Terms

Includes relevance, does an analysis process which removes stop words (“and”, “or”) and breaks words into stems (removes “-ing”, “-ed”)

Get back Ruby hashes.

solr-ruby is a Ruby DSL for HTTP communication to Solr.

Can set up master Solr server and load balanced read-only searcher slaves.

Relevancy score can be tuned.

The schema consists of dynamicFields. Any searchable fields need to be of type text.

Eric is showing a tool called “Luke” that lets you browse your Lucene index.

ActiveRecord can take a simple hash, so Eric wrote a simple mapper that takes the Solr results in a Ruby Hash and just calls Book.create(mapped_book).

faceting

Needs help with DSL/API guidance, documentation