Search Results

Search found 284 results on 12 pages for 'solr'.

Page 5/12 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >

Boost Solr results based on the field that contained the hit

- by TomFor

Hi, I was browsing the web looking for a indexing and search framework and stumbled upon Solr. A functionality that we abolutely need is to boost results based on what field contained the hit. A small example: Consider a record like this: <movie> <title>The Dark Knight</title> <alternative_title>Batman Begins 2</alternative_title> <year>2008</year> <director>Christopher Nolan</director> <plot>Batman, Gordon and Harvey Dent are forced to deal with the chaos unleashed by an anarchist mastermind known only as the Joker, as it drives each of them to their limits.</plot> </movie> I want to combine for example the title, alternative_title and plot fields into one search field, which isn't too difficult after looking at the Solr/Lucene documentation and tutorials. However I also want that movies that have a hit in title have a higher score than hits on alternative_title and those in their turn should score higher than hits in the plot field. Is there any way to indicate this kond of scoring in the xml or do we need to develop some custom scoring algorythm? Please also note that the example I've givnen is fictional end the real data will probably contain 100+ fields. Thanks in advance, Tom

Read the article
Custom Solr sorting

- by Tom

Hello everyone, I've been asked to do an evaluation of Solr as an alternative for a commercial search engine. The application now has a very particular way of sorting results using something called "buckets". I'll try to explain with a bit of details: In the interface they have 2 fields: "what" and "where". Both fields are actually sets of fields (what = category, name, contact info... and where= country, state, region, city...) so the copyfield feature of Solr immediately comes to mind. Now based on the field generated the actual match the result should end up in a specific bucket. In particular the first bucket contains all the result documents that have an exact match on the category field, in the second bucket all exact matches on name, the third partial matches on category, the fourth partial matches on name, the fifth matches on contact info etc... Then within each of those first tier buckets all results are placed in second tier buckets depending on what location was matched: city, then region, then province and so on. To even complicate things more there is also a third tier bucket where results are placed according to the value of a ranking field: all documents with the value 1 in the ranking field go in bucket 1 and so on. And finally results should be randomized in the third tier bucket... On top of this they obviously want support for facets and paging. My apologies for the long mail but I would greatly appreciate feedback and/or suggestions. I'm aware that this that this is a very particular problem but everything that points me in the right direction is helpful. Cheers, Tom

Read the article
SOLR - Boost function (bf) to increase score of documents whose date is closest to NOW

- by Mechanic

Hi all, I have a solr instance containing documents which have a 'startTime' field ranging from last month to a year from now. I'd like to add a boost query/function to boost the scores of documents whose startTime field is close to the current time. So far I have seen a lot of examples which use rord to add boosts to documents whom are newer but I have never seen an example of something like this. Can anyone tell me how to do it please? Thanks

Read the article
Exclude draft articles from Solr index with Sunspot

- by Bogdan Gusiev

I have an indexed model called Article and I don't want solr to index unpublished articles. class Article < ActiveRecord::Base searchable do text :title text :body end end How can I specify that article that is not #published? should not be indexed?

Read the article
Solr Multicore Admin Problem

- by Daniel M

Im trying to add a url based security constraint to solr deployed in websphere 6.1. If I specify the core name in the url of the constraint then the admin url for that core gives a 404. Has anyone had any success with this or any suggestions? Cheers

Read the article
solr plugin for symfony?

- by fayer

with symfony using doctrine is very easy cause its fully integrated into the framework. i wonder if there is a possibility to integrate solr with symfony too (eg, via plugin?) thanks

Read the article
Cheat sheets for Lucene/Solr?

- by noname

Is there any cheat sheet out there for Lucene/Solr query parameters, schema.xml elements (all the analyzers, tokenizers, etc.)? Or somewhere else I can find ALL query parameters? I cant find any with Google.

Read the article
Detailed information in Lucene/Solr results

- by Hans Stricker

After having performed a search in Lucene/Solr without having specified a field, how can I know in which fields of a result document the search string was found (and how often)?

Read the article
Can Apache Solr output HTML instead of XML?

- by Josh

The question is simple - we have a sample / test Solr app running that only responds with XML right now. Is there an easy way to change that output to HTML? Running Tomcat as the app server.

Read the article
Solr/Lucene user click based ranking

- by Danim

I am facing the problem of sort Lucene results based on user click log. I would like that more accessed results comes first. Does anyone knows how to configure or implement such property in Lucene or Solr? Thank you very much.

Read the article
Indexing different type of Entities/Objects with Solr Lucene

- by Yos

Let's say I want to index my shop using Solr Lucene. I have many types of entities : Products, Product Reviews, Articles How do I get my Lucene to index those types, but each type with different Schema ?

Read the article
Solr query results using *

- by agentile

I want to provide for partial matching, so I am tacking on * to the end of search queries. What I've noticed is that a search query of gatorade will return 12 results whereas gatorade* returns 7. So * seems to be 1 or many as opposed to 0 or many ... how can I achieve this? Am I going about partial matching in Solr all wrong? Thanks.

Read the article
solr schema for article->paragraph structure

- by Ke

Hi guys, I want to index some articles and show the paragraph number in the search result. So I guess the solr schema should looks like this: article_id, paragraph_number, paragraph_content Therefore, I need to parse article first, extract paragraphs and index it one by one. I'm worried about the performance since one article can contain 100 paragraphs. Any suggestion?

Read the article
Solr automatic startup script

- by Camran

I have followed this tutorial: http://wiki.apache.org/solr/SolrJetty But I cant get it working. I don't know what to put in the etc/defaults/jetty file Does anybody know how to configure this? I have Ubuntu 9 Server. Thanks

Read the article
Solr OR query for different combination of facets

- by Ritesh M Nayak

I have a sample Solr schema as follows isPublic = boolean source = facebook| twitter | wordpress I want to write a query which returns all documents from the index which matches either the isPublic = true or isPublic is false and source= facebook. Something like this solrUrl/?q=blah&fq=(isPublic:true OR (isPublic:false AND source:facebook)) Is such a thing possible or should I search the index two times with each of these conditions and then combine + de-duplicate the results?

Read the article
Solr search score in the range from 0 to 1

- by spacemonkey

Hi, Is it possible to configure Solr so that the document similarity score would be in the range for example from 0 (no match) to 1 (complete document and query match). Thanks!

Read the article
Apache Solr: Setting HTTP Response Headers From solrconfig.xml For CORS

- by Noah Freitas

Is it possible to setup the sending of a custom HTTP response header from within the solrconfig.xml file? I am thinking that it might be possible to add some configuration to the <requestDispatcher> since it controls caching headers. I am sure this is possible in the servlet container configuration (Jetty, Tomcat, etc.), but I would like to do this from within Solr's configuration files if at all possible. If this makes any difference, I am attempting to set an Access-Control-Allow-Origin header for CORS AJAX requests for a different host.

Read the article
Highlighting in Solr 1.4 - requireFieldMatch

- by Mark Redding

I have an object Title : foo Summary : foo bar Body : this is a published story about a foo and a bar All three are set up as fields with stored=true. The user searches across my system for the word "foo" I would like to highlight foo in all three places. The user searches for the word foo in the title "title:foo" I only want to highlight foo within the title. When I added hl.requireFieldMatch=true and hl.usePhraseHighlighter=true as part of my query over to SOLR I am unable to get the highlighting in all three places when doing a generic non fielded search. Is there a way to get both scenarios to work? I had these two items turned off, but I am adding in some fielded portions of the query that the user does not see which only display Published items for instance. the problem is (foo AND status:published) is causing the word published in the body to highlight when the user only searched for the word "foo".

Read the article
Random noise in Solr score

- by Andrea Campi

I am looking for a way of introducing random noise into my scoring function, and I'm at a loss on how to best proceed. Some background: We use Solr for a web application that manages large-ish sets of photos for agencies. One customer has an interesting requirement for scoring: 'quality' field, maintained by editors, from 1 (highest) to 3 (lowest); 'date' field, boosting more recent photos; I would probably use a logarithmic function; However, due to how the stock photo market works, this will likely result in many similar photos appearing together. Their request is to give 'quality' a large boost, but introduce some randomness so that photos will not appear in a strict date order. Any idea? EDITED: a key requirement is to have "stable" query results: if I search twice for "tropical island" I can get a slightly different result set, but if I ask for the first page, then the second, then the first, I'd better get the same results :)

Read the article
SOLR date faceting and BC / BCE dates / negative date ranges

- by Nigel_V_Thomas

Date ranges including BC dates is this possible? I would like to return facets for all years between 11000 BCE (BC) and 9000 BCE (BC) using SOLR. A sample query might be with date ranges converted to ISO 8601: q=*:*&facet.date=myfield_earliestDate&facet.date.end=-92009-01-01T00:00:00&facet.date.gap=%2B1000YEAR&facet.date.other=all&facet=on&f.myfield_earliestDate.facet.date.start=-112009-01-01T00:00:00 However the returned results seem to be suggest that dates are in positive range, ie CE, not BCE... see sample returned results <response> <lst name="responseHeader"> <int name="status">0</int> <int name="QTime">6</int> <lst name="params"> <str name="f.vra.work.creation.earliestDate.facet.date.start">-112009-01-01T00:00:00Z</str> <str name="facet">on</str> <str name="q">*:*</str> <str name="facet.date">vra.work.creation.earliestDate</str> <str name="facet.date.gap">+1000YEAR</str> <str name="facet.date.other">all</str> <str name="facet.date.end">-92009-01-01T00:00:00Z</str> </lst> </lst> <result name="response" numFound="9556" start="0">ommitted</result> <lst name="facet_counts"> <lst name="facet_queries"/> <lst name="facet_fields"/> <lst name="facet_dates"> <lst name="vra.work.creation.earliestDate"> <int name="112010-01-01T00:00:00Z">0</int> <int name="111010-01-01T00:00:00Z">0</int> <int name="110010-01-01T00:00:00Z">0</int> <int name="109010-01-01T00:00:00Z">0</int> <int name="108010-01-01T00:00:00Z">0</int> <int name="107010-01-01T00:00:00Z">0</int> <int name="106010-01-01T00:00:00Z">0</int> <int name="105010-01-01T00:00:00Z">0</int> <int name="104010-01-01T00:00:00Z">0</int> <int name="103010-01-01T00:00:00Z">0</int> <int name="102010-01-01T00:00:00Z">0</int> <int name="101010-01-01T00:00:00Z">0</int> <int name="100010-01-01T00:00:00Z">5781</int> <int name="99010-01-01T00:00:00Z">0</int> <int name="98010-01-01T00:00:00Z">0</int> <int name="97010-01-01T00:00:00Z">0</int> <int name="96010-01-01T00:00:00Z">0</int> <int name="95010-01-01T00:00:00Z">0</int> <int name="94010-01-01T00:00:00Z">0</int> <int name="93010-01-01T00:00:00Z">0</int> <str name="gap">+1000YEAR</str> <date name="end">92010-01-01T00:00:00Z</date> <int name="before">224</int> <int name="after">0</int> <int name="between">5690</int> </lst> </lst> </lst> </response> Any ideas why this is the case, can solr handle negative dates such as -112009-01-01T00:00:00Z?

Read the article
Per query relevance elevation for solr?

- by plusplus

I want to tune the relevance of solr search results on a per user basis - based on the number of times the user has clicked through a result before. Frequently hit items FOR THAT USER should rise to the top of their search results. Is there a way to provide custom boost/elevation for particular document ids on the query? I'm thinking in the order of ~100s of particular documents to elevate. The elevation should have no effect if the rest of the query doesn't find those documents. Alternatively, if this isn't possible, what is a sane way for setting up an alternative indexing approach that would make this possible? Could I add a field per user in the index to store their scores? I'm thinking in the order of 1000 users. The major drawback of that approach is the number of times a document would need to be reindexed (i.e. each time it was used by the user).

Read the article
Best Practice of Field Collapsing in SOLR 1.4

- by Dominik

I need a way to collapse duplicate (defined in terms of a string field with an id) results in solr. I know that such a feature is comming in the next version (1.5), but I can't wait for that. What would be the best way to remove duplicates using the current stable version 1.4? Given that finding duplicates in my case is really easy (comparison of a string field), should it be a Filter, should I overwrite the existing SearchComponent or write a new Component, or use some external libraries like carrot2? The overall result count should reflect the shortened result.

Read the article
what this `^` mean here in solr

- by Rahul Mehta

I am confuse her but i want to clear my doubt. I think it is stupid question but i want to know. Use a TokenFilter that outputs two tokens (one original and one lowercased) for each input token. For queries, the client would need to expand any search terms containing upper case characters to two terms, one lowercased and one original. The original search term may be given a boost, although it may not be necessary given that a match on both terms will produce a higher score. text:NeXT ==> (text:NeXT^10 OR text:next) what this ^ mean here . http://wiki.apache.org/solr/SolrRelevancyCookbook#Relevancy_and_Case_Matching

Read the article
Use different Solr Similarity algo for every search

- by snickernet

Hi Guys, Is possible in Solr 1.4 to specify which similarity class to use for every search within a single index? Let's say, I got 2 type of search (keyword and brand). For keyword search, I want to use the DefaultSimilarity class. But, for brand search, I want to use my CustomSimilarity class. I've been modifying the schema.xml to specify a single similarity class to use. But, I came to this requirement that I have to use 2 different similarity classes. I'll be glad to here your thoughts on this. Thanks in advance.

Read the article
Why Solr admin query page interprets UTF-8 as ISO-8859-1

- by Scott Chu

I deploy a war to my Tomcat 6.0.35 on Win7 64bit and when I use full-interface query page (I mean form.jsp) in Solr Admin to query 2 Chinese character (say it's C1C2) , the debug info shows: <lst name="debug"> <str name="rawquerystring">æ°è</str> <str name="querystring">æ°è</str> <str name="parsedquery">NEWSID:æ°è</str> <str name="parsedquery_toString">NEWSID:æ°è</str> ... You can see C1C2 becomes æ°è. I deploy same war file to Tomcat on Linux or on another Win7 64bit of my colleagues' computer, the encoding acts well. Does anyone know why and how can I avoid this problem? Thanks in advance!

Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12 | Next Page >