Search Results

Search found 393 results on 16 pages for 'lucene'.

Page 4/16 | < Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >

  • Couple o' quick questions on Apache Lucene

    - by Doug
    -- I don't want to start any religious wars, but a quick google search indicates that Apache Lucene is the preferred open source tool for indexing and searching. Are there others? -- What file format does Lucene use to store its index file(s)? Thank is advance. Doug

    Read the article

  • Lucene.NET performance

    - by Paul Knopf
    I have a website that runs of a third party search provider that is expensive. I am going to roll my own. Is Lucene.NET capable of ~25,000 products (or documents), each with maybe ten attributes used for filtering? I am looking to do a "narrow/drill down" or "faceted search". Does that sound like to much to ask from Lucene.NET?

    Read the article

  • lucene index missing files

    - by Akhil
    I have _0.cfs file of a lucene index directory but segments.gen and segments_2 are missing. Can I generate the segments.gen and segments_2 files without having to regenerate the _0.cfs file. Does these "segments" files contain any index specific data, which will thus force me to regnerate the entire index again. Or can I just generate the two "segments" file by copying these from another lucen index directory gnerated with the same lucene version.

    Read the article

  • Resources for getting started with Lucene.Net?

    - by Matt Dotson
    I'm building a simple site that allows users to post text content and I want to add it to a search index as it gets posted, so my site search is up to date. From what I can tell Lucene.NET is a good full text search framework. I've found very few examples of how to use it though. Can anyone post some good references for learning about Lucene?

    Read the article

  • Apache Lucene or another Search in iPhone app

    - by lostInTransit
    Hi I would like to implement a search functionality within my iPhone app which can search for terms within all the documents in the application. I believe I cannot use Apache Lucene directly since it is in Java. Can I use Lucy which is a C port of Lucene (not sure if Perl and Ruby would work on it)? Or is there any other open-source search engine which I can use in my iPhone app for search within the app? Thanks

    Read the article

  • C# Lucene get all the index

    - by ngc224
    Hello, I am working on a windows application using Lucene. I want to get all the indexed keywords and use them as a source for a auto-suggest on search field. How can I receive all the indexed keywords in Lucene? I am fairly new in C#. Code itself is appreciated. Thanks.

    Read the article

  • Solr/Lucene user click based ranking

    - by Danim
    I am facing the problem of sort Lucene results based on user click log. I would like that more accessed results comes first. Does anyone knows how to configure or implement such property in Lucene or Solr? Thank you very much.

    Read the article

  • How do i implement tag searching with lucene?

    - by acidzombie24
    I havent used lucene. Last time i ask (many months ago, maybe a year) people suggested lucene. As am example say there are 3 items tag like this apples carrots apples carrots apple banana if a user search apples i dont care if there is any preference from 1,2 and 4. However i seen many forums do this which i hated is when a user search apple carrots 2 and 3 are get high results while 1 is hard to find even though it matches my search more closely. I HATED this in forums. Also i would like the ability to do search carrots -apples which will only get me 3. I am not sure what should happen if i search carrots banana but anyways as long as more 2 and 3 results are lower priority then 1 when i search apples carrots i'll be happy. Can lucene do this? and where do i start? i see a lot of classes and many of them talk about docs. What should i use for tagging?

    Read the article

  • Read huge free text docs in one file for lucene indexing

    - by Jun
    I have heaps of free text news docs in one big file. The structure of each news doc is like: (Header line) Category, Doc1, Date (day, month, year) (body text) ... ... ... (Header line) Category, Doc2, Date (day, month, year) (body text) ... ... ... If I extract each doc from the big file, it costs too much time and not efficient. Therefore, I decide to read the file line by line and feed information to lucene the same time. I write c# code to index each doc to lucene like: Streamreader sr = new Streamreader(file); string line = ""; while((line = sr.ReadLine()) != null) { How can I tell this line is a doc header line from text line and get the metadata and all the text lines of a doc for lucene to index. Also, the text is read by OCR which can not give correct line-separating. Captions are mixed with content text iterate the process till the end of the file } with thanks

    Read the article

  • Building a case for solr

    - by Midhat
    Our product consists of multiple applications, All using Lucene. 2 of the applications I am involved with have Lucene indexes of about 3 GB and 12GB. Another team is building an application, for which they estimate the LUCENE INDEX size to be close to 1 Terabyte. New documents are added to the indexes every 15 days approx. We do not have any apparent performance issues with the current applications. So my question is SHould we be using Solr now? When should one stop using Lucene and graduate to Solr? Any disadvantages/problems for using Solr? The client applications are made in ASP.Net, but I assume they will be able to use a solr server using solrnet

    Read the article

  • Tokenizing Twitter Posts in Lucene

    - by Amaç Herdagdelen
    Hello, My question in a nutshell: Does anyone know of a TwitterAnalyzer or TwitterTokenizer for Lucene? More detailed version: I want to index a number of tweets in Lucene and keep the terms like @user or #hashtag intact. StandardTokenizer does not work because it discards the punctuation (but it does other useful stuff like keeping domain names, email addresses or recognizing acronyms). How can I have an analyzer which does everything StandardTokenizer does but does not touch terms like @user and #hashtag? My current solution is to preprocess the tweet text before feeding it into the analyzer and replace the characters by other alphanumeric strings. For example, String newText = newText.replaceAll("#", "hashtag"); newText = newText.replaceAll("@", "addresstag"); Unfortunately this method breaks legitimate email addresses but I can live with that. Does that approach make sense? Thanks in advance! Amaç

    Read the article

  • Lucene Error While Reading binary block : java.io.EOFException

    - by tushar Khairnar
    Hi, I am getting java.io.EOFException while reading a binary block from lucene index. I am storing java object as byte-array in lucene index field and reading it when hit occurs. Here is stack trace : Caused by: java.io.EOFException at java.io.ObjectInputStream$PeekInputStream.readFully(ObjectInputStream.java:2281) at java.io.ObjectInputStream$BlockDataInputStream.readShort(ObjectInputStream.java:2750) at java.io.ObjectInputStream.readStreamHeader(ObjectInputStream.java:780) at java.io.ObjectInputStream.(ObjectInputStream.java:280) at org.terracotta.modules.searchable.util.SerializationUtil$OIS.(SerializationUtil.java:20) I have some background threads which write into index. But i buffer them and then write them at once like 1000. Occasionally I also issue optimize() on index. When I write, I am re-opening IndexReader. Does this is happening because of IndexReader re-opening call? Thanks. Regards Tushar

    Read the article

  • Lucene.NET search index approach

    - by Tim Peel
    Hi, I am trying to put together a test case for using Lucene.NET on one of our websites. I'd like to do the following: Index in a single unique id. Index across a comma delimitered string of terms or tags. For example. Item 1: Id = 1 Tags = Something,Separated-Term I will then be structuring the search so I can look for documents against tag i.e. tags:something OR tags:separate-term I need to maintain the exact term value in order to search against it. I have something running, and the search query is being parsed as expected, but I am not seeing any results. Here's some code. My parser (_luceneAnalyzer is passed into my indexing service): var parser = new QueryParser(Lucene.Net.Util.Version.LUCENE_CURRENT, "Tags", _luceneAnalyzer); parser.SetDefaultOperator(QueryParser.Operator.AND); return parser; My Lucene.NET document creation: var doc = new Document(); var id = new Field( "Id", NumericUtils.IntToPrefixCoded(indexObject.id), Field.Store.YES, Field.Index.NOT_ANALYZED, Field.TermVector.NO); var tags = new Field( "Tags", string.Join(",", indexObject.Tags.ToArray()), Field.Store.NO, Field.Index.ANALYZED, Field.TermVector.YES); doc.Add(id); doc.Add(tags); return doc; My search: var parser = BuildQueryParser(); var query = parser.Parse(searchQuery); var searcher = Searcher; TopDocs hits = searcher.Search(query, null, max); IList<SearchResult> result = new List<SearchResult>(); float scoreNorm = 1.0f / hits.GetMaxScore(); for (int i = 0; i < hits.scoreDocs.Length; i++) { float score = hits.scoreDocs[i].score * scoreNorm; result.Add(CreateSearchResult(searcher.Doc(hits.scoreDocs[i].doc), score)); } return result; I have two documents in my index, one with the tag "Something" and one with the tags "Something" and "Separated-Term". It's important for the - to remain in the terms as I want an exact match on the full value. When I search with "tags:Something" I do not get any results. Question What Analyzer should I be using to achieve the search index I am after? Are there any pointers for putting together a search such as this? Why is my current search not returning any results? Many thanks

    Read the article

  • Lucene numDocs and doqFreq on custom similarity class

    - by David A
    Hi All, im doing an aplication with Lucene (im a noob with it) and im facing some problems. My aplication uses the Lucene 2.4.0 library with a custom similaraty implementation (the jar is imported) In my app im calculating doqFreq and numDocs manually (im adding the values of all indexes and then i calculate a global value in order to use it on every query) and i want to use that values on a custom similarity implementation in order to calculate a new IDF. The problem is that I dont know how to use (or send) the new doqFreq and numDocs values from my app on that new similarty implementation as I dont want to change lucene´s code apart from this extra class. Any suggestions or examples? I read the docs but i dont now how to aproach this :s Thanks

    Read the article

  • Lucene Search for japanese characters

    - by Pranali Desai
    Hi All, I have implemented lucene for my application and it works very well unless you have introduced something like japanese characters. The problem is that if I have japanese string ?????????????? and I search with ? that is the first character than it works well whereas if I use more than one japanese character(????)in search token search fails and there is no document found. Are japanese characters supported in lucene? what are the settings to be done to get it working?

    Read the article

< Previous Page | 1 2 3 4 5 6 7 8 9 10 11 12  | Next Page >