solr schema for article->paragraph structure

Posted by Ke on Stack Overflow See other posts from Stack Overflow or by Ke
Published on 2010-06-15T10:18:36Z Indexed on 2010/06/15 10:22 UTC
Read the original article Hit count: 186

Filed under:

Hi guys,

I want to index some articles and show the paragraph number in the search result. So I guess the solr schema should looks like this:

article_id, paragraph_number, paragraph_content

Therefore, I need to parse article first, extract paragraphs and index it one by one.

I'm worried about the performance since one article can contain 100 paragraphs.

Any suggestion?

© Stack Overflow or respective owner

Related posts about solr