|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||
java.lang.Objectorg.apache.lucene.benchmark.byTask.feeds.ContentItemsSource
org.apache.lucene.benchmark.byTask.feeds.ContentSource
org.apache.lucene.benchmark.byTask.feeds.TrecContentSource
public class TrecContentSource
Implements a ContentSource over the TREC collection.
Supports the following configuration parameters (on top of
ContentSource):
TrecDocParser class to use for
parsing the TREC documents content (default=TrecGov2Parser).
HTMLParser class to use for
parsing the HTML parts of the TREC documents content (default=DemoHTMLParser).
| Field Summary | |
|---|---|
static String |
DOC
|
static String |
DOCNO
|
static String |
NEW_LINE
separator between lines in the byffer |
static String |
TERMINATING_DOC
|
static String |
TERMINATING_DOCNO
|
| Fields inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource |
|---|
encoding, forever, logStep, verbose |
| Constructor Summary | |
|---|---|
TrecContentSource()
|
|
| Method Summary | |
|---|---|
void |
close()
Called when reading from this content source is no longer required. |
DocData |
getNextDocData(DocData docData)
Returns the next DocData from the content source. |
Date |
parseDate(String dateStr)
|
void |
resetInputs()
Resets the input for this content source, so that the test would behave as if it was just started, input-wise. |
void |
setConfig(Config config)
Sets the Config for this content source. |
| Methods inherited from class org.apache.lucene.benchmark.byTask.feeds.ContentItemsSource |
|---|
addBytes, addItem, collectFiles, getBytesCount, getConfig, getItemsCount, getTotalBytesCount, getTotalItemsCount, printStatistics, shouldLog |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String DOCNO
public static final String TERMINATING_DOCNO
public static final String DOC
public static final String TERMINATING_DOC
public static final String NEW_LINE
| Constructor Detail |
|---|
public TrecContentSource()
| Method Detail |
|---|
public Date parseDate(String dateStr)
public void close()
throws IOException
ContentItemsSource
close in interface Closeableclose in class ContentItemsSourceIOException
public DocData getNextDocData(DocData docData)
throws NoMoreDataException,
IOException
ContentSourceDocData from the content source.
Implementations must account for multi-threading, as multiple threads
can call this method simultaneously.
getNextDocData in class ContentSourceNoMoreDataException
IOException
public void resetInputs()
throws IOException
ContentItemsSourceNOTE: the default implementation resets the number of bytes and items generated since the last reset, so it's important to call super.resetInputs in case you override this method.
resetInputs in class ContentItemsSourceIOExceptionpublic void setConfig(Config config)
ContentItemsSourceConfig for this content source. If you override this
method, you must call super.setConfig.
setConfig in class ContentItemsSource
|
||||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | |||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | |||||||||