org.apache.lucene.benchmark.byTask.feeds
public interface HTMLParser
Method Summary | |
---|---|
DocData | parse(String name, Date date, Reader reader, DateFormat dateFormat)
Parse the input Reader and return DocData.
|
DocData | parse(String name, Date date, StringBuffer inputText, DateFormat dateFormat)
Parse the inputText and return DocData. |
Parameters: dateFormat date formatter to use for extracting the date. name name of the result doc data. If null, attempt to set by parsed data. date date of the result doc data. If null, attempt to set by parsed data. reader of html text to parse.
Returns: Parsed doc data.
Throws: IOException InterruptedException
Parameters: inputText the html text to parse.
See Also: HTMLParser