org.apache.lucene.benchmark.utils

Class ExtractReuters

public class ExtractReuters extends Object

Split the Reuters SGML documents into Simple Text files containing: Title, Date, Dateline, Body
Constructor Summary
ExtractReuters(File reutersDir, File outputDir)
Method Summary
voidextract()
protected voidextractFile(File sgmFile)
Override if you wish to change what is extracted
static voidmain(String[] args)

Constructor Detail

ExtractReuters

public ExtractReuters(File reutersDir, File outputDir)

Method Detail

extract

public void extract()

extractFile

protected void extractFile(File sgmFile)
Override if you wish to change what is extracted

Parameters: sgmFile

main

public static void main(String[] args)
Copyright © 2000-2007 Apache Software Foundation. All Rights Reserved.