Trees | Index | Help |
---|
Package Bio :: Package Medline :: Module NLMMedlineXML |
|
This module provides code to work the NCBI's XML format for Medline.
Functions: choose_format Pick the right data format to use to index an XML file. index Index a Medline XML file. index_many Index multiple Medline XML files.Classes | |
---|---|
Citation |
Holds information about a Medline citation. |
CitationParser |
Parses a citation into a Record object. |
_IndexerHandler |
Handles the results from the nlmmedline_format. |
_SavedDataHandle |
Function Summary | |
---|---|
choose_format(data) -> module | |
index(handle[, index_fn]) -> list of (PMID, MedlineID, start, end) | |
index_many(files_or_paths, index_fn[, nprocs]) |
Function Details |
---|
choose_format(data)choose_format(data) -> module Look at some data and choose the right format to parse it. data should be the first 1000 characters or so of the file. The module will contain 2 attributes: citation_format and format. citation_format is a Martel format to parse one citation. format will parse the whole file. |
index(handle, index_fn=None)index(handle[, index_fn]) -> list of (PMID, MedlineID, start, end) Index a Medline XML file. Returns where the records are, as offsets from the beginning of the handle. index_fn is a callback function with parameters (PMID, MedlineID, start, end) and is called as soon as each record is indexes. |
index_many(files_or_paths, index_fn, nprocs=1)index_many(files_or_paths, index_fn[, nprocs]) Index multiple Medline XML files. files_or_paths can be a single file, a path, a list of files, or a list of paths. index_fn is a callback function that should take the following parameters: index_fn(file, event, data) where file is the file being indexed, event is one of "START", "RECORD", "END", and data is extra data dependent upon the event. "START" and "END" events are passed to indicate when a file is being indexed. "RECORD" is passed whenever a new record has been indexed. When a "RECORD" event is passed, then data is set to a tuple of (pmid, medline_id, start, end). Otherwise it is None. start and end indicate the location of the record as offsets from the beginning of the file. |
Trees | Index | Help |
---|
Generated by Epydoc 2.1 on Mon Aug 27 16:13:10 2007 | http://epydoc.sf.net |