Apache | Jakarta | POI |
POI 1.0 Vision DocumentPreface(21-Jan-02) While this document is just full of useful project introductory information and I do suggest those interested in getting involved in the project read it, it is woefully out of date. We deliberately allowed this document to run out of date because it is a good reflection of what the original vision was for POI 1.0. You'll note that some of the terminology is not used in quite the same way any longer. I've made some minor corrections where reading this confused me. An example: in some places this document may refer to POI API instead of POIFS API. When this vision was written we had an incomplete understanding of the project. Lastly, the scope of the project expanded dramatically near the end of the 1.0 cycle. Our vision at the time was to focus merely on the Excel port (having no idea how the project would grow or be received) and provide the OLE 2 Compound Document port for others to port later formats. We now plan to spearhead these ports under the umbrella of the POI project. So, you've been warned. Read on, but just realize that we had a fuzzy view of things to come, and hindsight is 20-20. If I recall major holes were: a complete understanding of the format of OLE 2 Compound Document format, Excel file format, and exactly how Cocoon 2 Serializers worked. (that just about covers the whole range huh?) 1. Introduction1.1 Purpose of this documentThe purpose of this document is to collect, analyze and define high-level requirements, user needs and features of the HSSF Serializer for Cocoon 2 and related libraries. The HSSF Serializer is a java class supporting the Serializer interface from the Cocoon 2 project and outputting in a compatible format of that used by the spreadsheet program Microsoft Excel '97. The HSSF Serializer will be responsible for converting XML spreadsheet-like documents into Excel-compatible XLS spreadsheets. 1.2 Project OverviewMany web apps today hit a brick wall when it comes to the user request that they be able to easily manipulate their reports and data extracts in the popular Microsoft Excel spreadsheet format. This often causes inferior technologies to be chosen for the project simply because they easily support this format. This project seeks to extend existing XML, Java and Apache Cocoon 2 project technologies by:
2. User Description2.1 User/Market DemographicsThere are a number of enthusiastic users of XML, UNIX and Java technology. Secondly, the Microsoft solution for outputting Office Document formats often involves actually manipulating the software as an OLE Server. This method provides extremely low performance, extremely high overhead and is only capable of handling one document at a time.
2.2. User environmentThe users of this software shall be developers in a Java environment on any Operating System or power users who are capable of XML document generation/deployment. 2.3. Key User NeedsThe OLE 2 Compound Document format is undocumented for all practical purposes and cryptic for all impractical purposes. Developer needs in this area include documentation and an easy to use library for reading and writing in this format without requiring the developer to have intimate knowledge of the format. There is currently no good way to write to Microsoft Excel documents from Java or from a non-Microsoft Windows based platform for that matter. Developers need an easy to use library that supports a reasonable feature set and allows seperation of data from formatting/stylistic concerns. There is currently no good way to transform XML data to Microsoft Excel. Apache's Cocoon 2 project supplies a complete framework for XML, but nothing for outputting in Excel's XLS format. Developers and power users alike need a simple method to output XML documents to Excel through server-side processing. 3. Project Overview3.1. Project PerspectiveThe produced code shall be licensed by the Apache License as used by the Cocoon 2 project and maintained on a project page until such time as the Cocoon 2 developers accept it as a donation (at which time the copyright will be turned over to them). 3.2. Project Position StatementFor developers on a Java and/or XML environment this project will provide all the tools necessary for outputting XML data in the Microsoft Excel format. This project seeks to make the use of Microsoft Windows based servers unnecessary for file format considerations and to fully document the OLE 2 Compound Document format. The project aims not only to provide the tools for serializing XML to Excel's file format and the tools for writing to that file format from Java, but also to provide the tools for later projects to convert other OLE 2 Compound Document formats to pure Java APIs. 3.3. Summary of CapabilitiesHSSF Serializer for Apache Cocoon 2
3.4. Assumptions and Dependencies
4. Project FeaturesThe POIFS API will include:
The HSSF API will include:
4.1 POI Filesystem APIThe POI Filesystem API includes:
4.2 HSSF APIThe HSSF API includes:
4.3 HSSF SerializerThe HSSF Serializer subproject:
5. Other Product Requirements5.1. Applicable StandardsAll Java code will be 100% pure Java. 5.2. System RequirementsThe minimum system requirements for POIFS are:
The minimum system requirements for HSSF are:
The minimum system requirements for the HSSF Serializer are:
5.3. Performance RequirementsAll components must perform well enough to be practical for use in a webserver environment (especially Cocoon2/Tomcat/Apache combo) 5.4. Environmental RequirementsThe software will run primarily in developer environments. We should make some allowances for not-highly-technical users to write XML documents for the HSSF Serializer. All other components will assume intermediate Java 2 knowledge. No XML knowledge will be required except for using the HSSF Serializer. As much documentation as is practical shall be required for all components as XML is relatively new, and the concepts introduced for writing spreadsheets and to POI filesystems will be brand new to Java and many Java developers. 6. Documentation Requirements6.1 POI FilesystemThe filesystem as read and written by POI shall be fully documented and explained so that the average Java developer can understand it. 6.2. POI APIThe POI API will be fully documented through Javadoc. A walkthrough of using the high level POI API shall be provided. No documentation outside of the Javadoc shall be provided for the low-level POI APIs. 6.3. HSSF File FormatThe HSSF File Format as implemented by the HSSF API will be fully documented. No documentation will be provided for features that are not supported by HSSF API that are supported by the Excel 97 File Format. Care will be taken not to infringe on any "legal stuff". 6.4. HSSF APIThe HSSF API will be documented by javadoc. A walkthrough of using the high level HSSF API shall be provided. No documentation outside of the Javadoc shall be provided for the low level HSSF APIs. 6.5. HSSF SerializerThe HSSF Serializer will be documented by javadoc. 6.6 HSSF Serializer Tag languageThe XML tag language along with function and usage shall be fully documented. Examples will be provided as well. 7. Terminology7.1 Filesystemfilesystem shall refer only to the POI formatted archive. 7.2 Filefile shall refer to the embedded data stream within a POI filesystem. This will be the actual embedded document. |