community-commits mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Rob Weir (JIRA)" <>
Subject [jira] [Created] (COMDEV-76) [GSoC] Test Document Generator/Permutator
Date Fri, 22 Mar 2013 15:51:15 GMT
Rob Weir created COMDEV-76:

             Summary: [GSoC] Test Document Generator/Permutator
                 Key: COMDEV-76
             Project: Community Development
          Issue Type: New Feature
            Reporter: Rob Weir

Apache OpenOffice is the leading open source desktop office suite.   Our most recent release
has had over 40 million downloads.

The default document format for OpenOffice is Open Document Format (ODF).  But we also can
work with Microsoft document formats, including legacy binary formats (DOC/XLS/PPT) and their
new XML formats (DOCX/XLSX/PPTX).

A continuing challenge is finding an efficient way to test our support of these document formats.
 It is extremely laborious to create test documents  Imagine, for example, we want to verify
that we can correctly process table cell formatting.  We have variations in text styles, in
border styles, in fills, in alignment, etc.  A complete test would require an large number
of manually created test cases.

Is it possible to do better than this?  Can test documents be automatically generated?  

Presumably, yes, they can be automatically generated.  We have open source libraries, in Java,
that can read and write ODF and Microsoft documents:

The Apache ODF Toolkit for ODF documents:

Apache POI for Microsoft documents:

But can this be made really easy, so QA tester, not a programmer, can generate test cases
easily?    Can we find a way to specify a test scenario and then generate a range of test
documents in all three formats?   

Can we be smart about this and generate complete X*Y*Z sets of test cases as well as fractional
factorial design (   For example,
the factors for a text style might be: typeface, font size, weight, color, background color
and alignment.  A test of all combinations would lead to an enormous number of test cases,
because of the huge number of colors and typefaces.  But to be useful, we only need a subset
of these test cases, the ones that are likely to reveal bugs.  How can we be intelligent about

The specification for the document formats is available as well.  So we have a formal description
of the schema for ODF and OpenXML.  Is that information useful?  Can we have "schema-directed
test document creation"?

As you can see, there is a broad range of things that could be done here, limited only by
time, skill and interests of the student.  One could easily develop new ideas and research
here that could be publishable.   The results would be useful to Apache OpenOffice of course,
but could potentially be applicable more broadly, to other products and other markup languages.

Skills needed:

-- Java programming ("Core Java"), good working knowledge, but don't need to be a guru or

-- Knowledge of XML

-- Helps to have some awareness of QA, e.g., what "test coverage" is and why it is important.

This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see:

View raw message