xmlgraphics-fop-dev mailing list archives

Site index · List index
Message view « Date » · « Thread »
Top « Date » · « Thread »
From "Piyush Khandelwal (Jira)" <j...@apache.org>
Subject [jira] [Updated] (FOP-2937) [PATCH]Post PDF generation, Soft reference of PDFObject in PDFReference are not immediately garbage collected leading to excessive memory usage.
Date Mon, 18 May 2020 09:19:00 GMT

     [ https://issues.apache.org/jira/browse/FOP-2937?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]

Piyush Khandelwal updated FOP-2937:
-----------------------------------
    Summary: [PATCH]Post PDF generation, Soft reference of PDFObject in PDFReference are not
immediately garbage collected leading to excessive memory usage.  (was: Post PDF generation,
Soft reference of PDFObject in PDFReference are not immediately garbage collected leading
to excessive memory usage.)

> [PATCH]Post PDF generation, Soft reference of PDFObject in PDFReference are not immediately
garbage collected leading to excessive memory usage.
> ------------------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: FOP-2937
>                 URL: https://issues.apache.org/jira/browse/FOP-2937
>             Project: FOP
>          Issue Type: Improvement
>    Affects Versions: 2.3, 2.4
>            Reporter: Piyush Khandelwal
>            Priority: Major
>         Attachments: pdfreference.patch
>
>
> PDFReference object holds a SoftReference of PDFObject (PDFPage, PDFLabel, PDFName etc.).
> If we generate a huge PDF ; *I tried with a PDF having around 150 thousand pages with
12 GB of RAM;* lots of these references linger around waiting for the garbage collector to
collect them. 
> But GC wont collect them as long as JVM is able to recover enough memory without throwing
out of memory.
> Here are few metadata from my testing for further understanding of the issue - 
> Stats for generating 1 PDF - 
> *FO size:* 2.03GB
> *Generated PDF No. of Pages:* Around 150 K
> RAM: 12 GB
> Peak memory that reached while generation - 11.3GB
> Residual memory after forced GC: 9 GB
> The FO mainly contains tabular data with each pages sequence having max of 500 rows.
> On analyzing the memory dump; found lots of reference for PDFPage, PDFName etc.
> *Question - * Is there any specific reason for using SoftReference in PDFReference class
 instead of WeakReference.
> Testing by changing SoftReference  to WeakReference in PDFReference shows following improvements
without any issue in the generation whatsoever - 
> Stats for Generating 5 PDF in parallel - 
> *FO size:* 2.03GB
> *Generated PDF No. of Pages:* Around 150 K
> RAM: 12 GB
> Peak memory that reached while generation - 4GB
> Residual memory after forced GC: 300 MB
> So, by changing SoftReference to WeakReference, I was able to generate 5 PDF having 150K
pages in parallel with max  4GB Ram; without any generation issues.
> You can clearly see the performance benefits of changing to WeakReference. 
> But as I dont understand the complete internal details of how FOP works, I would like
to understand  if we can target this change and if not what is the reason behind using SoftReference?



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

Mime
View raw message