Need help with temporary files

Norbert Thiebaud nthiebaud at gmail.com
Wed Jul 22 08:49:19 PDT 2015


On Wed, Jul 22, 2015 at 10:20 AM, Caolán McNamara <caolanm at redhat.com> wrote:
> On Wed, 2015-07-22 at 13:15 +0200, Noel Grandin wrote:
>>
>> On 2015-07-22 01:07 PM, Caolán McNamara wrote:
>> > On Wed, 2015-07-22 at 10:43 +0100, Caolán McNamara wrote:
>> >> On Wed, 2015-07-22 at 10:28 +0100, Caolán McNamara wrote:
>> >>> On Tue, 2015-07-21 at 22:41 +0200, Matúš Kukan wrote:
>> >>>> Hi there,
>> >>>>
>> >>>> I am working on a bug around saving big file in Writer:
>> >>>> https://bugs.documentfoundation.org/show_bug.cgi?id=88314
>> >>>>
>> >> E_MFILE, too many open files, so the problem is a file handle leak.
>> >
>> > See https://gerrit.libreoffice.org/#/c/17289/ for a possible solution.
>> > That odt has > 14k files in it and in parallel deflate mode each one
>> > gets a separate ZipOutputEntry which all exist at the same time until
>> > the threads are completed. Each ZipOutputEntry has an open temp file so
>> > it runs out of file handles.
>> >
>>
>> We should be using some kind of task-manager/thread-pool to limit the number of active threads to something reasonable.
>> Running more than approx no_cores * 3 threads is going to __reduce__ performance.
>
> That's already the case, there are only no_cores active threads at a
> time, its just that the final .zip is stitched up from 14k Entries (each
> with an open stream) at the end of the process. While in non-thread mode
> there is each entry is created, processed and disposed of serially so
> there's only one entry alive at a time.

couldn't you have a n+1 thread whose job is to get the different
threads temp work and process them as you go...
I assume the last step is to get all the fragment and put them in one
final big file.. that can be done with the worker thread using
a pipe to send that to a 'merger' thread  (one output fragment at the
time with a minimal header) that pull from these pipes (select() or
poll())
as each worker thread finish they post a eot message and close their
pipe.. the merger thread then doing the role of the thread joiner that
must likely exist today anyway

You would have 2*n + 1 mx fd open n=nb_core for that.

Norbert

PS: I'm just bikesheeding based on generic concept.. I do not know the
specific of that particular code.


More information about the LibreOffice mailing list