[poppler] Question on policy w.r.t. code modernization

Carlos Garcia Campos carlosgc at gnome.org
Tue Dec 29 11:28:28 PST 2015


Albert Astals Cid <aacid at kde.org> writes:

> El Tuesday 29 December 2015, a les 12:28:28, Adam Reichold va escriure:
>> Hello again,
>> 
>> Am 29.12.2015 um 12:04 schrieb Albert Astals Cid:
>> > El Monday 28 December 2015, a les 20:32:31, Adam Reichold va escriure:
>> >> Hello,
>> >> 
>> >> Am 28.12.2015 um 18:55 schrieb Albert Astals Cid:
>> >>> El Friday 25 December 2015, a les 14:38:12, Adam Reichold va escriure:
>> >>>> Hello,
>> >>>> 
>> >>>> remembering how I tried to push some minor optimizations to GooString
>> >>>> some time ago, I wanted to ask about the project's policy on how to
>> >>>> handle code modernizations in the future.
>> >>>> 
>> >>>> To me, it seems like stuff that is pervasive like GooList or GooString
>> >>>> is unlikely to be changed with the currently available man power due to
>> >>>> the risk of regressions. One way of resolving this is IMHO to try to
>> >>>> remove the code completely and replace it with externally maintained
>> >>>> and
>> >>>> highly optimized components, e.g. using std::vector and std::string
>> >>>> from
>> >>>> the standard library instead of GooList and GooString. But this will
>> >>>> probably make it rather hard to merge xpdf changes in the future. Then
>> >>>> again, this seems to be hardly happening as is.
>> >>> 
>> >>> Do we actually get any benefit for using std::string instead of
>> >>> GooString?
>> >>> I can only see it creating regressions and lots of work for what is
>> >>> probably a very marginal speed improvement (happy to be proven wrong)
>> >>> and
>> >>> also the class does not give us any maintainance overhead since it's
>> >>> "done".
>> >> 
>> >> Basically, these are just examples to start this general discussion with
>> >> a concrete narrative. And while these modules certainly do what they
>> >> were intended to do, I would argue that their interfaces and
>> >> implementations are inferior to what most of the standard libraries on
>> >> various platforms provide. Also, better usage of the STL would improve
>> >> type safety, e.g. std::vector<T> instead of GooList storing void*, which
>> >> could also help with some of the issue mentioned below.
>> >> 
>> >> What is considered marginal or done, can be debated IMHO. Poppler is
>> >> used by a lot of people, so even saving a percent or two of runtime or
>> >> memory would surely save quite a bit in absolute terms.
>> > 
>> > Yeah, that very same lot of people will complain if you change the API
>> > (even if we've told them not to use it lots of people do).
>> > 
>> >> Also similar
>> >> projects did make similar optimizations for a possibly smaller user
>> >> base, e.g. Okular's text processing code does employ the more compact
>> >> small buffer optimization I proposed for GooString. Of course, the same
>> >> numbers imply opposite relationships w.r.t. regressions which is why I
>> >> would like to know how the maintainers think about this.
>> >> 
>> >> Phrased more generally, my question is whether this project sees room to
>> >> change things that are not broken. Or an explicit statement that changes
>> >> are restricted to adding features and fixing bugs if that is the case,
>> >> i.e. if the fundamentals of Poppler are considered "done".
>> > 
>> > Personally I think replacing GooString with std::string everywhere is a
>> > hell lot of work and I will not do it in my spare time.
>> 
>> This is definitely not about trying to tell you what to do in your spare
>> time. It is about what kind of patches actually have a chance of being
>> reviewed and accepted, e.g. if they significantly break away from
>> Poppler's xpdf roots. (Also, the GooList and GooString examples are just
>> examples. Fixing const methods and copy constructors everywhere is
>> pretty the same category IMHO.)
>> 
>> > If you want to do it, prove there's no regressions and prove there's a
>> > measurable gain, we can talk about merging the patch.
>> 
>> So can I interpret this together with Carlos' answer as: One can
>> significantly break away from the current code organization, but the bar
>> w.r.t. test cases and benchmarks will be pretty high? 
>
> Any patch needs to provide and improvement better than the risk it causes.

Indeed, code readability or less error prone code are also improvements.

> That goes with all the patches be it new features, bug fixes or internal 
> rewrites.
>
> The thing is it is genearlly easier to match that condition for new features 
> and bug fixes than for internal rewrites, but the conditions are the same.
>
>> (I am not sure if
>> "prove" is to be taken literally? Because if so, I would indeed
>> interpret this as a strict no.)
>
> If it doesn't cause regressions over the thousands of files i have here it is 
> good enough ;)

Yes, my point was that merging xpdf changes shouldn't be a reason to not
improve our code anymore. But of course any rewrite shouldn't introduce
any regression. Unfortunately our regression tools don't measure
performance at all.

> Cheers,
>   Albert
>
>> 
>> Best regards, Adam.
>> 
>> > Cheers,
>> > 
>> >   Albert
>> >   
>> >>> If we're going to diverge for the sake of modernizing the source code i
>> >>> think that makes more sense adding overrides to overriden functions,
>> >>> adding const to const functions and making sure that classes that should
>> >>> not be copied/ assigned by value have the copy constructor and
>> >>> assignment
>> >>> operators deleted.
>> >> 
>> >> I completely agree. As stated above, the chosen examples were just what
>> >> got me started thinking about this again recently.
>> >> 
>> >>>> So I guess that basically, my question is on the opinions of the
>> >>>> maintainers to detach the Poppler code base from the xpdf code base as
>> >>>> a
>> >>>> prerequisite to thoroughly modernizing it. Related to this is the
>> >>>> question on whether raising the language standard to C++11 or C++14 is
>> >>>> considered possible.
>> >> 
>> >> Best regards, Adam.
>> > 
>> > _______________________________________________
>> > poppler mailing list
>> > poppler at lists.freedesktop.org
>> > http://lists.freedesktop.org/mailman/listinfo/poppler
>
> _______________________________________________
> poppler mailing list
> poppler at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/poppler

-- 
Carlos Garcia Campos
PGP key: http://pgp.mit.edu:11371/pks/lookup?op=get&search=0x523E6462
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 180 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/poppler/attachments/20151229/842c854e/attachment-0001.sig>


More information about the poppler mailing list