<html xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:w="urn:schemas-microsoft-com:office:word" xmlns:m="http://schemas.microsoft.com/office/2004/12/omml" xmlns="http://www.w3.org/TR/REC-html40">
<head>
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered medium)">
<style><!--
/* Font Definitions */
@font-face
        {font-family:"Cambria Math";
        panose-1:2 4 5 3 5 4 6 3 2 4;}
@font-face
        {font-family:Calibri;
        panose-1:2 15 5 2 2 2 4 3 2 4;}
/* Style Definitions */
p.MsoNormal, li.MsoNormal, div.MsoNormal
        {margin:0in;
        font-size:11.0pt;
        font-family:"Calibri",sans-serif;}
a:link, span.MsoHyperlink
        {mso-style-priority:99;
        color:blue;
        text-decoration:underline;}
span.EmailStyle19
        {mso-style-type:personal-reply;
        font-family:"Calibri",sans-serif;
        color:windowtext;}
.MsoChpDefault
        {mso-style-type:export-only;
        font-size:10.0pt;}
@page WordSection1
        {size:8.5in 11.0in;
        margin:1.0in 1.0in 1.0in 1.0in;}
div.WordSection1
        {page:WordSection1;}
--></style>
</head>
<body lang="EN-US" link="blue" vlink="purple" style="word-wrap:break-word">
<div class="WordSection1">
<p class="MsoNormal">I believe the issue is that Evince, when doing copy/paste of text does *<b>not</b>* look at the Tags but instead just uses the content stream (via TextOutputDev or equivalent).<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<p class="MsoNormal">Leonard<o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div style="border:none;border-top:solid #B5C4DF 1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal" style="margin-bottom:12.0pt"><b><span style="font-size:12.0pt;color:black">From:
</span></b><span style="font-size:12.0pt;color:black">poppler <poppler-bounces@lists.freedesktop.org> on behalf of Germán Poo-Caamaño <gpoo@gnome.org><br>
<b>Date: </b>Thursday, June 24, 2021 at 9:49 AM<br>
<b>To: </b>poppler@lists.freedesktop.org <poppler@lists.freedesktop.org><br>
<b>Subject: </b>Re: [poppler] Why poppler, which supports tagged PDFs, doesn't recognize some of the tags as a whole?<o:p></o:p></span></p>
</div>
<div>
<p class="MsoNormal">On Thu, 2021-06-24 at 10:44 +0200, Albert Astals Cid wrote:<br>
> El dijous, 24 de juny de 2021, a les 7:48:45 (CEST), Denis Bitouzé va<br>
> escriure:<br>
> > Hi,<br>
> > <br>
> > the joined `test.pdf` file is properly tagged as you can check it<br>
> > by<br>
> > loading it at:<br>
> > <br>
> >   ┌────<br>
> >   │ <a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ngpdf.com%2FloadFile&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=B2PyymiLJry0%2FuTtFq%2FxKTxv%2Fm7IyCSvskMpv6vyYBI%3D&amp;reserved=0">
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ngpdf.com%2FloadFile&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=B2PyymiLJry0%2FuTtFq%2FxKTxv%2Fm7IyCSvskMpv6vyYBI%3D&amp;reserved=0</a><br>
> >   └────<br>
> > <br>
> > and then looking at:<br>
> > <br>
> >   ┌────<br>
> >   │ <a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ngpdf.com%2Feditor%2FeditFile&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=A5pj0HN%2BIXhlLKJXqHrGPOJiqOa2ecEP4oeTUf6DBLY%3D&amp;reserved=0">
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.ngpdf.com%2Feditor%2FeditFile&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=A5pj0HN%2BIXhlLKJXqHrGPOJiqOa2ecEP4oeTUf6DBLY%3D&amp;reserved=0</a><br>
> >   └────<br>
> > <br>
> > You can see each line of the code:<br>
> > <br>
> >   ┌────<br>
> >   │ \pdfdict_new:n   {l_my_action_dict}<br>
> >   │ \pdfdict_put:nnn {l_my_action_dict}{Type}{/Action}<br>
> >   │ \pdfdict_put:nnn {l_my_action_dict}{S}{/URI}<br>
> >   │ \pdfdict_put:nnn {l_my_action_dict}{URI}{(  <br>
> > <a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.latex-project.org%2F&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=sYehJE%2F61AvsbjEGCLNUrEzGObcp14gQVjix%2Beo11Bg%3D&amp;reserved=0)%7d">
https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fwww.latex-project.org%2F&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=sYehJE%2F61AvsbjEGCLNUrEzGObcp14gQVjix%2Beo11Bg%3D&amp;reserved=0)}</a><br>
> >   └────<br>
> > <br>
> > is a single tag.<br>
> > <br>
> > Nevertheless this code, if copied e.g. from Evince 3.38.1, is<br>
> > pasted not<br>
> > as it is and but as:<br>
> <br>
> That would be a question for the Evince developers (some of them are<br>
> on this is i guess so you may still get an answer).<br>
> <br>
> The fact that poppler has facilities to "see" the contents of tagged<br>
> pdf doesn't mean that evince is using them.<br>
<br>
I am unsure what the report or question is about. Is it about<br>
presenting/seeing each tag separately or copying/pasting the test in<br>
the tags?<br>
<br>
If the later, that corresponds to poppler-glib.<br>
<br>
-- <br>
Germán Poo-Caamaño<br>
<a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcalcifer.org%2F&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=rCag2kD3rZXh0xRFfGTWNGzL1jKmN8hwXN427188BcE%3D&amp;reserved=0">https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Fcalcifer.org%2F&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=rCag2kD3rZXh0xRFfGTWNGzL1jKmN8hwXN427188BcE%3D&amp;reserved=0</a><br>
<br>
<br>
<br>
_______________________________________________<br>
poppler mailing list<br>
poppler@lists.freedesktop.org<br>
<a href="https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fpoppler&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=6PjUtzy8Ks8uTdz1RZu3ctqH5eeHzsnxQf%2BgBBcBhLI%3D&amp;reserved=0">https://nam04.safelinks.protection.outlook.com/?url=https%3A%2F%2Flists.freedesktop.org%2Fmailman%2Flistinfo%2Fpoppler&amp;data=04%7C01%7Clrosenth%40adobe.com%7Cef542415b4b744bbb43608d93716a6ef%7Cfa7b1b5a7b34438794aed2c178decee1%7C0%7C0%7C637601393451358131%7CUnknown%7CTWFpbGZsb3d8eyJWIjoiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C2000&amp;sdata=6PjUtzy8Ks8uTdz1RZu3ctqH5eeHzsnxQf%2BgBBcBhLI%3D&amp;reserved=0</a><o:p></o:p></p>
</div>
</div>
</body>
</html>