Bug 49864

Summary: Wrong font id used when first word of a line has certain style applied (xml)
Product: poppler Reporter: Luis Parravicini <lparravi>
Component: pdftohtmlAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium CC: koleygr
Version: unspecified   
Hardware: All   
OS: other   
i915 platform: i915 features:
Attachments: Test files to reproduce the bug

Description Luis Parravicini 2012-05-13 05:46:10 UTC
Created attachment 61552 [details]
Test files to reproduce the bug

When generating an xml version of a pdf, the font id used in a certain line of the text seems to be that of the first word of that line.

This creates the following bug: it the first word in a line contains a word with italics, the font id outputted for the whole line is the font of the italic word, not of the rest of the line.

I've created a file in LibreOffice (I've come accross this problem with pdf created with other programs so it's not a problem in the way LibreOffice is generating the pdf) with four lines like the following text (italic words are marked here with <i> tags): 

Line 1
line 2
<i>line</i> 3
line <i>4</i>

All the text has the same font/size applied. And the xml generated is:

<page number="1" position="absolute" top="0" left="0" height="1263" width="892">
        <fontspec id="0" size="16" family="Times" color="#000000"/>
        <fontspec id="1" size="16" family="Times" color="#000000"/>
<text top="85" left="85" width="46" height="20" font="0">Line 1</text>
<text top="106" left="85" width="41" height="20" font="1"><i>line</i> 2</text>
<text top="126" left="85" width="40" height="20" font="0">line 3</text>
<text top="147" left="85" width="41" height="20" font="0">line <i>4</i></text>
Comment 1 Luis Parravicini 2012-05-13 05:51:50 UTC
This was tested with pdftohtml 0.20.0
Comment 2 GitLab Migration User 2018-08-20 21:49:13 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/91.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.