Bug forwarded from Evince: https://bugzilla.gnome.org/show_bug.cgi?id=632365
"The attached PDF shows the new Indian Rupee Sign (₹) exported from Inkscape;
with the smaller text at the bottom of the PDF it is possible to drag over and
select it for copy and paste.
However, neither "Select All", or dragging over the symbol itself highlights
the large symbol for copy-and-paste.
The large symbol is textual in nature (this can be confirmed by re-opening the
PDF in Inkscape, or using pdftohtml).
Ideally it would be possible to highlight and select all text in a document,
regardless of size. In this case the document had been created specifically to
encourage people to copy-and-paste the correct symbol into their own documents!"
Test case is attached to original bug report. I can confirm that pdftotext doesn't include the first large symbol, and acroread allows to select and copy/paste it.
Created attachment 70502 [details] [review]
Allow large chars in TextPage
The large symbol is not selectable because TextPage::addChar rejects characters larger than the page size. This patch removes that that check, though I do not know why it was added in the first place.
Created attachment 70504 [details] [review]
Use page size for max value in TextPage::visitSelection
The previous patch will cause TextPage::visitSelection to skip the "Indian ₹upee Sign" text because its bottom edge falls outside the page size. This also affects poppler_page_get_text, which indirectly calls visitSelection.
This patch fixes that by using the page size if the TextBlock's border is outside the page.
Jason, just to make sure, just one or both of the patches have to be applied?
(In reply to comment #3)
> Jason, just to make sure, just one or both of the patches have to be applied?
Both need to be applied.
The "Allow large chars" patch fixes the bug. The "Use page size" patch fixes a side effect.
I've commited the first patch, i'll let the second to Carlos as evince is the one that only uses the visitSelection code.
Pushed the second patch to git master. Thanks!