Bug 85702

Summary: Not able to find (only) ordinals (ª,º,...)
Product: poppler Reporter: Moenne-loccoz Frédéric <frederic.moenne-loccoz>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: All   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments: patch to poppler library
test file

Description Moenne-loccoz Frédéric 2014-10-31 15:57:29 UTC
Created attachment 108731 [details]
patch to poppler library

Hello,
I want to open this bug from gnome Evince users who asked the following thing:
distinguish ordinal characters during text search. Here are the characters:

0x207F=ⁿ,0x00AA=ª,0xBA=º,0xB9=¹,0xB2=² ,0xB3=³,0x2074=⁴,0x2075=⁵,0x2076=⁶,0x2077=⁷,0x2078=⁸,0x2079=⁹,..

Besides, these ones are considered like their standard numeric figure, for example º=0, ª =a, 
that work up false search result with too many characters.
This bug has been opened in gnome bugzilla (see bug #429985), this issue has been veted and the conclusion of our search is the following:
These characters are replaced because of normalization compatibility decomposition applying on characters strings in poppler library.
Here is the solution that i propose: exclude ordinal characters from NFKC normalization as indicated in my patch.
Perhaps it could be better to create a fonction to exclude a character from normalization process getting its unicode code point as a parameter. That would allow easy configuration for poppler library users.
I hope it helps. Thanks for your excellent work,

Frederic
Gnome Software Contributor
Comment 1 Moenne-loccoz Frédéric 2014-10-31 16:08:04 UTC
Created attachment 108732 [details]
test file
Comment 2 GitLab Migration User 2018-08-21 10:43:13 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/345.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.