Summary: | Support unicode args and console output on windows | ||
---|---|---|---|
Product: | poppler | Reporter: | Adrian Johnson <ajohnson> |
Component: | utils | Assignee: | poppler-bugs <poppler-bugs> |
Status: | RESOLVED FIXED | QA Contact: | |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | Other | ||
OS: | Windows (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Move UTF8.h to UnicodeMapFuncs.h
sort encodings fix some mingw warnings support unicode on windows console support unicode on windows console v2 |
Description
Adrian Johnson
2017-11-12 00:48:20 UTC
Created attachment 135409 [details] [review] Move UTF8.h to UnicodeMapFuncs.h Created attachment 135410 [details] [review] sort encodings Created attachment 135411 [details] [review] fix some mingw warnings Created attachment 135412 [details] [review] support unicode on windows console One problem I did find while testing is this doesn't work the pdftotext outputing to stdout because the TextOuputDev printf is in the poppler lib, not in pdftotext.cc. However, I don't think this a much of an issue as you would not normally output pdftotext to the console due to the size of the output. Normally you would pipe it through "more". The unicode console output is intended more for things like pdfinfo, and file names. The next thing I want to look at is the windows filename support in poppler. Currently is is a mess of duplicated code and #ifdefs everywhere. I'd like to simplify it to pass UTF-8 filenames through the code and just convert to wchar_t at the point where the file is opened (in goo/gfile.cc). goo/gfile.cc curently has: - openFile() takes UTF-8 filenames, has it's own UTF-8 to wchar_t conversion. - GooFile::open() has separate UTF-8 and wchar_t variants. - the GdirEntry code does not support unicode on windows. You need to copy the license text from the decoder, not just link to the web. Are you sure we're not breaking people's code by changing the PDFDoc constructor behaviour? Created attachment 135422 [details] [review] support unicode on windows console v2 Include license. (In reply to Albert Astals Cid from comment #7) > Are you sure we're not breaking people's code by changing the PDFDoc > constructor behaviour? Old behavior is on win32 it calls GooFile::open(const GooString *fileName) which uses CreateFileA. This only supports ASCII. So new code is backwards compatible. Having to have Win32Console win32Console(&argc, &argv); in each and every of the apps is a bit meh, but i guess there's no other way around it. About the improvements, do you want to finish them before commiting or you want to commit what you already have? I'd like to commit this first. I've got some other work in progress that I'd like to finish first then I'll open a new bug for the windows file/path unicode improvements. I noticed also that the code has a mixture of mostly gmalloc/gfree but sometimes new/delete.Do you have a preference for one or the other in new code? (In reply to Adrian Johnson from comment #11) > I'd like to commit this first. I've got some other work in progress that I'd > like to finish first then I'll open a new bug for the windows file/path > unicode improvements. I don't see anything obviously wrong there and since it's mostly windows changes (which personally is less important for me) i guess you can go on. > > I noticed also that the code has a mixture of mostly gmalloc/gfree but > sometimes new/delete.Do you have a preference for one or the other in new > code? The only benefit i see in gmalloc is the checkoverflow variants, otherwise i'd say use new/delete ? pushed |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.