Summary: | Windows filenames should be in UTF-8 encoding | ||
---|---|---|---|
Product: | cairo | Reporter: | Tom Schoonjans <Tom.Schoonjans> |
Component: | win32 backend | Assignee: | cairo-bugs mailing list <cairo-bugs> |
Status: | RESOLVED FIXED | QA Contact: | cairo-bugs mailing list <cairo-bugs> |
Severity: | normal | ||
Priority: | medium | ||
Version: | unspecified | ||
Hardware: | Other | ||
OS: | Windows (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Patch to use UTF-8 filenames on Windows
Patch to use UTF-8 filenames on Windows Windows UTF-8 patch |
Description
Tom Schoonjans
2017-11-01 15:55:42 UTC
This is a really good idea. The patch only calls a function named _cairo_utf8_to_utf16 (which already exists for font support), it does not call the functions cairo_win32_filename_from_unicode or cairo_win32_filename_from_ansi. These functions are also really confusing as it sounds like they return a windows filename, but in fact are conversions *from* windows filenames to utf8. I would just omit those functions. A few comments on this, hopefully pointing out that a number of people are strongly in favor of doing it: http://utf8everywhere.org/ http://cppcms.com/files/nowide/html/ Created attachment 135220 [details] [review] Patch to use UTF-8 filenames on Windows Glad to read you like my proposal. I am attaching a patch that accomplishes exactly this. Looks right to me, only problem is the comments mention the non-existent function cairo_win32_filename_from_unicode(). Probably should just read like this: @filename: name of PNG file to load. On Windows this is encoded in UTF-8. Created attachment 135230 [details] [review] Patch to use UTF-8 filenames on Windows Sorry about that. I removed those references from the patch Comment on attachment 135230 [details] [review] Patch to use UTF-8 filenames on Windows Review of attachment 135230 [details] [review]: ----------------------------------------------------------------- Any specific reason why not all calls to fopen() would get this new treatment? (Although I do agree that it would be a bit stupid for those cases with hardcoded ASCII file names, but still, consistency-wise it might be a good idea) $ git grep fopen | wc -l 60 $ git grep fopen src/ | wc -l 21 (Only src/ might make sense since stuff outside of libcairo does not get access to libcairo-internal symbols) I don't have much of an opinion here, I'm just asking. And API-doc-wise I want to mention the following callers of _cairo_output_stream_create_for_filename which in turn end up getting the file name from some public API: $ git grep _cairo_output_stream_create_for_filename src/cairo-output-stream-private.h:_cairo_output_stream_create_for_filename (const char *filename); src/cairo-output-stream.c:_cairo_output_stream_create_for_filename (const char *filename) src/cairo-pdf-surface.c: output = _cairo_output_stream_create_for_filename (filename); src/cairo-ps-surface.c: stream = _cairo_output_stream_create_for_filename (filename); src/cairo-script-surface.c: stream = _cairo_output_stream_create_for_filename (filename); src/cairo-svg-surface.c: stream = _cairo_output_stream_create_for_filename (filename); src/cairo-xml-surface.c: stream = _cairo_output_stream_create_for_filename (filename); _cairo_fopen() belongs in cairo-misc.c. Just before _cairo_win32_tmpfile() would be a good place.
It needs to pass through the status returned by cairo_utf8_to_utf16. So something like:
cairo_status_t
_cairo_fopen (const char *utf8_filename, const char *mode, FILE **file_out);
And add a comment that the status is for the filename conversion. It can return success and file_out = NULL.
I wouldn't bother using it for hard coded file names. Only the calls to fopen() where the user supplies the filename.
> On Unix platforms, the filename is
> + * passed directly to the system
It is all platforms except windows.
Also, I'd be inclined to use a for/while loop and fixed size uint16_t array to convert the mode. Seems pointless using the overhead of _cairo_utf8_to_utf16() for a string of between 1 and 3 ASCII characters. Created attachment 135258 [details] [review] Windows UTF-8 patch I think I addressed the comments with this new patch except the last suggestion regarding using a for/while loop for the fopen mode. Though I agree that there will be some overhead caused by using _cairo_utf8_to_utf16, I think that it's a reasonable one, especially since these methods are unlikely to be called a lot during the execution of a program. I have also not gone ahead with replacing every occurrence in the source of fopen to _cairo_fopen. Unless I missed one, the remaining ones are in DEBUG_* sections only, and they are always passed ASCII string filenames, either directly or indirectly. I personally am not thrilled that the function has two different methods of returning errors. I would instead just return null if the utf8->utf16 conversion fails and force errno to whatever it gets if you send invalid utf16 as a filename to Windows. Then it is easier to drop-in replacement of fopen. If you don't do that it may be better to return the status number for any failure, rather than the case statements that remain below both calls to this. (In reply to Bill Spitzak from comment #10) > I personally am not thrilled that the function has two different methods of > returning errors. I would instead just return null if the utf8->utf16 > conversion fails and force errno to whatever it gets if you send invalid > utf16 as a filename to Windows. Then it is easier to drop-in replacement of > fopen. It is an internal function. All internal functions that can generate a cairo status return the status number so that it will be propagated through to the cairo object that provides a get status. In that case maybe it should contain the case statement that is now after the calls to it currently, which checks errno and turns it into status codes. The idea is that if it returns anything other than SUCCESS then the caller can return that code immediately. And SUCCESS means the file really opened. Pushed Many thanks for merging this patch in! In case there are plans for another 1.14.x release, would it be possible to include this patch in there as well? > In case there are plans for another 1.14.x release, would it be possible to
> include this patch in there as well?
It is more a feature than a bug fix. Not worth the small chance it may break something.
|
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.