Bug 23364 - Useless digits for float numbers in Cairo output
Summary: Useless digits for float numbers in Cairo output
Status: RESOLVED WONTFIX
Alias: None
Product: cairo
Classification: Unclassified
Component: pdf backend (show other bugs)
Version: 1.8.8
Hardware: Other All
: medium normal
Assignee: Adrian Johnson
QA Contact: cairo-bugs mailing list
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-08-17 03:11 UTC by clark
Modified: 2015-10-18 05:21 UTC (History)
2 users (show)

See Also:
i915 platform:
i915 features:


Attachments
uncompressed Cairo output (40.98 KB, application/pdf)
2009-08-17 03:12 UTC, clark
Details

Description clark 2009-08-17 03:11:10 UTC
I delved into source file of PDF output from Cairo. I found there are many useless digits for float numbers. Take the following section for example:
#####
<000a0006>Tj
2.979353 1.142857 Td  
<0006000100010003>Tj
0.449219 -1.142857 Td
<000a0008>Tj
3.12221 1.142857 Td
<0006000100010003>Tj
0.449219 -1.142857 Td
######

Maybe we can change 
2.979353 1.142857 Td  
to 
2.98 1.14 Td

This will reduce the size of Cairo output. I notice there are many other appearances for rg, RG, etc. I attach one PDF file from Cairo for references.

Thanks.
Comment 1 clark 2009-08-17 03:12:18 UTC
Created attachment 28700 [details]
uncompressed Cairo output
Comment 2 clark 2009-08-17 03:14:22 UTC
(In reply to comment #0)
> I delved into source file of PDF output from Cairo. I found there are many
> useless digits for float numbers. Take the following section for example:
> #####
> <000a0006>Tj
> 2.979353 1.142857 Td  
> <0006000100010003>Tj
> 0.449219 -1.142857 Td
> <000a0008>Tj
> 3.12221 1.142857 Td
> <0006000100010003>Tj
> 0.449219 -1.142857 Td
> ######
> Maybe we can change 
> 2.979353 1.142857 Td  
> to 
> 2.98 1.14 Td
> This will reduce the size of Cairo output. I notice there are many other
> appearances for rg, RG, etc. I attach one PDF file from Cairo for references.
> Thanks.

Comment 3 clark 2009-08-17 23:03:27 UTC
Can anyone take a look at this issue?
Comment 4 Adrian Johnson 2009-08-18 04:23:19 UTC
I am reluctant to reduce the precision of the PDF backend my too much as some people may require a higher precision for their workflow. Looking at PDFs generated by Adobe Distiller for example I see they are using 3 decimal digits for values in points. Some other PDFs are using more. So aiming to match the precision of Distiller seams reasonable to me.

The Td operator specifies a relative change to the text position in text units. The precision required depends in the font size. Assume a worst case of a 100 point font used anywhere on the page, two decimal places in addition to the 3 specified above is required two maintain precision. As it is a relative adjustment, each time the Td operator is used the error accumulates. So allow another digit as well bringing the total to 6 to digits that is currently used.

One option to reduce the number of decimals is to keep track of the error like what is done with the TJ operator so that subsequent Td operations stay within a tolerance.

In the case of rg/RG I see that Distiller is using 5 digits. I'll look into adjusting this.

In both these cases the number of decimals does not contribute significantly to the file size.
Comment 5 clark 2009-08-18 18:28:54 UTC
(In reply to comment #4)
> I am reluctant to reduce the precision of the PDF backend my too much as some
> people may require a higher precision for their workflow. Looking at PDFs
> generated by Adobe Distiller for example I see they are using 3 decimal digits
> for values in points. Some other PDFs are using more. So aiming to match the
> precision of Distiller seams reasonable to me.
> The Td operator specifies a relative change to the text position in text units.
> The precision required depends in the font size. Assume a worst case of a 100
> point font used anywhere on the page, two decimal places in addition to the 3
> specified above is required two maintain precision. As it is a relative
> adjustment, each time the Td operator is used the error accumulates. So allow
> another digit as well bringing the total to 6 to digits that is currently used.
> One option to reduce the number of decimals is to keep track of the error like
> what is done with the TJ operator so that subsequent Td operations stay within
> a tolerance.
> In the case of rg/RG I see that Distiller is using 5 digits. I'll look into
> adjusting this.
> In both these cases the number of decimals does not contribute significantly to
> the file size.

Adrian, Thanks for reply.
I am thinking whether Cairo library can provide the interface for customizing the number of digits.

Our app doesnot require such a high precision.We attach more importance on the size of PDF file. As you can see from the attachment, the size can be reduced by nearly 10%, if we can customize the number of digits.
Comment 6 Behdad Esfahbod 2009-08-18 18:51:50 UTC
Actually the API is there: cairo_set_tolerance().  The PDF backend can be fixed to respect that.
Comment 7 clark 2009-08-18 21:49:12 UTC
(In reply to comment #6)
> Actually the API is there: cairo_set_tolerance().  The PDF backend can be fixed
> to respect that.

Hi BehHad,

Thanks for informing me of this. We just checked the source of cairo_set_tolerance() and called it in our expected way in our app.However, it doesnot work still. The Cairo library version we are using is 1.8.8.

Can you help verify this?

Thanks
Comment 8 Behdad Esfahbod 2009-08-19 06:57:17 UTC
I meant to say the API is there, but it's not working.
Comment 9 clark 2009-08-19 06:59:55 UTC
(In reply to comment #8)
> I meant to say the API is there, but it's not working.

Will this interface be supported in the coming release?

Thanks
Comment 10 clark 2009-08-20 18:47:52 UTC
(In reply to comment #4)
> I am reluctant to reduce the precision of the PDF backend my too much as some
> people may require a higher precision for their workflow. Looking at PDFs
> generated by Adobe Distiller for example I see they are using 3 decimal digits
> for values in points. Some other PDFs are using more. So aiming to match the
> precision of Distiller seams reasonable to me.
> The Td operator specifies a relative change to the text position in text units.
> The precision required depends in the font size. Assume a worst case of a 100
> point font used anywhere on the page, two decimal places in addition to the 3
> specified above is required two maintain precision. As it is a relative
> adjustment, each time the Td operator is used the error accumulates. So allow
> another digit as well bringing the total to 6 to digits that is currently used.
> One option to reduce the number of decimals is to keep track of the error like
> what is done with the TJ operator so that subsequent Td operations stay within
> a tolerance.
> In the case of rg/RG I see that Distiller is using 5 digits. I'll look into
> adjusting this.
> In both these cases the number of decimals does not contribute significantly to
> the file size.

Adrian, are you guys planning to support cairo_set_tolerance() as behdad said?

Thanks
Comment 11 Adrian Johnson 2009-08-21 05:46:50 UTC
If you really want to control the output precision now you can edit the "#define SIGNIFICANT_DIGITS_AFTER_DECIMAL" at the top of src/cairo-output-stream.c.

I don't think overloading the meaning of cairo_set_tolerance() to control floating point precision in vector backends as well as the tolerance of curve flattening in image backends and is good API design.

Using a tolerance in device units does not make sense for all the floating point values emitted by the PDF backend. For example in PDF floats are used for:

 - coordinates
 - colors
 - Td operator for text
 - pattern matrices
 - function domains

Coordinates are already limited to 3 decimal places due to the use of
24.8 fixed point.

Controlling the precision of colors is not possible with a parameter in device units. Users may wish to control the precision of color independently of coordinates.

Neither does it make sense to use a value in device units to control the precision of values in pattern matrices. These should be set using number of significant figures.

The mesh gradient patches on my 'mesh' branch adds an additional complication when making precision adjustable. The number of bits per coordinate and bits per color component of coordinates and colors in the shading stream is selectable.

There really isn't any way to control the output precision of all these different floats with one paramater. Maybe the proposed debug environment variable would be a good way to set the precision particularly as many applications are unlikely to make available a user configurable setting for output precision.
Comment 12 clark 2009-08-23 23:30:17 UTC
(In reply to comment #11)
> If you really want to control the output precision now you can edit the
> "#define SIGNIFICANT_DIGITS_AFTER_DECIMAL" at the top of
> src/cairo-output-stream.c.
> I don't think overloading the meaning of cairo_set_tolerance() to control
> floating point precision in vector backends as well as the tolerance of curve
> flattening in image backends and is good API design.
> Using a tolerance in device units does not make sense for all the floating
> point values emitted by the PDF backend. For example in PDF floats are used
> for:
>  - coordinates
>  - colors
>  - Td operator for text
>  - pattern matrices
>  - function domains
> Coordinates are already limited to 3 decimal places due to the use of
> 24.8 fixed point.
> Controlling the precision of colors is not possible with a parameter in device
> units. Users may wish to control the precision of color independently of
> coordinates.
> Neither does it make sense to use a value in device units to control the
> precision of values in pattern matrices. These should be set using number of
> significant figures.
> The mesh gradient patches on my 'mesh' branch adds an additional complication
> when making precision adjustable. The number of bits per coordinate and bits
> per color component of coordinates and colors in the shading stream is
> selectable.
> There really isn't any way to control the output precision of all these
> different floats with one paramater. Maybe the proposed debug environment
> variable would be a good way to set the precision particularly as many
> applications are unlikely to make available a user configurable setting for
> output precision.

Thanks for your constant explanations.
Comment 13 Carl Worth 2009-08-28 13:37:11 UTC
(In reply to comment #11)
> If you really want to control the output precision now you can edit the
> "#define SIGNIFICANT_DIGITS_AFTER_DECIMAL" at the top of
> src/cairo-output-stream.c.

Hi Adrian,

Thanks for providing an idea that can be used immediately, (even if only
at compile time).

> I don't think overloading the meaning of cairo_set_tolerance() to control
> floating point precision in vector backends as well as the tolerance of curve
> flattening in image backends and is good API design.

[snip many details of many uses of floating-point values in PDF backend]

> There really isn't any way to control the output precision of all these
> different floats with one paramater.

That's definitely true. In the meantime, cairo_set_tolerance exists today
and to the extent that it has a well-defined semantic, I think the PDF
backend should implement that semantic wherever it makes sense.

This bug report isn't really the right place for API discussion, so I
suggest that any proposals for changes to the documentation of
cairo_set_tolerance, (if anyone has any), go happen on the cairo
mailing list.

>                                      Maybe the proposed debug environment
> variable would be a good way to set the precision particularly as many
> applications are unlikely to make available a user configurable setting for
> output precision.

That's probably a very good approach for getting some run-time control
for what you described above, (some of which seems obviously separate
from a notion of "tolerance", which in my view is quite tightly bound
to a geometric context).

-Carl

Comment 14 Martin 2015-01-06 06:01:44 UTC
Greetings from 2015!

As a gentle push back against the initial post, and in support of Comment 4, please see my comments for bug#57021 in LibreOffice.

It looks as though they reduced the output to just one decimal place. For an A4 page the discretisation errors become noticeable at 500-1200% zoom in Adobe Reader, which I would consider a bug (given that LibreOffice Draw goes up to 3000% zoom).

I mean, what's the point of vector graphics when you get discretisation errors the moment you start to zoom in beyond 100%?

Having said that, this means that for the two decimal places suggested by clark, you can just about notice the discretisation at 5000% zoom. My Adobe Reader 9 on Linux only goes up to 6400%, so this is something I could live with. However the Adobe Distiller setting mentioned by Adrian Johnson - three decimal places - is a safer bet.
Comment 15 Adrian Johnson 2015-10-18 05:21:30 UTC
Closing this as I have no intention of changing the output. If you really need a reduced precision, comment 11 contains a workaround.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.