Bug 36653 - Missing functions for setting document's properties
Summary: Missing functions for setting document's properties
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: Other All
: medium normal
Assignee: Jakub Alba
QA Contact:
URL:
Whiteboard:
Keywords:
: 71999 (view as bug list)
Depends on:
Blocks:
 
Reported: 2011-04-28 01:29 UTC by Rodrigo Moya
Modified: 2016-12-15 14:01 UTC (History)
6 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Patch to add this new API (5.68 KB, patch)
2011-04-28 01:29 UTC, Rodrigo Moya
Details | Splinter Review
glib info-dict API (3.14 KB, patch)
2014-10-01 19:35 UTC, Scott West
Details | Splinter Review
core document info dict API (6.74 KB, patch)
2014-10-04 10:10 UTC, Scott West
Details | Splinter Review
core document info dict API rev 2 (5.58 KB, patch)
2014-10-05 19:13 UTC, Scott West
Details | Splinter Review
Poppler core patch (6.21 KB, patch)
2016-01-28 23:44 UTC, Jakub Alba
Details | Splinter Review
Poppler core patch (8.32 KB, patch)
2016-01-28 23:58 UTC, Jakub Alba
Details | Splinter Review
Poppler core patch (10.65 KB, patch)
2016-01-31 18:52 UTC, Jakub Alba
Details | Splinter Review
Poppler core patch (11.76 KB, patch)
2016-02-08 20:17 UTC, Jakub Alba
Details | Splinter Review
Poppler glib patch (16.02 KB, patch)
2016-02-08 20:20 UTC, Jakub Alba
Details | Splinter Review
Poppler cpp patch (18.36 KB, patch)
2016-02-15 14:56 UTC, Jakub Alba
Details | Splinter Review
patch (70.62 KB, patch)
2016-02-23 23:20 UTC, Jakub Alba
Details | Splinter Review
[PATCH 1/7] Added XRef modification flag (3.33 KB, patch)
2016-06-10 22:23 UTC, Jakub Alba
Details | Splinter Review
[PATCH 2/7] Added DocInfo setters & getters (9.38 KB, patch)
2016-06-10 22:25 UTC, Jakub Alba
Details | Splinter Review
[PATCH 3/7] cpp: Added functions to save a document (2.25 KB, patch)
2016-06-10 22:27 UTC, Jakub Alba
Details | Splinter Review
[PATCH 4/7] cpp: Added document property setters & getters (12.67 KB, patch)
2016-06-10 22:27 UTC, Jakub Alba
Details | Splinter Review
[PATCH 5/7] qt4: Added document property setters & getters (12.73 KB, patch)
2016-06-10 22:29 UTC, Jakub Alba
Details | Splinter Review
[PATCH 6/7] qt5: Added document property setters & getters (12.75 KB, patch)
2016-06-10 22:30 UTC, Jakub Alba
Details | Splinter Review
[PATCH 7/7] glib: Added document property setters & simplified getters (16.01 KB, patch)
2016-06-10 22:34 UTC, Jakub Alba
Details | Splinter Review
[PATCH 2/7] Added DocInfo setters & getters (9.43 KB, patch)
2016-06-17 10:24 UTC, Jakub Alba
Details | Splinter Review
[PATCH 7/7] glib: Added document property setters & simplified getters (16.01 KB, patch)
2016-06-17 10:25 UTC, Jakub Alba
Details | Splinter Review
[PATCH 1/7] Added XRef modification flag (3.79 KB, patch)
2016-06-17 12:30 UTC, Jakub Alba
Details | Splinter Review
[PATCH 2/7] Added DocInfo setters & getters (9.37 KB, patch)
2016-06-17 12:30 UTC, Jakub Alba
Details | Splinter Review
glib: make document metatag gobject properties writeable (4.27 KB, patch)
2016-07-16 23:46 UTC, Jakub Alba
Details | Splinter Review
[PATCH 1/2] qt4: fix memory leaks in Document::modificationDate() and Document::creationDate() (1.16 KB, patch)
2016-07-24 11:48 UTC, Jakub Alba
Details | Splinter Review
[PATCH 2/2] qt5: fix memory leaks in Document::modificationDate() and Document::creationDate() (1.16 KB, patch)
2016-07-24 11:48 UTC, Jakub Alba
Details | Splinter Review

Description Rodrigo Moya 2011-04-28 01:29:28 UTC
Created attachment 46136 [details] [review]
Patch to add this new API

The glib frontend misses functions for setting some of the document's properties, like author, subject and title. The attached patch adds them
Comment 1 Rodrigo Moya 2011-04-28 01:30:19 UTC
I wrote a simple test program to test it, so pasting here just in case it's useful:

#include <poppler.h>

int
main (int argc, char *argv[])
{
	PopplerDocument *document;
	GError *error = NULL;

	g_type_init ();

	if (argc != 6) {
		g_print ("Usage: %s filename output new_title new_author new_subject\n", argv[0]);
		return -1;
	}

	document = poppler_document_new_from_file (argv[1], NULL, &error);
	if (!document) {
		g_print ("Error opening %s: %s\n", argv[1], error->message);
		g_error_free (error);

		return -1;
	}

	g_print ("Changing title from %s to %s\n",
		 poppler_document_get_title (document),
		 argv[3]);
	poppler_document_set_title (document, argv[3]);

	g_print ("Changing author from %s to %s\n",
		 poppler_document_get_author (document),
		 argv[4]);
	poppler_document_set_author (document, argv[4]);

	g_print ("Changing subject from %s to %s\n",
		 poppler_document_get_subject (document),
		 argv[5]);
	poppler_document_set_subject (document, argv[5]);

	error = NULL;
	if (!poppler_document_save (document, argv[2], &error)) {
		g_print ("Error saving %s: %s\n", argv[2], error->message);
		g_error_free (error);

		return -1;
	}

	g_object_unref (document);

	return 0;
}
Comment 2 Carlos Garcia Campos 2011-04-28 08:27:52 UTC
Comment on attachment 46136 [details] [review]
Patch to add this new API

Hey rodrigo, patch looks goot to me. Common parts should be in the core, so that other frontends could use them too. Frontends currently get the info dict directly, so it's probably a good time to add a PDFInfo class in the core with setters and getters. See comments inline.

>From 3ce14bcbf198fc93365d664b0db5745680967764 Mon Sep 17 00:00:00 2001
>From: Rodrigo Moya <rodrigo@gnome-db.org>
>Date: Thu, 28 Apr 2011 10:24:05 +0200
>Subject: [PATCH] glib: Add _set functions for some PopplerDocument properties
>
>---
> glib/poppler-document.cc |  154 ++++++++++++++++++++++++++++++++++++++++++++++
> glib/poppler-document.h  |    6 ++
> 2 files changed, 160 insertions(+), 0 deletions(-)
>
>diff --git a/glib/poppler-document.cc b/glib/poppler-document.cc
>index 710a5b3..360aa42 100644
>--- a/glib/poppler-document.cc
>+++ b/glib/poppler-document.cc
>@@ -607,6 +607,46 @@ char *_poppler_goo_string_to_utf8(GooString *s)
>   return result;
> }
> 
>+GooString *_poppler_goo_string_from_utf8(const gchar *source)
>+{
>+  char *s, *utf8;
>+  gsize bytes_written;
>+  GooString *result;
>+
>+  utf8 = g_locale_to_utf8 (source, strlen (source), NULL, &bytes_written, NULL);

you shouldn't need this, we assume all strings used by the poppler-glib api are always utf-8, like glib/gtk+.

>+  s = g_convert (utf8, bytes_written,
>+		 "UTF-16BE", "UTF-8", NULL, &bytes_written, NULL);
>+  
>+  result = new GooString (s, bytes_written);

that's the only part that should be in the glib forntend

>+  if (!result->hasUnicodeMarker()) {
>+    result->insert(0, 0xff);
>+    result->insert(0, 0xfe);
>+  }

This should be in the core.

>+  g_free (s);
>+  g_free (utf8);
>+
>+  return result;
>+}
>+
>+static gboolean
>+mark_as_modified (PopplerDocument *document, Object *obj)
>+{
>+  gboolean result = FALSE;
>+  XRef *xref;
>+  Object info_dict;
>+
>+  xref = document->doc->getXRef ();
>+  if (xref != NULL) {
>+    xref->getDocInfoNF (&info_dict);
>+    xref->setModifiedObject (obj, info_dict.getRef ());

info_dict should be freed here with info_dict.free();

>+    result = TRUE;
>+  }
>+
>+  return result;
>+}

This method should be part fo the core too.

> static gchar *
> info_dict_get_string (Dict *info_dict, const gchar *key)
> {
>@@ -763,6 +803,44 @@ poppler_document_get_title (PopplerDocument *document)
> }
> 

if we add the PDFDocInfo class, all get/set methods would simply use that class.
Comment 3 Jose Aliste 2013-11-25 22:08:55 UTC
*** Bug 71999 has been marked as a duplicate of this bug. ***
Comment 4 Scott West 2014-10-01 19:35:15 UTC
Created attachment 107206 [details] [review]
glib info-dict API

This is a patch that uses Rodrigo's original patch as a guide for usage of the internal API. It adds support for setting/getting strings from the info dict, with the original impetus to make it possible to add custom metadata in a convenient way.

It could be used in a later step to add support for setting title/subject/etc, which would relate it pretty directly to this bug.
Comment 5 Albert Astals Cid 2014-10-02 19:40:48 UTC
I'm going to repeat Carlos here "Common parts should be in the core, so that other frontends could use them too."
Comment 6 Scott West 2014-10-03 17:15:08 UTC
(In reply to Albert Astals Cid from comment #5)
> I'm going to repeat Carlos here "Common parts should be in the core, so that
> other frontends could use them too."

(Sorry about the delay, didn't CC myself before)

Okay, I will give that a try.

The review also recommends creating a PDFDocInfo class, is that still recommended? And if so, should it replace the return type of the "getDocInfo" method?
Comment 7 Scott West 2014-10-04 10:10:53 UTC
Created attachment 107317 [details] [review]
core document info dict API

This adds an API to PDFDoc to get/set string entries in the doc's info dict.
Comment 8 Albert Astals Cid 2014-10-05 18:21:02 UTC
Please use proper style for the core code, i.e. don't add those jumps on function declarations.

Also no need to check if the xref is there or not, if the doc doesn't have an xref you don't really have a doc and should be calling this anyway.
Comment 9 Scott West 2014-10-05 18:34:35 UTC
(In reply to Albert Astals Cid from comment #8)
> Please use proper style for the core code, i.e. don't add those jumps on
> function declarations.
> 
> Also no need to check if the xref is there or not, if the doc doesn't have
> an xref you don't really have a doc and should be calling this anyway.

Thanks for the comments! I've removed the check on the Doc's XRef and modified the type signature appropriately.

What "jumps" on the declarations? Do you mean the newline between the get and set in PDFDoc.h? Is there a style guide somewhere for future reference?
Comment 10 Albert Astals Cid 2014-10-05 18:46:38 UTC
+GBool
+PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)

has to be

+GBool PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)

There is no document guideline, the file you're editing is your style guideline. Also you're not using the same indentation style.

In the core we use the crazy 2 space indentation and 8 spaces tab.
Comment 11 Scott West 2014-10-05 19:12:59 UTC
(In reply to Albert Astals Cid from comment #10)
> +GBool
> +PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> 
> has to be
> 
> +GBool PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> 
> There is no document guideline, the file you're editing is your style
> guideline. Also you're not using the same indentation style.
> 
> In the core we use the crazy 2 space indentation and 8 spaces tab.

Ah, thanks! Okay I've addressed the useless check and style issues. If I understand that tabs are used every 8 spaces... crazy indeed :).
Comment 12 Scott West 2014-10-05 19:13:38 UTC
Created attachment 107369 [details] [review]
core document info dict API rev 2

This addresses useless checks and style issues.
Comment 13 Albert Astals Cid 2014-10-07 22:13:13 UTC
I don't think you should add the unicode marker, the goostring should come correctly formated by the caller.
Comment 14 Scott West 2014-10-08 04:37:15 UTC
(In reply to Albert Astals Cid from comment #13)
> I don't think you should add the unicode marker, the goostring should come
> correctly formated by the caller.

It seemed that Carlos was asking for this to be in the core in his review of the original patch by Rodrigo. I have no opinion on where it should go, so whatever you feel is best :).
Comment 15 Albert Astals Cid 2014-10-08 20:27:58 UTC
Carlos why do you want the core to add that marker? Not using the marker is fine and in my opinion whoever calls the function should pass a correctly formatted GooString, not relying on the core to fixup the GooString.
Comment 16 Carlos Garcia Campos 2015-12-18 08:26:52 UTC
(In reply to Albert Astals Cid from comment #15)
> Carlos why do you want the core to add that marker? Not using the marker is
> fine and in my opinion whoever calls the function should pass a correctly
> formatted GooString, not relying on the core to fixup the GooString.

I don't remember, but I agree with you.
Comment 17 Carlos Garcia Campos 2015-12-18 08:42:37 UTC
(In reply to Carlos Garcia Campos from comment #16)
> (In reply to Albert Astals Cid from comment #15)
> > Carlos why do you want the core to add that marker? Not using the marker is
> > fine and in my opinion whoever calls the function should pass a correctly
> > formatted GooString, not relying on the core to fixup the GooString.
> 
> I don't remember, but I agree with you.

Looking at the code, my guess is that I proposed it because that's what we do when settings form and annot contents in the core.
Comment 18 Carlos Garcia Campos 2015-12-18 09:01:30 UTC
Comment on attachment 107369 [details] [review]
core document info dict API rev 2

Review of attachment 107369 [details] [review]:
-----------------------------------------------------------------

::: glib/poppler-document.cc
@@ +778,5 @@
>  }
>  
> +gchar *
> +poppler_document_info_get_string (PopplerDocument *document,
> +                                  const gchar *key)

Instead of adding API to extend the info dict with new metadata, I would start adding support for modifying the well known entries, like Rodrigo's patch did. I think we could split this patch, first adding the core part and the glib API, so that tthe glib api doesn't block the core part that could be used by qt.

::: poppler/PDFDoc.cc
@@ +538,5 @@
>  
> +// Set/get doc info string entries
> +GooString *PDFDoc::getDocInfoStringEntry(const char *key)
> +{
> +  Object info_obj;

I agree we are not always consistent, but in the core we usually prefer camel case instead of underscores, so this would be infoObj, instead of info_obj, and the same for all other cases.

@@ +546,5 @@
> +  if (info_obj.isDict()) {
> +    Object result_obj;
> +
> +    if (info_obj.dictLookup(key, &result_obj, 0)->isString()) {
> +      result = new GooString(result_obj.getString());

You can do result = resultObj.takeString() to avoid a memory allocation.

@@ +571,5 @@
> +
> +GBool PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> +{
> +  Object info_obj;
> +  GBool success = gFalse;

I'm not sure this could fail, the getter can fail because the entry might not be present, but here we are setting a new entry, how can that fail?

@@ +578,5 @@
> +
> +  if (info_obj.isDict()) {
> +    Object goo_str_obj;
> +
> +    if (value) {

The spec says we shouldn't store entry values, but omit the entries, so we should also check that the string is not empty when not NULL.

@@ +593,5 @@
> +
> +    info_obj.dictSet(key, &goo_str_obj);
> +    setInfoModified(this, &info_obj);
> +    success = gTrue;
> +  }

If the document doesn't have an info dict we should create one instead of failing.
Comment 19 Albert Astals Cid 2016-01-16 14:35:15 UTC
Why do we need getDocInfoStringEntry? there's already ways to get the doc info.

Also on setDocInfoStringEntry I don't see why it should fail if there's no info dict, we can create one, no?
Comment 20 Jakub Alba 2016-01-26 19:38:18 UTC
(In reply to Albert Astals Cid from comment #19)
> Why do we need getDocInfoStringEntry? there's already ways to get the doc
> info.
It could simplify frontend functions for getting document properties.
Comment 21 Jakub Alba 2016-01-28 23:44:47 UTC
Created attachment 121375 [details] [review]
Poppler core patch

I implemented the needed functions for the poppler core for now. A patch for frontend will come later. This way core and frontend patches will be split just as Carlos suggested.

Please review. I don't want to write the frontend part knowing that the core API might still change.
Comment 22 Jakub Alba 2016-01-28 23:58:17 UTC
Created attachment 121377 [details] [review]
Poppler core patch

This patch is the same as the previous. It just adds copyright info. Forgot to add it in the previous one, sorry.
Comment 23 Carlos Garcia Campos 2016-01-31 09:13:33 UTC
Comment on attachment 121377 [details] [review]
Poppler core patch

Review of attachment 121377 [details] [review]:
-----------------------------------------------------------------

Thanks for the patch, it looks good to me in general, I have a just a few comments.

::: poppler/PDFDoc.cc
@@ +569,5 @@
> +void PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> +{
> +  Object infoObj;
> +  getDocInfo(&infoObj);
> +  createInfoDictIfNoneExists(&infoObj);

Since the doc info dict is part of the xref, and we already have XRef::getDocInfo(), I think we could move this to XRef as something like XRef::createDocInfo() that internally marks the dict as modified, so that we only need the getter.

@@ +575,5 @@
> +  Object gooStrObj;
> +  gooStrObj.initNull();
> +
> +  if (value && value->getLength() != 0) {
> +    gooStrObj.initString(value);

Nit: instead of unconditionally init the object to null I would add an else here, we init it to string or null.

@@ +579,5 @@
> +    gooStrObj.initString(value);
> +  }
> +
> +  // gooStrObj is set to value or null by now. The latter will cause a removal.
> +  infoObj.dictSet(key, &gooStrObj);

Ok, so if we pass a null or empty string, the entry is removed from the dict. So, it could be the case of calling this with null for a document that doesn't have an info dict, in which case we are creating an info dict, just to remove it. I think we could return early in such case.

@@ +583,5 @@
> +  infoObj.dictSet(key, &gooStrObj);
> +
> +  if (infoObj.dictGetLength() == 0) {
> +    // Info dictionary is empty. Remove it altogether.
> +    removeDocInfo();

hmm, we could probably leave an empty dict, the problem of doing this is that if your API does something like set("Foo", NULL); set("Bar", "baz"); if "Foo" was the only entry, the dictionary is removed and a new one is created when setting "Bar". So, I would just leave the empty dict if the last entry is removed.

::: poppler/PDFDoc.h
@@ +226,5 @@
>    Object *getDocInfo(Object *obj) { return xref->getDocInfo(obj); }
>    Object *getDocInfoNF(Object *obj) { return xref->getDocInfoNF(obj); }
>  
> +  // Remove the document's Info dictionary and update the trailer dictionary.
> +  void removeDocInfo();

Why is this public?
Comment 24 Jakub Alba 2016-01-31 09:55:19 UTC
(In reply to Carlos Garcia Campos from comment #23)
> @@ +583,5 @@
> > +  infoObj.dictSet(key, &gooStrObj);
> > +
> > +  if (infoObj.dictGetLength() == 0) {
> > +    // Info dictionary is empty. Remove it altogether.
> > +    removeDocInfo();
> 
> hmm, we could probably leave an empty dict, the problem of doing this is
> that if your API does something like set("Foo", NULL); set("Bar", "baz"); if
> "Foo" was the only entry, the dictionary is removed and a new one is created
> when setting "Bar". So, I would just leave the empty dict if the last entry
> is removed.

Thanks for the review. So if we don't want to save a document with an empty dict, the dict should be removed in one of PDFDoc::saveSth functions?

> ::: poppler/PDFDoc.h
> @@ +226,5 @@
> >    Object *getDocInfo(Object *obj) { return xref->getDocInfo(obj); }
> >    Object *getDocInfoNF(Object *obj) { return xref->getDocInfoNF(obj); }
> >  
> > +  // Remove the document's Info dictionary and update the trailer dictionary.
> > +  void removeDocInfo();
> 
> Why is this public?

I thought that if someone wanted to strip a PDF of all properties, instead of calling set(key, NULL) for every single entry, they could just remove the info dict. It would be more "direct" and shorter.

I'll fix the things you pointed out. Also we could add functions like set_title, set_author etc. and get_title, get_author etc. I think. At last these are the properties described in the PDF spec. It would make things a bit more elegant in the frontends. What do you think? (The getters should probably be added in a separate patch though, since they don't have anything to do with this bug report.) I think you wrote about it in the comment #2, but I might have misunderstood you.
Comment 25 Jakub Alba 2016-01-31 12:07:57 UTC
(In reply to Carlos Garcia Campos from comment #23)
> ::: poppler/PDFDoc.h
> @@ +226,5 @@
> >    Object *getDocInfo(Object *obj) { return xref->getDocInfo(obj); }
> >    Object *getDocInfoNF(Object *obj) { return xref->getDocInfoNF(obj); }
> >  
> > +  // Remove the document's Info dictionary and update the trailer dictionary.
> > +  void removeDocInfo();
> 
> Why is this public?

It could also be implemented in XRef and changed in PDFDoc to an inline function just like PDFDoc::getDocInfo. So if you don't mind leaving it public, I'd do it.
Comment 26 Jakub Alba 2016-01-31 18:52:24 UTC
Created attachment 121427 [details] [review]
Poppler core patch

So this would look like this, if we agree on several things:

(In reply to Carlos Garcia Campos from comment #23)
> @@ +583,5 @@
> > +  infoObj.dictSet(key, &gooStrObj);
> > +
> > +  if (infoObj.dictGetLength() == 0) {
> > +    // Info dictionary is empty. Remove it altogether.
> > +    removeDocInfo();
> 
> hmm, we could probably leave an empty dict, the problem of doing this is
> that if your API does something like set("Foo", NULL); set("Bar", "baz"); if
> "Foo" was the only entry, the dictionary is removed and a new one is created
> when setting "Bar". So, I would just leave the empty dict if the last entry
> is removed.

I don't think that saving a document with an empty Info dict is "the right way". Apparently the authors of pdftk would agree with me on that. And removing the document's Info dictionary in saveAs/saveCompleteRewrite/saveIncrement doesn't feel right either. So I would prefer to keep it in setDocInfoStringEntry and don't mind the unnecessary steps occuring in special cases.

Then there is the case of removeDocInfo being public, so I'm waiting for your answer.

And, as I said, I will introduce the getters in a separate patch under a seperate bug report once the work with this bug report is done.
Comment 27 Jakub Alba 2016-02-08 20:17:06 UTC
Created attachment 121601 [details] [review]
Poppler core patch

A new patch for core - validated by the glib frontend. (I'll send the glib patch in a second.)
Comment 28 Jakub Alba 2016-02-08 20:20:14 UTC
Created attachment 121602 [details] [review]
Poppler glib patch

The glib frontend part can be tested with these two programs: https://gist.github.com/yakubin/63a7c94d9fee345a33fb

Please keep in mind that date getters may not work as you'd expect, since this: https://bugs.freedesktop.org/show_bug.cgi?id=94035 hasn't been committed yet.

Now, I think, the only thing left is to do the same for other frontends.
Comment 29 Jakub Alba 2016-02-15 14:56:50 UTC
Created attachment 121766 [details] [review]
Poppler cpp patch

Of course the cpp frontend also has a bug concerning timezones, but it doesn't invalidate this patch, so I'll deal with that later.
Comment 30 Jakub Alba 2016-02-23 23:20:38 UTC
Created attachment 121929 [details] [review]
patch

So I've finished writing the code.
Of course even though it's a single file, there are several patches in it. (I made it a single file in order to make the applying easier.)
I split the core patch in order to make it easier to review.

The first core patch is a bit different from the previous one. It's simpler and makes the save functions significantly faster (as can be seen by running this benchmark: https://github.com/yakubin/poppler-bench).

Patches for Bug 94173 may cause marge conflicts with these patches, so I'll be happy if these patches are applied first, since this is a big series of patches and it would be harder to resolve here, but of course if they aren't, I will resolve the conflicts.

I think now is the time for the final review.

Cheers.
Comment 31 Jakub Alba 2016-02-23 23:22:04 UTC
PS. "These patches" meaning "the patches I've just sent".
Comment 32 Jakub Alba 2016-06-10 22:23:27 UTC
Created attachment 124457 [details] [review]
[PATCH 1/7] Added XRef modification flag

Rebased the patches and corrected a couple of things (details in descriptions).

This patch enables detection of modification of trailer dict. It also speeds up the process of saving a document by removing an unnecessary loop.
Comment 33 Jakub Alba 2016-06-10 22:25:16 UTC
Created attachment 124458 [details] [review]
[PATCH 2/7] Added DocInfo setters & getters

This patch adds the necessary functions in core. No recent changes here.
Comment 34 Jakub Alba 2016-06-10 22:27:03 UTC
Created attachment 124459 [details] [review]
[PATCH 3/7] cpp: Added functions to save a document

Made the functions return a bool to give feedback to the user about success or failure.
Comment 35 Jakub Alba 2016-06-10 22:27:56 UTC
Created attachment 124460 [details] [review]
[PATCH 4/7] cpp: Added document property setters & getters

Same as above.
Comment 36 Jakub Alba 2016-06-10 22:29:55 UTC
Created attachment 124461 [details] [review]
[PATCH 5/7] qt4: Added document property setters & getters

Added Document::removeInfo() which I forgot to add to qt frontend in the previous series of patches.
Comment 37 Jakub Alba 2016-06-10 22:30:51 UTC
Created attachment 124462 [details] [review]
[PATCH 6/7] qt5: Added document property setters & getters

Same as above.
Comment 38 Jakub Alba 2016-06-10 22:34:13 UTC
Created attachment 124463 [details] [review]
[PATCH 7/7] glib: Added document property setters & simplified getters

Changed the "Since" versions in comments for the new functions to the next version of poppler.
Comment 39 Jakub Alba 2016-06-10 22:37:51 UTC
If you test the cpp frontend with these patches, you may run into problems resulting from Bug 96426. And aside from that if you run 'tail -n 50' on a modified pdf produced by poppler you may notice that it produces 2 trailer dicts (and 2 metadata objects if you modify it). This bug is not related. Its source is somewhere in saveIncrementalUpdate(). I'm going to investigate it further, but I don't think it should be a blocker for these patches.
Comment 40 Jakub Alba 2016-06-14 19:38:17 UTC
I reported the bug I described (Bug 96529). You can see that it has nothing to do with this one, since it occurs with the code in the master branch.
Comment 41 Jakub Alba 2016-06-17 10:24:11 UTC
Created attachment 124569 [details] [review]
[PATCH 2/7] Added DocInfo setters & getters

Added the lines:
"if (infoObj.isNull()) {
    return NULL;
}"

Without them a program is aborted when it calls a getter if there is no metadata whatsoever.
Comment 42 Jakub Alba 2016-06-17 10:25:22 UTC
Created attachment 124570 [details] [review]
[PATCH 7/7] glib: Added document property setters & simplified getters

Changed the "Since" values in comments to the next version of poppler.
Comment 43 Jakub Alba 2016-06-17 12:30:04 UTC
Created attachment 124572 [details] [review]
[PATCH 1/7] Added XRef modification flag

Moved all XRef::setModified() calls to XRef::addIndirectObject(), XRef::removeIndirectObject() and XRef::setModifiedObject(). This way introducing bugs related to it will be less probable and the code will be cleaner.
Comment 44 Jakub Alba 2016-06-17 12:30:36 UTC
Created attachment 124573 [details] [review]
[PATCH 2/7] Added DocInfo setters & getters

Same as above.
Comment 45 Carlos Garcia Campos 2016-07-03 08:48:57 UTC
Comment on attachment 124572 [details] [review]
[PATCH 1/7] Added XRef modification flag

Review of attachment 124572 [details] [review]:
-----------------------------------------------------------------

This looks goo to me, I'll push it.
Comment 46 Carlos Garcia Campos 2016-07-03 08:55:34 UTC
Comment on attachment 124573 [details] [review]
[PATCH 2/7] Added DocInfo setters & getters

Review of attachment 124573 [details] [review]:
-----------------------------------------------------------------

Looks good to me, I have only a couple of comments, but to not delay this more I'll push a modified patch with my comments addressed.

::: poppler/PDFDoc.cc
@@ +655,5 @@
> +
> +  GooString *result;
> +
> +  if (entryObj.isString()) {
> +    result = new GooString(entryObj.getString());

As I suggested in a previous review, we can save the allocation here, by using GooString::takeString().

::: poppler/PDFDoc.h
@@ +234,5 @@
>  
> +  // Create and return the document's Info dictionary if none exists.
> +  // Otherwise return the existing one.
> +  Object *createDocInfoIfNoneExists(Object *obj) { return xref->createDocInfoIfNoneExists(obj); }
> +  Object *createDocInfoIfNoneExistsNF(Object *obj) { return xref->createDocInfoIfNoneExistsNF(obj); }

This doesn't seem to be used anywhere, I prefer to not add dead code, if we eventually need this we can just add it.

::: poppler/XRef.cc
@@ +1312,5 @@
> +
> +  return obj;
> +}
> +
> +Object *XRef::createDocInfoIfNoneExistsNF(Object *obj) {

Ditto.

@@ +1338,5 @@
> +}
> +
> +void XRef::removeDocInfo() {
> +  Object infoObjRef;
> +  getDocInfoNF(&infoObjRef);

Since this is public in the end, we should check that it's not null.
Comment 47 Carlos Garcia Campos 2016-07-03 09:03:24 UTC
Comment on attachment 124570 [details] [review]
[PATCH 7/7] glib: Added document property setters & simplified getters

Review of attachment 124570 [details] [review]:
-----------------------------------------------------------------

Same here, I have a few more comments, but I'll push a modified patch.

::: glib/poppler-document.cc
@@ +754,3 @@
>  
> +  gchar *utf16 = g_convert (src, -1, "UTF-16BE", "UTF-8", NULL, &outlen, NULL);
> +  GooString *result = new GooString (utf16, outlen);

g_convert returns NULL if the conversion fails for whatever reason, we should handle that case.

@@ +758,2 @@
>  
> +  assert(result->getLength() != 0);

And then this assert is not needed, I would say.

@@ +764,3 @@
>    }
>  
> +  assert((*result).getLength() != 0);

Nor this redundant one.

@@ +886,5 @@
> + * poppler_document_set_title:
> + * @document: A #PopplerDocument
> + * @title: A new title
> + *
> + * Sets the document's title. If @title is NULL or its length is 0, Title entry

NULL -> %NULL
We don't normally use empty strings in glib APIs, so better to only handle the case of NULL explicitly.

@@ +896,5 @@
> +poppler_document_set_title (PopplerDocument *document, const gchar *title)
> +{
> +  g_return_if_fail (POPPLER_IS_DOCUMENT (document));
> +  GooString *goo_title = _poppler_goo_string_from_utf8(title);
> +  document->doc->setDocInfoTitle(goo_title);

We don't want to remove the entry if the conversion fails, so we should handle the case of NULL given by the user separately from the the case of _poppler_goo_string_from_utf8 returning NULL. These comments apply to all setters.

@@ +1122,4 @@
>  time_t
>  poppler_document_get_creation_date (PopplerDocument *document)
>  {
> +  g_return_val_if_fail (POPPLER_IS_DOCUMENT (document), -1);

I think we should always cast (time_t)-1

@@ +2831,5 @@
>    return NULL;
>  }
>  
> +GooString *
> +_poppler_convert_gtime_to_pdf_date (time_t gdate) {

We don't need to add a new function for this, there's timeToDateString() in the core.
Comment 48 Carlos Garcia Campos 2016-07-03 09:04:47 UTC
I've reviewed and pushed the core and glib patches, thanks!. The other patches should be reviewed by Pino and/or Albert.
Comment 49 Jakub Alba 2016-07-03 09:19:44 UTC
(In reply to Carlos Garcia Campos from comment #48)
> I've reviewed and pushed the core and glib patches, thanks!. The other
> patches should be reviewed by Pino and/or Albert.

Thanks a lot for your immense patience! :)
Comment 50 Jakub Alba 2016-07-16 23:46:03 UTC
Created attachment 125116 [details] [review]
glib: make document metatag gobject properties writeable

I'd completely forgotten about the existence of GObject properties. Sorry.
It may be useful to be able to modify the document properties via GObject properties. Thankfully, this patch is small.
Comment 51 Carlos Garcia Campos 2016-07-17 07:51:13 UTC
(In reply to Jakub Kucharski from comment #50)
> Created attachment 125116 [details] [review] [review]
> glib: make document metatag gobject properties writeable
> 
> I'd completely forgotten about the existence of GObject properties. Sorry.
> It may be useful to be able to modify the document properties via GObject
> properties. Thankfully, this patch is small.

Pushed, thanks!
Comment 52 Albert Astals Cid 2016-07-24 10:48:37 UTC
Commited, can you check QDateTime Document::modificationDate() const and QDateTime Document::creationDate() const in the qt frontends? I think they are leaky.
Comment 53 Jakub Alba 2016-07-24 11:48:02 UTC
Created attachment 125283 [details] [review]
[PATCH 1/2] qt4: fix memory leaks in Document::modificationDate() and Document::creationDate()

(In reply to Albert Astals Cid from comment #52)
> Commited, can you check QDateTime Document::modificationDate() const and
> QDateTime Document::creationDate() const in the qt frontends? I think they
> are leaky.

You are right.
Comment 54 Jakub Alba 2016-07-24 11:48:40 UTC
Created attachment 125284 [details] [review]
[PATCH 2/2] qt5: fix memory leaks in Document::modificationDate() and Document::creationDate()
Comment 55 Albert Astals Cid 2016-07-24 21:08:07 UTC
All pushed as far as i can see.
Comment 56 Jakub Alba 2016-07-24 21:18:00 UTC
(In reply to Albert Astals Cid from comment #55)
> All pushed as far as i can see.

Thanks!


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.