Bug 36653 - Missing functions for setting document's properties
Missing functions for setting document's properties
Status: NEW
Product: poppler
Classification: Unclassified
Component: glib frontend
unspecified
Other All
: medium normal
Assigned To: poppler-bugs
:
: 71999 (view as bug list)
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2011-04-28 01:29 UTC by Rodrigo Moya
Modified: 2014-10-08 20:27 UTC (History)
3 users (show)

See Also:


Attachments
Patch to add this new API (5.68 KB, patch)
2011-04-28 01:29 UTC, Rodrigo Moya
Details | Splinter Review
glib info-dict API (3.14 KB, patch)
2014-10-01 19:35 UTC, Scott West
Details | Splinter Review
core document info dict API (6.74 KB, patch)
2014-10-04 10:10 UTC, Scott West
Details | Splinter Review
core document info dict API rev 2 (5.58 KB, patch)
2014-10-05 19:13 UTC, Scott West
Details | Splinter Review

Note You need to log in before you can comment on or make changes to this bug.
Description Rodrigo Moya 2011-04-28 01:29:28 UTC
Created attachment 46136 [details] [review]
Patch to add this new API

The glib frontend misses functions for setting some of the document's properties, like author, subject and title. The attached patch adds them
Comment 1 Rodrigo Moya 2011-04-28 01:30:19 UTC
I wrote a simple test program to test it, so pasting here just in case it's useful:

#include <poppler.h>

int
main (int argc, char *argv[])
{
	PopplerDocument *document;
	GError *error = NULL;

	g_type_init ();

	if (argc != 6) {
		g_print ("Usage: %s filename output new_title new_author new_subject\n", argv[0]);
		return -1;
	}

	document = poppler_document_new_from_file (argv[1], NULL, &error);
	if (!document) {
		g_print ("Error opening %s: %s\n", argv[1], error->message);
		g_error_free (error);

		return -1;
	}

	g_print ("Changing title from %s to %s\n",
		 poppler_document_get_title (document),
		 argv[3]);
	poppler_document_set_title (document, argv[3]);

	g_print ("Changing author from %s to %s\n",
		 poppler_document_get_author (document),
		 argv[4]);
	poppler_document_set_author (document, argv[4]);

	g_print ("Changing subject from %s to %s\n",
		 poppler_document_get_subject (document),
		 argv[5]);
	poppler_document_set_subject (document, argv[5]);

	error = NULL;
	if (!poppler_document_save (document, argv[2], &error)) {
		g_print ("Error saving %s: %s\n", argv[2], error->message);
		g_error_free (error);

		return -1;
	}

	g_object_unref (document);

	return 0;
}
Comment 2 Carlos Garcia Campos 2011-04-28 08:27:52 UTC
Comment on attachment 46136 [details] [review]
Patch to add this new API

Hey rodrigo, patch looks goot to me. Common parts should be in the core, so that other frontends could use them too. Frontends currently get the info dict directly, so it's probably a good time to add a PDFInfo class in the core with setters and getters. See comments inline.

>From 3ce14bcbf198fc93365d664b0db5745680967764 Mon Sep 17 00:00:00 2001
>From: Rodrigo Moya <rodrigo@gnome-db.org>
>Date: Thu, 28 Apr 2011 10:24:05 +0200
>Subject: [PATCH] glib: Add _set functions for some PopplerDocument properties
>
>---
> glib/poppler-document.cc |  154 ++++++++++++++++++++++++++++++++++++++++++++++
> glib/poppler-document.h  |    6 ++
> 2 files changed, 160 insertions(+), 0 deletions(-)
>
>diff --git a/glib/poppler-document.cc b/glib/poppler-document.cc
>index 710a5b3..360aa42 100644
>--- a/glib/poppler-document.cc
>+++ b/glib/poppler-document.cc
>@@ -607,6 +607,46 @@ char *_poppler_goo_string_to_utf8(GooString *s)
>   return result;
> }
> 
>+GooString *_poppler_goo_string_from_utf8(const gchar *source)
>+{
>+  char *s, *utf8;
>+  gsize bytes_written;
>+  GooString *result;
>+
>+  utf8 = g_locale_to_utf8 (source, strlen (source), NULL, &bytes_written, NULL);

you shouldn't need this, we assume all strings used by the poppler-glib api are always utf-8, like glib/gtk+.

>+  s = g_convert (utf8, bytes_written,
>+		 "UTF-16BE", "UTF-8", NULL, &bytes_written, NULL);
>+  
>+  result = new GooString (s, bytes_written);

that's the only part that should be in the glib forntend

>+  if (!result->hasUnicodeMarker()) {
>+    result->insert(0, 0xff);
>+    result->insert(0, 0xfe);
>+  }

This should be in the core.

>+  g_free (s);
>+  g_free (utf8);
>+
>+  return result;
>+}
>+
>+static gboolean
>+mark_as_modified (PopplerDocument *document, Object *obj)
>+{
>+  gboolean result = FALSE;
>+  XRef *xref;
>+  Object info_dict;
>+
>+  xref = document->doc->getXRef ();
>+  if (xref != NULL) {
>+    xref->getDocInfoNF (&info_dict);
>+    xref->setModifiedObject (obj, info_dict.getRef ());

info_dict should be freed here with info_dict.free();

>+    result = TRUE;
>+  }
>+
>+  return result;
>+}

This method should be part fo the core too.

> static gchar *
> info_dict_get_string (Dict *info_dict, const gchar *key)
> {
>@@ -763,6 +803,44 @@ poppler_document_get_title (PopplerDocument *document)
> }
> 

if we add the PDFDocInfo class, all get/set methods would simply use that class.
Comment 3 Jose Aliste 2013-11-25 22:08:55 UTC
*** Bug 71999 has been marked as a duplicate of this bug. ***
Comment 4 Scott West 2014-10-01 19:35:15 UTC
Created attachment 107206 [details] [review]
glib info-dict API

This is a patch that uses Rodrigo's original patch as a guide for usage of the internal API. It adds support for setting/getting strings from the info dict, with the original impetus to make it possible to add custom metadata in a convenient way.

It could be used in a later step to add support for setting title/subject/etc, which would relate it pretty directly to this bug.
Comment 5 Albert Astals Cid 2014-10-02 19:40:48 UTC
I'm going to repeat Carlos here "Common parts should be in the core, so that other frontends could use them too."
Comment 6 Scott West 2014-10-03 17:15:08 UTC
(In reply to Albert Astals Cid from comment #5)
> I'm going to repeat Carlos here "Common parts should be in the core, so that
> other frontends could use them too."

(Sorry about the delay, didn't CC myself before)

Okay, I will give that a try.

The review also recommends creating a PDFDocInfo class, is that still recommended? And if so, should it replace the return type of the "getDocInfo" method?
Comment 7 Scott West 2014-10-04 10:10:53 UTC
Created attachment 107317 [details] [review]
core document info dict API

This adds an API to PDFDoc to get/set string entries in the doc's info dict.
Comment 8 Albert Astals Cid 2014-10-05 18:21:02 UTC
Please use proper style for the core code, i.e. don't add those jumps on function declarations.

Also no need to check if the xref is there or not, if the doc doesn't have an xref you don't really have a doc and should be calling this anyway.
Comment 9 Scott West 2014-10-05 18:34:35 UTC
(In reply to Albert Astals Cid from comment #8)
> Please use proper style for the core code, i.e. don't add those jumps on
> function declarations.
> 
> Also no need to check if the xref is there or not, if the doc doesn't have
> an xref you don't really have a doc and should be calling this anyway.

Thanks for the comments! I've removed the check on the Doc's XRef and modified the type signature appropriately.

What "jumps" on the declarations? Do you mean the newline between the get and set in PDFDoc.h? Is there a style guide somewhere for future reference?
Comment 10 Albert Astals Cid 2014-10-05 18:46:38 UTC
+GBool
+PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)

has to be

+GBool PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)

There is no document guideline, the file you're editing is your style guideline. Also you're not using the same indentation style.

In the core we use the crazy 2 space indentation and 8 spaces tab.
Comment 11 Scott West 2014-10-05 19:12:59 UTC
(In reply to Albert Astals Cid from comment #10)
> +GBool
> +PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> 
> has to be
> 
> +GBool PDFDoc::setDocInfoStringEntry(const char *key, GooString *value)
> 
> There is no document guideline, the file you're editing is your style
> guideline. Also you're not using the same indentation style.
> 
> In the core we use the crazy 2 space indentation and 8 spaces tab.

Ah, thanks! Okay I've addressed the useless check and style issues. If I understand that tabs are used every 8 spaces... crazy indeed :).
Comment 12 Scott West 2014-10-05 19:13:38 UTC
Created attachment 107369 [details] [review]
core document info dict API rev 2

This addresses useless checks and style issues.
Comment 13 Albert Astals Cid 2014-10-07 22:13:13 UTC
I don't think you should add the unicode marker, the goostring should come correctly formated by the caller.
Comment 14 Scott West 2014-10-08 04:37:15 UTC
(In reply to Albert Astals Cid from comment #13)
> I don't think you should add the unicode marker, the goostring should come
> correctly formated by the caller.

It seemed that Carlos was asking for this to be in the core in his review of the original patch by Rodrigo. I have no opinion on where it should go, so whatever you feel is best :).
Comment 15 Albert Astals Cid 2014-10-08 20:27:58 UTC
Carlos why do you want the core to add that marker? Not using the marker is fine and in my opinion whoever calls the function should pass a correctly formatted GooString, not relying on the core to fixup the GooString.