99416 – Sign PDF with digital signature

Bug 99416 - Sign PDF with digital signature

Summary: Sign PDF with digital signature

Status:	RESOLVED MOVED

Alias:	None

Product:	poppler
Classification:	Unclassified
Component:	general (show other bugs)
Version:	unspecified
Hardware:	All All

Importance:	medium enhancement
Assignee:	poppler-bugs
QA Contact:

URL:
Whiteboard:
Keywords:

Depends on:
Blocks:

Reported:	2017-01-15 08:52 UTC by Hanno Meyer-Thurow
Modified:	2018-08-21 11:00 UTC (History)
CC List:	7 users (show)

See Also:
i915 platform:
i915 features:

Attachments
Patch fixing this bug (22.71 KB, patch) 2017-06-07 14:24 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
Patch fixing this bug and adding support for signatures with SubFilter "ETSI.CAdES.detached" (26.65 KB, patch) 2017-06-09 07:29 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
Patch fixing this bug and adding support for signatures with SubFilter "ETSI.CAdES.detached" (26.86 KB, patch) 2017-06-13 07:26 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
Added signing of PDF documents via Qt5 interface and with pdfsig (28.41 KB, patch) 2017-08-16 14:40 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
Added signing of PDF documents via Qt5 interface and with pdfsig (29.24 KB, patch) 2017-08-17 14:02 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
pdfsig: add -nssdi option (11.26 KB, patch) 2017-09-06 23:19 UTC, Adrian Johnson	Details \| Splinter Review
write document then update byte offsets and sig on disk (16.66 KB, patch) 2017-09-06 23:23 UTC, Adrian Johnson	Details \| Splinter Review
Source code of a small test utility (2.12 KB, application/x-bzip) 2017-09-07 08:02 UTC, Hans-Ulrich Jüttner	Details
Script to generate certificates (463 bytes, text/plain) 2017-09-07 09:43 UTC, Adrian Johnson	Details
Return timezone in timeToDateString (2.76 KB, patch) 2017-09-07 13:32 UTC, Adrian Johnson	Details \| Splinter Review
pdfsig: add -nssdi option v2 (12.19 KB, patch) 2017-09-08 11:44 UTC, Adrian Johnson	Details \| Splinter Review
write document then update byte offsets and sig on disk v2 (22.85 KB, patch) 2017-09-08 11:47 UTC, Adrian Johnson	Details \| Splinter Review
Patch for a bug with missing -nssdir option in utils/pdfsig.cc (1.89 KB, patch) 2017-11-28 09:31 UTC, Hans-Ulrich Jüttner	Details \| Splinter Review
Show Obsolete (6) View All

Description Hanno Meyer-Thurow 2017-01-15 08:52:34 UTC

Since "support for digital signature" bug 16770 is included, how about adding support for actually signing PDF files?

For reference:
LibreOffice 5.3 and newer supports signing PDF 1.5+ format[0]. I wonder if it is possible to reuse their code wrt licensing? Just in case someone is interested, may talk to LibreOffice developers.

I mailed the LibreOffice developer, but got no response (yet).

[0] http://vmiklos.hu/blog/pdf-sign.html (LibreOffice developer blog post)

Comment 1 Hanno Meyer-Thurow 2017-01-17 07:28:27 UTC

I got a response from Miklos Vajna. Below are the interesting parts:

On Mon, Jan 16, 2017 at 10:15:53AM +0100, Miklos Vajna
<vmiklos@collabora.co.uk> wrote:
> On Wed, Jan 11, 2017 at 08:42:41PM +0100, Hanno Meyer-Thurow
> <h.mth@web.de> wrote:
> > I think it would be great to enable the poppler library to sign PDF
> > files. Though, I wonder if it is possible to port the code from
> > LibreOffice to poppler wrt to licensing.
> 
> Should be OK, LibreOffice is MPL licensed (weak copyleft), while
> poppler is GPL (copyleft), so copying code from LO to poppler should
> be fine.
> 
> > A question: is it possible to sign single form fields with
> > LibreOffice or just a complete document?  
> 
> LO can only sign the whole document.


I read on Gentoo forums[0], that MuPDF with gostscript-gpl is able to sign PDF files. One may have a look there, too.

[0] https://forums.gentoo.org/viewtopic-p-8017048.html#8017048

Comment 2 Hans-Ulrich Jüttner 2017-06-07 14:24:14 UTC

Created attachment 131773 [details] [review]
Patch fixing this bug

The proposed patch would fix this bug.

The Qt5 interface in qt5/src/poppler-form.h gets the following new method
in class FormFieldSignature:
/**
   Sign the whole document in this signature field.

   @param certNickname nickname of the signing certificate in the mozilla database
   @param password     password to unlock the private key
                       (may be empty if no password is set)
   @param digestAlg    digest algorithm to be used in the signature
 */
bool sign(const QString& certNickname, const QString& password,
          DigestAlgorithm digestAlg = SHA256);

Comment 3 Hans-Ulrich Jüttner 2017-06-09 07:29:19 UTC

Created attachment 131824 [details] [review]
Patch fixing this bug and adding support for signatures with SubFilter "ETSI.CAdES.detached"

In addition to the bugfix I added support for signatures with SubFilter
"ETSI.CAdES.detached". For this I added to the Qt5 interface in
qt5/src/poppler-form.h the

enum SignatureType {
  AdbePkcs7sha1,
  AdbePkcs7detached,
  EtsiCAdESdetached
}

and getter and setter

SignatureType signatureType();
void setSignatureType(SignatureType type);

Comment 4 Hans-Ulrich Jüttner 2017-06-13 07:26:31 UTC

Created attachment 131917 [details] [review]
Patch fixing this bug and adding support for signatures with SubFilter "ETSI.CAdES.detached"

In some cases the signature must be completely recalculated after
setting the byte range to the final values.

Comment 5 Hanno Meyer-Thurow 2017-06-15 07:51:47 UTC

Readd poppler-bugs to CC for now.

We got a PATCH here.

Comment 6 Hans-Ulrich Jüttner 2017-06-15 09:29:16 UTC

Albert Astals Cid in a comment to bug #99271 pointed out to me that I should not assign bugs to myself.

Comment 7 oliver.sander 2017-06-15 21:05:42 UTC

I tried the patch, but it doesn't apply cleanly.  I suppose that's because it depends on some incarnation of the patches in

  https://bugs.freedesktop.org/show_bug.cgi?id=99271

So getting those in would help with reviewing here.

Anyway, I manually reviewed the code, and found a few (mainly cosmetic) issues. Generally, the code would benefit from a bit of in-code documentation.  

The patch does not add copyright attributions to all the files it touches.

Add your copyright line?


--- a/poppler/Form.cc
+++ b/poppler/Form.cc
@@ -455,6 +467,50 @@ Object* FormWidgetSignature::getByteRange()
   return static_cast<FormFieldSignature*>(field)->getByteRange();
 }

+GBool FormWidgetSignature::signDocument(const char* certNickname, const char* digestName,
+                                        const char* password, const char* reason)
+{
+  GBool ok = gFalse;
+  if (certNickname != nullptr)

Please use if (certNickname), for consistency.


+  {
+    time_t now = time(nullptr);
+    unsigned char tmp_buffer[4];
+    memcpy(tmp_buffer, "PDF", 4);
+    SignatureHandler sigHandler(certNickname, SignatureHandler::getHashOidTag(digestName));
+    sigHandler.updateHash(tmp_buffer, 4);
+    // calculate a signature over tmp_buffer with the certificate to get it's size

typo: it is "its size", not "it's size"


+    GooString* tmpSignature = sigHandler.signDetached(password, now);
+    if (tmpSignature != nullptr)

if (tmpSignature)


+    {
+      FormFieldSignature* signatureField = static_cast<FormFieldSignature*>(field);
+      GooString gReason(reason != nullptr ? reason : "I approve this document");
+      GooString gName(sigHandler.getSignerName());
+      Object vObj;
+      ok = createSignature(vObj, gName, gReason, tmpSignature, &now);
+      if (ok)
+      {
+        // calculate hash with preliminary values for ranges
+        sigHandler.restartHash();
+        prepareSignature(&sigHandler);
+        ok = updateSignature(vObj, tmpSignature);
+      }
+      if (ok)
+      {

I find this style of having consecutive blocks of "if (ok)" confusing.
You think they must all evaluate the same, until you realize that the
value of 'ok' changes in between.  Personally, I'd prefer to have

if (!ok)
  return gFalse;

at each location where 'ok' changes, rather than having these large
conditional blocks.

But that's subjective :-)




+        // recalculate hash with the correct ranges
+        sigHandler.restartHash();
+        prepareSignature(&sigHandler);
+        GooString* signature = sigHandler.signDetached(password, now);
+        if (signature != nullptr)

  if (signature)

@@ -535,7 +591,7 @@ GooString* FormWidgetSignature::getCheckedSignature()
                 break;
               }
             }
-            if (sigLen > 0 && 2*(sigLen+lenBytes) < len-4)
+            if (sigLen > 0 && 2*(sigLen+lenBytes) <= len-4)


This smells like a separate bugfix.  Should this go into a separate patch?

@@ -565,6 +621,158 @@ GooString* FormWidgetSignature::getCheckedSignature()
   return nullptr;
 }

+GBool FormWidgetSignature::prepareSignature(SignatureHandler *handler)
+{
+  GBool ok = gFalse;
+  size_t size;
+  char* membuf;
+  FILE* mfp = open_memstream(&membuf, &size);
+  if (mfp != nullptr)

if (mfp)

+  {
+    FileOutStream* outStr = new FileOutStream(mfp, 0);
+    doc->saveAs(outStr, writeForceIncremental);
+    fclose(mfp);
+    if (membuf != nullptr)
+    {

if (membuf)

+      for (sig_start = 0; sig_start < size; ++sig_start)
+      {

The loop variable 'sig_start' is a data member, right?
Personally, I find this a bit confusing.

The following part would benefit from a bit a documentation.

+        if (strncmp(&membuf[sig_start], "/Contents <308", 14) == 0)
+        {
+          char* p = index(&membuf[sig_start+14], '>');
+          if (p != nullptr)

if (p)

+          {
+            sig_end = ++p - membuf;
+            p = strstr(&membuf[sig_end], "%%%%EOF\r\n");
+            for (file_size = sig_end; file_size < size; ++file_size)
+            {
+              if (membuf[file_size] == '%' && membuf[file_size+1] == '%' &&
+                  membuf[file_size+2] == 'E' && membuf[file_size+3] == 'O' &&
+                  membuf[file_size+4] == 'F' && membuf[file_size+5] == '\r' &&
+                  membuf[file_size+6] == '\n')

Is there not a more C++-like way to do this check?  Something like

  if (std::string(membuf[file_size]) == "%%EOF\r\n") ?

Or use strncmp ?

+      if (ok)
+      {
+        ok = gFalse;
+        for (range_offset = sig_end; range_offset < file_size; ++range_offset)
+        {
+          if (strncmp(&membuf[range_offset], "/ByteRange [0 ", 14) == 0)
+          {
+            char* p = index(&membuf[range_offset+14], ']');
+            if (p != nullptr)

 if (p)

+            {
+              range_size = (++p - membuf) - range_offset;
+              ok = gTrue;
+              break;
+            }
+          }
+        }
+      }
+      if (ok)
+      {
+        char* range_buf = new char[128];

Is it certain that 128 characters will always be enough?

+        Goffset end_size = file_size - sig_end;
+        Goffset s = sprintf(range_buf, "/ByteRange [0 %lld %lld %lld ]",
+                            sig_start, sig_end, end_size);
+        char* buffer = new char[file_size-sig_end+64];

Maybe use end_size here instead of file_size-sig_end ?

+GBool FormWidgetSignature::createSignature(Object &vObj, const GooString& name,
+                                           const GooString& reason,
+                                           const GooString* signature,

Why are two strings passed by-reference, and one as a pointer?

+                                           const time_t* sTime)
+{
+  GBool ok = obj.isDict();
+  if (ok)
+  {
+    file_size = doc->getBaseStream()->getLength();
+    Object bObj, obj1;
+    obj.dictSet("V", vObj.initDict(xref));
+    vObj.dictAdd(copyString("Type"), obj1.initName("Sig"));
+    vObj.dictAdd(copyString("Filter"), obj1.initName("Adobe.PPKLite"));
+    switch (signatureType()) {
+    case adbe_pkcs7_sha1:
+      // we don't support signing with SubFilter "adbe.pkcs7.sha1"

This will fall through.  Shouldn't there be 'return gFalse' here ?

+    case adbe_pkcs7_detached:
+      vObj.dictAdd(copyString("SubFilter"), obj1.initName("adbe.pkcs7.detached"));
+      break;
+    case ETSI_CAdES_detached:
+      vObj.dictAdd(copyString("SubFilter"), obj1.initName("ETSI.CAdES.detached"));
+      break;
+    }
+    vObj.dictAdd(copyString("Name"), obj1.initString(name.copy()));
+    char buf[24];
+    size_t size = strftime(buf, 24, "D:%Y%m%d%H%M%S%z", localtime(sTime));

What does the following code block do?

+    if (size >= 2 && size < 22)
+    {
+      buf[size] = buf[size-1];
+      buf[size-1] = buf[size-2];
+      buf[size-2] = '\'';
+      buf[++size] = '\'';
+      buf[++size] = '\0';
+      GooString gTime(buf, size);
+      vObj.dictAdd(copyString("M"), obj1.initString(gTime.copy()));
+    }

[snip]

+GBool FormWidgetSignature::updateSignature(Object &vObj, const GooString* signature)
+{
+  GBool ok = vObj.isDict();

Why not simply start here with

  if (! vObj.isDict() || !signature)
    return gFalse;

+  if (ok && signature != nullptr)
+  {
+    Object bObj, obj1;
+    vObj.dictSet("Contents", obj1.initString(signature->copy()));
+    bObj.initArray(xref);
+    bObj.arrayAdd(obj1.initInt64(0));
+    bObj.arrayAdd(obj1.initInt64(sig_start));
+    bObj.arrayAdd(obj1.initInt64(sig_end));
+    bObj.arrayAdd(obj1.initInt64(file_size-sig_end));
+    vObj.dictSet("ByteRange", &bObj);
+    obj1.free();
+  }
+  return ok;
+}
+
 void FormWidgetSignature::updateWidgetAppearance()
 {
   // Unimplemented
@@ -1554,7 +1762,16 @@ void FormFieldSignature::parseInfo()

   // check if subfilter is supported for signature validation, only detached signatures work for now
   sig_dict.dictLookup("SubFilter", &subfilterName);
-  if (subfilterName.isName("adbe.pkcs7.detached") || subfilterName.isName("adbe.pkcs7.sha1")) {
+  if (subfilterName.isName("adbe.pkcs7.sha1")) {
+    signature_type = adbe_pkcs7_sha1;
+    signature_info->setSubFilterSupport(true);
+  }
+  if (subfilterName.isName("adbe.pkcs7.detached")) {
+    signature_type = adbe_pkcs7_detached;
+    signature_info->setSubFilterSupport(true);
+  }
+  else if (subfilterName.isName("ETSI.CAdES.detached")) {
+    signature_type = ETSI_CAdES_detached;
     signature_info->setSubFilterSupport(true);
   }

Does the code behave sanely if the subFilter name is something other
than these three possibilities?

--- a/poppler/Form.h
+++ b/poppler/Form.h
@@ -252,18 +258,37 @@ public:
   FormWidgetSignature(PDFDoc *docA, Object *dict, unsigned num, Ref ref, FormField *p);
   void updateWidgetAppearance() override;

+  FormSignatureType signatureType();
+  void setFormSignatureType(FormSignatureType type);
   SignatureInfo *validateSignature(bool doVerifyCert, bool forceRevalidation, time_t validationTime = -1);
-  Object* getByteRange();
+  Object *getByteRange();

Cosmetic change


   // checks the length encoding of the signature and returns the hex encoded signature
   // if the check passed otherwise a nullptr is returned
-  GooString* getCheckedSignature();
+  GooString *getCheckedSignature();

Cosmetic change

   // this method only gives the correct file size if getCheckedSignature()
   // has been called before
   Goffset getCheckedFileSize() const { return file_size; }
+

Cosmetic change

--- a/poppler/PDFDoc.cc
+++ b/poppler/PDFDoc.cc
@@ -1215,6 +1215,14 @@ void PDFDoc::writeString (GooString* s, OutStream* outStr, Guchar *fileKey,
       outStr->printf("%c", unescaped);
     }
     outStr->printf(") ");
+  } else if (s->hasASN1Marker()) {
+    const char* c = s->getCString();
+    outStr->printf("<");
+    for(int i=0; i<s->getLength(); i++) {
+      unsigned char value = *(c+i)&0x000000ff;
+      outStr->printf("%2.2x", value);
+    }
+    outStr->printf("> ");

Add a brief comment explaining what the previous code block does.


--- a/poppler/SignatureHandler.cc
+++ b/poppler/SignatureHandler.cc
@@ -22,6 +22,18 @@
 #include <dirent.h>
 #include <Error.h>

+/* NSS headers */
+#include <secpkcs7.h>
+
+void SignatureHandler::outputCallback(void* arg, const char* buf, unsigned long len)
+{
+  if (arg != nullptr && buf != nullptr)

if (arg && buf)

@@ -129,7 +167,7 @@ GooString *SignatureHandler::getDefaultFirefoxCertDB_Linux()
 /**
  * Initialise NSS
  */
-void SignatureHandler::init_nss()
+void SignatureHandler::init_nss()

Where is the change here?  Trailing whitespace?

@@ -145,11 +183,12 @@ void SignatureHandler::init_nss()


 SignatureHandler::SignatureHandler(unsigned char *p7, int p7_length)
- : hash_context(NULL),
-   CMSMessage(NULL),
-   CMSSignedData(NULL),
-   CMSSignerInfo(NULL),
-   temp_certs(NULL)
+ : hash_context(nullptr),
+   CMSMessage(nullptr),
+   CMSSignedData(nullptr),
+   CMSSignerInfo(nullptr),
+   signing_cert(nullptr),
+   temp_certs(nullptr)

Use nullptr for new code, but don't replace existings NULLs.
That way it is easier to spot new code.

@@ -251,7 +316,6 @@ NSSCMSSignerInfo *SignatureHandler::CMS_SignerInfoCreate(NSSCMSSignedData * cms_
 {
   NSSCMSSignerInfo *signerInfo = NSS_CMSSignedData_GetSignerInfo(cms_sig_data, 0);
   if (!signerInfo) {
-    printf("Error in NSS_CMSSignedData_GetSignerInfo()\n");

Removing this line is unrelated to your main objective.
If you don't like it please propose its removal in a separate patch.

@@ -342,6 +406,54 @@ SECErrorCodes SignatureHandler::validateCertificate(time_t validation_time)
   return retVal;
 }

+GooString* SignatureHandler::signDetached(const char* password, time_t signing_time)
+{
+  if (hash_context == nullptr)

if (!hash_context)

--- a/qt5/src/poppler-form.cc
+++ b/qt5/src/poppler-form.cc
@@ -182,7 +182,6 @@ Link *FormField::additionalAction(AdditionalActionType type) const
   return action;
 }

-

Cosmetic change: one line removed

@@ -564,6 +563,73 @@ FormField::FormType FormFieldSignature::type() const
   return FormField::FormSignature;
 }

+FormFieldSignature::SignatureType FormFieldSignature::signatureType()
+{
+  SignatureType sigType = AdbePkcs7detached;
+  FormWidgetSignature* fws = static_cast<FormWidgetSignature*>(m_formData->fm);
+  switch (fws->signatureType())
+  {
+    case adbe_pkcs7_sha1:
+      sigType = AdbePkcs7sha1;
+      break;
+    case adbe_pkcs7_detached:
+      sigType = AdbePkcs7detached;
+      break;
+    case ETSI_CAdES_detached:
+      sigType = EtsiCAdESdetached;
+      break;
+  }

Do we need to guard against the possibility that fws->signaturType() is
not one of these three types?

+  return sigType;
+}
+
+void FormFieldSignature::setSignatureType(SignatureType type)
+{
+  FormWidgetSignature* fws = static_cast<FormWidgetSignature*>(m_formData->fm);
+  switch (type)
+  {
+    case AdbePkcs7sha1:
+      fws->setFormSignatureType(adbe_pkcs7_sha1);
+      break;
+    case AdbePkcs7detached:
+      fws->setFormSignatureType(adbe_pkcs7_detached);
+      break;
+    case EtsiCAdESdetached:
+      fws->setFormSignatureType(ETSI_CAdES_detached);
+      break;
+  }

Do we need to guard against the possibility that 'type' is
not one of these three types?


@@ -642,9 +708,10 @@ SignatureValidationInfo FormFieldSignature::validate(int opt, const QDateTime& v
       }
     }
   }
-  if (priv->range_bounds.size() == 4)
+  GooString* checkedSignature = fws->getCheckedSignature();
+  if (priv->range_bounds.size() == 4 && checkedSignature != nullptr)

 ... && checkedSignature)


--- a/qt5/src/poppler-form.h
+++ b/qt5/src/poppler-form.h
@@ -487,7 +504,21 @@ namespace Poppler {

 	FormType type() const override;

-	/**
[snip]
+
+       /**

Comment 8 Hans-Ulrich Jüttner 2017-06-16 07:45:19 UTC

(In reply to oliver.sander from comment #7)
> I tried the patch, but it doesn't apply cleanly.  I suppose that's because
> it depends on some incarnation of the patches in
> 
>   https://bugs.freedesktop.org/show_bug.cgi?id=99271
> 
> So getting those in would help with reviewing here.

You are right, this patch depends heavily on the patch #99271.
Unfortunately it started off from that patch as it was in comment 7
before rework after your review on #99271. That results in not applying
cleanly now for this patch. There seems to be hope that we get patch
#99271 in some time soon. Therefore I suggest that I'll revise this
patch then on a clean basis with attention to your comments below.
 
[snip]

> @@ -535,7 +591,7 @@ GooString* FormWidgetSignature::getCheckedSignature()
>                  break;
>                }
>              }
> -            if (sigLen > 0 && 2*(sigLen+lenBytes) < len-4)
> +            if (sigLen > 0 && 2*(sigLen+lenBytes) <= len-4)
> 
> 
> This smells like a separate bugfix.  Should this go into a separate patch?
> 

Yes this should go into patch #99271 as getCheckedSignature() was added there.

[snip]

Thank you for the review!

Comment 9 Albert Astals Cid 2017-08-15 21:54:03 UTC

Can we get a proper description for the patch? Because " Patch fixing this bug and adding support for signatures with SubFilter "ETSI.CAdES.detached" " doesn't seem to do much against "Sign PDF with digital signature".

Also the patch needs updating since it doesn't apply.

Comment 10 Hans-Ulrich Jüttner 2017-08-16 14:40:27 UTC

Created attachment 133553 [details] [review]
Added signing of PDF documents via Qt5 interface and with pdfsig

The patch has been updated to apply cleanly on the master branch
with the comments from Oliver Sander's review in mind.
It adds signing of PDF documents with digital signatures via Qt5 interface.
It also enhances pdfsig to sign PDF documents with parameter -s.
The signature type may be selected with method setSignatureType() or with the
parameter -etsi in pdfsig.

Comment 11 Adrian Johnson 2017-08-17 12:40:52 UTC

I'd like to keep the pdfsig command line usage consistent with the other poppler utils.

Most options in the poppler utils are one word, abbreviated word, or acronym. Single character options are generally only used for some very common options such as first/last page, help, and version.

I would suggest the following:

"-sign" to sign the document
"-nick" to specify cert nickname
"-digest" to specify the digest algorithm. Or maybe "-hash" is more common.
"-reason" to specify the reason for signing

I don't think the default reason is required. Looking at PDF32000, the reason is optional so omit it from the PDF if not specified.

Poppler uses "-opw" and -"-upw" for owner and user password. I suggest "-kpw" for key password.

The "-o" option is not required. Other utils list the output file after the input. eg
  pdfsig [options] [input-file] [output-file].

"-etsi" - how many signature types are available? Maybe "-type <sigtype" would be more flexible.


+  certNickname[0] = '\0';
+  password[0] = '\0';
+  strcpy(digestName, "SHA256");
+  reason[0] = '\0';
+  output[0] = '\0';

The other utils initialise string arguments in the declaration:
eg
  static char ownerPassword[33] = "";

Comment 12 Hans-Ulrich Jüttner 2017-08-17 14:02:21 UTC

Created attachment 133576 [details] [review]
Added signing of PDF documents via Qt5 interface and with pdfsig

I changed the patch accordingly except for -etsi.
There are only two types handled: adbe.pkcs7.detached and ETSI.CAdES.detached.
I believe it would be a burden for the user to specify the exact type string.
Switching from adbe.pkcs7.detached to ETSI.CAdES.detached with boolean parameter
-etsi is easier.

Comment 13 Albert Astals Cid 2017-08-21 22:40:38 UTC

is the ASN1 thing in the PDF spec? I can't find it

This is only useful for documents that already have a signature field, right?

Not that i'm an ultra fan of windows, but open_memstream seems not to work there, is there anything else we could use? (No idea if anyone has tried building the nss3 code in windows but maybe better if we don't make it harder for them than necessary)

Comment 14 Hans-Ulrich Jüttner 2017-08-28 07:37:10 UTC

(In reply to Albert Astals Cid from comment #13)
> is the ASN1 thing in the PDF spec? I can't find it
> 

The document from Adobe
https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf
describes the generation of digital signatures in PDF documents. It says
"5. The hash value is encrypted with the signer’s private key and a hex-encoded PKCS#7 object signature object is generated.
6. The signature object is placed in the file on disk, overwriting the placeholder /Contents value. Any space not used for the signature object is overwritten with zeros." RFC2315 (https://tools.ietf.org/html/rfc2315) describes PKCS#7 objects
as ASN.1 structures.

> This is only useful for documents that already have a signature field, right?
> 

Yes, I couldn't figure out how to create such a signature field with poppler.
So I content myself with the case where there is already a signature field in
the PDF document.

> Not that i'm an ultra fan of windows, but open_memstream seems not to work
> there, is there anything else we could use? (No idea if anyone has tried
> building the nss3 code in windows but maybe better if we don't make it
> harder for them than necessary)

I'm sorry about that, but I don't have access to a windows machine and,
moreover, do not know much about windows.

Comment 15 Adrian Johnson 2017-08-28 10:49:36 UTC

+  char buf[24];
+  time_t now = time(nullptr);
+  size_t size = strftime(buf, 24, "D:%Y%m%d%H%M%S%z", localtime(&now));
+  if (size >= 2 && size < 22)
+  {
+    // put timezone info into single quotes
+    buf[size] = buf[size-1];
+    buf[size-1] = buf[size-2];
+    buf[size-2] = '\'';
+    buf[++size] = '\'';
+    buf[++size] = '\0';
+    GooString gTime(buf, size);
+    vObj.dictAdd(copyString("M"), Object(gTime.copy()));
+  }

PDF date format uses a single quote to separate the time zone hour and minute, not enclose the minutes. strftime "%z" on windows returns the timezone name, not the offset.

We already have a time to PDF date function: DateInfo::timeToDateString(). It doesn't currently add the timezone but I'll fix that that when I submit some win32 patches I am working on.

FormWidgetSignature::prepareSignature is writing the entire PDF to memory. It would be better to write it to a temp file then search it on disk. I don't like the way you search the PDF file for matching strings. It could end up matching some random data in another stream. It would be better to get the offset of the dict object you want to search and start the search for there for up to some reasonable maximum.

eg

XRefEntry *entry = doc->getXRef()->getEntry(ref);
Goffset objOffset = entry->offset;

Also, don't search for %%EOF to get the file size. Use doc->getBaseStream()->getLength() or Gfseek(f, 0, SEEK_END), Gftell().

I don't fully understand what signDocument() and prepareSignature() are doing. Which means some comments would be helpful. It appears you are saving the PDF, computing the signature, updating the signature object, then saving the PDF again. This assumes that the second save will produce an identical file except for the updated signature. I'm not sure if we can assume this will always be true.

My naive understanding is that we could just create a signature dict with a dummy signature and offsets (large enough for any size, XRef allows up to 10 digits), save the PDF, compute the signature, and overwrite the dummy signature and offsets in the disk file with the real signature. What am I missing?

Comment 16 Hans-Ulrich Jüttner 2017-08-28 12:20:25 UTC

(In reply to Adrian Johnson from comment #15)
> +  char buf[24];
> +  time_t now = time(nullptr);
> +  size_t size = strftime(buf, 24, "D:%Y%m%d%H%M%S%z", localtime(&now));
> +  if (size >= 2 && size < 22)
> +  {
> +    // put timezone info into single quotes
> +    buf[size] = buf[size-1];
> +    buf[size-1] = buf[size-2];
> +    buf[size-2] = '\'';
> +    buf[++size] = '\'';
> +    buf[++size] = '\0';
> +    GooString gTime(buf, size);
> +    vObj.dictAdd(copyString("M"), Object(gTime.copy()));
> +  }
> 
> PDF date format uses a single quote to separate the time zone hour and
> minute, not enclose the minutes. strftime "%z" on windows returns the
> timezone name, not the offset.
> 
> We already have a time to PDF date function: DateInfo::timeToDateString().
> It doesn't currently add the timezone but I'll fix that that when I submit
> some win32 patches I am working on.
> 

That's fine, I didn't know about DateInfo::timeToDateString(). I'll use it
when you have fixed the timezone issue.

> FormWidgetSignature::prepareSignature is writing the entire PDF to memory.
> It would be better to write it to a temp file then search it on disk. I
> don't like the way you search the PDF file for matching strings. It could
> end up matching some random data in another stream. It would be better to
> get the offset of the dict object you want to search and start the search
> for there for up to some reasonable maximum.
> 
> eg
> 
> XRefEntry *entry = doc->getXRef()->getEntry(ref);
> Goffset objOffset = entry->offset;
> 
> Also, don't search for %%EOF to get the file size. Use
> doc->getBaseStream()->getLength() or Gfseek(f, 0, SEEK_END), Gftell().
> 

Ok, but is it guaranteed that this offset and the length of doc->getBaseStream()
are always updated when some objects in the PDF have been changed?

> I don't fully understand what signDocument() and prepareSignature() are
> doing. Which means some comments would be helpful. It appears you are saving
> the PDF, computing the signature, updating the signature object, then saving
> the PDF again. This assumes that the second save will produce an identical
> file except for the updated signature. I'm not sure if we can assume this
> will always be true.
> 

As the user later wants to save the signed document for example with the
PDFConverter from the Qt5 interface, saving the document again has to produce
an identical byte stream. Otherwise the signature would no longer be valid.

> My naive understanding is that we could just create a signature dict with a
> dummy signature and offsets (large enough for any size, XRef allows up to 10
> digits), save the PDF, compute the signature, and overwrite the dummy
> signature and offsets in the disk file with the real signature. What am I
> missing?

I was striving not to waste to much space for the signature. That's why I
calculated a signature with the user's certificate and a hash over an empty
string first to see how much space it will take. I had the problem that the
byte range object itself changed it's size in the byte stream when some number
in it gets an additional digit. Therefore I recalculated the byte range and
after it the signature once again. May bee this can be avoided if some extra
bytes are added to the space reserved for the signature.

Comment 17 Adrian Johnson 2017-08-28 12:46:27 UTC

(In reply to Hans-Ulrich Jüttner from comment #16)
> > FormWidgetSignature::prepareSignature is writing the entire PDF to memory.
> > It would be better to write it to a temp file then search it on disk. I
> > don't like the way you search the PDF file for matching strings. It could
> > end up matching some random data in another stream. It would be better to
> > get the offset of the dict object you want to search and start the search
> > for there for up to some reasonable maximum.
> > 
> > eg
> > 
> > XRefEntry *entry = doc->getXRef()->getEntry(ref);
> > Goffset objOffset = entry->offset;
> > 
> > Also, don't search for %%EOF to get the file size. Use
> > doc->getBaseStream()->getLength() or Gfseek(f, 0, SEEK_END), Gftell().
> > 
> 
> Ok, but is it guaranteed that this offset and the length of
> doc->getBaseStream()
> are always updated when some objects in the PDF have been changed?

After saving the PDF, open the saved PDF in a new PDFDoc object.

> > I don't fully understand what signDocument() and prepareSignature() are
> > doing. Which means some comments would be helpful. It appears you are saving
> > the PDF, computing the signature, updating the signature object, then saving
> > the PDF again. This assumes that the second save will produce an identical
> > file except for the updated signature. I'm not sure if we can assume this
> > will always be true.
> > 
> 
> As the user later wants to save the signed document for example with the
> PDFConverter from the Qt5 interface, saving the document again has to produce
> an identical byte stream. Otherwise the signature would no longer be valid.

I don't really understand how this works. Do you mean save a copy just after we signed the document? Or save any PDF that already has a signature? If we want to save an identical copy with the signature preserved we should just copy the file. It will be a lot faster. It may work now but what happens if we decide to always update the ModDate each time the PDF is saved?

> > My naive understanding is that we could just create a signature dict with a
> > dummy signature and offsets (large enough for any size, XRef allows up to 10
> > digits), save the PDF, compute the signature, and overwrite the dummy
> > signature and offsets in the disk file with the real signature. What am I
> > missing?
> 
> I was striving not to waste to much space for the signature. That's why I
> calculated a signature with the user's certificate and a hash over an empty
> string first to see how much space it will take. I had the problem that the
> byte range object itself changed it's size in the byte stream when some
> number
> in it gets an additional digit. Therefore I recalculated the byte range and
> after it the signature once again. May bee this can be avoided if some extra
> bytes are added to the space reserved for the signature.

A byte offset can not exceed 10 digits. Typically at least 6 digits will be used. I don't think it is worth trying to save a few bytes.

Comment 18 Hans-Ulrich Jüttner 2017-08-28 13:45:31 UTC

(In reply to Adrian Johnson from comment #17)
> (In reply to Hans-Ulrich Jüttner from comment #16)
> > > FormWidgetSignature::prepareSignature is writing the entire PDF to memory.
> > > It would be better to write it to a temp file then search it on disk. I
> > > don't like the way you search the PDF file for matching strings. It could
> > > end up matching some random data in another stream. It would be better to
> > > get the offset of the dict object you want to search and start the search
> > > for there for up to some reasonable maximum.
> > > 
> > > eg
> > > 
> > > XRefEntry *entry = doc->getXRef()->getEntry(ref);
> > > Goffset objOffset = entry->offset;
> > > 
> > > Also, don't search for %%EOF to get the file size. Use
> > > doc->getBaseStream()->getLength() or Gfseek(f, 0, SEEK_END), Gftell().
> > > 
> > 
> > Ok, but is it guaranteed that this offset and the length of
> > doc->getBaseStream()
> > are always updated when some objects in the PDF have been changed?
> 
> After saving the PDF, open the saved PDF in a new PDFDoc object.
> 

How is this supposed to be done? The user of the library calls the method
sign() from the Qt5 interface or signDocument() from poppler/Form.h via some
other interface. Are you saying that the doc member of type PDFDoc* within the
FormFieldSignature object should be replaced by a new one? I think that can't
work because there are other references to the PDFDoc object in the library
outside the FormFieldSignature object.

> > > I don't fully understand what signDocument() and prepareSignature() are
> > > doing. Which means some comments would be helpful. It appears you are saving
> > > the PDF, computing the signature, updating the signature object, then saving
> > > the PDF again. This assumes that the second save will produce an identical
> > > file except for the updated signature. I'm not sure if we can assume this
> > > will always be true.
> > > 
> > 
> > As the user later wants to save the signed document for example with the
> > PDFConverter from the Qt5 interface, saving the document again has to produce
> > an identical byte stream. Otherwise the signature would no longer be valid.
> 
> I don't really understand how this works. Do you mean save a copy just after
> we signed the document? Or save any PDF that already has a signature? If we
> want to save an identical copy with the signature preserved we should just
> copy the file. It will be a lot faster. It may work now but what happens if
> we decide to always update the ModDate each time the PDF is saved?
> 

The user of the library opens the PDF document, than perhaps makes some changes
and finally calls the method FormFieldSignature::sign() from the Qt5 interface.
Afterwards he saves the changed document with the PDFConverter class of the
Qt5 interface. If he makes other changes to the document after calling sign()
but before saving, the signature will of course be no longer valid.

> > > My naive understanding is that we could just create a signature dict with a
> > > dummy signature and offsets (large enough for any size, XRef allows up to 10
> > > digits), save the PDF, compute the signature, and overwrite the dummy
> > > signature and offsets in the disk file with the real signature. What am I
> > > missing?
> > 
> > I was striving not to waste to much space for the signature. That's why I
> > calculated a signature with the user's certificate and a hash over an empty
> > string first to see how much space it will take. I had the problem that the
> > byte range object itself changed it's size in the byte stream when some
> > number
> > in it gets an additional digit. Therefore I recalculated the byte range and
> > after it the signature once again. May bee this can be avoided if some extra
> > bytes are added to the space reserved for the signature.
> 
> A byte offset can not exceed 10 digits. Typically at least 6 digits will be
> used. I don't think it is worth trying to save a few bytes.

The problem is not saving a few bytes but specifying the final byte offsets.
If a byte offset changes the number of digits may vary at least between 6 and
10. If for example the total length of the document increases, the byte range
object might have to increase in size too because an additional digit is used
for that length. Thus in turn the start offset of the signature is shifted by
one byte because the byte range object is located in front of the signature.

Comment 19 Adrian Johnson 2017-08-28 14:01:05 UTC

(In reply to Hans-Ulrich Jüttner from comment #18)
> > After saving the PDF, open the saved PDF in a new PDFDoc object.
> > 
> 
> How is this supposed to be done? The user of the library calls the method
> sign() from the Qt5 interface or signDocument() from poppler/Form.h via some
> other interface. Are you saying that the doc member of type PDFDoc* within
> the
> FormFieldSignature object should be replaced by a new one? I think that can't
> work because there are other references to the PDFDoc object in the library
> outside the FormFieldSignature object.

Create a temporary PDFDoc* variable for the purpose of opening the PDF and getting the size and offset of the sig object. You can have more than one PDFDoc object at the same time.

> The user of the library opens the PDF document, than perhaps makes some
> changes
> and finally calls the method FormFieldSignature::sign() from the Qt5
> interface.
> Afterwards he saves the changed document with the PDFConverter class of the
> Qt5 interface. If he makes other changes to the document after calling sign()
> but before saving, the signature will of course be no longer valid.

Well yes changing the document after adding a signature will break the signature. I'm not seeing the problem here. Maybe warn the user if he tries to change a signed document.

> The problem is not saving a few bytes but specifying the final byte offsets.
> If a byte offset changes the number of digits may vary at least between 6 and
> 10. If for example the total length of the document increases, the byte range
> object might have to increase in size too because an additional digit is used
> for that length. Thus in turn the start offset of the signature is shifted by
> one byte because the byte range object is located in front of the signature.

When saving, set each offset to 9999999999, then search for and overwrite the 9's with the correct offsets.

Comment 20 Hans-Ulrich Jüttner 2017-08-28 14:20:38 UTC

(In reply to Adrian Johnson from comment #19)
> (In reply to Hans-Ulrich Jüttner from comment #18)
> > > After saving the PDF, open the saved PDF in a new PDFDoc object.
> > > 
> > 
> > How is this supposed to be done? The user of the library calls the method
> > sign() from the Qt5 interface or signDocument() from poppler/Form.h via some
> > other interface. Are you saying that the doc member of type PDFDoc* within
> > the
> > FormFieldSignature object should be replaced by a new one? I think that can't
> > work because there are other references to the PDFDoc object in the library
> > outside the FormFieldSignature object.
> 
> Create a temporary PDFDoc* variable for the purpose of opening the PDF and
> getting the size and offset of the sig object. You can have more than one
> PDFDoc object at the same time.
> 

Sorry, I do not understand what a temporary PDFDoc object should help.

> > The user of the library opens the PDF document, than perhaps makes some
> > changes
> > and finally calls the method FormFieldSignature::sign() from the Qt5
> > interface.
> > Afterwards he saves the changed document with the PDFConverter class of the
> > Qt5 interface. If he makes other changes to the document after calling sign()
> > but before saving, the signature will of course be no longer valid.
> 
> Well yes changing the document after adding a signature will break the
> signature. I'm not seeing the problem here. Maybe warn the user if he tries
> to change a signed document.
> 
> > The problem is not saving a few bytes but specifying the final byte offsets.
> > If a byte offset changes the number of digits may vary at least between 6 and
> > 10. If for example the total length of the document increases, the byte range
> > object might have to increase in size too because an additional digit is used
> > for that length. Thus in turn the start offset of the signature is shifted by
> > one byte because the byte range object is located in front of the signature.
> 
> When saving, set each offset to 9999999999, then search for and overwrite
> the 9's with the correct offsets.

That wouldn't help. The byte range object would shrink when these values later
are replaced by the correct smaller values. If you mean these values should be
overwritten on disk that would also be no solution as this is only a temporary
file and the file written later by the user would differ from it by some
missing blanks that have overwritten your 9's.

Comment 21 Hans-Ulrich Jüttner 2017-08-29 07:28:37 UTC

(In reply to Adrian Johnson from comment #15)
> +  char buf[24];
> +  time_t now = time(nullptr);
> +  size_t size = strftime(buf, 24, "D:%Y%m%d%H%M%S%z", localtime(&now));
> +  if (size >= 2 && size < 22)
> +  {
> +    // put timezone info into single quotes
> +    buf[size] = buf[size-1];
> +    buf[size-1] = buf[size-2];
> +    buf[size-2] = '\'';
> +    buf[++size] = '\'';
> +    buf[++size] = '\0';
> +    GooString gTime(buf, size);
> +    vObj.dictAdd(copyString("M"), Object(gTime.copy()));
> +  }
> 
> PDF date format uses a single quote to separate the time zone hour and
> minute, not enclose the minutes. strftime "%z" on windows returns the
> timezone name, not the offset.
> 
http://www.verypdf.com/pdfinfoeditor/pdf-date-format.htm says that the minutes
of the time zone in PDF date format are also followed by a single quote.

Comment 22 Adrian Johnson 2017-08-29 10:01:37 UTC

The PDF standard "PDF 32000-1:2008" is at http://www.adobe.com/content/dam/Adobe/en/devnet/acrobat/pdfs/PDF32000_2008.pdf

From section 7.9.4:

"
Date values used in a PDF shall conform to a standard date format, which closely follows that of the international standard ASN.1 (Abstract Syntax Notation One), defined in ISO/IEC 8824. A date shall be a text string of the form

(D:YYYYMMDDHHmmSSOHH'mm)
"

Comment 23 Albert Astals Cid 2017-09-06 19:35:31 UTC

Also, i guess you're planning to use this in some app, do you have the patches ready for it? Would make for much easier reviewing/testing :)

Comment 24 Adrian Johnson 2017-09-06 23:19:19 UTC

Created attachment 134029 [details] [review]
pdfsig: add -nssdi option

I had trouble getting the patch to work until I found nss_init() did not like my database format in my firefox profile.

I've created a patch for pdfsig that allows specifying the database location so I could create db in the new format.

All the calls to NSS functions need checking for errors and printing a NSS error message is this can fail in many different ways. I also found I needed to create the db with --empty-password otherwise pdfsig fails to sign with no error message.

Comment 25 Adrian Johnson 2017-09-06 23:23:26 UTC

Created attachment 134030 [details] [review]
write document then update byte offsets and sig on disk

This updates to signing to work the way I requested. ie
- write the PDF to disk with space reserved in the ByteOffsets
- read the PDF back into a new PDFDoc
- get the offset of the signature object from the XRef
- update the byte offsets on disk
- compute the hash from the disk data and update the signature on disk

Comment 26 Adrian Johnson 2017-09-06 23:56:19 UTC

(In reply to Adrian Johnson from comment #25)
> Created attachment 134030 [details] [review] [review]
> write document then update byte offsets and sig on disk

I should add this patch is only a prototype. It needs error checking added.

Comment 27 Hans-Ulrich Jüttner 2017-09-07 07:50:31 UTC

(In reply to Albert Astals Cid from comment #23)
> Also, i guess you're planning to use this in some app, do you have the
> patches ready for it? Would make for much easier reviewing/testing :)

I'm planning to use this in the large and non-public system of my company.
But I also wrote a small test utility which could be made public as an
attachement to this bug.

Comment 28 Hans-Ulrich Jüttner 2017-09-07 08:02:12 UTC

Created attachment 134038 [details]
Source code of a small test utility

The test utility currently works without the patches from Adrian Johnson.

Comment 29 Adrian Johnson 2017-09-07 09:43:24 UTC

Created attachment 134040 [details]
Script to generate certificates

To test signing I used the attached script to generate a self signed certificate. I used this PDF file which contains an unsigned signature form:

https://help.queens.edu/hc/en-us/article_attachments/211567983/Signature_Test_Form.pdf

then signed it with

pdfsig -nssdir sql:$HOME/.nssdb -sign 1 -nick johndoe Signature\ Test\ Form.pdf out.pdf

and checked it with

pdfsig -nssdir sql:$HOME/.nssdb out.pdf

Comment 30 Adrian Johnson 2017-09-07 13:32:00 UTC

Created attachment 134043 [details] [review]
Return timezone in timeToDateString

Here's the fix to make timeToDateString include the time zone in a portable way.

Comment 31 Albert Astals Cid 2017-09-07 21:10:03 UTC

(In reply to Adrian Johnson from comment #24)
> Created attachment 134029 [details] [review] [review]
> pdfsig: add -nssdi option
> 
> I had trouble getting the patch to work until I found nss_init() did not
> like my database format in my firefox profile.
> 
> I've created a patch for pdfsig that allows specifying the database location
> so I could create db in the new format.
> 
> All the calls to NSS functions need checking for errors and printing a NSS
> error message is this can fail in many different ways. I also found I needed
> to create the db with --empty-password otherwise pdfsig fails to sign with
> no error message.

I'd say you can commit this one, i'd reword the man page a bit making extra clear that nssdir is optional

Comment 32 Albert Astals Cid 2017-09-07 21:11:01 UTC

(In reply to Adrian Johnson from comment #25)
> Created attachment 134030 [details] [review] [review]
> write document then update byte offsets and sig on disk
> 
> This updates to signing to work the way I requested. ie
> - write the PDF to disk with space reserved in the ByteOffsets
> - read the PDF back into a new PDFDoc
> - get the offset of the signature object from the XRef
> - update the byte offsets on disk
> - compute the hash from the disk data and update the signature on disk

This is a patch on top of his patch, right?

Comment 33 Albert Astals Cid 2017-09-07 21:12:52 UTC

(In reply to Adrian Johnson from comment #30)
> Created attachment 134043 [details] [review] [review]
> Return timezone in timeToDateString
> 
> Here's the fix to make timeToDateString include the time zone in a portable
> way.

Adrian, does this mean you're relatively happy with the patch?

Comment 34 Adrian Johnson 2017-09-08 11:44:38 UTC

Created attachment 134080 [details] [review]
pdfsig: add -nssdi option v2

Update to improve documentation.

Comment 35 Adrian Johnson 2017-09-08 11:47:07 UTC

Created attachment 134081 [details] [review]
write document then update byte offsets and sig on disk v2

Updated:
 - only read in size of sig object from disk instead of size to end of file
 - handle all possible errors.
 - add comments

Comment 36 Adrian Johnson 2017-09-08 11:54:04 UTC

> This is a patch on top of his patch, right?

Patch order is:

1) Added signing of PDF documents via Qt5 interface and with pdfsig (https://bugs.freedesktop.org/attachment.cgi?id=133576) 

2) pdfsig: add -nssdi option v2 (https://bugs.freedesktop.org/attachment.cgi?id=134080)

3) write document then update byte offsets and sig on disk v2 (https://bugs.freedesktop.org/attachment.cgi?id=134081)

4) Return timezone in timeToDateString (https://bugs.freedesktop.org/attachment.cgi?id=134043)

> I'd say you can commit this one, i'd reword the man page a bit making extra
> clear that nssdir is optional

I've updated this patch but didn't push it to avoid rebasing the other patches.

Comment 37 Adrian Johnson 2017-09-08 12:02:28 UTC

(In reply to Albert Astals Cid from comment #33)
> (In reply to Adrian Johnson from comment #30)
> > Created attachment 134043 [details] [review] [review] [review]
> > Return timezone in timeToDateString
> > 
> > Here's the fix to make timeToDateString include the time zone in a portable
> > way.
> 
> Adrian, does this mean you're relatively happy with the patch?

I'm not sure which patch you are referring to. Current patch status is:

1) Added signing of PDF documents via Qt5 interface and with pdfsig 

I have a number of major issues with this patch. I have fixed these issues in (3).

2) pdfsig: add -nssdi option v2

Updated documentation based on your feedback.

3) write document then update byte offsets and sig on disk v2
 
This fixes all the issues I had with (1).

4) Return timezone in timeToDateString

In comment 15 I requested that the patch use the timeToDateString(). I noted that this function currently does not include the timezone. This patch adds the time zone. Patch (3) has already changed the signing code to use timeToDateString().

Comment 38 Adrian Johnson 2017-09-08 12:09:51 UTC

One thing I'd like to see is every call to an NSS function check for errors and print an error message. As noted in comment 24, it is very easy for the sig code to silently fail due to something not quite right.

Many of the NSS function calls are existing code for the signature verifying so I'll leave it to Albert to decide if it needs to be fixed as part of this bug or if it can be done later.

Comment 39 Albert Astals Cid 2017-09-12 21:48:46 UTC

(In reply to Adrian Johnson from comment #38)
> One thing I'd like to see is every call to an NSS function check for errors
> and print an error message. As noted in comment 24, it is very easy for the
> sig code to silently fail due to something not quite right.
> 
> Many of the NSS function calls are existing code for the signature verifying
> so I'll leave it to Albert to decide if it needs to be fixed as part of this
> bug or if it can be done later.

I agree with you, but the code was already "weak" before this patch, i see no need to block this patch on it.

Comment 40 Albert Astals Cid 2017-09-12 21:49:51 UTC

(In reply to Adrian Johnson from comment #37)
> (In reply to Albert Astals Cid from comment #33)
> > (In reply to Adrian Johnson from comment #30)
> > > Created attachment 134043 [details] [review] [review] [review] [review]
> > > Return timezone in timeToDateString
> > > 
> > > Here's the fix to make timeToDateString include the time zone in a portable
> > > way.
> > 
> > Adrian, does this mean you're relatively happy with the patch?
> 
> I'm not sure which patch you are referring to.

All of them :D

> Current patch status is:
> 
> 1) Added signing of PDF documents via Qt5 interface and with pdfsig 
> 
> I have a number of major issues with this patch. I have fixed these issues
> in (3).

So you're "happy with conditions" :D

> 
> 2) pdfsig: add -nssdi option v2
> 
> Updated documentation based on your feedback.
> 
> 3) write document then update byte offsets and sig on disk v2
>  
> This fixes all the issues I had with (1).
> 
> 4) Return timezone in timeToDateString
> 
> In comment 15 I requested that the patch use the timeToDateString(). I noted
> that this function currently does not include the timezone. This patch adds
> the time zone. Patch (3) has already changed the signing code to use
> timeToDateString().

This looks good to me.  Hans-Ulrich do patches 2) 3) and 4) also look good for you?

Comment 41 Hans-Ulrich Jüttner 2017-09-13 13:42:14 UTC

(In reply to Albert Astals Cid from comment #40)
> (In reply to Adrian Johnson from comment #37)
> > (In reply to Albert Astals Cid from comment #33)
> > > (In reply to Adrian Johnson from comment #30)
> > > > Created attachment 134043 [details] [review] [review] [review] [review] [review]
> > > > Return timezone in timeToDateString
> > > > 
> > > > Here's the fix to make timeToDateString include the time zone in a portable
> > > > way.
> > > 
> > > Adrian, does this mean you're relatively happy with the patch?
> > 
> > I'm not sure which patch you are referring to.
> 
> All of them :D
> 
> > Current patch status is:
> > 
> > 1) Added signing of PDF documents via Qt5 interface and with pdfsig 
> > 
> > I have a number of major issues with this patch. I have fixed these issues
> > in (3).
> 
> So you're "happy with conditions" :D
> 
> > 
> > 2) pdfsig: add -nssdi option v2
> > 
> > Updated documentation based on your feedback.
> > 
> > 3) write document then update byte offsets and sig on disk v2
> >  
> > This fixes all the issues I had with (1).
> > 
> > 4) Return timezone in timeToDateString
> > 
> > In comment 15 I requested that the patch use the timeToDateString(). I noted
> > that this function currently does not include the timezone. This patch adds
> > the time zone. Patch (3) has already changed the signing code to use
> > timeToDateString().
> 
> This looks good to me.  Hans-Ulrich do patches 2) 3) and 4) also look good
> for you?

Patch (2), attachment #134080 [details] [review], looks good to me except for the help message
for option -nssdir:
+  {"-nssdir", argGooString, &nssDir,     0,
+   "don't perform certificate validation"},
should be
+  {"-nssdir", argGooString, &nssDir,     0,
+   "database directory containing the certificate and key database files"},

Patch (4), attachment #134043 [details] [review], is fine.

I have a little conceptual problem with patch (3), attachment #134081 [details] [review].
Calling method sign() from qt5 interface now writes directly to disk with
the file name as new first parameter of that method. But this leaves the
document in memory with an invalid signature and invalid ByteRange parameters.
Poppler::PDFConverter::convert() called afterwards would write this invalid
document to disk and Poppler::FormFieldSignature::validate() called after
signing would tell us that the signature is invalid.

This behaviour can be argued as signing should always be the last thing to do
before writing the signed document to disk. But I think that should be clearly
documented in the header file qt5/src/poppler-form.h saying that the document
has to be reread from disk before doing anything with it after signing.

Moreover, the new parameter saveFilename of method sign() should be added to
the documentation of that method with an @param line just as it was done for
the other parameters.

Comment 42 Adrian Johnson 2017-09-13 14:05:10 UTC

(In reply to Hans-Ulrich Jüttner from comment #41)
> I have a little conceptual problem with patch (3), attachment #134081 [details] [review]
> [details] [review].
> Calling method sign() from qt5 interface now writes directly to disk with
> the file name as new first parameter of that method. But this leaves the
> document in memory with an invalid signature and invalid ByteRange
> parameters.
> Poppler::PDFConverter::convert() called afterwards would write this invalid
> document to disk and Poppler::FormFieldSignature::validate() called after
> signing would tell us that the signature is invalid.

I'm not familiar with the qt5 interface. There a a couple of options:
- reread the document after signing so the in memory copy is consistent with the on disk copy
- document the signing as a "save a copy" operation. ie the saved copy will be different to the in memory copy. And fix the code so the in memory copy is not changed.

> This behaviour can be argued as signing should always be the last thing to do
> before writing the signed document to disk. But I think that should be
> clearly
> documented in the header file qt5/src/poppler-form.h saying that the document
> has to be reread from disk before doing anything with it after signing.

And user interfaces should display a warning of modification of a signed document is attempted to warning the signature will be invalidated.

> Moreover, the new parameter saveFilename of method sign() should be added to
> the documentation of that method with an @param line just as it was done for
> the other parameters.

Comment 43 Albert Astals Cid 2017-09-14 22:08:19 UTC

(In reply to Adrian Johnson from comment #42)
> I'm not familiar with the qt5 interface. There a a couple of options:
> - reread the document after signing so the in memory copy is consistent with
> the on disk copy

This is what needs to happen, be it reread or ideally a less extensive operation, but after sign is called, things need to keep returning valid information.

Hans what do you say would be invalid and why?

> - document the signing as a "save a copy" operation. ie the saved copy will
> be different to the in memory copy. And fix the code so the in memory copy
> is not changed.

This doesn't make sense, remember you're not "signing the document", but signining an anotation of the document, so this is not even a function on the document level, but at the annotation level.

Comment 44 Adrian Johnson 2017-09-14 22:32:43 UTC

(In reply to Albert Astals Cid from comment #43)
> This is what needs to happen, be it reread or ideally a less extensive
> operation, but after sign is called, things need to keep returning valid
> information.

It is probably sufficient to just update the signature and byte offsets in the in memory object.

Comment 45 Hans-Ulrich Jüttner 2017-09-15 07:18:11 UTC

(In reply to Adrian Johnson from comment #44)
> (In reply to Albert Astals Cid from comment #43)
> > This is what needs to happen, be it reread or ideally a less extensive
> > operation, but after sign is called, things need to keep returning valid
> > information.
> 
> It is probably sufficient to just update the signature and byte offsets in
> the in memory object.

(In reply to Albert Astals Cid from comment #43)
> (In reply to Adrian Johnson from comment #42)
> > I'm not familiar with the qt5 interface. There a a couple of options:
> > - reread the document after signing so the in memory copy is consistent with
> > the on disk copy
> 
> This is what needs to happen, be it reread or ideally a less extensive
> operation, but after sign is called, things need to keep returning valid
> information.
> 
> Hans what do you say would be invalid and why?

The in memory ByteRange object contains numbers 9999999999 and generates a
string "/ByteRange [0 9999999999 9999999999 9999999999 ]" if written to disk.
The in memory Contents object contains a string with a signature not calculated
over the byte ranges which should be signed.

> 
> > - document the signing as a "save a copy" operation. ie the saved copy will
> > be different to the in memory copy. And fix the code so the in memory copy
> > is not changed.
> 
> This doesn't make sense, remember you're not "signing the document", but
> signining an anotation of the document, so this is not even a function on
> the document level, but at the annotation level.

Comment 46 Hans-Ulrich Jüttner 2017-09-15 07:30:12 UTC

(In reply to Adrian Johnson from comment #44)
> (In reply to Albert Astals Cid from comment #43)
> > This is what needs to happen, be it reread or ideally a less extensive
> > operation, but after sign is called, things need to keep returning valid
> > information.
> 
> It is probably sufficient to just update the signature and byte offsets in
> the in memory object.

With the Contents object I see no problem replacing it with the correct signature
value. However, the ByteRange object on disk is a string with multiple spaces,
e.g. "/ByteRange [0 103562     108976     311        ]". These multiple spaces
can't be represented in the ByteRange object in memory as it is an array of
integers. But if these multiple spaces are removed the signature will be
invalidated since the hash is calculated over a string including these spaces.
Before the patch (3) of Adrian this problem was avoided by not producing such
multiple spaces.

Comment 47 Adrian Johnson 2017-09-15 10:22:54 UTC

(In reply to Hans-Ulrich Jüttner from comment #46)
> With the Contents object I see no problem replacing it with the correct
> signature
> value. However, the ByteRange object on disk is a string with multiple
> spaces,
> e.g. "/ByteRange [0 103562     108976     311        ]". These multiple
> spaces
> can't be represented in the ByteRange object in memory as it is an array of
> integers. But if these multiple spaces are removed the signature will be
> invalidated since the hash is calculated over a string including these
> spaces.

This doesn't make sense. The signature has to be computed on the disk file.

> Before the patch (3) of Adrian this problem was avoided by not producing such
> multiple spaces.

Before patch (3) the entire PDF file was written to memory which is a non-starter. It also assumed that the document can be saved twice and get an identical file. It may work now but I don't think this assumption is safe given that if only one bit changes the signature breaks.

There is an Adobe document that explains the signing process on page 5.
https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/Acrobat_DigitalSignatures_in_PDF.pdf

It is how patch (3) works except for the last line "The PDF file is re-loaded in Acrobat to ensure that the in-memory and on-disk versions are identical.".

Comment 48 Hans-Ulrich Jüttner 2017-09-15 12:32:28 UTC

(In reply to Adrian Johnson from comment #47)
> (In reply to Hans-Ulrich Jüttner from comment #46)
> > With the Contents object I see no problem replacing it with the correct
> > signature
> > value. However, the ByteRange object on disk is a string with multiple
> > spaces,
> > e.g. "/ByteRange [0 103562     108976     311        ]". These multiple
> > spaces
> > can't be represented in the ByteRange object in memory as it is an array of
> > integers. But if these multiple spaces are removed the signature will be
> > invalidated since the hash is calculated over a string including these
> > spaces.
> 
> This doesn't make sense. The signature has to be computed on the disk file.
> 
> > Before the patch (3) of Adrian this problem was avoided by not producing such
> > multiple spaces.
> 
> Before patch (3) the entire PDF file was written to memory which is a
> non-starter. It also assumed that the document can be saved twice and get an
> identical file. It may work now but I don't think this assumption is safe
> given that if only one bit changes the signature breaks.
> 
> There is an Adobe document that explains the signing process on page 5.
> https://www.adobe.com/devnet-docs/acrobatetk/tools/DigSig/
> Acrobat_DigitalSignatures_in_PDF.pdf
> 
> It is how patch (3) works except for the last line "The PDF file is
> re-loaded in Acrobat to ensure that the in-memory and on-disk versions are
> identical.".

I think that re-reading a document which just has been written with poppler
and writing it again whithout changes should produce an identical document.
But with the multiple spaces in the ByteRange on disk this would not be the
case. Moreover, multiple spaces separating objects in PFD files are not
allowed by more restrictive standards like PDF/A.

Comment 49 Adrian Johnson 2017-09-15 13:05:52 UTC

(In reply to Hans-Ulrich Jüttner from comment #48)
> I think that re-reading a document which just has been written with poppler
> and writing it again whithout changes should produce an identical document.

Should does not mean it always will. It introduces a constraint on poppler that prevents any future changes that alter that assumption.

I still have not seen a good explanation for why the you think the signing should work this way.

> But with the multiple spaces in the ByteRange on disk this would not be the
> case. Moreover, multiple spaces separating objects in PFD files are not
> allowed by more restrictive standards like PDF/A.

Then we can use '0's. But in any case we don't support PDF/A.

Comment 50 Hans-Ulrich Jüttner 2017-09-15 14:53:55 UTC

(In reply to Adrian Johnson from comment #49)
> (In reply to Hans-Ulrich Jüttner from comment #48)
> > I think that re-reading a document which just has been written with poppler
> > and writing it again whithout changes should produce an identical document.
> 
> Should does not mean it always will. It introduces a constraint on poppler
> that prevents any future changes that alter that assumption.
> 
> I still have not seen a good explanation for why the you think the signing
> should work this way.

We simply have different ideas about the in memory representation of a PDF file.
In my opinion the bytes on disk should be reproducible from the in memory
representation whenever possible. Especially when dealing with signatures
this is desirably. If some time in the future one wants to support multiple
signatures on PDF documents with the second signature after the ByteRange
of the first one, then the first signature should not be invalidated by
writing the document with the second signature.

In your opinion the bytes on disk need not be reproducible from the in memory
representation.

> 
> > But with the multiple spaces in the ByteRange on disk this would not be the
> > case. Moreover, multiple spaces separating objects in PFD files are not
> > allowed by more restrictive standards like PDF/A.
> 
> Then we can use '0's. But in any case we don't support PDF/A.

Leading '0's on the numbers in ByteRange would be even worse. These numbers
then probably would be mistakenly read as 0 by at least some PDF software.

Comment 51 Albert Astals Cid 2017-09-16 17:17:04 UTC

(In reply to Adrian Johnson from comment #47)
> It is how patch (3) works except for the last line "The PDF file is
> re-loaded in Acrobat to ensure that the in-memory and on-disk versions are
> identical.".

This could be left to the application using poppler if they want to do it i guess

Comment 52 Albert Astals Cid 2017-09-16 17:17:41 UTC

(In reply to Adrian Johnson from comment #47)
> (In reply to Hans-Ulrich Jüttner from comment #46)
> > With the Contents object I see no problem replacing it with the correct
> > signature
> > value. However, the ByteRange object on disk is a string with multiple
> > spaces,
> > e.g. "/ByteRange [0 103562     108976     311        ]". These multiple
> > spaces
> > can't be represented in the ByteRange object in memory as it is an array of
> > integers. But if these multiple spaces are removed the signature will be
> > invalidated since the hash is calculated over a string including these
> > spaces.
> 
> This doesn't make sense. The signature has to be computed on the disk file.
> 
> > Before the patch (3) of Adrian this problem was avoided by not producing such
> > multiple spaces.
> 
> Before patch (3) the entire PDF file was written to memory which is a
> non-starter. 

Agreed, reading the entire PDF file to memory is not a good idea.

Comment 53 Albert Astals Cid 2017-09-16 17:19:39 UTC

(In reply to Hans-Ulrich Jüttner from comment #50)
> (In reply to Adrian Johnson from comment #49)
> > (In reply to Hans-Ulrich Jüttner from comment #48)
> > > I think that re-reading a document which just has been written with poppler
> > > and writing it again whithout changes should produce an identical document.
> > 
> > Should does not mean it always will. It introduces a constraint on poppler
> > that prevents any future changes that alter that assumption.
> > 
> > I still have not seen a good explanation for why the you think the signing
> > should work this way.
> 
> We simply have different ideas about the in memory representation of a PDF
> file.
> In my opinion the bytes on disk should be reproducible from the in memory
> representation whenever possible. Especially when dealing with signatures
> this is desirably. If some time in the future one wants to support multiple
> signatures on PDF documents with the second signature after the ByteRange
> of the first one, then the first signature should not be invalidated by
> writing the document with the second signature.
> 
> In your opinion the bytes on disk need not be reproducible from the in memory
> representation.

Let's not delve into philosophical questions and let's try to finish this patch, we're already at comment #53

> 
> > 
> > > But with the multiple spaces in the ByteRange on disk this would not be the
> > > case. Moreover, multiple spaces separating objects in PFD files are not
> > > allowed by more restrictive standards like PDF/A.
> > 
> > Then we can use '0's. But in any case we don't support PDF/A.
> 
> Leading '0's on the numbers in ByteRange would be even worse. These numbers
> then probably would be mistakenly read as 0 by at least some PDF software.

That'd be stupid, why would anyone parse 000004 as anything different than 4? I'm almost even sure the specification mentions leading 0 have to be ignored.

Comment 54 Hans-Ulrich Jüttner 2017-09-18 07:30:04 UTC

(In reply to Albert Astals Cid from comment #53)
> (In reply to Hans-Ulrich Jüttner from comment #50)
> > (In reply to Adrian Johnson from comment #49)
> > > (In reply to Hans-Ulrich Jüttner from comment #48)
> > > > I think that re-reading a document which just has been written with poppler
> > > > and writing it again whithout changes should produce an identical document.
> > > 
> > > Should does not mean it always will. It introduces a constraint on poppler
> > > that prevents any future changes that alter that assumption.
> > > 
> > > I still have not seen a good explanation for why the you think the signing
> > > should work this way.
> > 
> > We simply have different ideas about the in memory representation of a PDF
> > file.
> > In my opinion the bytes on disk should be reproducible from the in memory
> > representation whenever possible. Especially when dealing with signatures
> > this is desirably. If some time in the future one wants to support multiple
> > signatures on PDF documents with the second signature after the ByteRange
> > of the first one, then the first signature should not be invalidated by
> > writing the document with the second signature.
> > 
> > In your opinion the bytes on disk need not be reproducible from the in memory
> > representation.
> 
> Let's not delve into philosophical questions and let's try to finish this
> patch, we're already at comment #53
> 
> > 
> > > 
> > > > But with the multiple spaces in the ByteRange on disk this would not be the
> > > > case. Moreover, multiple spaces separating objects in PFD files are not
> > > > allowed by more restrictive standards like PDF/A.
> > > 
> > > Then we can use '0's. But in any case we don't support PDF/A.
> > 
> > Leading '0's on the numbers in ByteRange would be even worse. These numbers
> > then probably would be mistakenly read as 0 by at least some PDF software.
> 
> That'd be stupid, why would anyone parse 000004 as anything different than
> 4? I'm almost even sure the specification mentions leading 0 have to be
> ignored.

You are the core developers. It's up to you to decide.

Comment 55 Theofilos Intzoglou 2017-11-26 15:06:29 UTC

Comment on attachment 134080 [details] [review]
pdfsig: add -nssdi option v2

Review of attachment 134080 [details] [review]:
-----------------------------------------------------------------

Omiting the nssdir option doesn't result in nssDir.getCString() being equal to nullptr so the following changes are required.

::: utils/pdfsig.cc
@@ +197,4 @@
>        fws->setSignatureType(ETSI_CAdES_detached);
>      const char* pw = (strlen(password) == 0) ? nullptr : password;
>      const char* rs = (strlen(reason) == 0) ? nullptr : reason;
> +    if (fws->signDocument(nssDir.getCString(), certNickname, digestName, pw, rs)) {

nssDir.getCString() should be :
nssDir.getLength() ? nssDir.getCString() : nullptr

@@ +217,4 @@
>    }
>  
>    for (unsigned int i = 0; i < sigCount; i++) {
> +    sig_info = sig_widgets.at(i)->validateSignature(nssDir.getCString(), !dontVerifyCert, false, -1 /* now */);

nssDir.getCString() should be :
nssDir.getLength() ? nssDir.getCString() : nullptr

Comment 56 Hans-Ulrich Jüttner 2017-11-28 09:31:03 UTC

Created attachment 135747 [details] [review]
Patch for a bug with missing -nssdir option in utils/pdfsig.cc

This patch to be applied on top of the other patches would fix the bug
pointed out by  Theofilos Intzoglou.

Comment 57 Albert Astals Cid 2017-12-02 11:56:37 UTC

I lost track of this, what are the missing stuff for this to get merged?

Comment 58 Hans-Ulrich Jüttner 2017-12-04 08:43:46 UTC

(In reply to Albert Astals Cid from comment #57)
> I lost track of this, what are the missing stuff for this to get merged?

In my opinion there is nothing missing. You already favoured the version with
on disk signing. So patches #133576, #134080, #134081, #134043 and now
#135747 could be merge in that order.

Comment 59 Fahad Al-Saidi 2018-06-24 15:55:54 UTC

any update on this?

Comment 60 Hans-Ulrich Jüttner 2018-06-25 07:43:58 UTC

(In reply to Fahad Al-Saidi from comment #59)
> any update on this?

Unfortunately this seems to have been forgotten.

Comment 61 Albert Astals Cid 2018-06-27 21:26:22 UTC

It's not forgotten, it's sadly part of a very long TODO list that I don't seem to be able to make shorter.

Comment 62 GitLab Migration User 2018-08-21 11:00:32 UTC

-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/465.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.