Bug 84722 - Attribute::getName() returns wrong value for /N when UTF-16BE is used
Summary: Attribute::getName() returns wrong value for /N when UTF-16BE is used
Status: RESOLVED FIXED
Alias: None
Product: poppler
Classification: Unclassified
Component: general (show other bugs)
Version: unspecified
Hardware: All All
: medium normal
Assignee: poppler-bugs
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-10-06 17:26 UTC by luigi.scarso
Modified: 2014-10-07 20:46 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
A pdf where the vlaue of /N is encoded as utf-16be (7.47 KB, application/pdf)
2014-10-07 17:40 UTC, luigi.scarso
Details

Description luigi.scarso 2014-10-06 17:26:29 UTC
In the following pdf object 
/N has value 'elementname' encoded as UTF-16BE

19 0 obj
<< /K 18 0 R /S /div /Type /StructElem /A 
<< /P [ << 
/N <feff0065006c0065006d0065006e0074006e0061006d0065> 
/V <feff004100410041> >> ] /O /UserProperties >> /Pg 15 0 R /P 17 0 R >>
endobj

Attribute::getName() returns 0xfe 0xff 0x00..0x65 but the caller doesn't know the size, so it uses only 0xfe 0xff 0x00.

I propose these patches: 
1) another constructor Attribute(const char *nameA, Object *valueA, int lenA)
2) a new method GooString *getUserPropertyName() to keep backwards compatibility (getName() is in the public interface)
3) a patch to parseUserProperty






--- 0.26.4/poppler/StructElement.cc    2014-09-11 18:28:21.000000000 +0200
+++ p0.26.4/poppler/StructElement.cc        2014-10-06 09:13:31.013375479 +0200
@@ -690,6 +690,23 @@
   valueA->copy(&value);
 }
 
+Attribute::Attribute(const char *nameA, Object *valueA, int lenA):
+  type(UserProperty),
+  owner(UserProperties),
+  revision(0),
+  name(nameA,lenA),
+  value(),
+  hidden(gFalse),
+  formatted(NULL)
+{
+  assert(valueA);
+  valueA->copy(&value);
+}
+
+
+
+
+
 Attribute::Attribute(Type type, Object *valueA):
   type(type),
   owner(UserProperties), // TODO: Determine corresponding owner from Type
@@ -785,13 +802,17 @@
   return entry ? entry->type : Unknown;
 }
 
+
 Attribute *Attribute::parseUserProperty(Dict *property)
 {
   Object obj, value;
   const char *name = NULL;
+  int len = 0 ;
 
-  if (property->lookup("N", &obj)->isString())
+  if (property->lookup("N", &obj)->isString()){
     name = obj.getString()->getCString();
+    len = obj.getString()->getLength();
+  }
   else if (obj.isName())
     name = obj.getName();
   else {
@@ -807,7 +828,7 @@
     return NULL;
   }
 
-  Attribute *attribute = new Attribute(name, &value);
+  Attribute *attribute = new Attribute(name, &value,len) ;
   value.free();
   obj.free();



--- 0.26.4/poppler/StructElement.h     2014-09-11 18:28:20.000000000 +0200
+++ p0.26.4/poppler/StructElement.h 2014-10-06 09:13:53.469376410 +0200
@@ -76,6 +76,9 @@
   // Creates an UserProperty attribute, with an arbitrary name and value.
   Attribute(const char *name, Object *value);
 
+  // Creates an UserProperty attribute, with an arbitrary name of lenght len and value.
+  Attribute(const char *name, Object *value, int len);
+
   GBool isOk() const { return type != Unknown; }
 
   // Name, type and value can be set only on construction.
@@ -87,6 +90,9 @@
   static Object *getDefaultValue(Type type);
 
   const char *getName() const { return type == UserProperty ? name.getCString() : getTypeName(); }
+  GooString *getUserPropertyName() const { return type == UserProperty ? name.copy() : NULL; }
+
+
 
   // The revision is optional, and defaults to zero.
   Guint getRevision() const { return revision; }
Comment 1 Albert Astals Cid 2014-10-06 17:51:08 UTC
There is no compatibility in internal poppler classes. I think the class should work on GooString.

Can you please work on such a patch?
Comment 2 luigi.scarso 2014-10-07 06:02:29 UTC
This seems to be ok



--- StructElement.h.orig	2014-10-06 21:05:53.439147479 +0200
+++ StructElement.h	2014-10-07 07:51:25.412753215 +0200
@@ -76,6 +76,9 @@
   // Creates an UserProperty attribute, with an arbitrary name and value.
   Attribute(const char *name, Object *value);
 
+  // Creates an UserProperty attribute, with an arbitrary name of length len and value.
+  Attribute(const char *name, Object *value, int len);
+
   GBool isOk() const { return type != Unknown; }
 
   // Name, type and value can be set only on construction.
@@ -86,7 +89,7 @@
   Object *getValue() const { return &value; }
   static Object *getDefaultValue(Type type);
 
-  const char *getName() const { return type == UserProperty ? name.getCString() : getTypeName(); }
+  GooString *getName() const { return type == UserProperty ? name.copy() : new GooString(getTypeName()); }
 
   // The revision is optional, and defaults to zero.
   Guint getRevision() const { return revision; }



--- StructElement.cc.orig	2014-10-06 21:05:47.551147234 +0200
+++ StructElement.cc	2014-10-06 09:13:31.000000000 +0200
@@ -690,6 +690,23 @@
   valueA->copy(&value);
 }
 
+Attribute::Attribute(const char *nameA, Object *valueA, int lenA):
+  type(UserProperty),
+  owner(UserProperties),
+  revision(0),
+  name(nameA,lenA),
+  value(),
+  hidden(gFalse),
+  formatted(NULL)
+{
+  assert(valueA);
+  valueA->copy(&value);
+}
+
+
+
+
+
 Attribute::Attribute(Type type, Object *valueA):
   type(type),
   owner(UserProperties), // TODO: Determine corresponding owner from Type
@@ -785,13 +802,17 @@
   return entry ? entry->type : Unknown;
 }
 
+
 Attribute *Attribute::parseUserProperty(Dict *property)
 {
   Object obj, value;
   const char *name = NULL;
+  int len = 0 ;
 
-  if (property->lookup("N", &obj)->isString())
+  if (property->lookup("N", &obj)->isString()){
     name = obj.getString()->getCString();
+    len = obj.getString()->getLength();
+  }
   else if (obj.isName())
     name = obj.getName();
   else {
@@ -807,7 +828,7 @@
     return NULL;
   }
 
-  Attribute *attribute = new Attribute(name, &value);
+  Attribute *attribute = new Attribute(name, &value,len) ;
   value.free();
   obj.free();
Comment 3 Albert Astals Cid 2014-10-07 17:34:47 UTC
Can you attach a pdf where this is needed?
Comment 4 luigi.scarso 2014-10-07 17:40:40 UTC
Created attachment 107516 [details]
A pdf where the vlaue of /N is encoded as utf-16be
Comment 5 Albert Astals Cid 2014-10-07 20:39:45 UTC
So are you using getName somewhere? because as far as i can see nothing in poppler is.
Comment 6 Albert Astals Cid 2014-10-07 20:46:23 UTC
I've commited a patch inspired by yours but that fixes a few things.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.