Hi, $ cat girtest.js const pop = imports.gi.Poppler; const doc = pop.Document.new_from_file("file:///path/to/test.pdf", '') const page=doc.get_page(0) log(page.get_text_layout()) $ gjs girtest.js Segmentation fault (core dumped) Backtrace follows below. What actually happens is that gir expects get_text_layout to return an array of PopplerRectangle* (an array of pointers to allocated PopplerRectangle objects), while it actually returns an array of PopplerRectangle (one continuous malloc region with all rectangles side-by-side). The confusion occurs naturally in C, as the type "Foo*" can be a pointer to a single Foo object, or an array of Foo objects. From a cursory glance at gir, it seems the actual data layout currently implemented is not supported by gir, and that the PopplerRectangles have to be allocated separately. This would be an API change. If one tries to trick gir and change the header file to a PopplerRectangle* (instead of the **), one gets: $ gjs girtest.js (gjs:20441): Gjs-WARNING **: JS ERROR: Error: Unsupported type array for (out caller-allocates) @girtest.js:4 JS_EvaluateScript() failed Here is the backtrace: (gdb) bt #0 __memcpy_sse2_unaligned () at ../sysdeps/x86_64/multiarch/memcpy-sse2-unaligned.S:37 #1 0x00007ffff69393a7 in g_slice_copy (mem_size=32, mem_block=0x40519322d0e56042) at gslice.c:1056 #2 0x00007ffff6c166d3 in g_boxed_copy (boxed_type=8036912, src_boxed=0x40519322d0e56042) at gboxed.c:352 #3 0x00007ffff7d9cf54 in gjs_boxed_from_c_struct (context=0x636ff0, info=<optimized out>, gboxed=0x40519322d0e56042, flags=<optimized out>) at gi/boxed.cpp:1236 #4 0x00007ffff7d99d43 in gjs_value_from_g_argument (context=context@entry=0x636ff0, value_p=value_p@entry=0x7fffffffc500, type_info=type_info@entry=0x72aa30, arg=arg@entry=0x7fffffffc510, copy_structs=copy_structs@entry=1) at gi/arg.cpp:2642 #5 0x00007ffff7d9a1fb in gjs_array_from_carray_internal (context=context@entry=0x636ff0, value_p=value_p@entry=0x7fffffffc5c8, param_info=param_info@entry=0x72aa30, length=length@entry=483, array=<optimized out>) at gi/arg.cpp:2143 #6 0x00007ffff7d9a695 in gjs_value_from_explicit_array (context=0x636ff0, value_p=0x7fffffffc5c8, type_info=<optimized out>, arg=0x7fffffffc618, length=483) at gi/arg.cpp:2195 #7 0x00007ffff7d9fe03 in gjs_invoke_c_function (context=context@entry=0x636ff0, function=function@entry=0x6d0de0, obj=obj@entry=0x7fffee735cd0, js_argc=js_argc@entry=0, js_argv=js_argv@entry=0x68e508, js_rval=js_rval@entry=0x7fffffffc970, r_value=r_value@entry=0x0) at gi/function.cpp:1140
I have exactly the same issue with python. I was able to work around problem using the ctypes foreign function interface of Python. The code is for Python 2.7. It should be possible create a gi.overrides.Poppler module based on this code. This would fix the issue for Python. from gi.repository import Poppler, GLib import ctypes lib_poppler = ctypes.cdll.LoadLibrary("libpoppler-glib-8") ctypes.pythonapi.PyCapsule_GetPointer.restype = ctypes.c_void_p ctypes.pythonapi.PyCapsule_GetPointer.argtypes = [ctypes.py_object, ctypes.c_char_p] PyCapsule_GetPointer = ctypes.pythonapi.PyCapsule_GetPointer class Poppler_Rectangle(ctypes.Structure): _fields_ = [ ("x1", ctypes.c_double), ("y1", ctypes.c_double), ("x2", ctypes.c_double), ("y2", ctypes.c_double) ] LP_Poppler_Rectangle = ctypes.POINTER(Poppler_Rectangle) poppler_page_get_text_layout = ctypes.CFUNCTYPE(ctypes.c_int, ctypes.c_void_p, ctypes.POINTER(LP_Poppler_Rectangle), ctypes.POINTER(ctypes.c_uint) )(lib_poppler.poppler_page_get_text_layout) def get_page_layout(page): assert isinstance(page, Poppler.Page) capsule = page.__gpointer__ page_addr = PyCapsule_GetPointer(capsule, None) rectangles = LP_Poppler_Rectangle() n_rectangles = ctypes.c_uint(0) has_text = poppler_page_get_text_layout(page_addr, ctypes.byref(rectangles), ctypes.byref(n_rectangles)) try: result = [] if has_text: assert n_rectangles.value > 0, "n_rectangles.value > 0: {}".format(n_rectangles.value) assert rectangles, "rectangles: {}".format(rectangles) for i in range(n_rectangles.value): r = rectangles[i] result.append((r.x1, r.y1, r.x2, r.y2)) return result finally: if rectangles: GLib.free(ctypes.addressof(rectangles.contents))
I still have this issues with poppler v 0.40. in python. (or assume that it's caused by this issue). get_text_layout is returning nonsence numbers like 5.77367485245e-317 1.44295714099e-312 2.56761490707e-312 2.60372595358e-321 Any chance to get this fixed? I.
-- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/612.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.