Bug 63820

Summary: Search is slow (?infinite?) in this document
Product: poppler Reporter: Germán Poo-Caamaño <gpoo+bfdo>
Component: generalAssignee: poppler-bugs <poppler-bugs>
Status: RESOLVED MOVED QA Contact:
Severity: normal    
Priority: medium    
Version: unspecified   
Hardware: Other   
OS: All   
URL: https://bugzilla.gnome.org/show_bug.cgi?id=698621
Whiteboard:
i915 platform: i915 features:
Attachments: PDF test case (page 83)

Description Germán Poo-Caamaño 2013-04-22 23:47:32 UTC
I opened a proceeding, searched by 'forge' and evince got stuck in 59%.  I had
tried searching with other documents of >140 pages and I could not reproduce
the problem.  Neither with smaller documents.

I am not attaching the document because is big (1.6MB) and I am not sure about
its copyright (usually these are behind a paywall).  However, I found a link
that would make the trick for you to reproduce the error:

http://opensource.ucc.ie/icse2003/3rd-WS-on-OSS-Engineering.pdf

Here is the stacktrace after I interrupted evince because it was taking too
long.  Same thing with poppler-glib-demo.  The stacktrace is the following
with poppler master:

Starting program: /home/gpoo/code/evince/install/bin/poppler-glib-demo '/home/gpoo/Documentos/Papers/Conference, Portland, May/Conference, Portland, May - 2003 - Automating the measurement of open source projects.pdf'
[Thread debugging using libthread_db enabled]
Using host libthread_db library "/lib/i386-linux-gnu/libthread_db.so.1".

Program received signal SIGINT, Interrupt.
0xb731d450 in TextBlock::isBeforeByRule1 (this=0x93231d0, blk1=0x92ee238)
    at TextOutputDev.cc:1737
warning: Source file is more recent than executable.
1737	       (this->ExMin <= blk1->ExMax));
#0  0xb731d450 in TextBlock::isBeforeByRule1 (this=0x93231d0, blk1=0x92ee238)
    at TextOutputDev.cc:1737
#1  0xb731d743 in TextBlock::visitDepthFirst (this=0x93231d0, 
    blkList=0x8433d00, pos1=908, sorted=0x9d62f90, sortPos=923, 
    visited=0x9d6bf20) at TextOutputDev.cc:1855
#2  0xb731d674 in TextBlock::visitDepthFirst (this=0x93544c8, 
    blkList=0x8433d00, pos1=1054, sorted=0x9d62f90, sortPos=923, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#3  0xb731d674 in TextBlock::visitDepthFirst (this=0x9351fc8, 
    blkList=0x8433d00, pos1=1047, sorted=0x9d62f90, sortPos=917, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#4  0xb731d674 in TextBlock::visitDepthFirst (this=0x934d518, 
    blkList=0x8433d00, pos1=1033, sorted=0x9d62f90, sortPos=917, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#5  0xb731d674 in TextBlock::visitDepthFirst (this=0x934b500, 
    blkList=0x8433d00, pos1=1027, sorted=0x9d62f90, sortPos=909, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#6  0xb731d674 in TextBlock::visitDepthFirst (this=0x9321220, 
    blkList=0x8433d00, pos1=902, sorted=0x9d62f90, sortPos=909, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#7  0xb731d674 in TextBlock::visitDepthFirst (this=0x9320cd8, 
    blkList=0x8433d00, pos1=901, sorted=0x9d62f90, sortPos=902, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#8  0xb731d674 in TextBlock::visitDepthFirst (this=0x93534f0, 
    blkList=0x8433d00, pos1=1051, sorted=0x9d62f90, sortPos=902, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#9  0xb731d674 in TextBlock::visitDepthFirst (this=0x934cfb8, 
    blkList=0x8433d00, pos1=1032, sorted=0x9d62f90, sortPos=901, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#10 0xb731d674 in TextBlock::visitDepthFirst (this=0x934efe0, 
    blkList=0x8433d00, pos1=1038, sorted=0x9d62f90, sortPos=901, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#11 0xb731d674 in TextBlock::visitDepthFirst (this=0x932dcd0, 
    blkList=0x8433d00, pos1=938, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#12 0xb731d674 in TextBlock::visitDepthFirst (this=0x932b288, 
    blkList=0x8433d00, pos1=930, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#13 0xb731d674 in TextBlock::visitDepthFirst (this=0x9329d68, 
    blkList=0x8433d00, pos1=926, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#14 0xb731d674 in TextBlock::visitDepthFirst (this=0x92f9098, 
    blkList=0x8433d00, pos1=781, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#15 0xb731d674 in TextBlock::visitDepthFirst (this=0x93266a0, 
    blkList=0x8433d00, pos1=918, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#16 0xb731d674 in TextBlock::visitDepthFirst (this=0x93036c0, 
    blkList=0x8433d00, pos1=812, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#17 0xb731d674 in TextBlock::visitDepthFirst (this=0x92fdae8, 
    blkList=0x8433d00, pos1=795, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#18 0xb731d674 in TextBlock::visitDepthFirst (this=0x92f6b98, 
    blkList=0x8433d00, pos1=774, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#19 0xb731d674 in TextBlock::visitDepthFirst (this=0x93188d0, 
    blkList=0x8433d00, pos1=876, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#20 0xb731d674 in TextBlock::visitDepthFirst (this=0x9317e40, 
    blkList=0x8433d00, pos1=874, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#21 0xb731d674 in TextBlock::visitDepthFirst (this=0x934ea98, 
    blkList=0x8433d00, pos1=1037, sorted=0x9d62f90, sortPos=860, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#22 0xb731d674 in TextBlock::visitDepthFirst (this=0x932a7f8, 
    blkList=0x8433d00, pos1=928, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#23 0xb731d674 in TextBlock::visitDepthFirst (this=0x93292d8, 
    blkList=0x8433d00, pos1=924, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#24 0xb731d674 in TextBlock::visitDepthFirst (this=0x9328860, 
    blkList=0x8433d00, pos1=922, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#25 0xb731d674 in TextBlock::visitDepthFirst (this=0x92f3178, 
    blkList=0x8433d00, pos1=763, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#26 0xb731d674 in TextBlock::visitDepthFirst (this=0x92ef210, 
    blkList=0x8433d00, pos1=751, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#27 0xb731d674 in TextBlock::visitDepthFirst (this=0x9315e90, 
    blkList=0x8433d00, pos1=868, sorted=0x9d62f90, sortPos=823, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#28 0xb731d674 in TextBlock::visitDepthFirst (this=0x930d540, 
    blkList=0x8433d00, pos1=842, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#29 0xb731d674 in TextBlock::visitDepthFirst (this=0x930c568, 
    blkList=0x8433d00, pos1=839, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#30 0xb731d674 in TextBlock::visitDepthFirst (this=0x933a750, 
    blkList=0x8433d00, pos1=976, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#31 0xb731d674 in TextBlock::visitDepthFirst (this=0x9339230, 
    blkList=0x8433d00, pos1=972, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#32 0xb731d674 in TextBlock::visitDepthFirst (this=0x9337d10, 
    blkList=0x8433d00, pos1=968, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#33 0xb731d674 in TextBlock::visitDepthFirst (this=0x9336c50, 
    blkList=0x8433d00, pos1=965, sorted=0x9d62f90, sortPos=818, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#34 0xb731d674 in TextBlock::visitDepthFirst (this=0x93311a0, 
    blkList=0x8433d00, pos1=948, sorted=0x9d62f90, sortPos=806, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#35 0xb731d674 in TextBlock::visitDepthFirst (this=0x92ce120, 
    blkList=0x8433d00, pos1=646, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#36 0xb731d674 in TextBlock::visitDepthFirst (this=0x930cff8, 
    blkList=0x8433d00, pos1=841, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#37 0xb731d674 in TextBlock::visitDepthFirst (this=0x930b590, 
    blkList=0x8433d00, pos1=836, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#38 0xb731d674 in TextBlock::visitDepthFirst (this=0x930a070, 
    blkList=0x8433d00, pos1=832, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#39 0xb731d674 in TextBlock::visitDepthFirst (this=0x93095e0, 
    blkList=0x8433d00, pos1=830, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#40 0xb731d674 in TextBlock::visitDepthFirst (this=0x9335c60, 
    blkList=0x8433d00, pos1=962, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#41 0xb731d674 in TextBlock::visitDepthFirst (this=0x9330710, 
    blkList=0x8433d00, pos1=946, sorted=0x9d62f90, sortPos=805, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#42 0xb731d674 in TextBlock::visitDepthFirst (this=0x932f1f0, 
    blkList=0x8433d00, pos1=942, sorted=0x9d62f90, sortPos=804, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#43 0xb731d674 in TextBlock::visitDepthFirst (this=0x932e218, 
    blkList=0x8433d00, pos1=939, sorted=0x9d62f90, sortPos=804, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#44 0xb731d674 in TextBlock::visitDepthFirst (this=0x9309098, 
    blkList=0x8433d00, pos1=829, sorted=0x9d62f90, sortPos=804, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#45 0xb731d674 in TextBlock::visitDepthFirst (this=0x9308608, 
    blkList=0x8433d00, pos1=827, sorted=0x9d62f90, sortPos=804, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#46 0xb731d674 in TextBlock::visitDepthFirst (this=0x92d5040, 
    blkList=0x8433d00, pos1=668, sorted=0x9d62f90, sortPos=804, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#47 0xb731d674 in TextBlock::visitDepthFirst (this=0x92d20b8, 
    blkList=0x8433d00, pos1=659, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#48 0xb731d674 in TextBlock::visitDepthFirst (this=0x92d0b68, 
    blkList=0x8433d00, pos1=654, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#49 0xb731d674 in TextBlock::visitDepthFirst (this=0x92ca1c0, 
    blkList=0x8433d00, pos1=634, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#50 0xb731d674 in TextBlock::visitDepthFirst (this=0x92c5d18, 
    blkList=0x8433d00, pos1=620, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#51 0xb731d674 in TextBlock::visitDepthFirst (this=0x930b048, 
    blkList=0x8433d00, pos1=835, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#52 0xb731d674 in TextBlock::visitDepthFirst (this=0x9309b28, 
    blkList=0x8433d00, pos1=831, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#53 0xb731d674 in TextBlock::visitDepthFirst (this=0x9307630, 
    blkList=0x8433d00, pos1=824, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#54 0xb731d674 in TextBlock::visitDepthFirst (this=0x9306110, 
    blkList=0x8433d00, pos1=820, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#55 0xb731d674 in TextBlock::visitDepthFirst (this=0x9305680, 
    blkList=0x8433d00, pos1=818, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#56 0xb731d674 in TextBlock::visitDepthFirst (this=0x9304bf0, 
    blkList=0x8433d00, pos1=816, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#57 0xb731d674 in TextBlock::visitDepthFirst (this=0x92c7238, 
    blkList=0x8433d00, pos1=625, sorted=0x9d62f90, sortPos=791, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#58 0xb731d674 in TextBlock::visitDepthFirst (this=0x92c4d10, 
    blkList=0x8433d00, pos1=616, sorted=0x9d62f90, sortPos=789, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#59 0xb731d674 in TextBlock::visitDepthFirst (this=0x92bfdc8, 
    blkList=0x8433d00, pos1=601, sorted=0x9d62f90, sortPos=789, 
    visited=0x9d6bf20) at TextOutputDev.cc:1873
#60 0xb7321a86 in TextPage::coalesce (this=0x855d260, physLayout=true, 
    fixedPitch=0, doHTML=false) at TextOutputDev.cc:3426
#61 0xb7324de4 in TextOutputDev::endPage (this=0x855e518)
    at TextOutputDev.cc:5339
#62 0xb7fb381a in poppler_page_get_text_page (page=0x845a680)
    at poppler-page.cc:277
#63 0xb7fb570f in poppler_page_find_text_with_options (page=0x845a680, 
    text=0x83a6848 "forge", options=POPPLER_FIND_DEFAULT)
    at poppler-page.cc:885
#64 0x08054526 in pgd_find_find_text (demo=0x82a2fa8) at find.c:107
#65 0xb762f880 in g_idle_dispatch (source=0x83dfb18, 
    callback=0x8054480 <pgd_find_find_text>, user_data=0x82a2fa8)
    at gmain.c:5205
#66 0xb7632ce6 in g_main_dispatch (context=0x808b8b0) at gmain.c:3054
#67 g_main_context_dispatch (context=0x808b8b0) at gmain.c:3630
#68 0xb7633085 in g_main_context_iterate (dispatch=1, block=-1218179600, 
    context=0x808b8b0, self=<optimized out>) at gmain.c:3701
#69 g_main_context_iterate (context=0x808b8b0, block=-1218179600, dispatch=1, 
    self=<optimized out>) at gmain.c:3638
#70 0xb763355b in g_main_loop_run (loop=0x8347f80) at gmain.c:3895
#71 0xb7c3175d in gtk_main () at gtkmain.c:1156
#72 0x08050a92 in main (argc=2, argv=0xbfffe874) at main.c:380
Comment 1 Jose Aliste 2013-04-23 00:08:04 UTC
the problem is in page 83 of the attached test case. The problem lies in a table that is rotated 90 degrees... And the text-layout extraction algorithm does not cope with the rotation of glpyhs... as a by product of this, poppler glib frontend can't render that page.
Comment 2 Germán Poo-Caamaño 2013-04-23 18:01:51 UTC
Created attachment 78392 [details]
PDF test case (page 83)

The original file is linked.  Now I am attaching only the page that helps to reproduce the issue.
Comment 3 Albert Astals Cid 2013-04-23 22:00:42 UTC
Another of those TextOutputDev is slow problems, don't think the glib frontend is the culprit, pdftotext takes ages too.
Comment 4 Germán Poo-Caamaño 2013-04-23 22:10:39 UTC
(In reply to comment #3)
> Another of those TextOutputDev is slow problems, don't think the glib
> frontend is the culprit, pdftotext takes ages too.

FWIW, xpdf shows the page rendered very fast but then it gets stuck, too.  I guess it is because is getting the text or something like that, which makes sense to me given pdftocairo renders the page fast, too.
Comment 5 Jose Aliste 2013-04-23 22:39:26 UTC
Albert, of course you are right, the problem is in the textOutputDdev. My comment was just about the fact that the popple-glib render code does two things at the same time, it renders the page AND it gets the text. Thus, evince won't render that page since poppler_page_render won't return.
Comment 6 GitLab Migration User 2018-08-20 21:52:34 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/poppler/poppler/issues/104.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.