Quantcast

colorspace exception during images extraction on a existing pdf

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
18 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

colorspace exception during images extraction on a existing pdf

giansluca

Hi all, :)

i'm using Itext (5.5.1) to extract some images in a existing pdf file .. i can extract almost all image (CMYK and RGB), but in some cases i have this exception :

com.itextpdf.text.exceptions.UnsupportedPdfException: The color space [/Indexed, /DeviceCMYK, 244, 274 0 R] is not supported.
        at com.itextpdf.text.pdf.parser.PdfImageObject.decodeImageBytes(PdfImageObject.java:323)
        at com.itextpdf.text.pdf.parser.PdfImageObject.<init>(PdfImageObject.java:200)
        at com.itextpdf.text.pdf.parser.PdfImageObject.<init>(PdfImageObject.java:169)
        at com.itextpdf.text.pdf.parser.ImageRenderInfo.prepareImageObject(ImageRenderInfo.java:124)
        at com.itextpdf.text.pdf.parser.ImageRenderInfo.getImage(ImageRenderInfo.java:114)

i use the standard way for extract images implementing RenderListener class.

PdfImageObject image = renderInfo.getImage();

i can't get the PdfImageObject and so the byte array of images ...
is a Itext bug ?
have you any idea to solve it ?

thanks all

Gian



Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

iText mailing list
On 5/27/2014 12:00 AM, giansluca wrote:
> i
> can extract almost all image (CMYK and RGB), but in some cases i have this
> exception :
>
> com.itextpdf.text.exceptions.UnsupportedPdfException
You get this exception if extraction of a specific type of image isn't
supported in iText.

------------------------------------------------------------------------------
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
This post was updated on .
ok thanks! ...
strange because the image is a JPG with CMYK profile ... i can open pdf in photoshop and see or save image correctly .. and in my code (with Itext) i can get this type of images (JPG - CMYK)
 
Gian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

iText mailing list
giansluca schreef op 27/05/2014 11:31:
> strange because the image is a JPG with CMYK profile ... i can open pdf in
> photoshop and see or save image correctly ..

Well, there's nothing wrong with the PDF, nor with the image, it's just
that iText relies on the Java ImageIO class and this class doesn't
support that image. Adding support for such images would probably not be
that difficult, but nobody has felt the need so far, so it didn't happen
(yet).

------------------------------------------------------------------------------
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
thanks ... yes i understand!
i i know that ImageIO doesn't support images with colorspace CMYK (JPG, TIFF yes), if i call the method getBufferedImage() on PdfImageObject object i get IIOException .. but is ok because i can get the byte array on PdfImageObject :)

i added support for get that type of images and all work fine (if i have the byte array) ... but Itex in this case can't give me the PdfImageObject ... i think (maybe) that the standard image reader (ImageIO) is called by Itext when i try to get the BufferedImage ...

well .. i'm investigating and debugging ... and try to find a solution ;)

Gian

Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

Leonard Rosenthol-3
In reply to this post by giansluca
Actually, it is NOT a JPEG with a CMYK profile.

The colorspace that you showed in the error: [/Indexed, /DeviceCMYK, 244,
274 0 R]
1 - That means that the data is indexed, which can¹t be used with JPEG
data.
2 - As it says DeviceCMYK that means that the data is (most likely) raw
CMYK numbers and not a profile. HOWEVER, I¹d need to see the object at 274
to be certain.

Leonard

On 5/27/14, 2:31 AM, "giansluca" <[hidden email]> wrote:

>ok thanks! ...
>strange because the image is a JPG with CMYK profile ... i can open pdf in
>photoshop and see or save image correctly ..
>
>Gian
>
>
>
>--
>View this message in context:
>http://itext-general.2136553.n4.nabble.com/colorspace-exception-during-ima
>ges-extraction-on-a-existing-pdf-tp4660004p4660006.html
>Sent from the iText - General mailing list archive at Nabble.com.
>
>--------------------------------------------------------------------------
>----
>The best possible search technologies are now affordable for all
>companies.
>Download your FREE open source Enterprise Search Engine today!
>Our experts will assist you in its installation for $59/mo, no commitment.
>Test it for FREE on our Cloud platform anytime!
>http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clkt
>rk
>_______________________________________________
>iText-questions mailing list
>[hidden email]
>https://lists.sourceforge.net/lists/listinfo/itext-questions
>
>iText(R) is a registered trademark of 1T3XT BVBA.
>Many questions posted to this list can (and will) be answered with a
>reference to the iText book: http://www.itextpdf.com/book/
>Please check the keywords list before you ask for examples:
>http://itextpdf.com/themes/keywords.php


------------------------------------------------------------------------------
The best possible search technologies are now affordable for all companies.
Download your FREE open source Enterprise Search Engine today!
Our experts will assist you in its installation for $59/mo, no commitment.
Test it for FREE on our Cloud platform anytime!
http://pubads.g.doubleclick.net/gampad/clk?id=145328191&iu=/4140/ostg.clktrk
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
thaks Leonard
yes you're right! it isn't a jgp image and i can not find what type of image is it! :/

in this way (with direct reference) i can get the byte array of image ...

PdfReader reader = new PdfReader("filePath");
PdfObject o = reader.getPdfObject(renderInfo.getRef());
PdfStream stream = (PdfStream) o;
byte[] barray = PdfReader.getStreamBytesRaw((PRStream) stream);


but ImageIo can't read it correctly and I have not yet understand the type of image ..

the first hex byte are : 48 89 1c 96 8b .. not a known signature ..

Gian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

iText mailing list
On 5/27/2014 11:02 PM, giansluca wrote:
> byte[] barray = PdfReader.getStreamBytesRaw((PRStream) stream);
You're reading the stream, but... this stream is useless without the
corresponding stream dictionary.
The stream dictionary will inform you about the Filter that is used,
e.g. DCTDecode means you have a JPEG, about the number of components,
about the number of bits per component, about the width and the height
of the image.

------------------------------------------------------------------------------
Time is money. Stop wasting it! Get your web API in 5 minutes.
www.restlet.com/download
http://p.sf.net/sfu/restlet
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
thanks a lot ..

i need to improve my Itex knowledge .. today is arrived a copy of Itext in Action second edition , i hope it can help me a little bit ;)

if you have some advice to try to solve my problem .. i'm listening :)

Gian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
solved implementing a custom way to decode png image which [/Indexed, /DeviceCMYK, 244, 274 0 R] colorspace.

... was a png image

thank all for help
Gian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

rao tran
hi Gian!

So far i know you need a another tool OCR to extract text from a scanned pdf or image.  You can only extract the whole pdf and insert into some other PDF.

Rao

NHATRANGRNT

> Date: Mon, 2 Jun 2014 15:22:52 -0700

> From: [hidden email]
> To: [hidden email]
> Subject: Re: [iText-questions] colorspace exception during images extraction on a existing pdf
>
> solved implementing a custom way to decode png image which [/Indexed,
> /DeviceCMYK, 244, 274 0 R] colorspace.
>
> ... was a png image
>
> thank all for help
> Gian
>
>
>
> --
> View this message in context: http://itext-general.2136553.n4.nabble.com/colorspace-exception-during-images-extraction-on-a-existing-pdf-tp4660004p4660038.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> iText-questions mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

rao tran
In reply to this post by giansluca
hi Gian

You still have a problem of extract the images.

Use another methods..  From my experience , use another and trial and error until it works.

Dig in and you get better.

If u need some codes, please let me know and I give it to u

Rao

NHATRANGRNT

> Date: Mon, 2 Jun 2014 15:22:52 -0700

> From: [hidden email]
> To: [hidden email]
> Subject: Re: [iText-questions] colorspace exception during images extraction on a existing pdf
>
> solved implementing a custom way to decode png image which [/Indexed,
> /DeviceCMYK, 244, 274 0 R] colorspace.
>
> ... was a png image
>
> thank all for help
> Gian
>
>
>
> --
> View this message in context: http://itext-general.2136553.n4.nabble.com/colorspace-exception-during-images-extraction-on-a-existing-pdf-tp4660004p4660038.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> iText-questions mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
thank you! i solved this problem about colorspace Exception ... but i have another little problem ,
i opened a new thread :)

Gian
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

rao tran
In reply to this post by giansluca
hi

Do toy send me some email?? I saw some emails and suddenly they are gone.  So i make sure that did u send me some mail??

R

NHATRANGRNT

> Date: Mon, 2 Jun 2014 15:22:52 -0700

> From: [hidden email]
> To: [hidden email]
> Subject: Re: [iText-questions] colorspace exception during images extraction on a existing pdf
>
> solved implementing a custom way to decode png image which [/Indexed,
> /DeviceCMYK, 244, 274 0 R] colorspace.
>
> ... was a png image
>
> thank all for help
> Gian
>
>
>
> --
> View this message in context: http://itext-general.2136553.n4.nabble.com/colorspace-exception-during-images-extraction-on-a-existing-pdf-tp4660004p4660038.html
> Sent from the iText - General mailing list archive at Nabble.com.
>
> ------------------------------------------------------------------------------
> Learn Graph Databases - Download FREE O'Reilly Book
> "Graph Databases" is the definitive new guide to graph databases and their
> applications. Written by three acclaimed leaders in the field,
> this first edition is now available. Download your free book today!
> http://p.sf.net/sfu/NeoTech
> _______________________________________________
> iText-questions mailing list
> [hidden email]
> https://lists.sourceforge.net/lists/listinfo/itext-questions
>
> iText(R) is a registered trademark of 1T3XT BVBA.
> Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
> Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php

------------------------------------------------------------------------------
HPCC Systems Open Source Big Data Platform from LexisNexis Risk Solutions
Find What Matters Most in Your Big Data with HPCC Systems
Open Source. Fast. Scalable. Simple. Ideal for Dirty Data.
Leverages Graph Analysis for Fast Processing & Easy Data Exploration
http://p.sf.net/sfu/hpccsystems
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

giansluca
no, i didn't sent you email ...
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

Gilles
This post has NOT been accepted by the mailing list yet.
In reply to this post by giansluca
Hi Giansluca,

I have the exact same problem, how did you solve it?

Thanks
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

blowagie
This post has NOT been accepted by the mailing list yet.
Please read http://itextpdf.com/nabble to find out why your message will not reach the mailing-list.

We are abandoning the mailing-list in favor of http://stackoverflow.com
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: colorspace exception during images extraction on a existing pdf

pramodgate6
This post has NOT been accepted by the mailing list yet.
In reply to this post by giansluca
PLEASE SEND CODE , HOW TO RESOLVE YOUR PROBLEM
Loading...