Convert Text-Based PDF to Image-Based PDF

Previous Topic Next Topic
 
classic Classic list List threaded Threaded
9 messages Options
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Convert Text-Based PDF to Image-Based PDF

Balajiprasad

Hi,

 

Can we convert entire PDF as Image Based. Currently we doing it manually by opening PDF into Adobe tool(Reader/Acrobat), and selecting File->Print->Advanced->Check ‘print as Image’ and select Printer as Adobe PDF. So it returning PDF as Image. Can we do it in itextsharp?. Please guide me.

Note: Currently  we can get back Text-Based PDF by Recognizing Text (OCR) property.

 

Thanks & Regards,

R.Balajiprasad


------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

iText mailing list
Op 21/04/2011 11:11, Balajiprasad schreef:

Hi,

 

Can we convert entire PDF as Image Based.


No, that's not possible with iText.

Currently we doing it manually by opening PDF into Adobe tool(Reader/Acrobat), and selecting File->Print->Advanced->Check ‘print as Image’ and select Printer as Adobe PDF. So it returning PDF as Image. Can we do it in itextsharp?. Please guide me.


You need another tool. For instance JPedal.

Note: Currently  we can get back Text-Based PDF by Recognizing Text (OCR) property.

 

Thanks & Regards,

R.Balajiprasad



------------------------------------------------------------------------------
Benefiting from Server Virtualization: Beyond Initial Workload
Consolidation -- Increasing the use of server virtualization is a top
priority.Virtualization can reduce costs, simplify management, and improve
application availability and disaster protection. Learn more about boosting
the value of server virtualization. http://p.sf.net/sfu/vmware-sfdev2dev
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

Balajiprasad

HI,

How do we convert Image-Based PDF to Text-Based PDF. I want to do OCR for the PDF(not extracting Text only). Entire Pdf as Searchable PDF(we can select text).

 

Regards,

R.Balajiprasad

 

From: 1T3XT BVBA [mailto:[hidden email]]
Sent: Thursday, April 21, 2011 2:45 PM
Cc: 'Post all your questions about iText here'
Subject: Re: [iText-questions] Convert Text-Based PDF to Image-Based PDF

 

Op 21/04/2011 11:11, Balajiprasad schreef:

Hi,

 

Can we convert entire PDF as Image Based.


No, that's not possible with iText.


Currently we doing it manually by opening PDF into Adobe tool(Reader/Acrobat), and selecting File->Print->Advanced->Check ‘print as Image’ and select Printer as Adobe PDF. So it returning PDF as Image. Can we do it in itextsharp?. Please guide me.


You need another tool. For instance JPedal.


Note: Currently  we can get back Text-Based PDF by Recognizing Text (OCR) property.

 

Thanks & Regards,

R.Balajiprasad

 


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

iText mailing list
Op 27/04/2011 5:32, Balajiprasad schreef:
I want to do OCR for the PDF(not extracting Text only
iText doesn't do OCR, but if you have OCR output (from another tool),
you can use iText to merge that output with the original PDF to make it searchable.

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

Balajiprasad

I have some other 3rd party tools to extract Text(OCR) >From Image Based PDF. But it gives only Text of that pdf not Image. How do we place the text into existing PDF using itextsharp. Please Give me sample Code for this.

 

Regards,                               

R.Balajiprasad

 

From: 1T3XT BVBA [mailto:[hidden email]]
Sent: Wednesday, April 27, 2011 12:03 PM
To: [hidden email]
Subject: Re: [iText-questions] Convert Text-Based PDF to Image-Based PDF

 

Op 27/04/2011 5:32, Balajiprasad schreef:

I want to do OCR for the PDF(not extracting Text only

iText doesn't do OCR, but if you have OCR output (from another tool),
you can use iText to merge that output with the original PDF to make it searchable.


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

iText mailing list
Op 27/04/2011 9:26, Balajiprasad schreef:

I have some other 3rd party tools to extract Text(OCR) >From Image Based PDF. But it gives only Text of that pdf not Image. How do we place the text into existing PDF using itextsharp. Please Give me sample Code for this.


The OCR output should also contain coordinates.
If you have text + coordinates, you should use PdfStamper to put the text at absolute positions.
DO NOT ask for a code samples. PURCHASE the book and READ how to do this.
The KNOWLEDGE obtained from the book will SAVE YOU TIME AND MONEY.

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

Balajiprasad

Hi,

 

I have used MODI(Microsoft office document Imaging 12.0 Type Library) to extract text from Image i.e.) converting PDF Pages to Images(jpg or tif) and then extracting Text from Images.

Is it possible to use iTextSharp to place the Text into the pdf & using Modi object get Coordinates to place in particular places to make it searchable PDF. Pls confirm me, how to use it?

 

Regards,

R.Balajiprasad

 

From: 1T3XT BVBA [mailto:[hidden email]]
Sent: Wednesday, April 27, 2011 1:12 PM
To: [hidden email]
Subject: Re: [iText-questions] Convert Text-Based PDF to Image-Based PDF

 

Op 27/04/2011 9:26, Balajiprasad schreef:

I have some other 3rd party tools to extract Text(OCR) >From Image Based PDF. But it gives only Text of that pdf not Image. How do we place the text into existing PDF using itextsharp. Please Give me sample Code for this.


The OCR output should also contain coordinates.
If you have text + coordinates, you should use PdfStamper to put the text at absolute positions.
DO NOT ask for a code samples. PURCHASE the book and READ how to do this.
The KNOWLEDGE obtained from the book will SAVE YOU TIME AND MONEY.


------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

iText mailing list
Op 2/05/2011 9:09, Balajiprasad schreef:

Hi,

 

I have used MODI(Microsoft office document Imaging 12.0 Type Library) to extract text from Image i.e.) converting PDF Pages to Images(jpg or tif) and then extracting Text from Images.


Good for you. I don't know MODI.

Is it possible to use iTextSharp to place the Text into the pdf & using Modi object get Coordinates to place in particular places to make it searchable PDF. Pls confirm me, how to use it?


How would we be able to confirm this?
We don't know what "using Modi object get Coordinates" means.

We can only confirm that if you have a page number, text and coordinates,
you can use PdfStamper and put that text on the coordinates of that page.
READ CHAPTER 6 of the documentation to confirm for yourself.

------------------------------------------------------------------------------
WhatsUp Gold - Download Free Network Management Software
The most intuitive, comprehensive, and cost-effective network
management toolset available today.  Delivers lowest initial
acquisition cost and overall TCO of any competing solution.
http://p.sf.net/sfu/whatsupgold-sd
_______________________________________________
iText-questions mailing list
[hidden email]
https://lists.sourceforge.net/lists/listinfo/itext-questions

iText(R) is a registered trademark of 1T3XT BVBA.
Many questions posted to this list can (and will) be answered with a reference to the iText book: http://www.itextpdf.com/book/
Please check the keywords list before you ask for examples: http://itextpdf.com/themes/keywords.php
Reply | Threaded
Open this post in threaded view
|  
Report Content as Inappropriate

Re: Convert Text-Based PDF to Image-Based PDF

wenbuyi
In reply to this post by Balajiprasad
you can try this free online pdf ocr to extract text from pdf.
Loading...