“How to read PDF content using .NET?” is one of the very common questions you normally found in almost all Microsoft forum. Since I have been answering this question with sample code most of the time in I thought I will write a short article with detailed explanation.
Here I am going to use iTextSharp.dll to read the PDF file. iTextSharp is a C# port of iText, and open source Java library for PDF generation and manipulation. You can download the DLL from sourceforge.net using this download iTextSharp link.
Now we will start the .NET coding part to use the iTextSharp.
As this is a sample programe I am going to add only 3 controls. One FileUpload Control to locate/browse the PDF file, one button to show the content in a label and finally a label display the PDF content.
First we will see the PDF file and it’s content we are going to read.
No we will design our .ASPX page, as I mentioned above we have only three controls.
<%@ Page Language="C#" AutoEventWireup="true" CodeBehind="WebForm1.aspx.cs" Inherits="Sample_2012_Web_App.WebForm1" %>
<!DOCTYPE html><html xmlns="http://www.w3.org/1999/xhtml"><head runat="server"><title></title></head><body><form id="form1" runat="server"><div></div><asp:Label ID="Label1" runat="server" Text="Please select the PDF File"></asp:Label> <asp:FileUpload ID="PDFFileUpload" runat="server" /><br /><br /><asp:Button ID="btnShowContent" runat="server" OnClick="btnShowContent_Click" Text="Show PDF Content" /><br /><br /><asp:Label ID="lblPdfContent" runat="server"></asp:Label></form></body></html>
Below image shows you the interface we have created,
Now we will see the C# code to read the PDF content. Before start writing the code we need to add reference to the iTextSharp.dll. So from your solution explorer right click on the Reference and click on Browse button to locate the DLL file you have stored from the downloaded source code.
Once you add the reference we have to add the namespaces like below,
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
Now we will see the complete source code.
using System;
using System.Collections.Generic;
using System.Linq;
using System.Web;
using System.Web.UI;
using System.Web.UI.WebControls;
using iTextSharp.text.pdf;
using iTextSharp.text.pdf.parser;
using System.Text;
namespace Sample_2012_Web_App
{public partial class WebForm1 : System.Web.UI.Page{protected void Page_Load(object sender, EventArgs e){}protected void btnShowContent_Click(object sender, EventArgs e){if (PDFFileUpload.HasFile)
{string strPDFFile = PDFFileUpload.FileName;
PDFFileUpload.SaveAs(Server.MapPath(strPDFFile));StringBuilder strPdfContent = new StringBuilder();
PdfReader reader = new PdfReader(Server.MapPath(strPDFFile));
for (int i = 1; i <= reader.NumberOfPages; i++){ITextExtractionStrategy objExtractStrategy = new SimpleTextExtractionStrategy();
string strLineText = PdfTextExtractor.GetTextFromPage(reader, i, objExtractStrategy);
strLineText = Encoding.UTF8.GetString(ASCIIEncoding.Convert(Encoding.Default, Encoding.UTF8, Encoding.Default.GetBytes(strLineText)));strPdfContent.Append(strLineText);reader.Close();strPdfContent.Append("<br/>");
}lblPdfContent.Text = strPdfContent.ToString();}}}}
Finally we will see the output.
As usual you are always welcome to post your comment below.
13 comments:
its good. but it has some error in its code.
The following two lines of code should be outside of your loop.
reader.Close(); strPdfContent.Append("
");
Excellent blog for dotnet learners.
HTML / aspx web Page TO PDF using iTestsharp C#
This link has Demo App with code availaible to download and it is working successfully
http://geeksprogrammings.blogspot.in/2013/10/connect-access-database-with-c.html
Your posts is really helpful for me.Thanks for your wonderful post.It is really very helpful for us and I have gathered some important information from this blog.If anyone wants to get Dot Net Training in Chennai reach FITA, rated as No.1 Dot Net Training Institutes in Chennai.
Its a good post and you have given some useful information how to read a pdf using .net method...so helpful
Best DOT NET Training in Chennai
Nice blog, here I had an opportunity to learn something new in my interested domain. I have an expectation about your future post so please keep updates.
SAP PP Training In Chennai
Thanks for sharing this valuable information to our vision.
ccna training in Chennai
Thanks for sharing this valuable information to our vision. You have posted a trust worthy blog keep sharing.
Fita Chennai Reviews
Thanks for splitting your comprehension with us. It’s really useful to me & I hope it helps the people who in need of this vital information.
Angularjs training in chennai|Angular course in chennai
Software Testing Training Institutes in Chennai
I have read your blog and i got a very useful and knowledgeable information from your blog.its really a very nice article. I did Loadrunner Training Chennai. This is really useful for me. Suppose if anyone interested to learn Manual Testing Training Chennai reach FITA academy located at Chennai Velachery.
Thanks for sharing this valuable information.
ieee java projects in chennai
ieee dotnet projects in chennai
mba projects in chennai
be projects in chennai
ns2 projects in chennai
mca projects in chennai
bulk projects in chennai
Great Article
Dot Net Based Projects for Final Year Students
FInal Year Project Centers in Chennai
JavaScript Training in Chennai
JavaScript Training in Chennai
Post a Comment