Skip to main content

Extracting Values from PDFs in .NET Core 8 without ASP.NET

Extracting data from PDF files is a common necessity for various tasks such as data analysis, content indexing, and information retrieval. While ASP.NET Core 8 offers robust tools for PDF manipulation, there are instances where developers may prefer alternatives for flexibility or specific project requirements. In this article, we'll explore how to extract values from PDF files within the .NET Core 8 ecosystem without relying on ASP.NET, using the PdfSharpCore library. We'll provide a step-by-step guide along with examples in C# to demonstrate how to accomplish this task effectively.

  1. Understanding PdfSharpCore: PdfSharpCore is a popular .NET library for PDF document manipulation. It provides functionalities to create, modify, and extract content from PDF files. In this guide, we'll focus on utilizing PdfSharpCore to extract text from PDF documents.
  2. Installing PdfSharpCore: Before we can start using PdfSharpCore in our .NET Core application, we need to install the PdfSharpCore NuGet package. This can be done via the NuGet Package Manager Console or the .NET CLI.

Using the NuGet Package Manager Console:
Install-Package PdfSharpCore

Using the .NET CLI
dotnet add package PdfSharpCore

Extracting Text from PDFs in C#: Now we have PdfSharpCore installed, let's dive into how we can extract text from PDF files using C#.

using PdfSharpCore.Pdf;
using PdfSharpCore.Pdf.IO;
using System;

public class PdfTextExtractor
{
    public static string ExtractTextFromPdf(string filePath)
    {
        using (PdfDocument document = PdfReader.Open(filePath, PdfDocumentOpenMode.Import))
        {
            string text = "";
            foreach (PdfPage page in document.Pages)
            {
                text += page.GetText();
            }
            return text;
        }
    }

    // Example usage:
    public static void Main(string[] args)
    {
        string pdfText = ExtractTextFromPdf("sample.pdf");
        Console.WriteLine(pdfText);
    }
}

In this example, we've created a PdfTextExtractor class with a static method ExtractTextFromPdf that takes the file path of the PDF as input and returns the extracted text. Inside the method, we use PdfSharpCore to open the PDF file, iterate through its pages, and extract text from each page. Finally, the extracted text is concatenated and returned.

Comments

Popular posts from this blog

How To See Logs Of Dropped Tables From The Database in MS SQL.

Here, I will explain you how you can see logs of users. Step 1 : First, create a new database with name "test". Step 2 : Create a new table. Step 3 : Now, go and drop the table by running the following command. Step 4 : Now, select your database under Object Explorer and go to Reports >> Standard Reports >> Schema Changes History. Step 5 : You will then see the schema change history. The report will show you who has dropped this table. Finally, you can locate the user activity with the help of log.

How To Implement NLog With WebAPI In Asp.Net(C#).

What is NLog? NLog is a flexible and free logging platform for various .NET platforms, including .NET standard. NLog is easy to apply and it includes several targets (database, file, event viewer). Which platform support it? .NET Framework 3.5, 4, 4.5, 4.6 & 4.7 .NET Framework 4 client profile Xamarin Android Xamarin iOS Windows Phone 8 Silver light 4 and 5 Mono 4 ASP.NET 4 (NLog.Web package) ASP.NET Core (NLog.Web.AspNetCore package) .NET Core (NLog.Extensions.Logging package) .NET Standard 1.x - NLog 4.5 .NET Standard 2.x - NLog 4.5 UWP - NLog 4.5 There are several log levels. Fatal : Something terrible occurred; the application is going down  Error : Something fizzled; the application might possibly proceed Warn : Something surprising; the application will proceed  Info : Normal conduct like mail sent, client refreshed profile and so on.  Debug : For troubleshooting; the executed question, the client confirmed, ...

How to write Unit Tests in .net

Unit tests are automated tests that verify the behavior code like methods and functions. Writing unit tests is crucial to clean coding, as they help ensure your code works as intended and catches bugs early in the development process. I can share some tips for writing effective unit tests: Write tests for all public methods Every public method in your code should have a corresponding unit test. This helps ensure that your code behaves correctly and catches any unexpected behavior early. public class Calculator { public int Add(int a, int b) { return a + b; } } [TestClass] public class CalculatorTests { [TestMethod] public void Add_ShouldReturnCorrectSum() { // Arrange Calculator calculator = new Calculator(); int a = 1; int b = 2; // Act int result = calculator.Add(a, b); // Assert Assert.AreEqual(3, result); } } Test boundary conditions  Make sure to test boundary conditions, such a...