Skip to main content

Extracting Values from PDFs in .NET Core 8 without ASP.NET

Extracting data from PDF files is a common necessity for various tasks such as data analysis, content indexing, and information retrieval. While ASP.NET Core 8 offers robust tools for PDF manipulation, there are instances where developers may prefer alternatives for flexibility or specific project requirements. In this article, we'll explore how to extract values from PDF files within the .NET Core 8 ecosystem without relying on ASP.NET, using the PdfSharpCore library. We'll provide a step-by-step guide along with examples in C# to demonstrate how to accomplish this task effectively.

  1. Understanding PdfSharpCore: PdfSharpCore is a popular .NET library for PDF document manipulation. It provides functionalities to create, modify, and extract content from PDF files. In this guide, we'll focus on utilizing PdfSharpCore to extract text from PDF documents.
  2. Installing PdfSharpCore: Before we can start using PdfSharpCore in our .NET Core application, we need to install the PdfSharpCore NuGet package. This can be done via the NuGet Package Manager Console or the .NET CLI.

Using the NuGet Package Manager Console:
Install-Package PdfSharpCore

Using the .NET CLI
dotnet add package PdfSharpCore

Extracting Text from PDFs in C#: Now we have PdfSharpCore installed, let's dive into how we can extract text from PDF files using C#.

using PdfSharpCore.Pdf;
using PdfSharpCore.Pdf.IO;
using System;

public class PdfTextExtractor
{
    public static string ExtractTextFromPdf(string filePath)
    {
        using (PdfDocument document = PdfReader.Open(filePath, PdfDocumentOpenMode.Import))
        {
            string text = "";
            foreach (PdfPage page in document.Pages)
            {
                text += page.GetText();
            }
            return text;
        }
    }

    // Example usage:
    public static void Main(string[] args)
    {
        string pdfText = ExtractTextFromPdf("sample.pdf");
        Console.WriteLine(pdfText);
    }
}

In this example, we've created a PdfTextExtractor class with a static method ExtractTextFromPdf that takes the file path of the PDF as input and returns the extracted text. Inside the method, we use PdfSharpCore to open the PDF file, iterate through its pages, and extract text from each page. Finally, the extracted text is concatenated and returned.

Comments

Popular posts from this blog

How To See Logs Of Dropped Tables From The Database in MS SQL.

Here, I will explain you how you can see logs of users. Step 1 : First, create a new database with name "test". Step 2 : Create a new table. Step 3 : Now, go and drop the table by running the following command. Step 4 : Now, select your database under Object Explorer and go to Reports >> Standard Reports >> Schema Changes History. Step 5 : You will then see the schema change history. The report will show you who has dropped this table. Finally, you can locate the user activity with the help of log.

How To Deploy .net Core Application On Linux

Here, I can explain steps to deploy .net core application on linux machine. Step 1 - Publish your .net Core application: First, create a .net core application on VS; you can make an MVC project or Web API project and if you already have an existing project, then open it. Right Click on your project Click on publish Now create a new publish profile, and browse the folder where you want to publish your project dll Click on publish so it will create your dll in the folder Step 2 - Install required .net Module on Linux: Now we have our web application dll and now we need to host it on the Linux environment. First, we need to understand how the deployment works in Linux. .Net applications run on Kestrel servers and we run Apache or Nginx server in Linux environments, which acts as a proxy server and handles the traffic from outside the machine and redirects it to the Kestrel server so we will have Apache or Nginx server as the middle layer. In this article, we will use Apache as a proxy ser

List Of Commonly Used Angular Commands

1) To get the npm version,    npm -v 2) To get the node version,    node -v 3) To get the Angular version,    ng v  4) To get the Jasmine version,    jasmine -v  5) To get the Karma version,    karma --version  6) To install Angular CLI,    npm install @angular/cli -g   npm install @angular/cli 7) To install the next version of Angular CLI, v   npm install @angular/cli@next  8) To get help in the terminal,    ng help 9) To create a new project in Angular,    ng new project-name  10) To skip external dependencies while creating a new project,    ng new project-name --skip-install  11) To run the Angular project,   ng serve (or) npm start (or) ng serve --force  12) Dry Run,   ng new project-name --dry-run 13) To create a new component in the Angular Project,   ng generate component component-