Extract Text From Documents (PDF, DOC, XLS, PPT, etc) – docsToText

Category: Javascript , Text | February 3, 2021
Author:bshopcho
Views Total:893 views
Official Page:Go to website
Last Update:February 3, 2021
License:MIT

Preview:

Extract Text From Documents (PDF, DOC, XLS, PPT, etc) – docsToText

Description:

An easy-to-use documentation to text converter that makes it possible to extract text from documents like PDF and MS Word/Excel/PowerPoint files.

Supported file types: doc, docx, xls, xlsx, ppt, pptx, pdf, and hwp.

How to use it:

1. To get started, load the JavaScript file docToText.js in the document.

<script src="docToText.js"></script>

2. Create a new instance of the DocToText.

const docToText = new DocToText();

3. Exact text from a file you specify.

docToText.extractToText('example.pdf', 'pdf')
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

4. Exact text from a file you choose from local.

const file = files[0];
const {name} = file;
const ext = name.toLowerCase().substring(name.lastIndexOf('.') + 1);
docToText.extractToText(file, ext)
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

5. You can also exact from multiple files bundled in a zip.

docToText.extractZipToText('file.zip')
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});
// from a local file
const file = files[0];
const docToText = new DocToText();
docToText.extractZipToText(file)
.then(function (text) {
  console.log(text)
}).catch(function (error) {
  console.log(error)
});

You Might Be Interested In:


Leave a Reply