LIS Links

First and Largest Academic Social Network of LIS Professionals in India

Latest Activity

Dr. U. PRAMANATHAN posted an event
yesterday
Dr. U. PRAMANATHAN is now friends with Sumit Sundar Ray and HINGE RAMAKANTH
yesterday
Dr. Sheel Bhadra Yadav posted an event

Request to Participate in Research Study on AI Adoption in LIS at India

October 19, 2025 at 6pm to November 30, 2025 at 7pm
yesterday
Dr. O Seshaiaih posted a discussion
yesterday
Dr. N K Pachauri posted a discussion
yesterday
Dr. U. PRAMANATHAN posted a discussion
yesterday
Dr.Stephen.G updated their profile
Tuesday
Dr.Stephen.G and KADARI NAVEEN KUMAR are now friends
Tuesday
Dr. Sheel Bhadra Yadav updated their profile
Oct 18
jignesh nakum updated their profile
Oct 16
Dr. Badan Barman posted a blog post
Oct 15
Dr. Nilanjana Purkayastha posted an event
Thumbnail

ICSSSM 2025: First International Conference on Smart Systems and Social Management at The Assam Royal Global University, Guwahati

November 6, 2025 to November 8, 2025
Oct 15
Dr. Badan Barman and Geetanjali Tiwari are now friends
Oct 15
Pabitra Kumar Choudhury posted an event
Thumbnail

22nd MANLIBNET Convention and International Conference on Marching Beyond the Libraries: Talent, Technology, and Transformation (ICMBL) at KIIT Deemed to be University, Bhubaneswar, Odisha

December 18, 2025 at 9:30am to December 20, 2025 at 7:30pm
Oct 15
Dr. Jagadish Bujugundala posted an event

One Nation One Subscription for Research Excellence at Government Degree College, Mulugu,

November 7, 2025 all day
Oct 15
Ashutosh Kushwaha posted an event
Oct 15
Saurabh Prajapati posted a blog post
Oct 15
Profile IconMB Films Narayanpur, Amit ojha, Pritee Sharma and 13 more joined LIS Links
Oct 15
N KUMAR updated their profile
Oct 15
MOHD MALIK might attend Nayana Nair's event
Thumbnail

One-day Workshop on Analog to Digital: The Transformation at French Institute of Pondicherry (IFP)

October 31, 2025 from 9:30am to 5:30pm
Oct 14

Dear Friends

 

We are in need of a PDF Metadata Extractor Information, preferably free and not online. Please share the information if anybody using it. Actually it is for using in combination with DSpace software, but we can not go online with our collection.

Any help will be highly appreciated.

Thank you

Subeesh A C

Views: 975

Reply to This

Replies to This Forum

Try ExitTool

http://www.sno.phy.queensu.ca/~phil/exiftool/

I have been using it for extracting metadata from PDFs for using in DSpace.  It is possible to extract metadata from all PDFs at one go, if you are familiar with command line options.

S. Baskar

Thank you very much sir

But I think the tool is extracting data from document properties in my try. Are you getting the appropriate data with exiftool?

Subeesh A C

Hi,

Using the below command, you can extract all metadata (i.e. all metadata tags associated with the PDF document) from hundreds of PDF documents and save it as CSV file which could be used for doing batch import within DSpace.  

In case, if you require only specific tags, then you have to mention the required metadata tags for extracting.  I have given an example below for your understanding.

To extract all available metadata tags from the PDF documents and save it as a CSV file

---------------------------------------------------------------------------------------------------------------------

exiftool -csv  *.pdf > output.csv

To extract specific metadata tags from the PDF documents and save it as a CSV file

-----------------------------------------------------------------------------------------------------------------------------

exiftool  -TAG -Title   -TAG -Author  -TAG -Producer  -TAG -Subject -TAG -Description -TAG -Type -TAG -Keywords -TAG -ISBN -TAG -Isbn -TAG -Createdate -TAG -CourseID  -TAG -FileSize -TAG -PageCount -TAG -PDFVersion -d %Y-%m-%d  *.pdf -csv > output.csv

Hope this helps.


S. Baskar

LinuXpert Systems

ExifTool Tag Names

The tables listed below give the names of all tags recognized by ExifTool.

http://www.sno.phy.queensu.ca/~phil/exiftool/TagNames/index.html

Thank you very much sir

I have created a small uitlity for extracting information from pdf files  few years ago . it will extract data from all files in a folder and save in tab delimited text file.

you can try it. hope it helps. pls let me know.

i have uploaded the program to google drive. Click here to download

with regards

Mujib Rahiman

KV Kanjikode

Thanks sir, I will surely let you know.

Regards 

Subeesh A C

Sir

I have checked your software, its a great effort if you have coded it yourself. As I see most of the software(s) are not able to identify the pdf files metadata as we require. I think the problem is mostly revolve around  the structure of pdf files itself. In my case the pdf files are not having any standard structure (+ OCR ) in it for the algorithm to extract as it did for any appropriate one. Since we are in hurry and we require more metadata for the current work, we are thinking of indexing it and filtering it later through various categories. Anyway thanks for your reply.

Regards 

Subeesh A C

RSS

© 2025   Created by Dr. Badan Barman.   Powered by

Badges  |  Report an Issue  |  Terms of Service

Koha Workshop