To test Parscit i ran it on my system. To test Grobid i downloaded the Grobid source code, i tried alot but was unable to run the code on my system. There is no guide available on the Grobid website. I have written an email to the author but i havent received a reply yet.
Efficiency:
ParsCit Efficency:
System specification on which Parscit was tested:
Core 2 Duo 2.1 Ghz, 2gb RAM, Operating System: Ubuntu
For header extraction without references ParsCit took 1 second for each document.
Below is the time taken by ParsCit to process the refrences
1110.6200 1s
2006_fulltext-3 1s
2011_full_69640340 1s
04743392 2s
a19-arthur 2s
fulltext-2 1s
GI-Proceedings.37-25 1s (Citation text cannot be found)
journal.pone.0019917 1s
KPM378-FINAL 0s (Citation text cannot be found)
p561-ozenc 2s
p561-ozenc_AmitShresthsconflictedcopy2012-01-11 1s
p784371491 2s
prodeedings_37842 1s
Grobid Efficiency:
The average time for processing the document took around 5 to 9 second. I was using the web application provided on the Grobid website. There is network overhead involved.
Accuracy Comparison:
Red = Not found
Green = Found
PDF FILE | ParsCit | Grobit |
1110.6200 | Title ,Author ,Affiliation,Address ,Date,Abstract ,Citation Average |
Title ,Author ,Affiliation,Address ,Date,Abstract ,Citation Good |
2006_fulltext-3 | Title ,Author ,Affiliation ,Address ,Abstract ,Email ,Citation Good |
Title ,Author ,Affiliation ,Address ,Abstract ,Email, Citation Good |
2011_full_69640340 | Title ,Author ,Affiliation ,Address ,Keywords ,Abstract ,Email ,Citation Good |
Title ,Author ,Affiliation ,Address ,Keywords ,Abstract ,Email ,Citation good |
04743392 | Title ,Author ,Keywords ,Affiliation ,Email ,Affilation ,Citation Good |
Title ,Author ,Keywords ,Affiliation ,Email ,Affilation ,Citation Good |
A19-arthur | Title ,Author,Affiliation ,Abstract ,Citation Good |
Title ,Author ,Affiliation ,Abstract,Citation Good |
fulltext-2 | Title ,Author ,Abstract ,Keywords ,Citation Good |
Title ,Author ,Abstract ,Keywords,Citation Good |
GI-Proceedings.37-25 | Title ,Authors, Affiliation ,Address,Email ,Abstract ,Citation Didnt process any citation |
Title ,Authors,Affiliation ,Address ,Email ,Abstract ,Citation Didnt process any citation |
journal.pone.0019917 | Title ,Authors ,Affiliation ,Abstract ,Citation Good |
Title ,Authors ,Affiliation ,Abstract ,Citation Good |
KPM378-FINAL | Title ,Author ,Affiliation ,Abstract ,Citation Didnt process any citation |
Title ,Author,Affiliation ,Abstract ,Citation Good |
P561-ozenc | Title ,Author ,Address ,Affiliation ,Email ,Citation Average |
Title ,Author ,Address ,Affiliation ,Email ,Citation Poor |
p561-ozenc_Amit Shrestha’s | Title,Author ,Address ,Affiliation ,Email ,Citation Average |
Title ,Author ,Address ,Affiliation ,Email ,Citation Poor |
p784371491 | Title ,Authors ,Affiliation ,Email ,Abstract ,Citation Poor |
Title ,Authors ,Affiliation ,Email ,Abstract ,Citation Poor |
Prodeedings_37842 | Title,Authors ,Affiliation ,Email ,Abstract,Citation Average |
Title ,Authors ,Affiliation ,Email ,Abstract ,Citation Average |
shabazza
13.02.2012 at 11:36
Here is the link to the sourceforge site: http://sourceforge.net/projects/grobid/
shabazza
13.02.2012 at 13:42
Hi,
I have checked out grobid and it works, you need the tool PDF2XML http://sourceforge.net/projects/pdf2xml/ on your machine. You have also to set the enviroment variable PDFTOXML_HOME to the path of PDF2XML.
Then it work proper and i have measured runningtimes between 2500ms and 400ms per pdf.
kalimawan
13.02.2012 at 15:34
I have the source code from sourceforge. I will give it a try again.
wollepb
20.02.2012 at 20:22
Any news here?
kalimawan
21.02.2012 at 18:09
I havent been able to run Grobid. I will get an appointment with Tobias to get his help. I have found one new tool. I will post it soon.