EXCITE - Extraction of Citations from PDF Documents

img

About Excite


Excite Team




WeST -The Institute for Web Science and Technologies

  • team-img

    Prof. Dr. Steffen Staab

    Team Leader

    staab@uni-koblenz.de

  • team-img

    Dr. Zeyd Boukhers

    Researcher

    boukhers@uni-koblenz.de

  • team-img

    Martin Körner

    Researcher

    mkoerner@uni-koblenz.de

GESIS - Leibniz-Institut für Sozialwissenschaften

  • team-img

    Dr. Philipp Mayr

    Team Leader

    philipp.mayr@gesis.org

  • team-img

    Behnam Ghavimi

    Researcher

    behnam.ghavimi@gesis.org

  • team-img

    Azam Hosseini

    Programmer

    azam.hosseini@gesis.org


  
  

External Supporter

Dr. Heinrich Hartmann

heinrich@heinrichhartmann.com


Software

Several services are provided by Excite to extract and parse citations. All tools are licensed GNU General Public License v3.0 (GNU GPL v3.0) and their codes are available on GitHub.

  • EXParser: It is a Python tool that extracts and segment references from PDF files by adopting a feedback mechanism.


    Read more ....

  • EXMatcher: This algorithm is implemented for finding corresponding items in a bibliography corpus (such as Sowiport.org or related-work.net) for reference strings.

    Read more ....

  • EXPublisher: This code is dedicated to the task of converting EXCITE data to a JSON file with OCC ontology.


    Read more ....

  • EXRef-Identifier: It is an annotator tool that helps to annotate reference string in a text files and thus create a gold standard.

    Read more ....

    live demo

  • EXRef-Segmentation: It is an annotator tool that helps to manually parse reference strings.

    Read more ....

    live demo

  • RefExt: It is JAVA tool that extracts references from PDF files. Using Conditional Random Fields (CRF).


    Read more ....


News From Excite


P ublications