academicdb: An academic database builder
Why build your CV by hand when you can create it programmatically? This package uses a set of APIs (including Scopus, ORCID, CrossRef, and Pubmed) to generate a database of academic acheivements, and provides a tool to render those into a professional-looking CV. Perhaps more importantly, it provides a database of collaborators, which can be used to generate the notorious NSF collaborators spreadsheet.
Installing academicdb
To install the current version:
pip install academicdb
In addition to the Python packages required by academicdb (which should be automatically installed), you will also need a MongoDB server to host the database. There are two relatively easy alternatives:
- Install MongoDB on your own system.
- Create a free cloud MongoDB instance here.
Configuring academicdb
To use academicdb, you must first set up some configuration files, which will reside in ~/.academicdb
. The most important is config.toml
, which contains all of the details about you that are used to retrieve your information. Here are the contents of mine as an example:
[researcher]
lastname = "poldrack"
middlename = "alan"
firstname = "russell"
email = "russpold@stanford.edu"
orcid = "0000-0001-6755-0259"
query = "poldrack-r"
url = "http://poldrack.github.io"
twitter = "@russpoldrack"
github = "http://github.com/poldrack"
phone = "650-497-8488"
scholar_id = "RbmLvDIAAAAJ"
scopus_id = "7004739390"
address = [
"Stanford University",
"Department of Psychology",
"Building 420",
"450 Jane Stanford Way",
"Stanford, CA, 94305-2130",
]
Most of this should be self-explanatory. There are several identifiers that you need to specify:
- ORCID: This is a unique identifier for researchers. If you don’t already have an ORCID you can get one here. You will need to enter information about your education, employment, invited position and distinctions, and memberships and service into your ORCID account since that is where academicdb looks for that information.
- Google Scholar: You will also need to retrieve your Google Scholar ID. Once you have set up your profile, go to the “My Profile” page. The URL from that page contains your id: for example, my URL is https://scholar.google.com/citations?user=RbmLvDIAAAAJ&hl=en and the ID is RbmLvDIAAAAJ.
- Scopus: Scopus is a service run by Elsevier. I know that they are the bad guys, but Scopus provides a service that is not available from anywhere else: For each reference it provides a set of unique identifiers for the coauthors, which can be used to retreive their affilation information. This is essential for generating the NSF collaborators spreadsheet.
Obtaining an API key for Scopus
You will need to obtain an API key to access the Scopus database, which you can obtain from http://dev.elsevier.com/myapikey.html. This is used by the pybliometrics package to access the APIs; note that there are weekly limits on the number of records one can access from these APIs without a subscription. If your institution has a subscription and you are on the institution’s network, you may be able to bypass these.
The first time you use the package, you will be asked by pybliometrics to enter your API key, which will be stored in ~/.pybliometrics/config.ini
for reuse.
specifying additional information
There are a number of pieces of information that are difficult to reliably obtain from ORCID or other APIs, so they must be specified in a set of text files, which should be saved in the base directory that is specified when the command line dbbuilder
tool is used.
editorial.csv
talks.csv
conference.csv
teaching.csv
funding.csv
In addition, there may be references (including books, book chapters, and published conference papers) that are not detected by the automated search and need to be added by hand, using a file called additional_pubs.csv
in the base directory.
Database setup
By default, academicdb will use a MongoDB server on the local host with the default port number; if you have installed and started up MongoDB on your computer it should just work.