Skip to content

Scival Publication Lookup #377

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 24, 2025
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
113 changes: 113 additions & 0 deletions docs/reference/scival/PublicationLookup.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,113 @@
pybliometrics.scival.PublicationLookup
======================================

`PublicationLookup()` implements the `Scival Publication Lookup API <https://dev.elsevier.com/documentation/SciValPublicationAPI.wadl>`_.

It accepts Scopus ID identifier as the main argument.

.. currentmodule:: pybliometrics.scival
.. contents:: Table of Contents
:local:

Documentation
-------------

.. autoclass:: PublicationLookup
:members:
:inherited-members:

Examples
--------
You initialize the class with the Scopus ID. The argument can be an integer or a string.

.. code-block:: python

>>> import pybliometrics
>>> from pybliometrics.scival import PublicationLookup
>>> pybliometrics.scival.init()
>>> pub = PublicationLookup('85036568406')


You can obtain basic information just by printing the object:

.. code-block:: python

>>> print(pub)
Document with Scopus Id 85036568406 received:
- Title: Soft Electrochemical Probes for Mapping the Distribution of Biomarkers and Injected Nanomaterials in Animal and Human Tissues
- DOI: 10.1002/anie.201709271
- Type: Article
- Publication Year: 2017
- 7 author(s) found
- 3 institution(s) found


There are many attributes which are available in the response from the API. It is possible to explore the properties as following example:

.. code-block:: python

>>> pub.id
85036568406
>>> pub.title
'Soft Electrochemical Probes for Mapping the Distribution of Biomarkers and Injected Nanomaterials in Animal and Human Tissues'
>>> pub.doi
'10.1002/anie.201709271'
>>> pub.publication_year
2017
>>> pub.type
'Article'
>>> pub.citation_count
34
>>> pub.source_title
'Angewandte Chemie - International Edition'
>>> pub.topic_id
7563
>>> pub.topic_cluster_id
157
>>> pub.sdgs
['SDG 3: Good Health and Well-being']


You can retrieve the authors as a list of `namedtuples <https://docs.python.org/3/library/collections.html#collections.namedtuple>`_, which pair conveniently with `pandas <https://pandas.pydata.org/>`_:

.. code-block:: python

>>> pub.authors
[Author(id=7404861905, name='Lin, T.-E.', uri='Author/7404861905'),
Author(id=24537666700, name='Lu, Y.-J.', uri='Author/24537666700'),
Author(id=7404248170, name='Sun, C.-L.', uri='Author/7404248170'),
Author(id=7004202515, name='Pick, H.', uri='Author/7004202515'),
Author(id=58307174900, name='Chen, J.-P.', uri='Author/58307174900'),
Author(id=36246291500, name='Lesch, A.', uri='Author/36246291500'),
Author(id=7102360867, name='Girault, H.H.', uri='Author/7102360867')]

>>> import pandas as pd
>>> print(pd.DataFrame(pub.authors))
id name uri
0 7404861905 Lin, T.-E. Author/7404861905
1 24537666700 Lu, Y.-J. Author/24537666700
2 7404248170 Sun, C.-L. Author/7404248170
3 7004202515 Pick, H. Author/7004202515
4 58307174900 Chen, J.-P. Author/58307174900
5 36246291500 Lesch, A. Author/36246291500
6 7102360867 Girault, H.H. Author/7102360867


The same structure applies for the attributes `institutions`:

.. code-block:: python

>>> pub.institutions
[Institution(id=217002, name='Chang Gung University', country='Taiwan', country_code='TWN'),
Institution(id=306002, name='Swiss Federal Institute of Technology Lausanne', country='Switzerland', country_code='CHE'),
Institution(id=725104, name='Chang Gung Memorial Hospital', country='Taiwan', country_code='TWN')]

>>> import pandas as pd
>>> print(pd.DataFrame(pub.institutions))
id name country country_code
0 217002 Chang Gung University Taiwan TWN
1 306002 Swiss Federal Institute of Technology Lausanne Switzerland CHE
2 725104 Chang Gung Memorial Hospital Taiwan TWN


Downloaded results are cached to expedite subsequent analyses. This information may become outdated. To refresh the cached results if they exist, set `refresh=True`, or provide an integer that will be interpreted as maximum allowed number of days since the last modification date. For example, if you want to refresh all cached results older than 100 days, set `refresh=100`. Use `ab.get_cache_file_mdate()` to obtain the date of last modification, and `ab.get_cache_file_age()` to determine the number of days since the last modification.
3 changes: 3 additions & 0 deletions pybliometrics/scival/__init__.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
from pybliometrics.utils import *

from pybliometrics.scival.publication_lookup import *
133 changes: 133 additions & 0 deletions pybliometrics/scival/publication_lookup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,133 @@
from collections import namedtuple
from typing import Union, Optional

from pybliometrics.superclasses import Retrieval
from pybliometrics.utils import make_int_if_possible, chained_get, list_authors


class PublicationLookup(Retrieval):

@property
def authors(self) -> Optional[list[namedtuple]]:
"""A list of namedtuples representing listed authors in
the form `(id, name, uri)`.
"""
out = []
fields = 'id name uri'
auth = namedtuple('Author', fields)
for item in chained_get(self._json, ['publication', 'authors'], []):
new = auth(id=make_int_if_possible(item['id']), name=item.get('name'),
uri=item.get('uri'))
out.append(new)
return out or None

@property
def citation_count(self) -> Optional[int]:
"""Count of citations"""
return make_int_if_possible(chained_get(self._json, ['publication', 'citationCount']))

@property
def doi(self) -> Optional[str]:
"""Digital Object Identifier (DOI)"""
return chained_get(self._json, ['publication', 'doi'])

@property
def id(self) -> Optional[int]:
"""ID of the document (same as EID without "2-s2.0-")."""
return make_int_if_possible(chained_get(self._json, ['publication', 'id']))

@property
def institutions(self) -> Optional[list[namedtuple]]:
"""A list of namedtuples representing listed institutions in
the form `(id, name, country, country_code)`.
"""
out = []
fields = 'id name country country_code'
auth = namedtuple('Institution', fields)
for item in chained_get(self._json, ['publication', 'institutions'], []):
new = auth(id=make_int_if_possible(item['id']), name=item.get('name'),
country=item.get('country'), country_code=item.get('countryCode'))
out.append(new)
return out or None

@property
def link(self) -> Optional[str]:
"""URL link"""
return chained_get(self._json, ['link', '@href'])

@property
def publication_year(self) -> Optional[int]:
"""Year of publication"""
return make_int_if_possible(chained_get(self._json, ['publication', 'publicationYear']))

@property
def sdgs(self) -> Optional[list[str]]:
"""Sustainable Development Goals."""
return chained_get(self._json, ['publication', 'sdg'])

@property
def source_title(self) -> Optional[str]:
"""Title of source"""
return chained_get(self._json, ['publication', 'sourceTitle'])

@property
def title(self) -> Optional[str]:
"""Publication title"""
return chained_get(self._json, ['publication', 'title'])

@property
def topic_cluster_id(self) -> Optional[int]:
"""Topic cluster id"""
return make_int_if_possible(chained_get(self._json, ['publication', 'topicClusterId']))

@property
def topic_id(self) -> Optional[int]:
"""Topic id"""
return make_int_if_possible(chained_get(self._json, ['publication', 'topicId']))

@property
def type(self) -> Optional[str]:
"""Type of publication"""
return chained_get(self._json, ['publication', 'type'])

def __str__(self):
"""Return pretty text version of the document."""
if len(self.authors) >= 1:
author_count = len(self.authors)
authors = f"{author_count} author(s) found"
else:
authors = "(No author found)"

if len(self.institutions) >= 1:
institution_count = len(self.institutions)
institutions = f"{institution_count} institution(s) found"
else:
institutions = "(No institution found)"
return (
f"Document with Scopus Id {self.id or 'N/A'} received:\n"
f"- Title: {self.title}\n"
f"- DOI: {self.doi}\n"
f"- Type: {self.type}\n"
f"- Publication Year: {self.publication_year}\n"
f"- {authors}\n"
f"- {institutions}\n"
)

def __init__(self,
identifier: Union[int, str] = None,
refresh: Union[bool, int] = False,
**kwds: str
) -> None:
"""Interaction with the Publication Lookup API.

:param identifier: The Scopus ID of the publication.
:param refresh: Whether to refresh the cached file if it exists or not.
If int is passed, cached file will be refreshed if the
number of days since last modification exceeds that value.
:param kwds: Keywords passed on as query parameters. Must contain
fields and values mentioned in the API specification at
https://dev.elsevier.com/documentation/SciValPublicationAPI.wadl.
"""
self._view = ''
self._refresh = refresh
Retrieval.__init__(self, identifier=str(identifier), **kwds)
56 changes: 56 additions & 0 deletions pybliometrics/scival/tests/test_PublicationLookup.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
from pybliometrics.scival.publication_lookup import PublicationLookup
from pybliometrics.utils.startup import init

init()

pub1 = PublicationLookup('85036568406')


def test_publication_authors_count():
assert len(pub1.authors) >= 7


def test_publication_citation_count():
assert pub1.citation_count > 0


def test_publication_doi():
assert pub1.doi == "10.1002/anie.201709271"


def test_publication_first_author():
assert pub1.authors[0].id == 7404861905
assert pub1.authors[0].name == "Lin, T.-E."
assert pub1.authors[0].uri == "Author/7404861905"


def test_publication_first_institution():
assert pub1.institutions[0].id == 217002
assert pub1.institutions[0].name == "Chang Gung University"
assert pub1.institutions[0].country == "Taiwan"
assert pub1.institutions[0].country_code == "TWN"


def test_publication_id():
assert pub1.id == 85036568406


def test_publication_institutions_count():
assert len(pub1.institutions) >= 3


def test_publication_sdgs():
assert len(pub1.sdgs) >= 1
assert pub1.sdgs[0] == 'SDG 3: Good Health and Well-being'


def test_publication_source_title():
assert pub1.source_title == 'Angewandte Chemie - International Edition'


def test_publication_type():
assert pub1.type == "Article"


def test_publication_year():
assert pub1.publication_year == 2017
3 changes: 3 additions & 0 deletions pybliometrics/utils/constants.py
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,7 @@
'ObjectMetadata': CACHE_PATH / "ScienceDirect" / 'object_metadata',
'ObjectRetrieval': CACHE_PATH / "ScienceDirect" / 'object_retrieval',
'PlumXMetrics': CACHE_PATH / "Scopus" / 'plumx',
'PublicationLookup': CACHE_PATH / "Scival" / "publication_lookup",
'ScDirSubjectClassifications': CACHE_PATH / "ScienceDirect" / 'subject_classification',
'ScienceDirectSearch': CACHE_PATH / "ScienceDirect" / 'science_direct_search',
'ScopusSearch': CACHE_PATH / "Scopus" / 'scopus_search',
Expand All @@ -43,6 +44,7 @@
# URLs for all classes
RETRIEVAL_BASE = 'https://api.elsevier.com/content/'
SEARCH_BASE = 'https://api.elsevier.com/content/search/'
SCIVAL_BASE = 'https://api.elsevier.com/analytics/scival/'
URLS = {
'AbstractRetrieval': RETRIEVAL_BASE + 'abstract/',
'ArticleEntitlement': RETRIEVAL_BASE + 'article/entitlement/',
Expand All @@ -56,6 +58,7 @@
'NonserialTitle': RETRIEVAL_BASE + 'nonserial/title/isbn/',
'ObjectMetadata': RETRIEVAL_BASE + 'object/',
'ObjectRetrieval': RETRIEVAL_BASE + 'object/',
'PublicationLookup': SCIVAL_BASE + 'publication/',
'PlumXMetrics': 'https://api.elsevier.com/analytics/plumx/',
'ScDirSubjectClassifications': RETRIEVAL_BASE + 'subject/scidir/',
'ScienceDirectSearch': SEARCH_BASE + 'sciencedirect/',
Expand Down
3 changes: 2 additions & 1 deletion pybliometrics/utils/startup.py
Original file line number Diff line number Diff line change
Expand Up @@ -114,7 +114,8 @@ def check_keys_tokens() -> None:
def create_cache_folders(config: Type[ConfigParser]) -> None:
"""Auxiliary function to create cache folders."""
for api, path in config.items('Directories'):
for view in VIEWS[api]:
views = VIEWS.get(api, [""]) # empty string
for view in views:
view_path = Path(path, view)
view_path.mkdir(parents=True, exist_ok=True)

Expand Down
3 changes: 3 additions & 0 deletions pybliometrics/utils/tests/test_startup.py
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,9 @@ def test_imports():
import pybliometrics.scopus
pybliometrics.scopus.init()

import pybliometrics.scival
pybliometrics.scival.init()


def test_new_config():
"""Test whether a new config file is created."""
Expand Down