-
-
Notifications
You must be signed in to change notification settings - Fork 106
Description
Submitting Author Name: Chao Liu
Submitting Author Github Handle: @chaoliu-cl
Other Package Authors Github handles: (comma separated, delete if none)
Repository: https://github.com/chaoliu-cl/LBDiscover
Version submitted:
Submission type: Standard
Editor: TBD
Reviewers: TBD
Archive: TBD
Version accepted: TBD
Language: en
- Paste the full DESCRIPTION file inside a code block below:
Package: LBDiscover
Title: Literature-Based Discovery Tools for Biomedical Research
Version: 0.1.0
Date: 2025-05-14
Authors@R:
person("Chao Liu", email = "chaoliu@cedarville.edu", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-9979-8272"))
Description: A suite of tools for literature-based discovery in biomedical research.
Provides functions for retrieving scientific articles from PubMed and
other NCBI databases, extracting biomedical entities (diseases, drugs, genes, etc.),
building co-occurrence networks, and applying various discovery models
including ABC, AnC, LSI, and BITOLA. The package also includes
visualization tools for exploring discovered connections.
License: GPL-3
URL: https://github.com/chaoliu-cl/LBDiscover, http://liu-chao.site/LBDiscover/, https://liu-chao.site/LBDiscover/
BugReports: https://github.com/chaoliu-cl/LBDiscover/issues
Encoding: UTF-8
LazyData: true
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.3.2
Depends:
R (>= 4.0.0)
Imports:
httr (>= 1.4.0),
xml2 (>= 1.3.0),
igraph (>= 1.2.0),
Matrix (>= 1.3.0),
utils,
stats,
grDevices,
graphics,
tools,
rentrez (>= 1.2.0),
jsonlite (>= 1.7.0)
Suggests:
openxlsx (>= 4.2.0),
SnowballC (>= 0.7.0),
visNetwork (>= 2.1.0),
spacyr (>= 1.2.0),
parallel,
digest (>= 0.6.0),
irlba (>= 2.3.0),
knitr,
rmarkdown,
base64enc,
reticulate,
testthat (>= 3.0.0),
mockery,
covr,
htmltools
VignetteBuilder: knitr
Config/testthat/edition: 3
Scope
-
Please indicate which category or categories from our package fit policies this package falls under: (Please check an appropriate box below. If you are unsure, we suggest you make a pre-submission inquiry.):
- data retrieval
- data extraction
- data munging
- data deposition
- data validation and testing
- workflow automation
- version control
- citation management and bibliometrics
- scientific software wrappers
- field and lab reproducibility tools
- database software bindings
- geospatial data
- translation
-
Explain how and why the package falls under these categories (briefly, 1-2 sentences):
Data retrieval: The package provides functions for retrieving scientific articles from PubMed and other NCBI databases. It is a tool for systematically accessing biomedical literature from major research repositories.
Data extraction: It extracts biomedical entities (diseases, drugs, genes, etc.) from retrieved literature, performing information extraction from scientific texts.
Citation management and bibliometrics: The package builds co-occurrence networks from literature and applies discovery models (ABC, AnC, LSI, BITOLA) to find hidden connections between concepts, which represents bibliometric analysis for literature-based discovery research. -
Who is the target audience and what are scientific applications of this package?
Target Audience: LBDiscover is designed for biomedical researchers, bioinformaticians, and data scientists working in literature-based discovery (LBD). The primary users include: -
Biomedical researchers seeking hidden connections between diseases, drugs, and genes
-
Pharmaceutical researchers exploring drug repurposing opportunities
-
Bioinformaticians building knowledge networks from literature
-
Graduate students and academics studying computational approaches to hypothesis generation
Scientific Applications:
The package supports several key research applications:
- Drug Discovery and Repurposing: LBD has been used extensively in drug development and repurposing as well as predicting adverse drug reactions
- Disease-Gene Association Discovery: Using literature-based discovery to identify disease candidate genes
- Biomarker Identification: LBD has been explored as a tool to identify biomarkers for diagnostic and prognostic for diseases
- Hypothesis Generation: Creating testable scientific hypotheses by connecting disparate pieces of literature
- Knowledge Network Construction: Building co-occurrence networks to visualize research landscapes
- Are there other R packages that accomplish the same thing? If so, how does yours differ or meet our criteria for best-in-category?
There are several R packages that overlap with LBDiscover's functionality, but none provide the same comprehensive approach to literature-based discovery:
Similar Packages and Key Differences:
-
pubmed.mineR
Overlap: PubMed text mining with functions for data visualization and biomedical entity extraction
Difference: Focuses on general text mining and clustering rather than implementing specific LBD models like ABC, AnC, LSI, and BITOLA -
bibliometrix
Overlap: Comprehensive science mapping analysis with network analysis capabilities and bibliometric workflows
Difference: Designed for general scientometric analysis across all disciplines, not specifically for biomedical literature-based discovery or implementing LBD-specific algorithms -
Data Retrieval Packages (rentrez, easyPubMed, RISmed)
Overlap: All provide interfaces to NCBI/PubMed for retrieving biomedical literature
Difference: These focus solely on data retrieval and don't perform LBD analysis, entity extraction, or hypothesis generation
How LBDiscover Meets Best-in-Category Criteria:
- Unique Functionality: LBDiscover is the first R package to specifically implement established LBD models:
- ABC Model: The most basic and widespread type of LBD centered around finding connections between concepts A, B, and C
- BITOLA: An interactive literature-based biomedical discovery support system using semantic prediction
- LSI (Latent Semantic Indexing): A statistical technique for improving information retrieval effectiveness used to assist in literature-based discoveries
- AnC Model: Advanced connection models for more sophisticated discovery patterns
- Integrated Workflow: Unlike other packages that handle only one aspect (retrieval OR analysis OR visualization), LBDiscover provides a complete workflow from data retrieval through entity extraction to discovery model application and network visualization.
- Biomedical Specialization: While bibliometrix serves general scientometrics and pubmed.mineR does general text mining, LBDiscover is specifically designed for biomedical literature-based discovery with domain-specific entity recognition (diseases, drugs, genes).
- Modern Implementation: Recent work has focused on integrating Large Language Models for enhancing Literature-Based Discovery processes, and LBDiscover appears positioned to incorporate such advances while maintaining established methodological foundations.
-
(If applicable) Does your package comply with our guidance around Ethics, Data Privacy and Human Subjects Research?
NA -
If you made a pre-submission inquiry, please paste the link to the corresponding issue, forum post, or other discussion, or
@tag
the editor you contacted. -
Explain reasons for any
pkgcheck
items which your package is unable to pass.
None
Technical checks
Confirm each of the following by checking the box.
- I have read the rOpenSci packaging guide.
- I have read the author guide and I expect to maintain this package for at least 2 years or to find a replacement.
This package:
- does not violate the Terms of Service of any service it interacts with.
- has a CRAN and OSI accepted license.
- contains a README with instructions for installing the development version.
- includes documentation with examples for all functions, created with roxygen2.
- contains a vignette with examples of its essential functions and uses.
- has a test suite.
- has continuous integration, including reporting of test coverage.
Publication options
-
Do you intend for this package to go on CRAN?
-
Do you intend for this package to go on Bioconductor?
-
Do you wish to submit an Applications Article about your package to Methods in Ecology and Evolution? If so:
MEE Options
- The package is novel and will be of interest to the broad readership of the journal.
- The manuscript describing the package is no longer than 3000 words.
- You intend to archive the code for the package in a long-term repository which meets the requirements of the journal (see MEE's Policy on Publishing Code)
- (Scope: Do consider MEE's Aims and Scope for your manuscript. We make no guarantee that your manuscript will be within MEE scope.)
- (Although not required, we strongly recommend having a full manuscript prepared when you submit here.)
- (Please do not submit your package separately to Methods in Ecology and Evolution)
Code of conduct
- I agree to abide by rOpenSci's Code of Conduct during the review process and in maintaining my package should it be accepted.