Skip to the content.

SearchClient API Reference

The SearchClient is the main interface for searching Europe PMC’s database of scientific literature.

Class Overview

from pyeuropepmc.search import SearchClient

class SearchClient:
    """Client for searching Europe PMC REST API."""

Constructor

SearchClient(rate_limit_delay=1.0, timeout=30, max_retries=3)

Create a new SearchClient instance.

Parameters:

Example:

client = SearchClient(rate_limit_delay=2.0, timeout=60)

Methods

search(query, **kwargs)

Search Europe PMC and return results.

Parameters:

Returns:

Example:

results = client.search(
    "CRISPR gene editing",
    pageSize=50,
    format="json",
    sort="CITED desc"
)

search_and_parse(query, **kwargs)

Search and automatically parse results into structured data.

Parameters:

Returns:

Example:

papers = client.search_and_parse(
    "COVID-19 vaccine",
    pageSize=25,
    sort="CITED desc"
)

for paper in papers:
    print(f"Citations: {paper.get('citedByCount', 0)}")
    print(f"Title: {paper.get('title')}")

get_hit_count(query)

Get the total number of results for a query without retrieving them.

Parameters:

Returns:

Example:

count = client.get_hit_count("machine learning")
print(f"Found {count} papers")

fetch_all_pages(query, max_results=None, **kwargs)

Automatically fetch all pages of results up to a maximum limit.

Parameters:

Returns:

Example:

all_papers = client.fetch_all_pages(
    "cancer immunotherapy",
    max_results=1000,
    pageSize=100
)

Context Manager Usage

The SearchClient supports context manager usage for automatic resource cleanup:

with SearchClient() as client:
    results = client.search("neural networks")
    # Client automatically closed

Error Handling

The SearchClient raises EuropePMCError for API-related issues:

from pyeuropepmc.search import SearchClient, EuropePMCError

try:
    with SearchClient() as client:
        results = client.search("invalid query syntax")
except EuropePMCError as e:
    print(f"Search failed: {e}")

Rate Limiting

Built-in rate limiting prevents API abuse:

# Respectful usage (recommended)
client = SearchClient(rate_limit_delay=1.0)

# More conservative
client = SearchClient(rate_limit_delay=2.0)

Examples

from pyeuropepmc.search import SearchClient

with SearchClient() as client:
    results = client.search("CRISPR", pageSize=10)

    print(f"Total results: {results['hitCount']}")

    for paper in results["resultList"]["result"]:
        print(f"Title: {paper.get('title')}")
        print(f"Authors: {paper.get('authorString')}")
        print(f"Year: {paper.get('pubYear')}")

Advanced Search with Filtering

with SearchClient() as client:
    results = client.search(
        query="cancer immunotherapy",
        source="MED",  # PubMed only
        sort="CITED desc",  # Most cited first
        pageSize=50,
        format="json"
    )

Pagination

# Manual pagination
page1 = client.search("machine learning", pageSize=100, offset=0)
page2 = client.search("machine learning", pageSize=100, offset=100)

# Automatic pagination
all_results = client.fetch_all_pages("machine learning", max_results=500)

Citation Analysis

papers = client.search_and_parse(
    "artificial intelligence",
    pageSize=100,
    sort="CITED desc"
)

# Analyze citation distribution
citations = [p.get('citedByCount', 0) for p in papers]
highly_cited = [p for p in papers if p.get('citedByCount', 0) > 100]