auto_research.search.core module
- class AutoSearch(keywords, num_results=30, delay=1, sort_by='relevance', date_cutoff='2024-01-01', score_threshold=0.5, recency_weight=3.5, auto_destination=False, destination_folder='search_results', zip_folder=True)[source]
Bases:
object
A class to search for academic papers by keyword(s), retrieve their details, and optionally download them.
- Parameters:
- keywords
The keyword(s) to search for. If a string, performs a single search. If a list, performs multiple searches.
- "relevance").
- score_threshold
The minimum combined score for papers to be displayed/downloaded. The combined score is calculated differently based on the sorting criteria:
If sorting by “date”:
\[\text{combined_score} = \frac{\text{citation_count}}{\left(\frac{365 + \text{days_ago}}{365}\right)^{\text{recency_weight}}}\]where \(\text{days_ago}\) is the number of days since the paper was published.
If sorting by “relevance”:
\[\text{combined_score} = \frac{\text{citation_count}}{\text{recency}^ {\text{recency_weight}}}\]where \(\text{recency}\) is the number of years since the paper was published.
- The \(\text{recency_weight}\) parameter controls how much weight is given to
the recency of the paper.
- Type:
Example
>>> search = AutoSearch("machine learning", num_results=10) >>> search.run()
- __init__(keywords, num_results=30, delay=1, sort_by='relevance', date_cutoff='2024-01-01', score_threshold=0.5, recency_weight=3.5, auto_destination=False, destination_folder='search_results', zip_folder=True)[source]
Initialize the AutoSearch class with the given parameters.
- Parameters:
num_results (int) – The number of results to retrieve.
delay (int) – Delay between requests.
sort_by (str) – Sorting criteria (“date” or “relevance”).
date_cutoff (str) – Cutoff date for date-based search (format: “YYYY-MM-DD”).
score_threshold (float) – Minimum combined score for papers.
recency_weight (float) – Weight for recency in combined score calculation.
auto_destination (bool) – Whether to auto-generate the destination folder name.
destination_folder (str) – Folder to save downloaded papers.
zip_folder (bool) – Whether to zip the downloaded papers.
- Return type:
None
- search_papers_by_keyword(keyword)[source]
Search for papers by a given keyword and retrieve their details.
- Parameters:
keyword (str) – The keyword to search for.
- Returns:
A list of dictionaries containing paper details.
- Return type:
Example
>>> search = AutoSearch("machine learning") >>> papers = search.search_papers_by_keyword("machine learning") >>> len(papers) > 0 True
- display_and_download(papers_info, verbose=True)[source]
Display the details of the papers and optionally download them.
- Parameters:
- Return type:
None
Example
>>> search = AutoSearch("machine learning") >>> papers = search.search_papers_by_keyword("machine learning") >>> search.display_and_download(papers)