auto_research.search.data_retrival module

download_pdf(url, filename, folder=None, timeout=10)[source]

Downloads a PDF file from the specified URL and saves it to the given filename and folder.

Parameters:
  • url (str) – The URL of the PDF file to download.

  • filename (str) – The name of the file to save the PDF as.

  • folder (Optional[str]) – The folder to save the PDF in. If None, saves in the current directory.

  • timeout (int) – The timeout for the request in seconds. Defaults to 10.

Returns:

True if the download was successful and the file is not corrupted, False otherwise.

Return type:

bool

Example

>>> download_pdf("http://example.com/sample.pdf", "sample.pdf", folder="pdfs")
Downloaded: sample.pdf
True
get_paper_details_from_semantic_scholar(title, verbose=False)[source]

Retrieves paper details (abstract and venue) from Semantic Scholar based on the paper title.

Parameters:
  • title (str) – The title of the paper to search for.

  • verbose (bool) – If True, prints error messages. Defaults to False.

Returns:

A tuple containing the abstract and venue of the paper.

Returns None if no data is found or if an error occurs.

Return type:

Optional[Tuple[str, str]]

Example

>>> get_paper_details_from_semantic_scholar("Attention is All You Need")
("Abstract text...", "NeurIPS")
get_arxiv_paper_details(title)[source]
Retrieves paper details (title, abstract, PDF link, and venue) from arXiv based on the paper

title.

Parameters:

title (str) – The title of the paper to search for.

Returns:

A tuple containing the paper title, abstract, PDF

link,and venue. Returns None if no data is found.

Return type:

Optional[Tuple[str, str, str, str]]

Example

>>> get_arxiv_paper_details("Attention is All You Need")
("Attention is All You Need", "Abstract text...", "http://arxiv.org/pdf/...", "arXiv")