auto_research.utils.stored_info module

class Storage(path, debug=False)[source]

Bases: LLM_utils.storage.Storage_base

A class for managing and storing information about papers in a structured format.

This class provides functionality to add, retrieve, and check information about papers stored in a JSON-like structure. It supports adding papers by path or name, adding information to specific papers, and retrieving or checking information based on specified criteria.

information

A dictionary storing paper information, where keys are paper names and values are nested dictionaries containing information types and their corresponding trials.

Type:

dict

Example

>>> storage = Storage()
>>> storage.add_papers_by_name(["paper1.pdf", "paper2.pdf"])
>>> storage.add_info_to_a_paper("paper1.pdf", "summary", "This is a summary.")
>>> storage.get_info(paper_list=["paper1.pdf"], type_list=["summary"])
---Paper name: paper1.pdf, Info type: summary, Trial number: 1---
This is a summary.
add_papers_by_path(path_to_paper)[source]

Add the file names of all papers in a specified directory as keys to the info dictionary.

Parameters:

path_to_paper (str) – The path to the directory containing the paper files.

Returns:

None

Return type:

None

Example

>>> storage = Storage()
>>> storage.add_papers_by_path("/path/to/papers")
add_papers_by_name(list_of_papers)[source]

Add the file names of all papers in a provided list as keys to the info dictionary.

Parameters:
  • list_of_papers (list[str]) – A list of paper file names to be added to the info

  • dictionary.

Returns:

None

Return type:

None

Example

>>> storage = Storage()
>>> storage.add_papers_by_name(["paper1.pdf", "paper2.pdf"])
add_info_to_a_paper(paper_name, info_type, info_content, info_trial=None)[source]

Add information to a specific paper in the info dictionary.

If the paper or info type does not exist, they are initialized. If a trial number is not provided, the next available trial number is automatically assigned.

Parameters:
  • paper_name (str) – The name of the paper to which information will be added.

  • info_type (str) – The type of information to be added (e.g., ‘summary’, ‘analysis’).

  • info_content (str) – The content of the information to be stored.

  • info_trial (Optional[int]) – The trial number for the information. If None, the next available trial number is used.

Returns:

None

Raises:

ValueError – If the specified trial number already exists for the given paper and info type.

Return type:

None

Example

>>> storage = Storage()
>>> storage.add_papers_by_name(["paper1.pdf"])
>>> storage.add_info_to_a_paper("paper1.pdf", "summary", "This is a summary.")
get_info(paper_list=None, type_list=None, trial_list=None, verbose=False)[source]

Retrieve information from the info dictionary based on the specified criteria.

Parameters:
  • paper_list (Optional[list[str]]) – List of paper names to retrieve info for. If None, all papers are included.

  • type_list (Optional[list[str]]) – List of info types to retrieve info for. If None, all types are included.

  • trial_list (Optional[list[int]]) – List of trial numbers to retrieve info for. If None, all trials are included.

  • verbose (bool) – If True, prints messages for missing data.

Returns:

None

Return type:

None

Example

>>> storage = Storage()
>>> storage.add_papers_by_name(["paper1.pdf"])
>>> storage.add_info_to_a_paper("paper1.pdf", "summary", "This is a summary.")
>>> storage.get_info(paper_list=["paper1.pdf"], type_list=["summary"])
---Paper name: paper1.pdf, Info type: summary, Trial number: 1---
This is a summary.
check_info(paper_list=None, type_list=None, trial_list=None, verbose=False)[source]

Check if the info dictionary contains the specified information.

Parameters:
  • paper_list (Optional[list[str]]) – List of paper names to check. If None, checks all

  • papers.

  • type_list (Optional[list[str]]) – List of info types to check. If None, checks all

  • types.

  • trial_list (Optional[list[int]]) – List of trial numbers to check. If None, checks all

  • trials.

  • verbose (bool) – If True, returns a detailed results dictionary; if False, returns a

  • boolean.

Returns:

If verbose=True, returns a nested dictionary with results of the existence check.

If verbose=False, returns a boolean indicating if all checks passed.

Return type:

Union[bool, dict]

Example

>>> storage = Storage()
>>> storage.add_papers_by_name(["paper1.pdf"])
>>> storage.add_info_to_a_paper("paper1.pdf", "summary", "This is a summary.")
>>> storage.check_info(paper_list=["paper1.pdf"], type_list=["summary"])
True
static get_latest_trial(info_papers)[source]

Retrieve the latest trial information from a given dictionary of trials.

Parameters:

info_papers (dict) – A dictionary containing trial information.

Returns:

The value corresponding to the latest trial. Returns None if the dictionary is empty.

Return type:

Optional[str]

Example

>>> storage = Storage()
>>> storage.get_latest_trial({"1": "Trial 1", "2": "Trial 2"})
'Trial 2'
__init__(path, debug=False)
classmethod auto_load_save(method)

Decorator to automatically call self.load_info() before the method and self.save_info() after the method.

load_info()

This method loads the information from a json file.

Args:

Returns:

save_info()

This method saves the information to a JSON file in a nicely formatted way.

Args:

Returns: