Code Availability Check

This script demonstrates the usage of AutoSurvey in the auto_research.survey.core module to:

  • Select a PDF file from a specified folder.

  • Retrieve an API key for the LLM (Large Language Model).

  • Format a base prompt for code availability checks.

  • Test the availability of code on GitHub.

To get started with the package, you need to set up API keys. For detailed instructions, see Setting up API keys for LLMs.

This script assumes that:

  • At least one valid PDF file of the article is available. (located at “sample_articles/”)

  • A valid key.json file is available (located at the current working directory (“”))

The process involves user interaction, including selecting a PDF file.

Below is an example output from the following input:

  • 3

Available PDF files:
0: BOHB Robust and Efficient Hyperparameter Optimization at Scale.pdf
1: A survey on evaluation of large language models.pdf
2: The AI Scientist Towards Fully Automated Open-Ended Scientific Discovery.pdf
3: Large Language Models Synergize with Automated Machine Learning.pdf
4: Active Learning for Distributionally Robust Level-Set Estimation.pdf
Enter the index of the file you want to process: Sequence generation under testing: attempt 1 of 3
Operation under time limit: attempt 1 of 3
The operation finishes in time
Test passed
The retrieved information is:

https://github.com/JLX0/llm-automl
The total cost is 0.004786649999999999 USD

from __future__ import annotations

from LLM_utils.inquiry import get_api_key

from auto_research.reimplementation.code_availability_check import base_prompt_formatted
from auto_research.reimplementation.code_availability_check import test_github_link
from auto_research.survey.core import AutoSurvey
from auto_research.utils.files import select_pdf_file


def main() -> None:
    """
    Main function to execute the code availability check and survey analysis.

    This function:
    - Selects a PDF file from the specified folder.
    - Retrieves the API key for the LLM.
    - Formats the base prompt for code availability checks.
    - Initializes the AutoSurvey instance.
    - Checks whether GitHub link is available
    """
    # Specify the folder containing the target PDF files
    sample_folder = "sample_articles/"

    # Select a PDF file from the specified folder
    selected_file, file_path = select_pdf_file(sample_folder)

    # Retrieve the API key for the LLM
    # This script assumes a valid key.json file is located at the current working directory ("")
    # Modify the path to key.json ("") and the value for LLMs category ("OpenAI") if needed
    key = get_api_key("", "OpenAI")

    # Initialize the AutoSurvey instance
    auto_survey_instance = AutoSurvey(
        key,
        "gpt-4o-mini",
        file_path,
        False,
        "information_retrieval",
    )

    # Format the base prompt for code availability checks
    prompt = base_prompt_formatted()

    # Check whether GitHub link is available
    auto_survey_instance.run(prompt, test_github_link)


if __name__ == "__main__":
    main()

Total running time of the script: (0 minutes 31.762 seconds)

Gallery generated by Sphinx-Gallery