Note
Go to the end to download the full example code.
Code Availability Check
This script demonstrates the usage of AutoSurvey
in the auto_research.survey.core
module to:
Select a PDF file from a specified folder.
Retrieve an API key for the LLM (Large Language Model).
Format a base prompt for code availability checks.
Test the availability of code on GitHub.
To get started with the package, you need to set up API keys. For detailed instructions, see Setting up API keys for LLMs.
This script assumes that:
At least one valid PDF file of the article is available. (located at “sample_articles/”)
A valid
key.json
file is available (located at the current working directory (“”))
The process involves user interaction, including selecting a PDF file.
Below is an example output from the following input:
3
Available PDF files:
0: BOHB Robust and Efficient Hyperparameter Optimization at Scale.pdf
1: A survey on evaluation of large language models.pdf
2: The AI Scientist Towards Fully Automated Open-Ended Scientific Discovery.pdf
3: Large Language Models Synergize with Automated Machine Learning.pdf
4: Active Learning for Distributionally Robust Level-Set Estimation.pdf
Enter the index of the file you want to process: Sequence generation under testing: attempt 1 of 3
Operation under time limit: attempt 1 of 3
The operation finishes in time
Test passed
The retrieved information is:
https://github.com/JLX0/llm-automl
The total cost is 0.004786649999999999 USD
from __future__ import annotations
from LLM_utils.inquiry import get_api_key
from auto_research.reimplementation.code_availability_check import base_prompt_formatted
from auto_research.reimplementation.code_availability_check import test_github_link
from auto_research.survey.core import AutoSurvey
from auto_research.utils.files import select_pdf_file
def main() -> None:
"""
Main function to execute the code availability check and survey analysis.
This function:
- Selects a PDF file from the specified folder.
- Retrieves the API key for the LLM.
- Formats the base prompt for code availability checks.
- Initializes the AutoSurvey instance.
- Checks whether GitHub link is available
"""
# Specify the folder containing the target PDF files
sample_folder = "sample_articles/"
# Select a PDF file from the specified folder
selected_file, file_path = select_pdf_file(sample_folder)
# Retrieve the API key for the LLM
# This script assumes a valid key.json file is located at the current working directory ("")
# Modify the path to key.json ("") and the value for LLMs category ("OpenAI") if needed
key = get_api_key("", "OpenAI")
# Initialize the AutoSurvey instance
auto_survey_instance = AutoSurvey(
key,
"gpt-4o-mini",
file_path,
False,
"information_retrieval",
)
# Format the base prompt for code availability checks
prompt = base_prompt_formatted()
# Check whether GitHub link is available
auto_survey_instance.run(prompt, test_github_link)
if __name__ == "__main__":
main()
Total running time of the script: (0 minutes 31.762 seconds)