auto_research.search.post_processing module
- class ArticleOrganizer(source_folder, target_folder='top_articles', threshold_type='rank', score_threshold=0.5, rank_threshold=30, organize_files=True, order_by_score=True, zip_folder=True, plotting=True)[source]
Bases:
object
A class to organize, filter, and visualize academic papers based on their combined scores.
- Parameters:
- "score".
- __init__(source_folder, target_folder='top_articles', threshold_type='rank', score_threshold=0.5, rank_threshold=30, organize_files=True, order_by_score=True, zip_folder=True, plotting=True)[source]
Initialize the ArticleOrganizer class with the given parameters.
- Parameters:
source_folder (str) – The folder where the original papers and metadata are stored.
target_folder (str) – The folder where organized papers will be saved. Defaults to
"top_articles".
threshold_type (str) – The filtering method (“rank” or “score”). Defaults to “rank”.
score_threshold (float) – The minimum combined score for filtering when threshold_type
"score". (is) – Defaults to 0.5.
rank_threshold (int) – The number of top papers to filter when threshold_type is “rank”. Defaults to 30.
organize_files (bool) – Whether to organize files into the target folder. Defaults to
True.
order_by_score (bool) – Whether to rename files with their combined score. Defaults to
True.
zip_folder (bool) – Whether to zip the target folder and source folder. Defaults to
True.
plotting (bool) – Whether to plot the combined scores of papers. Defaults to True.
- Return type:
None
- draw(paper_list, title)[source]
Plot the combined scores of papers and save the plot as an image.
- Parameters:
paper_list (List[Dict]) – A list of dictionaries containing paper details.
title (str) – The title of the plot.
- Return type:
None
- organize_and_visualize()[source]
Organize, filter, and visualize the papers based on the initialized parameters.
Steps: 1. Read metadata from the source folder. 2. Sort papers by combined score in descending order. 3. Draw a plot of the unfiltered papers if plotting is True. 4. Filter papers based on the selected threshold type (“rank” or “score”). 5. Draw a plot of the filtered papers if plotting is True. 6. Organize files into the target folder if required. 7. Zip the target folder and source folder if required.
- Return type:
None