AI-Powered Content Summarizer

This project contains a Flask-based web application that integrates various Natural Language Processing (NLP) models to generate text summaries. The app allows users to enter text and receive summaries in different formats such as paragraph or bullet points, utilizing state-of-the-art models for summarization tasks.

Project Overview
Installation
Usage
Model Training
API Endpoints
Customization
File Structure
Contributing
License

Project Overview

This repository offers a web-based interface for summarizing text using different models including BART, Llama 2, Ollama Llama 3, and PEFT-based fine-tuning techniques like LoRA (Low-Rank Adaptation). The app is designed for flexibility, supporting multiple models and summarization formats.

Key Features:

Supports BART, Llama 2, and Ollama Llama 3 models.
Users can choose between paragraph or bullet points format for the summaries.
Adjustable summary lengths (short or long).
Utilizes LoRA for efficient fine-tuning with PEFT.

Installation

Prerequisites

Python 3.8 or higher
Pip package manager
CUDA-enabled GPU (for training models)

Steps

Clone the repository:

   git clone https://github.com/shahinur-alam/AI-Powered-Content-Summarizer.git
   cd AI-Powered-Content-Summarizer

Create a virtual environment (optional but recommended):

   python -m venv venv
   source venv/bin/activate  # On Windows: venv\Scripts\activate

Install dependencies:

   pip install -r requirements.txt

Run the application:

   python app.py

Usage

Running the App

Start the application by running python app.py.
Open your browser and go to http://127.0.0.1:5000/.
Input the text you want to summarize, select the summary type (short or long), and the format (paragraph or bullet points).
Submit the form to get the summary generated by the model.

Example Input

Original Text: 
The BART model is widely known for its text generation capabilities, and is particularly effective at summarization tasks.

Example Output (Short Summary in Bullet Points Format)

* BART is known for text generation.
* It is effective at summarization tasks.

Model Training

This application includes both pre-trained models and fine-tuning capabilities.

BART Summarization

BART is used to generate summaries directly without the need for additional training:

from transformers import BartForConditionalGeneration, BartTokenizer

Llama 2 with PEFT and LoRA

The Llama 2 model is fine-tuned using LoRA to enable efficient fine-tuning:

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    task_type="CAUSAL_LM"
)

This allows the app to fine-tune the model efficiently on smaller datasets.

Ollama Llama 3

For Ollama Llama 3, a Langchain implementation is used to summarize text:

llm = Ollama(model="llama3")

Fine-Tuning and Training

The Llama 2 model can be fine-tuned using Hugging Face’s Trainer API, with training arguments and LoRA applied for parameter-efficient tuning.
The fine-tuned model can then be saved for future inference:

  trainer.save_model("./llama2-finetuned-final")

API Endpoints

`GET /`

Description: Displays the main page where users can input text for summarization.
Response: Renders the index.html page containing a form.

`POST /`

Description: Handles form submissions and returns the generated summary.
Parameters:
text: The text to summarize.
summary_type: The desired summary type (short or long).
format_type: The desired format (paragraph or bullet_points).
Response: Displays the summarized text.

Customization

Summary Length

You can adjust the maximum and minimum summary length in the code:

max_length = 150 if summary_type == 'short' else 300
min_length = 50 if summary_type == 'short' else 100

Models

To change the model, adjust the initialization in app.py. For example, to switch between BART and Llama 2, update the respective model import and initialization.

Output Format

Summaries can be returned in either paragraph or bullet points format. This is controlled by user input through the form and processed within the summarize_text function.

File Structure

AI-Powered-Content-Summarizer/
│
├── templates/
│   └── index.html            # HTML template for the web interface
├── app.py                    # Main Flask application logic
├── requirements.txt          # Python dependencies
└── README.md                 # Project documentation

Contributing

We welcome contributions to improve the project! Here’s how you can get started:

Fork the repository.
Create a new feature branch (git checkout -b feature-branch).
Commit your changes (git commit -m 'Add new feature').
Push to the branch (git push origin feature-branch).
Open a pull request.

License

This project is licensed under the MIT License. You are free to use, modify, and distribute it.

0 0 votes

Article Rating