github-process-manager

🤖 GitHub Process Manager - AI-Powered Workflow and Documentation Assistant

A lightweight, local AI-powered assistant that combines Retrieval-Augmented Generation (RAG) with the Gemini API and GitHub repository integration. Upload reference documents, connect to your GitHub repositories, and get intelligent responses for process documentation, SOX compliance, MLOps workflows, DevOps pipelines, and more.

✨ Features

🎯 Use Cases

SOX Compliance & Auditing

MLOps Workflows

DevOps Pipelines

General Process Documentation

📋 Prerequisites

🚀 Quick Start

1. Clone the Repository

git clone <your-repo-url>
cd github-process-manager

2. Create Virtual Environment

# Windows
python -m venv venv
venv\Scripts\activate

# macOS/Linux
python3 -m venv venv
source venv/bin/activate

3. Install Dependencies

pip install -r requirements.txt

4. Configure Environment Variables

Copy the template and edit with your credentials:

# Windows
copy .env.template .env

# macOS/Linux
cp .env.template .env

Edit .env file:

# Required: Gemini API Key
GEMINI_API_KEY=your_gemini_api_key_here

# Optional: GitHub Integration
GITHUB_TOKEN=your_github_personal_access_token_here
GITHUB_REPO_URL=https://github.com/username/repository

# Flask Configuration
FLASK_SECRET_KEY=your_secret_key_here
FLASK_DEBUG=True

Getting Your API Keys:

5. Run the Application

python app.py

The application will be available at: http://localhost:5000

For a consistent, isolated environment, use Docker:

Quick Start with Docker Compose

# 1. Configure environment
cp .env.template .env
# Edit .env with your API keys

# 2. Start the application
docker-compose up -d

# 3. View logs
docker-compose logs -f app

# 4. Access at http://localhost:5000

Development with VS Code Dev Container

  1. Install Remote - Containers extension
  2. Open project in VS Code
  3. Press F1 → “Remote-Containers: Reopen in Container”
  4. Environment is automatically configured with all dependencies

Docker Commands

# Stop the application
docker-compose down

# Rebuild after changes
docker-compose up -d --build

# Production mode
docker-compose -f docker-compose.prod.yml up -d

# View container shell
docker-compose exec app /bin/bash

For detailed Docker setup, see README.docker.md

�📖 Usage Guide

Upload Reference Documents

  1. Navigate to the main Chat page
  2. Click “Choose File” in the upload section
  3. Select a document (.txt, .pdf, or .docx)
  4. Click “Upload” to process the document
  5. The document will be chunked, embedded, and stored in ChromaDB

Connect to GitHub Repository

  1. Go to the Settings page
  2. Enter your GitHub repository URL (e.g., https://github.com/username/repo)
  3. Click “Connect Repository”
  4. Once connected, the chatbot can access PRs, issues, and workflows

Chat with the AI

  1. Type your question in the chat input
  2. The chatbot will:
    • Retrieve relevant document chunks from your uploaded files
    • Fetch related GitHub repository data (if connected)
    • Generate a response using Gemini AI with all context
  3. Responses cite sources from documents and GitHub data

Trigger GitHub Actions

  1. Go to SettingsGitHub Actions
  2. Click “Load Workflows”
  3. Click “Trigger” on any workflow to manually start it

Customize AI Behavior

The application supports customizable system prompts to tailor AI responses to your needs:

Using Pre-defined Templates

  1. Go to SettingsAI System Prompt Configuration
  2. Select a template from the dropdown:
    • Default - Balanced assistant for general queries
    • Technical Expert - Deep technical explanations with code examples
    • Security Auditor - Security-focused analysis and compliance
    • Developer Assistant - Code-heavy responses with best practices
    • Data Analyst - Structured analysis with metrics and insights
    • Technical Educator - Clear explanations for learning purposes
  3. Click “Update Prompt” to apply (changes last for your session)
  4. See the preview to verify the selected template

Creating Custom Prompts

  1. Go to SettingsAI System Prompt Configuration
  2. Select “Custom Prompt” from the dropdown
  3. Write your own system instruction in the text editor
  4. Click “Update Prompt” to apply
  5. Example custom prompt:
    You are a helpful assistant specializing in cloud infrastructure.
    Focus on AWS best practices, security, and cost optimization.
    Provide actionable recommendations with specific service names.
    

Permanent Configuration (via .env)

For persistent customization across server restarts:

  1. Edit your .env file
  2. Set one of these variables:
    # Use a pre-defined template
    SYSTEM_PROMPT_TEMPLATE=technical_expert
       
    # Or set a custom prompt
    CUSTOM_SYSTEM_PROMPT="Your custom system instruction here"
    
  3. Restart the application

Available Templates: default, technical_expert, security_auditor, developer_assistant, data_analyst, technical_educator

Note: Session-based changes (via UI) take priority over .env settings until the server restarts.

Customize Document Templates

The application supports configurable Word document templates with custom branding:

Available Document Templates

  1. SOX Audit - 5-section compliance reports (Control Objective, Risks, Testing, Results, Conclusion)
  2. MLOps Workflow - ML pipeline documentation (Model Overview, Data Pipeline, Training, Validation, Deployment)
  3. DevOps Pipeline - CI/CD documentation (Pipeline Overview, Build Steps, Quality Gates, Deployment, Monitoring)
  4. Generic - General purpose documentation (Overview, Components, Procedures, Results, Recommendations)

Customize Branding

Edit your .env file to personalize generated documents:

# Project name for document headers
PROJECT_NAME=GitHub Process Manager

# Optional: Add company name to headers
COMPANY_NAME=Your Company Name

# Brand color (hex format #RRGGBB)
BRAND_COLOR=#4A90E2

# Optional: Add logo to document headers (.png, .jpg, .jpeg)
DOCUMENT_LOGO_PATH=/path/to/your/logo.png

# Default template type
DEFAULT_TEMPLATE_TYPE=generic

Create Custom Templates

Modify document_templates.json to add new templates:

{
  "templates": {
    "your_template": {
      "name": "Your Template Name",
      "report_title": "Your Report Title",
      "sections": [
        {"number": 1, "title": "Section 1", "key": "Section 1"},
        {"number": 2, "title": "Section 2", "key": "Section 2"}
      ],
      "keywords": ["keyword1", "keyword2"]
    }
  }
}

Template Features:

Using for MLOps

The application includes specialized MLOps templates and workflows for managing machine learning operations.

MLOps Documentation Templates

Located in templates/mlops/, these guides provide comprehensive MLOps best practices:

  1. mlops_guide.md - Complete MLOps lifecycle guide covering:
    • Model development and version control
    • Experiment tracking (MLflow, Weights & Biases)
    • Training best practices and reproducibility
    • Model validation strategies
    • Deployment strategies (Blue-Green, Canary, Shadow)
    • Monitoring and drift detection
    • Model retraining triggers
  2. model_validation_template.md - Structured validation report template:
    • Model overview and business context
    • Validation methodology (unit, integration, performance, regression)
    • Performance metrics and comparison with baseline
    • Bias and fairness analysis
    • Failure pattern analysis
    • Deployment recommendations
  3. deployment_checklist.md - Comprehensive pre-deployment checklist:
    • Model readiness verification
    • Security and compliance checks
    • Monitoring and observability setup
    • Testing requirements (functional, performance, integration)
    • Deployment strategy selection
    • Rollback procedures
  4. monitoring_guide.md - Production monitoring strategies:
    • Performance metrics tracking
    • Data drift detection methods
    • Infrastructure monitoring
    • Alert configuration
    • Incident response procedures

Using MLOps Templates as RAG Documents

  1. Navigate to the Chat page
  2. Upload MLOps template files from templates/mlops/
  3. Ask questions about ML workflows:
    • “What metrics should I track for a classification model?”
    • “How do I implement canary deployment for my model?”
    • “What are the best practices for detecting data drift?”
    • “Create a validation checklist for my model deployment”

MLOps GitHub Actions Workflows

Located in .github/workflows/mlops/, trigger workflows for automated documentation:

Model Validation Report (mlops-model-validation.yml):

Deployment Documentation (mlops-deployment-doc.yml):

Example MLOps Queries

Try these queries with MLOps templates uploaded:

Model Training:

"Document the training process for a fraud detection model with 95% accuracy"

Deployment Planning:

"Create a deployment checklist for deploying a recommendation model to production"

Monitoring Setup:

"What alerts should I configure for monitoring a prediction model in production?"

Validation Reporting:

"Generate a validation report for model version 2.1.0 with accuracy 94.2%, precision 93.8%, recall 94.5%"

Integration with ML Tools

The MLOps templates include guidance for integrating with popular ML platforms:

Export metrics from these tools and use the GitHub Actions workflows to generate documentation with your actual performance data.

MLOps Workflow Best Practices

  1. Version Everything: Code, data, models, configurations
  2. Track All Experiments: Log hyperparameters, metrics, and artifacts
  3. Validate Before Deploying: Run all tests (unit, integration, performance)
  4. Monitor Continuously: Set up drift detection and performance alerts
  5. Document Thoroughly: Use templates for consistency
  6. Plan Rollbacks: Always have a tested rollback strategy

🏗️ Project Structure

github-process-manager/
├── app.py                  # Main Flask application
├── config.py               # Configuration management
├── logger.py               # Logging setup
├── rag_engine.py           # RAG document processing
├── gemini_client.py        # Gemini API integration
├── github_client.py        # GitHub API integration
├── word_generator.py       # Word document generation
├── requirements.txt        # Python dependencies
├── .env.template           # Environment variable template
├── .gitignore             # Git ignore rules
├── document_templates.json # Document template configuration
├── templates/
│   ├── base.html          # Base template
│   ├── index.html         # Chat interface
│   └── settings.html      # Settings page
├── static/
│   └── css/
│       └── style.css      # Application styling
├── .github/
│   └── workflows/
│       ├── process-analysis-doc.yml  # Generic process workflow
│       └── sox-analysis-doc.yml      # SOX-specific workflow (legacy)
├── chroma_db/             # ChromaDB storage (auto-created)
├── uploads/               # Temporary upload folder (auto-created)
├── generated_reports/     # Generated Word documents (auto-created)
└── README.md              # This file

🔧 Configuration Options

Edit config.py or set environment variables:

Variable Description Default        
GEMINI_API_KEY Google Gemini API key Required        
GEMINI_TEMPERATURE AI response randomness (0.0-1.0) 0.7        
GEMINI_MAX_TOKENS Maximum response length 2048        
SYSTEM_PROMPT_TEMPLATE Pre-defined prompt template default        
CUSTOM_SYSTEM_PROMPT Custom system instruction None   PROJECT_NAME Project name for documents GitHub Process Manager
COMPANY_NAME Company name for documents None        
BRAND_COLOR Document brand color (hex) #4A90E2        
DOCUMENT_LOGO_PATH Path to logo for documents None        
DEFAULT_TEMPLATE_TYPE Default document template generic        
DOCUMENT_TEMPLATES_PATH Template config file path document_templates.json   GITHUB_TOKEN GitHub personal access token Optional
GITHUB_REPO_URL GitHub repository URL Optional        
FLASK_SECRET_KEY Flask session secret Auto-generated        
CHROMA_DB_PATH ChromaDB storage location ./chroma_db        
CHUNK_SIZE Characters per document chunk 800        
CHUNK_OVERLAP Overlap between chunks 200        
TOP_K_RESULTS RAG chunks to retrieve 3        
MLOPS_FEATURES_ENABLED Enable MLOps features false        
MLOPS_TEMPLATES_DIR MLOps templates directory templates/mlops        
MLOPS_WORKFLOWS_DIR MLOps workflows directory .github/workflows/mlops        

🛠️ API Endpoints

Chat

Document Management

GitHub Integration

AI Prompt Management

MLOps (Optional - requires MLOPS_FEATURES_ENABLED=true)

System

❗ Troubleshooting

“Configuration validation failed: GEMINI_API_KEY is not set”

Documents not being processed

GitHub connection failing

ChromaDB errors

📝 Features in Detail

RAG (Retrieval-Augmented Generation)

Gemini Integration

GitHub Features

🤝 Contributing

This is a personal project, but suggestions and improvements are welcome!

📄 License

This project is provided as-is for educational and personal use.

🙏 Acknowledgments

📧 Support

For issues or questions, please check the logs in app.log or review the troubleshooting section above.


Built with ❤️ using Python, Flask, and ChromaDB