This project implements a production-ready customer support chatbot combining fine-tuned LLaMA models with RAG architecture, deployed on AWS SageMaker with a FastAPI backend.
- Context-aware responses using fine-tuned LLaMA-3
- Retrieval-Augmented Generation (RAG)
- Multi-turn conversation handling
- Responsible AI guardrails
- SageMaker and Lambda deployment
- API Gateway integration
src/: Core application code and modulesinfra/: Infrastructure as Code (CDK)docker/: Dockerfile for SageMaker deploymentscripts/: Helper scripts
MIT © January 2025 Fereydoon Boroojerdi
pip install -r requirements.txtRun the FastAPI app:
uvicorn src.appmain:app --host 0.0.0.0 --port 8000pytest tests/Please read CONTRIBUTING.md for details on our code of conduct and the process for submitting pull requests.
To run the app locally with Docker:
docker-compose up --buildCreate a .env file or use sample.env as a template.
GitHub Actions will automatically run tests and lint checks on each pull request.
