Skip to content

Nava is a Python & Web based AI browser automation agent that lets you control the web with natural language.

Notifications You must be signed in to change notification settings

Abdulmuiz44/Nava

Folders and files

NameName
Last commit message
Last commit date

Latest commit

Β 

History

22 Commits
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 
Β 

Repository files navigation

Nava - Intelligent Browser Automation Platform

A professional-grade browser automation platform with both web and CLI interfaces. Built with Next.js 14, TypeScript, React, Tailwind CSS, Playwright, and Python.

Version License Next.js Playwright

πŸ“‚ Project Structure

This repository contains two main components:

  • Web Application (Root): Enterprise-ready Next.js web interface with advanced features
  • CLI Tool (nava-cli/): Python-based command-line interface for browser automation

Choose the interface that best fits your workflow!

✨ What's New in v2.0

πŸŽ‰ Major Update: Nava has been transformed into a professional-grade automation platform!

πŸ†• New Features

  • πŸ” API Key Authentication - Secure your instance with optional API keys
  • ⚑ 8 New Task Types - Scroll, hover, dropdown selection, text extraction, smart waits, tab switching, file upload, downloads
  • πŸ“š Workflow Library - Save, manage, and reuse automation workflows
  • πŸ“Έ Screenshot Gallery - Automatic screenshot management with visual gallery
  • πŸ”„ Task History & Replay - Persistent command history with one-click replay
  • 🎨 Enhanced UI - Modern navigation, better feedback, mobile responsive
  • πŸ›‘οΈ Production Ready - TypeScript strict mode, comprehensive error handling, optimized builds

πŸš€ Features

Web Interface Features

  • Modern Navigation Bar: Quick access to Workflows, Screenshots, and Documentation
  • Beautiful UI: Responsive interface with Tailwind CSS and modern design
  • Real-time Execution: Instant task execution with live feedback
  • Workflow Management: Save and reuse complex automation sequences
  • Screenshot Gallery: Visual gallery with lightbox, download, and metadata
  • Task History: Persistent history with replay functionality
  • API Security: Optional API key authentication for production deployments
  • RESTful API: Comprehensive API endpoints for programmatic access
  • Vercel Ready: Optimized for serverless deployment
  • TypeScript: Fully typed codebase with strict mode enabled

CLI Features

  • Python-Based: Powerful command-line automation tool
  • Rich Terminal UI: Interactive command-line interface
  • Scheduling: Automate tasks with built-in scheduler
  • Workflow Management: Create and execute complex automation workflows
  • Integration Support: Extensible integration system

Core Automation Features (Both Interfaces)

  • Natural Language Commands: Control browsers with simple English (26+ command types)
  • Smart Text-Based Clicking: Click buttons and links by their visible text
  • Intelligent Form Filling: Fill forms by field labels automatically
  • Multi-Step Workflows: Chain multiple commands with comma separation
  • Advanced Interactions: Scroll, hover, select dropdowns, extract text, smart waits
  • Tab Management: Switch between browser tabs
  • File Operations: Upload files and capture downloads
  • Visible/Headless Toggle: Watch automation or run in background
  • Playwright Integration: Robust browser automation powered by Playwright

πŸ“‹ Prerequisites

For Web Application

  • Node.js 18.17.0 or higher
  • pnpm, npm, or yarn package manager
  • Vercel account (optional, for deployment)

For CLI Tool

  • Python 3.8 or higher
  • pip package manager

πŸ› οΈ Quick Start

Web Application (Next.js)

The web application is now at the root of the repository.

1. Install Dependencies

# Using pnpm (recommended)
pnpm install

# Or using npm
npm install

2. Install Playwright Browsers

npx playwright install chromium

3. Run Development Server

pnpm run dev
# Or: npm run dev

Open http://localhost:3000 in your browser.

CLI Tool (Python)

The CLI tool is located in the nava-cli/ directory.

1. Navigate to CLI Directory

cd nava-cli

2. Install Dependencies

pip install -r requirements.txt

3. Run CLI

python cli.py

For detailed CLI documentation, see nava-cli/README_PRO.md

πŸ“¦ Build for Production (Web App)

pnpm run build
pnpm start
# Or: npm run build && npm start

🌐 Deploy to Vercel (Web App)

Important: The web application is now at the repository root, so deployment is simpler.

Method 1: Vercel CLI (Recommended)

  1. Install Vercel CLI:
npm i -g vercel
  1. Login to Vercel:
vercel login
  1. Deploy from the repository root:
vercel

For production deployment:

vercel --prod

Method 2: GitHub Integration (Easiest)

  1. Push your code to GitHub
  2. Go to vercel.com
  3. Click "Import Project"
  4. Select your GitHub repository
  5. Vercel will auto-detect Next.js configuration
  6. Configure build settings (optional):
    • Framework Preset: Next.js (auto-detected)
    • Root Directory: . (leave as root)
    • Build Command: pnpm run build or npm run build
    • Install Command: pnpm install && npx playwright install chromium
  7. Click "Deploy"

Method 3: Vercel Dashboard

  1. Go to vercel.com/new
  2. Import your Git repository
  3. Vercel will auto-detect Next.js from the root
  4. Add build settings if needed:
    Build Command: pnpm run build
    Install Command: pnpm install && npx playwright install chromium
    
  5. Deploy

Note: The vercel.json configuration is already set up at the repository root.

βš™οΈ Configuration

Environment Variables

Create a .env.local file in the root directory:

# Node Environment
NODE_ENV=production

# Playwright Configuration
PLAYWRIGHT_SKIP_BROWSER_DOWNLOAD=0

# API Security (set to 'none' to disable authentication)
NAVA_API_KEY=your_secure_api_key_here

# Optional: Rate Limiting
MAX_REQUESTS_PER_MINUTE=60

# Optional: Browser Configuration
DEFAULT_TIMEOUT=30000
MAX_CONCURRENT_SESSIONS=5

# Optional: Features
ENABLE_SCREENSHOTS=true
ENABLE_FILE_UPLOAD=true

Vercel Configuration

The vercel.json file is pre-configured with:

  • Optimized memory (2GB) for Hobby plan compatibility
  • Extended timeout (60s) for automation tasks
  • Playwright Chromium installation
  • Environment variable management

Note: For production deployments requiring longer timeouts or more memory, consider upgrading to Vercel Pro plan.

🎯 Which Interface Should You Use?

Use the Web Application if you:

  • Want a visual interface with real-time feedback
  • Need to deploy on Vercel or other serverless platforms
  • Prefer TypeScript/JavaScript ecosystem
  • Want easy API access for integrations
  • Need a production-ready web UI

Use the CLI Tool if you:

  • Prefer command-line interfaces
  • Need advanced scheduling capabilities
  • Want local Python-based automation
  • Need workflow management features
  • Want to integrate with Python scripts

Both interfaces share the same core automation features!

πŸ“š API Documentation (Web App)

Authentication

All API endpoints support optional API key authentication via the x-api-key header:

curl -X POST http://localhost:3000/api/execute \
  -H "Content-Type: application/json" \
  -H "x-api-key: your_api_key_here" \
  -d '{"task": "go to github.com"}'

Set NAVA_API_KEY=none in .env.local to disable authentication.

Execute Single Task

Endpoint: POST /api/execute

Headers:

Content-Type: application/json
x-api-key: your_api_key (if authentication enabled)

Request Body:

{
  "task": "go to github.com, scroll down, screenshot",
  "headless": true
}

Response:

{
  "success": true,
  "result": {
    "success": true,
    "taskType": "screenshot",
    "detail": "Screenshot captured successfully",
    "data": {
      "screenshot": "base64_encoded_image...",
      "pageUrl": "https://github.com"
    }
  },
  "timestamp": "2024-01-01T12:00:00.000Z"
}

Execute Task Chain

Endpoint: POST /api/execute-chain

Headers:

Content-Type: application/json
x-api-key: your_api_key (if authentication enabled)

Request Body:

{
  "tasks": [
    "go to github.com",
    "scroll down 500",
    "hover over .menu",
    "click search button",
    "wait for #results to appear"
  ],
  "headless": true
}

Response:

{
  "success": true,
  "results": [...],
  "totalTasks": 3,
  "successfulTasks": 3,
  "failedTasks": 0,
  "timestamp": "2024-01-01T12:00:00.000Z"
}

Get Workflow Templates

Endpoint: GET /api/workflows

Headers:

x-api-key: your_api_key (if authentication enabled)

Response:

{
  "success": true,
  "templates": [
    {
      "id": "template_1",
      "name": "Login Flow",
      "description": "Standard login workflow",
      "tasks": ["go to https://example.com", "click login button", ...],
      "tags": ["authentication", "login"]
    }
  ]
}

Get Screenshots

Endpoint: GET /api/screenshots

Headers:

x-api-key: your_api_key (if authentication enabled)

Response:

{
  "success": true,
  "screenshots": [
    {
      "filename": "screenshot_123.png",
      "url": "/screenshots/screenshot_123.png",
      "size": 245678,
      "created": "2024-01-01T12:00:00.000Z"
    }
  ],
  "count": 15
}

🎯 Supported Commands (26+ Types)

Navigation

  • go to <url> - Navigate to a URL
  • visit <url> - Navigate to a URL
  • navigate to <url> - Navigate to a URL
  • access <page> - Navigate by clicking a link/button with text

Examples:

go to https://example.com
go to github.com
access my dashboard

Search

  • search for <query> - Google search
  • search <query> - Google search

Examples:

search for react tutorials
search Tradia AI

Clicking (Smart Text-Based)

  • click <button text> - Click by visible text (button, link, etc.)
  • click <selector> - Click by CSS selector

Examples:

click login button
click menu
click Sign In
click #submit-btn

Scrolling ✨ NEW

  • scroll down - Scroll down 500px
  • scroll up - Scroll up 500px
  • scroll down <pixels> - Scroll down custom amount
  • scroll to top - Scroll to page top
  • scroll to bottom - Scroll to page bottom
  • scroll to <selector> - Scroll to element

Examples:

scroll down
scroll down 1000
scroll to top
scroll to #footer

Hovering ✨ NEW

  • hover over <selector> - Hover over element
  • hover <selector> - Hover over element

Examples:

hover over .menu-item
hover #dropdown-trigger

Dropdown Selection ✨ NEW

  • select "<option>" from <selector> - Select dropdown option

Examples:

select "United States" from #country
select "Blue" from select[name="color"]

Text Extraction ✨ NEW

  • get text from <selector> - Extract text from element
  • extract text from <selector> - Extract text from element

Examples:

get text from h1
extract text from .description

Smart Waits ✨ NEW

  • wait for <selector> to appear - Wait for element
  • wait for <selector> for <seconds> seconds - Wait with timeout
  • wait <seconds> - Wait for duration

Examples:

wait for #success to appear
wait for .loading for 10 seconds
wait 5

Tab Management ✨ NEW

  • switch to tab <index> - Switch to browser tab (0-indexed)

Examples:

switch to tab 0
switch to tab 1

File Operations ✨ NEW

  • upload <filepath> to <selector> - Upload file to input
  • download - Capture download event

Examples:

upload /path/to/file.pdf to input[type="file"]
download

Form Filling (Smart Label Detection)

  • fill <field name> with <value> - Fill by label text
  • fill <selector> with <value> - Fill by CSS selector

Examples:

fill email with test@example.com
fill password with mypassword
fill #username with john_doe

Keyboard Actions

  • type <text> in <selector> - Type text
  • press <key> - Press a keyboard key

Examples:

type hello world in #search
press Enter

Data Extraction

  • extract links - Get all links from page
  • screenshot - Capture page screenshot (auto-saved to gallery)

Complex Multi-Step Workflows

Separate commands with commas to execute them in sequence:

Examples:

go to example.com, scroll down, hover over button, click menu

go to store.com, select "Blue" from #color, click add to cart, screenshot

go to form.com, fill name, wait for #confirmation to appear, screenshot

go to github.com, scroll to bottom, extract links

πŸ”§ Troubleshooting

Playwright Installation Issues

If Playwright browsers aren't installing:

npx playwright install --with-deps chromium

Memory Issues on Vercel

The configuration is optimized for Vercel Hobby plan (2GB limit). For Pro plan:

  1. Upgrade your Vercel plan
  2. Adjust vercel.json:
{
  "functions": {
    "app/api/**/*.ts": {
      "memory": 3008,
      "maxDuration": 300
    }
  }
}

Build Errors

If you encounter ESLint or TypeScript errors:

# Run build locally to see errors
pnpm run build

# Common fixes already implemented:
# - Suspense boundaries for useSearchParams
# - TypeScript strict mode compliance
# - ESLint rule compliance

API Authentication Issues

If API calls fail with 401:

  1. Check NAVA_API_KEY is set in .env.local
  2. Include x-api-key header in requests
  3. Or set NAVA_API_KEY=none to disable auth

πŸ“ Project Structure

Nava/
β”œβ”€β”€ app/                          # Next.js app directory (Web App)
β”‚   β”œβ”€β”€ api/
β”‚   β”‚   β”œβ”€β”€ execute/
β”‚   β”‚   β”‚   └── route.ts          # Single task execution API
β”‚   β”‚   β”œβ”€β”€ execute-chain/
β”‚   β”‚   β”‚   └── route.ts          # Task chain execution API
β”‚   β”‚   β”œβ”€β”€ workflows/            # ✨ NEW
β”‚   β”‚   β”‚   └── route.ts          # Workflow templates API
β”‚   β”‚   └── screenshots/          # ✨ NEW
β”‚   β”‚       └── route.ts          # Screenshot management API
β”‚   β”œβ”€β”€ workflows/                # ✨ NEW
β”‚   β”‚   └── page.tsx              # Workflow management page
β”‚   β”œβ”€β”€ screenshots/              # ✨ NEW
β”‚   β”‚   └── page.tsx              # Screenshot gallery page
β”‚   β”œβ”€β”€ globals.css               # Global styles
β”‚   β”œβ”€β”€ layout.tsx                # Root layout
β”‚   └── page.tsx                  # Enhanced home page with all features
β”œβ”€β”€ lib/                          # Web app utilities
β”‚   β”œβ”€β”€ browser.ts                # Browser session management (enhanced)
β”‚   β”œβ”€β”€ task-executor.ts          # Task parsing and execution (enhanced)
β”‚   β”œβ”€β”€ workflow-manager.ts       # ✨ NEW - Workflow storage & operations
β”‚   └── screenshot-manager.ts     # ✨ NEW - Screenshot storage & operations
β”œβ”€β”€ public/
β”‚   └── screenshots/              # ✨ NEW - Screenshot storage directory
β”œβ”€β”€ middleware.ts                 # ✨ NEW - API authentication middleware
β”œβ”€β”€ nava-cli/                     # Python CLI Tool
β”‚   β”œβ”€β”€ cli.py                    # Main CLI entry point
β”‚   β”œβ”€β”€ browser.py                # Browser automation core
β”‚   β”œβ”€β”€ task_executor.py          # Task execution logic
β”‚   β”œβ”€β”€ scheduler.py              # Task scheduling
β”‚   β”œβ”€β”€ workflow.py               # Workflow management
β”‚   β”œβ”€β”€ integrations.py           # External integrations
β”‚   β”œβ”€β”€ api_server.py             # API server for CLI
β”‚   β”œβ”€β”€ setup.bat                 # Windows setup script
β”‚   β”œβ”€β”€ run.bat                   # Windows run script
β”‚   └── README_PRO.md             # CLI documentation
β”œβ”€β”€ .env.example                  # Environment variables template (enhanced)
β”œβ”€β”€ next.config.js                # Next.js configuration
β”œβ”€β”€ package.json                  # Web app dependencies
β”œβ”€β”€ pnpm-lock.yaml                # pnpm lock file
β”œβ”€β”€ tailwind.config.ts            # Tailwind CSS configuration
β”œβ”€β”€ tsconfig.json                 # TypeScript configuration (strict mode)
β”œβ”€β”€ vercel.json                   # Vercel deployment config (optimized)
β”œβ”€β”€ netlify.toml                  # ✨ NEW - Netlify deployment config
β”œβ”€β”€ QUICKSTART.md                 # Quick start guide
β”œβ”€β”€ FEATURES.md                   # ✨ NEW - Detailed feature documentation
β”œβ”€β”€ QUICK-REFERENCE.md            # ✨ NEW - Command cheat sheet
β”œβ”€β”€ IMPLEMENTATION-COMPLETE.md    # ✨ NEW - Technical implementation details
β”œβ”€β”€ NEW-FEATURES-README.md        # ✨ NEW - v2.0 feature overview
└── README.md                     # This file

πŸš€ Performance Optimization

1. Edge Runtime (Optional)

For faster cold starts, consider Edge runtime for non-browser tasks:

export const runtime = 'edge';

2. Caching

Implement response caching for repeated tasks:

export const revalidate = 60; // Cache for 60 seconds

3. Parallel Execution

Use the task chain API with parallel execution for multiple tasks.

πŸ”’ Security Considerations

βœ… Implemented Security Features

  1. API Key Authentication: Optional middleware-based authentication (middleware.ts)
  2. CORS Configuration: Properly configured CORS headers
  3. Input Validation: TypeScript strict mode with comprehensive validation
  4. Environment Variables: Secure configuration management
  5. Type Safety: Full TypeScript coverage with strict mode

πŸ” Production Security Checklist

  1. Set API Key: Always set a strong NAVA_API_KEY in production
  2. Enable HTTPS: Vercel provides automatic HTTPS
  3. Rate Limiting: Consider implementing rate limiting middleware
  4. Input Sanitization: Validate all user inputs before processing
  5. Error Handling: Production error messages don't expose sensitive data
  6. Monitoring: Set up logging and monitoring for suspicious activity

πŸ›‘οΈ API Security Best Practices

# Production .env.local
NAVA_API_KEY=use_a_strong_random_key_here_min_32_chars
NODE_ENV=production

Always include the API key in requests:

curl -H "x-api-key: your_secure_key" https://your-app.vercel.app/api/execute

πŸ“ˆ Monitoring

Vercel Analytics

Vercel Analytics is integrated. View metrics at:

https://vercel.com/your-username/nava-web/analytics

Custom Logging

Add custom logging in API routes:

console.log('Task executed:', result);

πŸ“– Additional Documentation

πŸ“˜ User Guides

πŸ”§ Technical Documentation

πŸ–₯️ CLI Documentation

🎯 Quick Links

  • Web Pages: Visit /workflows and /screenshots after running the app
  • API Endpoints: /api/execute, /api/execute-chain, /api/workflows, /api/screenshots
  • Environment Config: See .env.example for all configuration options

🀝 Contributing

  1. Fork the repository
  2. Create your feature branch (git checkout -b feature/amazing-feature)
  3. Commit your changes (git commit -m 'Add amazing feature')
  4. Push to the branch (git push origin feature/amazing-feature)
  5. Open a Pull Request

πŸ“„ License

MIT License - feel free to use this project for any purpose.

πŸ†˜ Support

For issues and questions:

  • Open an issue on GitHub
  • Check the troubleshooting section
  • Review Vercel deployment logs (for web app)
  • See CLI documentation for CLI-specific issues

πŸŽ‰ Credits

Web Application

CLI Tool

Core


Made with ❀️ by Abdulmuiz44

⭐ Star this repo if you find it useful!

About

Nava is a Python & Web based AI browser automation agent that lets you control the web with natural language.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published