diff --git a/.assets/perplexica-preview.gif b/.assets/perplexica-preview.gif index 5dae084..61fe968 100644 Binary files a/.assets/perplexica-preview.gif and b/.assets/perplexica-preview.gif differ diff --git a/.github/copilot-instructions.md b/.github/copilot-instructions.md new file mode 100644 index 0000000..1d601a5 --- /dev/null +++ b/.github/copilot-instructions.md @@ -0,0 +1,145 @@ +# Project Overview + +Perplexica is an open-source AI-powered search engine that uses advanced machine learning to provide intelligent search results. It combines web search capabilities with LLM-based processing to understand and answer user questions, similar to Perplexity AI but fully open source. + +## Architecture + +The system works through these main steps: + +- User submits a query +- The system determines if web search is needed +- If needed, it searches the web using SearXNG +- Results are ranked using embedding-based similarity search +- LLMs are used to generate a comprehensive response with cited sources + +## Architecture Details + +### Technology Stack + +- **Frontend**: React, Next.js, Tailwind CSS +- **Backend**: Node.js +- **Database**: SQLite with Drizzle ORM +- **AI/ML**: LangChain + LangGraph for orchestration +- **Search**: SearXNG integration +- **Content Processing**: Mozilla Readability, Cheerio, Playwright + +### Database (SQLite + Drizzle ORM) + +- Schema: `src/lib/db/schema.ts` +- Tables: `messages`, `chats`, `systemPrompts` +- Configuration: `drizzle.config.ts` +- Local file: `data/db.sqlite` + +### AI/ML Stack + +- **LLM Providers**: OpenAI, Anthropic, Groq, Ollama, Gemini, DeepSeek, LM Studio +- **Embeddings**: Xenova Transformers, similarity search (cosine/dot product) +- **Agents**: `webSearchAgent`, `analyzerAgent`, `synthesizerAgent`, `taskManagerAgent` + +### External Services + +- **Search Engine**: SearXNG integration (`src/lib/searxng.ts`) +- **Configuration**: TOML-based config file + +### Data Flow + +1. User query → Task Manager Agent +2. Web Search Agent → SearXNG → Content extraction +3. Analyzer Agent → Content processing + embedding +4. Synthesizer Agent → LLM response generation +5. Response with cited sources + +## Project Structure + +- `/src/app`: Next.js app directory with page components and API routes + - `/src/app/api`: API endpoints for search and LLM interactions +- `/src/components`: Reusable UI components +- `/src/lib`: Backend functionality + - `lib/search`: Search functionality and meta search agent + - `lib/db`: Database schema and operations + - `lib/providers`: LLM and embedding model integrations + - `lib/prompts`: Prompt templates for LLMs + - `lib/chains`: LangChain chains for various operations + - `lib/agents`: LangGraph agents for advanced processing + - `lib/utils`: Utility functions and types including web content retrieval and processing + +## Focus Modes + +Perplexica supports multiple specialized search modes: + +- All Mode: General web search +- Local Research Mode: Research and interact with local files with citations +- Chat Mode: Have a creative conversation +- Academic Search Mode: For academic research +- YouTube Search Mode: For video content +- Wolfram Alpha Search Mode: For calculations and data analysis +- Reddit Search Mode: For community discussions + +## Core Commands + +- **Development**: `npm run dev` (uses Turbopack for faster builds) +- **Build**: `npm run build` (includes automatic DB push) +- **Production**: `npm run start` +- **Linting**: `npm run lint` (Next.js ESLint) +- **Formatting**: `npm run format:write` (Prettier) +- **Database**: `npm run db:push` (Drizzle migrations) + +## Configuration + +The application uses a `config.toml` file (created from `sample.config.toml`) for configuration, including: + +- API keys for various LLM providers +- Database settings +- Search engine configuration +- Similarity measure settings + +## Common Tasks + +When working on this codebase, you might need to: + +- Add new API endpoints in `/src/app/api` +- Modify UI components in `/src/components` +- Extend search functionality in `/src/lib/search` +- Add new LLM providers in `/src/lib/providers` +- Update database schema in `/src/lib/db/schema.ts` +- Create new prompt templates in `/src/lib/prompts` +- Build new chains in `/src/lib/chains` +- Implement new LangGraph agents in `/src/lib/agents` + +## AI Behavior Guidelines + +- Focus on factual, technical responses without unnecessary pleasantries +- Avoid conciliatory language and apologies +- Ask for clarification when requirements are unclear +- Do not add dependencies unless explicitly requested +- Only make changes relevant to the specific task +- Do not create test files or run the application unless requested +- Prioritize existing patterns and architectural decisions +- Use the established component structure and styling patterns + +## Code Style & Standards + +### TypeScript Configuration + +- Strict mode enabled +- ES2017 target +- Path aliases: `@/*` → `src/*` +- No test files (testing not implemented) + +### Formatting & Linting + +- ESLint: Next.js core web vitals rules +- Prettier: Use `npm run format:write` before commits +- Import style: Use `@/` prefix for internal imports + +### File Organization + +- Components: React functional components with TypeScript +- API routes: Next.js App Router (`src/app/api/`) +- Utilities: Grouped by domain (`src/lib/`) +- Naming: camelCase for functions/variables, PascalCase for components + +### Error Handling + +- Use try/catch blocks for async operations +- Return structured error responses from API routes diff --git a/.gitignore b/.gitignore index 9fb5e4c..dd76d84 100644 --- a/.gitignore +++ b/.gitignore @@ -38,4 +38,6 @@ Thumbs.db db.sqlite /searxng +# AI stuff for planning and implementation +.ai/ certificates \ No newline at end of file diff --git a/README.md b/README.md index 5eb0713..106a57e 100644 --- a/README.md +++ b/README.md @@ -1,22 +1,6 @@ # 🚀 Perplexica - An AI-powered search engine 🔎 -
- Automatic Image Search + Automatic Suggestions
- Automatically search for relevant images in chat - responses + Automatically show related suggestions after responses
- Automatic Video Search -
-- Automatically search for relevant videos in chat - responses -
-+ {prompt.content} +
++ Optimization Mode +
++ Chat Model +
++ Chat Context Window Size +
+Model Name
- {Custom OpenAI API Key
- {Custom OpenAI Base URL
- { )} -+ No models available +
+ ); + })()} +OpenAI API Key
- {Ollama API URL
- {GROQ API Key
- { />+ OpenRouter API Key +
+Anthropic API Key
- {Gemini API Key
- {Deepseek API Key
- {AI/ML API Key
- { + setConfig((prev) => ({ + ...prev!, + aimlApiKey: e.target.value, + })); + }} + onSave={(value) => saveConfig('aimlApiKey', value)} + /> ++ AI/ML API Key +
+LM Studio API URL
- { )}{event.message}
+ )} + + {/* Display relevant details based on event type */} + {event.details && Object.keys(event.details).length > 0 && ( +- {suggestion} -
-+ {progress.message} +
+ {progress.subMessage && ( ++ {progress.subMessage} +
+ )} +