A fork of "swapi" from https://github.com/Juriy/swapi re-written using Node.js, now enhanced with natural language query support using RAG (Retrieval-Augmented Generation) with vector database, OpenAI, and LangChain tools for enhanced accuracy.
- Node.js with Express.js - RESTful API server
- LanceDB - Embedded vector database for semantic search
- OpenAI API - GPT models and text embeddings
- LangChain - AI agent system with specialized tools
- Swagger - API documentation
- Passport.js - Authentication
- JSON Server - Alternative data serving
- React.js - Modern web interface for natural language queries
- HTML/CSS - Classic web interface for API documentation
- Server-Side Events (SSE) - Real-time progress updates
- JSON file storage - Star Wars data in database.json
- Vector embeddings - Semantic search capabilities
- Relationship mapping - Many-to-many entity relationships
- RESTful API - Access to all Star Wars data
- Swagger Documentation - Interactive API documentation at
/api-docs
- CORS Support - Cross-origin requests enabled
- Natural Language Queries - Ask questions about Star Wars in plain English
- Embedded Vector Database - Uses LanceDB embedded in the Node.js server
- OpenAI Integration - Leverages GPT models for intelligent responses
- LangChain Agent System - Uses tools to make API calls for additional details
- Real-time Context - Provides relevant context sources for each answer
- Enhanced Accuracy - Combines vector search with direct API access
Access the main Star Wars data through these endpoints:
- Films:
/api/films
- All Star Wars movies - Characters:
/api/characters
- All characters - Planets:
/api/planets
- All planets - Species:
/api/species
- All species - Starships:
/api/starships
- All starships - Vehicles:
/api/vehicles
- All vehicles - Transports:
/api/transports
- All transport vehicles
Many-to-many relationships between entities:
URL | Junction table | Explanation |
---|---|---|
/api/characters/:character_id/films |
characters_films | Films featuring a character |
/api/characters/:character_id/starships |
starships_characters | Starships piloted by a character |
/api/characters/:character_id/vehicles |
vehicles_characters | Vehicles used by a character |
/api/films/:film_id/characters |
films_characters | Characters appearing in a film |
/api/films/:film_id/planets |
films_planets | Planets featured in a film |
/api/films/:film_id/starships |
films_starships | Starships appearing in a film |
/api/films/:film_id/species |
films_species | Species featured in a film |
/api/films/:film_id/vehicles |
films_vehicles | Vehicles appearing in a film |
/api/planets/:planet_id/characters |
planets_characters | Characters from a planet |
/api/planets/:planet_id/films |
films_planets | Films featuring a planet |
/api/species/:species_id/characters |
species_characters | Characters of a species |
/api/species/:species_id/films |
films_species | Films featuring a species |
/api/starships/:starship_id/characters |
starships_characters | Pilots of a starship |
/api/starships/:starship_id/films |
films_starships | Films featuring a starship |
/api/vehicles/:vehicle_id/characters |
vehicles_characters | Drivers of a vehicle |
Returns the system status including vector database initialization state.
Response:
{
"api": "running",
"vectorDatabase": "ready|not_initialized|error",
"timestamp": "2025-06-24T19:24:19.514Z"
}
Fetches available OpenAI models for the provided API key.
Request:
{
"apiKey": "sk-your-openai-api-key"
}
Response:
{
"models": [
{
"id": "gpt-4",
"name": "gpt-4",
"created": 1687882411
}
]
}
Process a natural language query using RAG.
Request:
{
"apiKey": "sk-your-openai-api-key",
"model": "gpt-3.5-turbo",
"query": "Who is Luke Skywalker?"
}
Response:
{
"query": "Who is Luke Skywalker?",
"answer": "Luke Skywalker is a main character...",
"context": [
{
"content": "This is a character from Star Wars...",
"metadata": {"entity_type": "characters", "name": "Luke Skywalker"},
"relevance": 0.95
}
],
"model": "gpt-3.5-turbo",
"timestamp": "2025-06-24T19:24:19.514Z"
}
- Swagger UI:
/api-docs
- Interactive API documentation - Swagger JSON:
/api/swagger.json
- OpenAPI specification - Natural Language Interface:
/nlq
- Web interface for natural language queries
- OpenAI API Key: You need a valid OpenAI API key that starts with
sk-
- Internet Connection: Required for OpenAI API calls and embedding generation
-
Start the Backend Server:
cd server npm install npm start
Server will run on http://localhost:3000
The LanceDB vector database will be automatically created and embedded in the server - no separate database server setup required!
-
Start the Frontend (Optional - for Natural Language Queries):
cd app/swapi npm install npm start
Frontend will run on http://localhost:3001 (or another available port)
-
Use the Interface:
- Classic API: Visit http://localhost:3000 for API documentation and endpoints
- Natural Language Queries: Visit http://localhost:3001/nlq
- Enter your OpenAI API key
- Click "Load Models" to fetch available GPT models
- Select a model from the dropdown
- Ask your question about Star Wars
- View the AI-generated response with context sources
- "Who is Luke Skywalker?"
- "Tell me about Darth Vader"
- "What characters are from Naboo?"
- "What movies is Luke Skywalker in?" (Enhanced with API tool calls)
- "What characters have red eyes?" (Enhanced with search tool)
- "Which characters have blue hair?"
- "Characters with yellow skin"
- "Who has brown eyes?"
- "What is the Death Star?"
- "Tell me about the planet Tatooine"
- "What movies feature Darth Vader?" (Enhanced with API tool calls)
- "What starships appear in A New Hope?" (Enhanced with API tool calls)
- "Who pilots the Millennium Falcon?"
- "What characters are from Tatooine?"
- "Which films feature the Death Star?"
- "What species appear in the original trilogy?"
Note: The system now uses LangChain agents with API tools to provide much more comprehensive and accurate answers by combining vector search with direct API calls to get complete, up-to-date information.
The system uses a hybrid approach combining vector search with intelligent API tool selection via LangChain agents:
- Vector Search - Initial context retrieval using semantic similarity from embedded Star Wars data
- LangChain Agent - Intelligent analysis and tool selection using OpenAI's function calling capabilities
- Custom API Tools - Direct access to detailed Star Wars data via 12 specialized tools (see detailed breakdown below)
- Comprehensive Response - Combines initial vector context with fresh, complete API data
User Query → Vector Search → Initial Context → LangChain Agent Analysis
↓
Agent Tool Selection ← Tool Descriptions ← Available Tools (12 custom tools)
↓
API Calls via Selected Tools → Fresh Data → Combined with Vector Context
↓
Comprehensive AI Response
- Accuracy: Direct API access ensures up-to-date, complete information
- Relationships: Tools can follow entity relationships (e.g., character → films → other characters)
- Attribute Search: Advanced filtering capabilities beyond basic vector similarity
- Structured Data: Returns properly formatted JSON data for reliable AI processing
- Error Handling: Robust error management ensures graceful degradation
The system includes 12 specialized tools built with LangChain's DynamicStructuredTool
class, each designed for specific Star Wars data access patterns:
-
get_character
- Get detailed character information by ID- Schema:
{ id: number }
- Purpose: Retrieves complete character data including name, physical attributes, homeworld, etc.
- Schema:
-
get_film
- Get detailed film information by ID- Schema:
{ id: number }
- Purpose: Retrieves movie details including title, director, release date, opening crawl, etc.
- Schema:
-
get_planet
- Get detailed planet information by ID- Schema:
{ id: number }
- Purpose: Retrieves planet data including climate, terrain, population, etc.
- Schema:
-
get_starship
- Get detailed starship information by ID- Schema:
{ id: number }
- Purpose: Retrieves starship specifications including model, class, speed, capacity, etc.
- Schema:
-
get_species
- Get detailed species information by ID- Schema:
{ id: number }
- Purpose: Retrieves species data including classification, language, lifespan, etc.
- Schema:
-
get_vehicle
- Get detailed vehicle information by ID- Schema:
{ id: number }
- Purpose: Retrieves vehicle specifications including model, class, crew capacity, etc.
- Schema:
get_character_films
- Get all films featuring a specific characterget_film_characters
- Get all characters appearing in a specific filmget_planet_characters
- Get all characters from a specific planet (homeworld)get_starship_characters
- Get all characters who pilot a specific starshipget_species_characters
- Get all characters belonging to a specific species
search_characters
- Advanced character attribute search- Schema:
{ attribute: string }
- Purpose: Searches all characters for specific attributes like "red eyes", "blue hair", "yellow skin"
- Supported Attributes: eye color, hair color, skin color, gender
- Intelligence: Uses pattern matching to extract colors and attributes from natural language queries
- Schema:
// Built using LangChain's DynamicStructuredTool with Zod validation
import { DynamicStructuredTool } from '@langchain/core/tools';
import { z } from 'zod';
// Each tool follows this pattern:
export const getCharacterTool = new DynamicStructuredTool({
name: 'get_character',
description: 'Get detailed information about a Star Wars character by their ID...',
schema: z.object({
id: z.number().describe('The ID of the character to retrieve')
}),
func: async ({ id }) => {
// API call implementation with error handling
}
});
- Base URL:
http://localhost:3000/api
- All tools connect to the local SWAPI server - HTTP Client: Uses Axios for reliable HTTP requests with automatic JSON parsing
- Response Format: All tools return JSON.stringify'd responses for consistent LangChain consumption
- Error Handling: Comprehensive try-catch blocks with descriptive error messages
The LangChain agent automatically selects appropriate tools based on:
- Query Analysis: Natural language processing to identify entities and relationships
- Context Awareness: Uses initial vector search results to determine which IDs to query
- Tool Descriptions: Rich descriptions guide the agent's tool selection decisions
- Schema Validation: Zod schemas ensure proper parameter types and validation
// Helper function extracts relevant entity IDs from vector search context
export function extractEntityIds(context) {
const entityIds = {
characters: [], films: [], planets: [],
starships: [], species: [], vehicles: []
};
// Parses metadata from vector search results to identify relevant entities
}
- Network Error Handling: Graceful handling of API connection failures
- Invalid ID Handling: Descriptive error messages for non-existent entities
- Rate Limiting Awareness: Built to handle API rate limits without breaking agent flow
- Fallback Responses: Returns structured error messages that the agent can interpret and relay to users
All custom tools follow a consistent implementation pattern using LangChain's DynamicStructuredTool
:
export const getCharacterTool = new DynamicStructuredTool({
name: 'get_character',
description: 'Get detailed information about a Star Wars character by their ID. Use this when you need more details about a specific character.',
schema: z.object({
id: z.number().describe('The ID of the character to retrieve')
}),
func: async ({ id }) => {
try {
const response = await axios.get(`${BASE_URL}/characters/${id}`);
return JSON.stringify(response.data, null, 2);
} catch (error) {
return `Error fetching character ${id}: ${error.message}`;
}
}
});
The search_characters
tool demonstrates sophisticated attribute parsing:
export const searchCharactersTool = new DynamicStructuredTool({
name: 'search_characters',
description: 'Search all characters to find those with specific attributes like eye color, hair color, etc.',
schema: z.object({
attribute: z.string().describe('The attribute to search for (e.g., "red eyes", "blue hair")')
}),
func: async ({ attribute }) => {
// Intelligent attribute parsing with regex pattern matching
const attributeLower = attribute.toLowerCase();
if (attributeLower.includes('eye')) {
const color = attributeLower.match(/(red|blue|green|yellow|brown|black|white|orange|purple|pink)/)?.[1];
// Filter characters by eye color
}
// Similar logic for hair color, skin color, gender, etc.
}
});
// All tools exported as a single array for easy agent integration
export const swapiTools = [
getCharacterTool, getCharacterFilmsTool, getFilmTool,
getFilmCharactersTool, getPlanetTool, getPlanetCharactersTool,
getStarshipTool, getStarshipCharactersTool, getSpeciesTool,
getSpeciesCharactersTool, getVehicleTool, searchCharactersTool
];
// Used in agent initialization (from index.js):
const agent = await createToolCallingAgent({
llm, tools: swapiTools, prompt
});
The system includes intelligent entity ID extraction from vector search results:
export function extractEntityIds(context) {
const entityIds = { characters: [], films: [], planets: [], starships: [], species: [], vehicles: [] };
context.forEach(item => {
if (item.metadata && item.metadata.entity_type && item.metadata.entity_id) {
const entityType = item.metadata.entity_type;
const entityId = item.metadata.entity_id;
if (entityIds[entityType] && !entityIds[entityType].includes(entityId)) {
entityIds[entityType].push(entityId);
}
}
});
return entityIds;
}
Tools are integrated into the agent system with comprehensive guidance:
const prompt = ChatPromptTemplate.fromMessages([
['system', `You are an expert on Star Wars data with access to detailed information through API tools.
Available tools can help you get:
- Detailed character information and their films
- Film details and cast information
- Planet information and inhabitants
- Starship and vehicle details
- Species information and characters
- Search functionality for attribute-based queries
Use the tools strategically to provide the most complete and accurate answer possible.`],
['human', '{input}'],
['placeholder', '{agent_scratchpad}']
]);
- Uses LanceDB embedded in the Node.js server for storing and searching vector embeddings
- Data is automatically ingested from
database.json
on first query - Uses OpenAI's
text-embedding-3-small
model for embeddings - Stores metadata for entity type, names, and other searchable fields
- No external database server required - fully embedded solution
- Query Embedding: User query is converted to vector embedding using OpenAI's
text-embedding-3-small
- Vector Similarity Search: LanceDB performs semantic search to find initially relevant Star Wars data
- Context Analysis: System extracts entity IDs from vector search metadata using
extractEntityIds()
helper - LangChain Agent Initialization: Agent is created with access to all 12 custom tools and context-aware prompt
- Intelligent Tool Selection: Agent analyzes query and context to determine which tools to use:
- Single entity queries → Direct entity tools (
get_character
,get_film
, etc.) - Relationship queries → Relationship mapping tools (
get_character_films
, etc.) - Attribute queries → Advanced search tool (
search_characters
)
- Single entity queries → Direct entity tools (
- API Tool Execution: Selected tools make targeted API calls to local SWAPI endpoints
- Response Synthesis: OpenAI generates comprehensive response combining:
- Initial vector search context (for semantic understanding)
- Fresh API tool results (for accurate, complete data)
- Tool execution metadata (for transparency)
- Structured Output: Response includes answer, context sources, tool usage, and relevance scores
- API keys are not stored on the server
- Input validation for all parameters
- Rate limiting awareness and error handling
- CORS enabled for frontend communication
The system uses the existing Star Wars API data from database.json
which includes:
- Films
- Characters
- Planets
- Species
- Starships
- Vehicles
All data is processed and vectorized to enable semantic search capabilities.
The system includes comprehensive error handling for:
- Invalid API keys
- Rate limiting
- Model availability
- Vector database issues
- Network connectivity problems
- JSON parsing errors with detailed debugging information
-
Vector database initialization errors: The system will automatically create the LanceDB database on first use. Check the console logs for any OpenAI API key or network issues.
-
OpenAI API issues: Make sure your API key is valid and has sufficient credits
-
Port conflicts: If port 3000 is in use, the server will fail to start. Use a different port:
PORT=3001 npm start
-
JSON parsing errors: If you see "Unexpected non-whitespace character after JSON", this usually indicates the database.json file has formatting issues or multiple JSON objects. Check the server console for detailed error information.
To extend or modify the system:
-
Backend: Edit files in
/server
directoryindex.js
: Main Express server with API endpoints and LangChain agent integrationlangchain-tools.js
: Custom LangChain tools - Add new tools or modify existing ones herevector-db-setup.js
: Vector database operations using LanceDB
-
Frontend: Edit files in
/app/swapi/src
directoryApp.js
: Main React componentApp.css
: Styling
-
Testing: Use the provided test scripts or browser interface
To add new custom tools to the LangChain agent:
- Create New Tool in
langchain-tools.js
:
export const getNewEntityTool = new DynamicStructuredTool({
name: 'get_new_entity',
description: 'Description of what this tool does',
schema: z.object({
id: z.number().describe('Parameter description')
}),
func: async ({ id }) => {
// Implementation
}
});
- Add to Tools Array:
export const swapiTools = [
// ... existing tools
getNewEntityTool
];
-
Update Agent Prompt in
index.js
to describe the new tool's capabilities -
Test Tool Integration by running queries that would benefit from the new tool