Le Kiet

Software Engineer | Data Scientist

View all authors

Architecture

November 30, 2024 · 5 min read

Just a random guy who loves coding.

Le Kiet

Software Engineer | Data Scientist

Overview

This document provides a high-level overview of the architecture for our Large Language Model (LLM) application, designed to assist back-office employees with intelligent query responses. The application leverages a combination of a large language model (LLM), document embedding, semantic search, and caching to deliver fast and accurate answers based on company documents. Each component is containerized, enabling scalability, modularity, and ease of deployment.

Architecture Diagram

The following is a simplified view of the system’s core components and their interactions:

User Interface layer
API & Backend layer
LLM & Embedding layer
Data Management layer
Document processing layer

Components

1. User Interface layer

Web Frontend

Service: NextJS
Function: Provides the primary user interface for employees to interact with the assistant. Built with NextJS, the frontend is responsive and optimized for delivering a seamless user experience.

Zalo Chatbot

Service: using Zalo API
Function: Acts as an additional chatbot interface, potentially integrated with Zalo (a popular chat platform). The chatbot is proxied through Nginx for better performance and security.

2. API & Backend layer

Web Server

Service: Nginx
Function: Operates as a reverse proxy, routing requests from the frontend to the backend API server. Nginx enhances security and load balancing, helping manage high request volumes effectively.

API Server

Service: FastAPI
Function: The core backend server, handling API requests, managing cache checks, interfacing with the LLM, and orchestrating document retrieval. FastAPI’s asynchronous capabilities allow the API to handle multiple requests efficiently, making it suitable for real-time applications.

3. LLM & Embedding layer

Large language model

Service: OpenAI API/self-host LLM model
Function: The primary LLM for answering user queries. The model receives user queries (with context provided from relevant documents) and generates human-like responses.

Embedding model

Service: OpenAI API/self-host embedding model
Function: Maps text to vector embeddings, a process crucial for similarity search. This component enables the system to find and rank relevant documents based on user queries. The choice of GPT or PhoBERT allows flexibility in handling multilingual or domain-specific embedding needs.

4. Data Management layer

LLM Cache

Service: Redis
Function: Caches responses from the LLM based on query embeddings, minimizing redundant API calls. This significantly reduces response latency for frequently asked questions and optimizes API usage costs.

Vector Database

Service: Qdrant
Function: Stores vector embeddings generated from documents and supports semantic search. When a query is received, Qdrant enables the system to quickly find and return similar documents based on vector similarity.

Relational Database

Service: PostgreSQL
Function: Manages structured data, including user information, permissions, and system metadata. This database is essential for tracking user access and managing query history and other operational data.

5. Document Processing layer

Document Storage

Service: MinIO
Function: Stores documents (in PDF format) sourced from the company’s website or other repositories. MinIO, an S3-compatible object storage solution, offers scalable and private storage for company documents.

Background Task Queue

Service: Celery
Function: Orchestrates background tasks, including document fetching, processing, and embedding. This allows the system to handle these tasks asynchronously, ensuring that document indexing does not impact the main query processing flow.

Workflow

User Query:

A user submits a query through the Web Frontend or Zalo Chatbot.

Request Handling:

The query is routed via the Web Server (Nginx) to the API Server (FastAPI).

Cache Lookup:

The API Server checks the LLM Cache (Redis) to see if a response for this query is available.
If a cached response is found, it is returned to the user, reducing latency and API costs.

Embedding and Document Retrieval:

If no cache hit, the query is embedded by the Embedding Model and sent to the Vector Database (Qdrant).
Qdrant performs a similarity search, returning relevant documents.

LLM Query:

The API Server forwards the user’s query and relevant document context to the Large Language Model (OpenAI API) for response generation.
The response is then cached in Redis for future use.

Background Document Processing:

Periodically, Celery fetches new documents from the Company’s Website, processes them into vector embeddings using the Embedding Model, and stores these embeddings in the Vector Database for future query matching.

Acknowledgements

This application builds on top of other open-source projects and leverages them:

NextJS for a dynamic and responsive front-end.
FastAPI for a high-performance, asynchronous API backend.
Redis for caching responses and reducing latency.
Qdrant as a vector database for semantic search.
MinIO for scalable object storage of documents.
Celery for managing background tasks, ensuring real-time user experience is not impacted by document processing.

This high-level architecture is designed to support a fast, scalable, and cost-effective solution for assisting employees with document-based query responses. It leverages modern microservices architecture principles, providing a robust foundation for further enhancements.

Naming Conventions

November 30, 2024 · 6 min read

Just a random guy who loves coding.

Le Kiet

Software Engineer | Data Scientist

A well-defined Git naming convention enhances collaboration, supports automated versioning, and simplifies project maintenance. This document outlines recommended naming conventions for branches, commits, and tags to facilitate a smooth workflow and compatibility with Semantic Release.

0. Scopes

Backend Scopes (Python, FastAPI)

api: General API-related changes, such as routes or request handling.
- Example: fix(api): correct token validation issue
auth: Authentication and authorization logic.
- Example: feat(auth): add OAuth2 support
llm: LLM-specific logic, including model loading and response generation.
- Example: perf(llm): optimize LLM response caching
database: Database models, migrations, or queries.
- Example: fix(database): correct migration script for user model
schemas: Pydantic or data validation schemas.
- Example: refactor(schemas): update response schema for LLM results
middleware: FastAPI middleware (e.g., CORS, logging).
- Example: chore(middleware): add logging middleware for request tracing
background: Background tasks or scheduled jobs.
- Example: feat(background): add scheduled job to refresh model
config: Configuration files or environment variable updates.
- Example: chore(config): add env variable for LLM model path
testing: Unit and integration tests for backend logic.
- Example: test(testing): add tests for LLM response accuracy
deps: Dependency updates or additions.
- Example: chore(deps): upgrade FastAPI to latest version

Frontend Scopes (JavaScript, Next.js)

ui: General UI/UX changes, layout adjustments, and styling.
- Example: style(ui): improve layout for model response view
auth: Frontend authentication handling, including login and token management.
- Example: feat(auth): add persistent session handling
api: Frontend API calls or data fetching logic.
- Example: refactor(api): update request headers for secured endpoint
llm-display: UI components specifically for displaying LLM responses.
- Example: feat(llm-display): add loading spinner for LLM responses
forms: Form components for user inputs, especially related to LLM queries.
- Example: fix(forms): correct validation error messages
state: Application state management (e.g., Redux, context API).
- Example: chore(state): add state management for user settings
config: Frontend configuration, environment variables, or Next.js settings.
- Example: chore(config): add environment variable for API base URL
deps: Frontend dependencies and package management.
- Example: chore(deps): update Next.js to latest version
i18n: Localization and language support.
- Example: feat(i18n): add support for Spanish language
testing: Unit or end-to-end tests for frontend components.
- Example: test(testing): add Jest tests for LLM response UI
seo: SEO-related adjustments (e.g., meta tags, titles).
- Example: chore(seo): update meta description for homepage

Full-Stack or Shared Scopes

docker: Docker configuration and setup.
- Example: chore(docker): optimize Dockerfile for smaller image size
docs: Documentation changes for both backend and frontend.
- Example: docs(docs): update README with API usage examples
env: Environment variables or configuration shared between backend and frontend.
- Example: chore(env): add new environment variable for model type
build: Build scripts or deployment configurations.
- Example: chore(build): optimize build for production deployment
ci: Continuous Integration setup or updates.
- Example: chore(ci): add automated test for LLM endpoint responses

1. Branch Naming Conventions

Branch naming conventions help distinguish between different types of work and keep the Git repository organized. Each branch name should reflect its purpose and scope, using a clear and consistent naming pattern.

Primary Branches (Default Branches)

Primary branches represent key stages of the project lifecycle:

main: The main branch contains the latest stable production-ready code.

Feature Branches

Feature branches are used to work on new features or enhancements in isolation from the main codebase.

Pattern: feat/{issue-id}/{short-description}

Examples:

feat/102/add-user-authentication
feat/145/improve-dashboard-ui

Guidelines:

Issue ID: Include the issue or task ID if using a tracking tool like Jira or GitHub Issues.
Short Description: Use a concise description in kebab-case (lowercase, separated by hyphens) to describe the purpose of the feature.

Hotfix and Bugfix Branches

Hotfix and bugfix branches are used to address issues either in production or development environments.

Hotfix Branches

Pattern: hotfix/{short-description}

Examples:

hotfix/fix-critical-auth-bug
hotfix/remove-duplicate-entries

Bugfix Branches

Pattern: bugfix/{issue-id}/{short-description}

Examples:

bugfix/203-correct-profile-picture-upload
bugfix/207-fix-login-loop-error

2. Commit Message Conventions

Using semantic and structured commit messages ensures readability, consistency, and compatibility with Semantic Release tools, enabling automated versioning.

Commit Message Structure

A standard commit message follows this format:

<type>(<scope>): <subject>

type: Specifies the type of change (e.g., feat, fix) to indicate its impact on the code.
scope: Indicates the module or area affected (optional).
subject: A brief, imperative description of the change.

Semantic Commit Types

The following commit types align with Semantic Versioning, automatically updating version numbers based on the types of changes introduced.

feat: A new feature (increases the MINOR version).
fix: A bug fix (increases the PATCH version).
docs: Documentation changes only.
style: Changes in code formatting, not affecting code behavior.
refactor: Refactoring code without affecting functionality.
test: Adding or modifying tests.
perf: Code changes that improve performance.
chore: Routine tasks, maintenance, or build changes.
build: Changes affecting the build system or dependencies.

Best Practices for Commit Messages

Examples:

feat(auth): add OAuth2 support for social login
fix(ui): correct layout issue on mobile navbar
docs(readme): update installation instructions

Guidelines:

Use the imperative form: (e.g., “fix,” not “fixed”).
Keep subjects concise (50 characters or less).
Add a body section if further explanation is needed, separated by a blank line after the subject line.
Avoid WIP (Work In Progress) commits in the main branches.

3. Tag Naming Conventions

Tags mark specific points in the project’s history, typically used for release versions. Follow Semantic Versioning format for tagging releases.

Release Tags

Semantic Release relies on tags for versioning. Tags should be consistent, indicating the major, minor, and patch levels as vMAJOR.MINOR.PATCH.

Pattern: v<MAJOR>.<MINOR>.<PATCH>

Examples:

v1.0.0
v1.2.1
v2.0.0-alpha

Notes:

Pre-release tags: Use pre-release identifiers (e.g., alpha, beta) for early versions (e.g., v1.0.0-alpha).
Automated Tagging: When using Semantic Release, tags are automatically generated based on commit messages, ensuring consistent versioning.

Roadmap

November 30, 2024 · 6 min read

Just a random guy who loves coding.

Le Kiet

Software Engineer | Data Scientist

This milestone plan is structured to guide the development of a full-stack LLM application from ideation to full public launch, focusing on core functionality, testing, and iterative feedback.

Summary Table of Milestones

Milestone	Goal	Key Deliverables
1. Discovery and Planning	Define scope and architecture	Requirements doc, architecture diagram
2. Core Infrastructure Setup	Set up essential infrastructure	Storage, databases, task queue
3. Prototype Completion	Develop basic LLM functionality	Basic functional prototype
4. Core Features Developed	Add document processing, caching, search	Feature-complete system
5. Alpha Release	Internal testing and feedback	Stable alpha version
6. Beta Release	Limited user testing and feedback	Beta version, user feedback
7. Performance Optimization	Optimize performance and security	Optimized, secure application
8. Public Launch Prep	Finalize for public release	Launch-ready product, documentation
9. Public Launch	Release to all users	Publicly accessible application
10. Post-Launch Improvements	Minor enhancements and bug fixes	Improved version with fixes
11. Scaling and Maintenance	Ensure scalability and plan future growth	Scalable, maintained application

Milestone 1: Discovery and Planning

Goal: Define project requirements, key objectives, and architecture.
Details:
- Conduct research on user needs, LLM capabilities, and expected system load.
- Define project scope, including the primary features and user stories.
- Outline initial system architecture (frontend, backend, cache, databases).
- Set up project repository, CI/CD pipelines, and preliminary development environment.
Deliverables: Requirements document, architecture diagram, project timeline, and initial codebase setup.

Milestone 2: Core Infrastructure Setup

Goal: Establish core infrastructure, including storage, databases, and background processing.
Details:
- Set up and configure Document Storage (AWS S3 or MinIO), Vector Database (Qdrant), and Relational Database (SQL Server).
- Configure the background task queue (Celery on AWS Step Functions or similar) for asynchronous tasks like embedding documents.
- Establish serverless or managed services for storage and databases to reduce costs and improve scalability.
Deliverables: Functional infrastructure with integration tests confirming connectivity and data storage.

Milestone 3: Prototype Completion

Goal: Develop a functional prototype with basic LLM and embedding model integration.
Details:
- Integrate the LLM API (e.g., OpenAI API) and embedding model (GPT or PhoBERT) for basic query processing.
- Build a simple frontend UI (NextJS) for users to submit queries and receive responses.
- Implement a basic API server (FastAPI on AWS Lambda) with reverse proxy (Nginx).
- Test end-to-end flow from user input to LLM response and back.
Deliverables: Basic functional prototype enabling user queries and AI-generated responses.

Milestone 4: Core Features Developed

Goal: Implement essential features like document processing, caching, and vector search.
Details:
- Implement background tasks for document embedding and storage in the Vector Database.
- Add caching (Redis) for frequently accessed queries to improve response times.
- Build vector-based search functionality for finding similar documents based on embeddings.
- Refine UI and backend to support these features with scalability in mind.
Deliverables: Feature-complete application with document embedding, caching, and search.

Milestone 5: Alpha Release (Internal Testing)

Goal: Launch an internal version for testing core functionalities and performance.
Details:
- Deploy to a test environment with restricted access for internal team members.
- Conduct usability testing for UI, API reliability, and system stability.
- Run initial performance tests to check if response times and throughput meet expectations.
- Collect internal feedback to refine UX and resolve major issues.
Deliverables: Stable alpha version with feedback from internal testing and an updated backlog.

Milestone 6: Beta Release (Limited User Testing)

Goal: Release a beta version for a limited audience to gather real user feedback.
Details:
- Deploy to a production-like environment with limited access for select users.
- Implement logging and monitoring to capture usage patterns, errors, and bottlenecks.
- Integrate a feedback system to gather insights on user experience and response accuracy.
Deliverables: Beta version with logging and analytics, and documented user feedback.

Milestone 7: Performance and Security Optimization

Goal: Optimize the system for speed, efficiency, and security.
Details:
- Optimize caching for frequently accessed responses.
- Review and optimize vector database queries for faster results.
- Conduct a security audit focused on data protection, authentication, and permissions.
- Run load testing to confirm system scalability.
Deliverables: Performance-optimized, secure application ready for a wider release, with testing reports.

Milestone 8: Public Launch Preparation

Goal: Finalize all details for a successful public release.
Details:
- Complete final testing, including UAT (User Acceptance Testing), to ensure functional readiness.
- Finalize documentation for users and developers, including API docs and setup guides.
- Prepare marketing and user onboarding materials, if needed.
- Establish support channels for post-launch user queries.
Deliverables: Launch-ready product, comprehensive documentation, and user onboarding resources.

Milestone 9: Version 1.0 - Public Launch

Goal: Release the application to all intended users.
Details:
- Officially launch the application with monitoring for critical issues.
- Monitor user activity and metrics for the first 24-48 hours to ensure stability.
- Provide support for major issues with a rapid response plan.
Deliverables: Full public release with active user support and monitoring.

Milestone 10: Post-Launch Improvements (Version 1.1)

Goal: Address post-launch feedback and add minor improvements.
Details:
- Collect and analyze public feedback to identify improvement areas.
- Resolve post-launch issues or bugs based on user reports and monitoring.
- Implement minor feature enhancements or optimizations.
Deliverables: Improved version with bug fixes, optimizations, and minor enhancements.

Milestone 11: Scaling and Maintenance (Ongoing)

Goal: Ensure the application is scalable and well-maintained.
Details:
- Review performance under varying loads to identify scaling needs.
- Set up maintenance plans for databases, storage, and serverless resources to optimize costs.
- Prioritize future feature requests for competitive advantage.
Deliverables: Scalable, well-maintained application with a roadmap for future updates.

This plan provides a structured path from initial development to launch and ongoing maintenance, helping ensure that each stage is well-prepared and aligned with the project’s goals.

donut ver two

November 23, 2024 · One min read

Le Kiet

Software Engineer | Data Scientist

Andy Sloane

A very cool person

_,x,y,o       ,N;char       b[1840]       ;p(n,c)
{for(;n       --;x++)       c==10?y       +=80,x=
o-1:x>=       0?80>x?       c!='~'?       b[y+x]=
c:0:0:0       ;}c(q,l       ,r,o,v)       char*l,
       *r;{for       (;q>=0;       )q=("A"       "YLrZ^"
       "w^?EX"           "novne"     "bYV"       "dO}LE"
       "{yWlw"      "Jl_Ja|[ur]zovpu"   ""       "i]e|y"
       "ao_Be"   "osmIg}r]]r]m|wkZU}{O}"         "xys]]\
x|ya|y"        "sm||{uel}|r{yIcsm||ya[{uE"  "{qY\
w|gGor"      "VrVWioriI}Qac{{BIY[sXjjsVW]aM"  "T\
tXjjss"     "sV_OUkRUlSiorVXp_qOM>E{BadB"[_/6  ]-
62>>_++    %6&1?r[q]:l[q])-o;return q;}E(a){for (
       o= x=a,y=0,_=0;1095>_;)a= " <.,`'/)(\n-"  "\\_~"[
       c  (12,"!%*/')#3"  ""     "+-6,8","\"(.$" "01245"
       " &79",46)+14],  p(""       "#$%&'()0:439 "[ c(10
       , "&(*#,./1345" ,"')"       "+%-$02\"! ", 44)+12]
-34,a);  }main(k){float     A=0,B= 0,i,j,z[1840];
puts(""  "\x1b[2J");;;      for(;; ){float e=sin
(A), n=  sin(B),g=cos(      A),m=  cos(B);for(k=
0;1840>   k;k++)y=-10-k/    80   ,o=41+(k%80-40
       )* 1.3/y+n,N=A-100.0/y,b[k]=".#"[o+N&1],  z[k]=0;
       E(  80-(int)(9*B)%250);for(j=0;6.28>j;j   +=0.07)
       for  (i=0;6.28>i;i+=0.02){float c=sin(    i),  d=
       cos(  j),f=sin(j),h=d+2,D=15/(c*h*e+f     *g+5),l
=cos(i)        ,t=c*h*g-f*e;x=40+2*D*(l*h*  m-t*n
),y=12+       D  *(l*h*n+t*m),o=x+80*y,N  =8*((f*
e-c*d*g       )*m   -c*d*e-f*g-l*d*n)     ;if(D>z
[o])z[o       ]=D,b[     o]=" ."          ".,,-+"
       "+=#$@"       [N>0?N:       0];;;;}       printf(
       "%c[H",       27);for       (k=1;18       *100+41
       >k;k++)       putchar       (k%80?b       [k]:10)
       ;;;;A+=       0.053;;       B+=0.03       ;;;;;}}

(as with the first one, compile it with -lm, and it needs ANSI-ish terminal emulation)

have a donut

November 22, 2024 · One min read

Le Kiet

Software Engineer | Data Scientist

Andy Sloane

A very cool person

(compile with gcc -o donut donut.c -lm, and it needs ANSI- or VT100-like emulation)

             k;double sin()
         ,cos();main(){float A=
       0,B=0,i,j,z[1760];char b[
     1760];printf("\x1b[2J");for(;;
  ){memset(b,32,1760);memset(z,0,7040)
  ;for(j=0;6.28>j;j+=0.07)for(i=0;6.28
 >i;i+=0.02){float c=sin(i),d=cos(j),e=
 sin(A),f=sin(j),g=cos(A),h=d+2,D=1/(c*
 h*e+f*g+5),l=cos      (i),m=cos(B),n=s\
in(B),t=c*h*g-f*        e;int x=40+30*D*
(l*h*m-t*n),y=            12+15*D*(l*h*n
+t*m),o=x+80*y,          N=8*((f*e-c*d*g
 )*m-c*d*e-f*g-l        *d*n);if(22>y&&
 y>0&&x>0&&80>x&&D>z[o]){z[o]=D;;;b[o]=
 ".,-~:;=!*#$@"[N>0?N:0];}}/*#****!!-*/
  printf("\x1b[H");for(k=0;1761>k;k++)
   putchar(k%80?b[k]:10);A+=0.04;B+=
     0.02;}}/*****####*******!!=;:~
       ~::==!!!**********!!!==::-
         .,~~;;;========;;;:~-.
             ..,--------,*/

(This was my first attempt at obfuscated C and I feel it's pretty amateurish; se Donut Marke II for a more impressive demo — though this one is simple and elegant in comparison.)

Overview​

Architecture Diagram​

Components​

1. User Interface layer​

Web Frontend​

Zalo Chatbot​

2. API & Backend layer​

Web Server​

API Server​

3. LLM & Embedding layer​

Large language model​

Embedding model​

4. Data Management layer​

LLM Cache​

Vector Database​

Relational Database​

5. Document Processing layer​

Document Storage​

Background Task Queue​

Workflow​

Acknowledgements​

0. Scopes​

Backend Scopes (Python, FastAPI)​

Frontend Scopes (JavaScript, Next.js)​

Full-Stack or Shared Scopes​

1. Branch Naming Conventions​

Primary Branches (Default Branches)​

Feature Branches​

Hotfix and Bugfix Branches​

Hotfix Branches​

Bugfix Branches​

2. Commit Message Conventions​

Commit Message Structure​

Semantic Commit Types​

Best Practices for Commit Messages​

3. Tag Naming Conventions​

Release Tags​

Summary Table of Milestones​

Milestone 1: Discovery and Planning​

Milestone 2: Core Infrastructure Setup​

Milestone 3: Prototype Completion​

Milestone 4: Core Features Developed​

Milestone 5: Alpha Release (Internal Testing)​

Milestone 6: Beta Release (Limited User Testing)​

Milestone 7: Performance and Security Optimization​

Milestone 8: Public Launch Preparation​

Milestone 9: Version 1.0 - Public Launch​

Milestone 10: Post-Launch Improvements (Version 1.1)​

Milestone 11: Scaling and Maintenance (Ongoing)​

Overview

Architecture Diagram

Components

1. User Interface layer

Web Frontend

Zalo Chatbot

2. API & Backend layer

Web Server

API Server

3. LLM & Embedding layer

Large language model

Embedding model

4. Data Management layer

LLM Cache

Vector Database

Relational Database

5. Document Processing layer

Document Storage

Background Task Queue

Workflow

Acknowledgements

0. Scopes

Backend Scopes (Python, FastAPI)

Frontend Scopes (JavaScript, Next.js)

Full-Stack or Shared Scopes

1. Branch Naming Conventions

Primary Branches (Default Branches)

Feature Branches

Hotfix and Bugfix Branches

Hotfix Branches

Bugfix Branches

2. Commit Message Conventions

Commit Message Structure

Semantic Commit Types

Best Practices for Commit Messages

3. Tag Naming Conventions

Release Tags

Summary Table of Milestones

Milestone 1: Discovery and Planning

Milestone 2: Core Infrastructure Setup

Milestone 3: Prototype Completion

Milestone 4: Core Features Developed

Milestone 5: Alpha Release (Internal Testing)

Milestone 6: Beta Release (Limited User Testing)

Milestone 7: Performance and Security Optimization

Milestone 8: Public Launch Preparation

Milestone 9: Version 1.0 - Public Launch

Milestone 10: Post-Launch Improvements (Version 1.1)

Milestone 11: Scaling and Maintenance (Ongoing)