Meeting Summary: CS 486-686 Lecture/Lab (Spring 2025)

Date: February 27, 2025
Time: 08:08 AM Pacific Time (US and Canada)
Meeting ID: 893 0161 6954

Quick Recap

In the session, Greg addressed several key topics:

Logistics & Schedule:
- Discussed upcoming tasks and the class schedule that includes paper discussions, extended content on ForPaPy, and a transition to agent-related topics.
Project and Coding Concerns:
- Addressed student decisions on working individually or in groups.
- Tackled coding issues and shared troubleshooting steps for embedding models.
Performance and Experimentation:
- Compared embedding models, specifically Llama versus OpenAI.
- Explored the potential of local embedding models and alternatives (e.g., Ginger Embeddings, Transformers, and Voyage AI).
Future Directions:
- Discussed plans for using AI embeddings and integrating the Alama model.
- Raised the importance of metadata (e.g., source file, function name, lines of code) for improving retrieval quality.

Next Steps

Both students and Greg have specific action items to complete:

For Students

Install Ollama:
Install Ollama on their laptops.
Finalize Repository Setup:
Pull the in-class repository and run the example using the Ollama embedding model locally.
Project Work Mode:
Inform Greg by the end of the day whether they will work individually or as a team.
Prepare for RAG Discussion:
Read and prepare to discuss the RAG (Retrieval-Augmented Generation) paper for Tuesday’s class.
Paper Preparation:
Create a paper02.md file based on the provided template for the second paper discussion.
Investigate the Code Splitter:
Understand how the code splitter functions and verify its context-providing capabilities.
Review RAG Example:
Examine a linked example of a similar RAG system for code search.
Enhance RAG System:
Consider adding metadata (e.g., source file, function name, lines of code) to improve the base RAG system.
Tune RAG Parameters:
Think about effective parameter settings for the RAG system.

For Greg

Website Update:
Update the website with past content and include materials for week 4.

Below is a Mermaid diagram that outlines the workflow for these next steps:

flowchart TD
    A[Install Ollama on Laptop] --> B[Pull In-Class Repository]
    B --> C[Run Example with Ollama Embedding Model]
    C --> D[Decide: Team or Individual Project]
    D --> E[Notify Greg by End-of-Day]
    E --> F[Read and Prepare for RAG Paper Discussion]
    F --> G[Create paper02.md File Using Template]
    G --> H[Investigate Code Splitter Functionality]
    H --> I[Review RAG System Example for Code Search]
    I --> J[Consider Metadata Enhancements in RAG System]
    J --> K[Tune Parameters for RAG System]
    
    subgraph Website Update
      L[Greg Updates Website with Past Content & Week 4 Materials]
    end

Detailed Summary

Upcoming Tasks and Class Schedule

Logistics:
Greg reviewed logistical matters and provided an overview of the upcoming tasks.
Project Decisions:
Students must decide by the end of the day whether they will work individually or in groups.
Learning Materials:
The website will be updated with new and past materials, and a second paper has been assigned for Tuesday’s discussion on RAG.
Upcoming Topics:
Future sessions will cover agent discussions, fine-tuning, and preparation for the final group project.
Coding Troubleshooting:
Resolution of previous coding issues was discussed.

Embedding Models Performance and Llama

Model Focus:
The discussion centered on the performance of embedding models, with a focus on using Llama versus OpenAI.
Local Execution:
Emphasis was placed on the benefits of running models locally rather than relying on external APIs.
Performance Concerns:
Although Llama was installed on a local machine, concerns were raised regarding its performance impact.
Exploration of Alternatives:
Alternatives such as Ginger Embeddings and Transformers were considered, along with the potential use of Voyage AI.

The following Mermaid diagram illustrates the embedding models assessment:

flowchart TD
    A[Llama] --> B[Local Execution]
    A --> C[Performance Concerns]
    D[OpenAI] --> E[External API Reliance]
    F[Ginger Embeddings & Transformers] --> G[Alternative Options]
    H[Voyage AI] --> I[Embedding Experimentation]

Python Program Troubleshooting and Setup

Development Process:
Greg developed a Python program to interact with an embedding API.
Encountered Issues:
- Incorrect URL endpoints.
- Improper payload formats.
Resolution Steps:
Adjustments were made to the code by fixing the model name and correcting the embedding endpoint. Once modified, the API provided the expected response.
Next Steps:
All students are expected to set up and run Llama locally following the troubleshooting process.

Llama Integration and Performance Discussion

System Integration:
Greg detailed the process of connecting Llama with the Alama embedding API.
Performance Analysis:
Performance differences were noted between computer types, including Apple Silicon and Intel systems.
Local Setup Advancements:
Initiatives to use Llama Index for local utilization of the Alama embedding model were discussed.
Graphics Note:
Earlier systems used integrated Intel graphics rather than the GPU, impacting performance.

Exploring Local Embedding Models and Alternatives

Experimentation Goals:
Utilizing local embedding models aims to reduce API calls while facilitating experimentation.
Benchmarking:
Different models were compared for speed and efficiency against OpenAI’s solution.
Voyage AI Integration:
Greg set up the necessary API key and packages to test Voyage AI as an alternative.
Collaborative Efforts:
The group shared insights and troubleshooting tips for implementing various embedding models.

Embedding Models and Sequence Limits

Challenges:
The session highlighted challenges such as rate limits and sequence limits impacting model performance.
Proposals for Improvement:
- Use AI embeddings.
- Evaluate the Alama model.
- Improve retrieval quality by incorporating metadata (file path, function name, code lines).
Investigation:
Students were tasked to explore the functionality of the code splitter and assess its contextual accuracy.

This flowchart outlines the process of addressing embedding model challenges:

flowchart TD
    A[Rate Limits & Sequence Issues] --> B[Performance Challenges]
    B --> C[Proposal: Use AI Embeddings]
    C --> D[Evaluate Alama Model]
    D --> E[Enhance with Metadata]
    E --> F[Investigate Code Splitter Effectiveness]

Conclusion

The meeting concentrated on practical logistics, upcoming tasks, and technical challenges related to embedding models and system integrations. Greg provided thorough instructions and troubleshooting strategies, establishing a clear roadmap for both individual tasks and collaborative efforts. This approach aims to enhance the efficiency and performance of the course’s embedding systems while fostering proactive experimentation.